--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # MARTINI_enrich_BERTopic_jordansather This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_jordansather") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 40 * Number of training documents: 5580
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | disinformation - trump - influencers - youtube - flynn | 20 | -1_disinformation_trump_influencers_youtube | | 0 | disinformation - influencers - awakening - believe - grifters | 3047 | 0_disinformation_influencers_awakening_believe | | 1 | bitchute - gabtv - livestream - replays - embed | 310 | 1_bitchute_gabtv_livestream_replays | | 2 | vaxxers - vaccinated - pfizer - injections - mrna | 201 | 2_vaxxers_vaccinated_pfizer_injections | | 3 | ufo - pentagon - disclosure - drones - gatekeepers | 176 | 3_ufo_pentagon_disclosure_drones | | 4 | ghislaine - conspirators - lawsuit - latham - dismissed | 150 | 4_ghislaine_conspirators_lawsuit_latham | | 5 | qanon - shills - repost - sources - normies | 139 | 5_qanon_shills_repost_sources | | 6 | hypochlorite - hydroxychloroquine - dioxido - ivermectin - cl02 | 115 | 6_hypochlorite_hydroxychloroquine_dioxido_ivermectin | | 7 | ballots - maricopa - rigged - statewide - republican | 91 | 7_ballots_maricopa_rigged_statewide | | 8 | qanonjohn - flynn - hypocritical - persecution - whistleblower | 74 | 8_qanonjohn_flynn_hypocritical_persecution | | 9 | fbi - antifa - capitol - alleged - jan | 71 | 9_fbi_antifa_capitol_alleged | | 10 | disinfo - qult_headquarters - nesara - scammy - weirdo | 69 | 10_disinfo_qult_headquarters_nesara_scammy | | 11 | fraudlewski - godlewki - liars - gregg - suing | 66 | 11_fraudlewski_godlewki_liars_gregg | | 12 | fednow - coinbase - banks - robinhood - deepfuckingvalue | 62 | 12_fednow_coinbase_banks_robinhood | | 13 | russia - nordstream - zelensky - globalists - invade | 59 | 13_russia_nordstream_zelensky_globalists | | 14 | scampeachment - donald - indicted - newsmax - courthouse | 57 | 14_scampeachment_donald_indicted_newsmax | | 15 | biden - wikileaks - laptop - joe - emails | 55 | 15_biden_wikileaks_laptop_joe | | 16 | truthsocial - newsom - launched - trolled - bots | 54 | 16_truthsocial_newsom_launched_trolled | | 17 | biden - kamala - newsom - putin - helluva | 50 | 17_biden_kamala_newsom_putin | | 18 | sunspots - auroras - geomagnetic - earthquakes - satellites | 49 | 18_sunspots_auroras_geomagnetic_earthquakes | | 19 | vaccines - remdesivir - graphene - poison - sinopeg | 49 | 19_vaccines_remdesivir_graphene_poison | | 20 | comey - mueller - dossier - indictments - sussman | 49 | 20_comey_mueller_dossier_indictments | | 21 | wuhan - coronaviruses - darpa - leaked - redfield | 43 | 21_wuhan_coronaviruses_darpa_leaked | | 22 | energy - geoengineering - tesla - turbines - decentralized | 41 | 22_energy_geoengineering_tesla_turbines | | 23 | twitter - musk - takeover - unban - shareholder | 39 | 23_twitter_musk_takeover_unban | | 24 | livestream - today - foxhole - chillin - 5pm | 36 | 24_livestream_today_foxhole_chillin | | 25 | fake - shillbots - qanonofficial - telegram - ladymelaniatrump | 35 | 25_fake_shillbots_qanonofficial_telegram | | 26 | johnmcafee - juan - larpdar - legit - griftin | 34 | 26_johnmcafee_juan_larpdar_legit | | 27 | charlies - simon - parkes - spoof - cloned | 33 | 27_charlies_simon_parkes_spoof | | 28 | shootings - pistol - texas - lunatics - manifesto | 33 | 28_shootings_pistol_texas_lunatics | | 29 | gmo - aspartame - pasteurized - vitamins - toxins | 32 | 29_gmo_aspartame_pasteurized_vitamins | | 30 | collagen - cordyceps - supplements - theanine - adaptogenic | 32 | 30_collagen_cordyceps_supplements_theanine | | 31 | ballots - illegitimately - smartmatic - senators - sheriff | 30 | 31_ballots_illegitimately_smartmatic_senators | | 32 | qanonjohn - flynn - soldiers - tyranny - disseminating | 30 | 32_qanonjohn_flynn_soldiers_tyranny | | 33 | hamas - israelis - bombed - terrorizers - blaming | 28 | 33_hamas_israelis_bombed_terrorizers | | 34 | livestreams - monday - tonight - hi - starlink | 28 | 34_livestreams_monday_tonight_hi | | 35 | airspace - montana - drones - missiles - fairchild | 26 | 35_airspace_montana_drones_missiles | | 36 | coinscammers - trumpcoinannouncements - scammer - bots - promoting | 23 | 36_coinscammers_trumpcoinannouncements_scammer_bots | | 37 | twitter - suspended - majorpatriot - unbanning - updated | 23 | 37_twitter_suspended_majorpatriot_unbanning | | 38 | whatevergender - transgenders - feminized - boobs - onlyfans | 21 | 38_whatevergender_transgenders_feminized_boobs |
## Training hyperparameters * calculate_probabilities: True * language: None * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: False * zeroshot_min_similarity: 0.7 * zeroshot_topic_list: None ## Framework versions * Numpy: 1.26.4 * HDBSCAN: 0.8.40 * UMAP: 0.5.7 * Pandas: 2.2.3 * Scikit-Learn: 1.5.2 * Sentence-transformers: 3.3.1 * Transformers: 4.46.3 * Numba: 0.60.0 * Plotly: 5.24.1 * Python: 3.10.12