MARTINI_enrich_BERTopic_varlinas

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_varlinas")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 35
  • Number of training documents: 4383
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 biden - trump - vaxxed - states - everything 20 -1_biden_trump_vaxxed_states
0 donetsk - mariupol - russia - ukrainians - zelensky 2605 0_donetsk_mariupol_russia_ukrainians
1 pasveikinti - niekada - susiklausymo - rytas - metu 166 1_pasveikinti_niekada_susiklausymo_rytas
2 hs1jq13jv6 - anonymous - 9129529 - ww - 1f0a66 144 2_hs1jq13jv6_anonymous_9129529_ww
3 countdown - boom - gematria - djt - 11th 114 3_countdown_boom_gematria_djt
4 pfizer - vaccinated - covid - vaers - antibodies 110 4_pfizer_vaccinated_covid_vaers
5 lietuva - jankevicius - kgb - sovietu - landzbergiu 106 5_lietuva_jankevicius_kgb_sovietu
6 awakening - spiritual - righteousness - yehovah - greatness 86 6_awakening_spiritual_righteousness_yehovah
7 twitter - parler - trump - starlink - banned 86 7_twitter_parler_trump_starlink
8 нет - белоруссии - ростова - минобороны - солдат 83 8_нет_белоруссии_ростова_минобороны
9 epstein - trafficked - ghislaine - pedophiles - clinton 64 9_epstein_trafficked_ghislaine_pedophiles
10 petrodollar - greenback - federal - cryptocurrencies - cbdcs 52 10_petrodollar_greenback_federal_cryptocurrencies
11 youtube - musutv - bukimevieningi - platformoje - transliacija 52 11_youtube_musutv_bukimevieningi_platformoje
12 ballots - recount - maricopa - counties - senate 47 12_ballots_recount_maricopa_counties
13 trumpas - prezidentui - bidenas - konstitucija - igaliojimus 47 13_trumpas_prezidentui_bidenas_konstitucija
14 infekcijas - yersinia - staphylococcus - listeria - monocytogenes 46 14_infekcijas_yersinia_staphylococcus_listeria
15 military - trump - coups - washington - united 45 15_military_trump_coups_washington
16 spygate - dossier - dnc - subpoenas - fisa 39 16_spygate_dossier_dnc_subpoenas
17 globalists - guterres - crises - nwo - klaus 35 17_globalists_guterres_crises_nwo
18 komunistiniai - konservatyvizmas - nepasitikiu - konspiracininkus - kazachstana 33 18_komunistiniai_konservatyvizmas_nepasitikiu_konspiracininkus
19 bioweapon - ukraine - disinformation - soros - pentagon 32 19_bioweapon_ukraine_disinformation_soros
20 wuhan - pyongyang - lockdowns - pudong - typhoon 30 20_wuhan_pyongyang_lockdowns_pudong
21 jairbolsonaro - 38presidente - santos - bolivia - argentinean 29 21_jairbolsonaro_38presidente_santos_bolivia
22 hydroxychloroquine - molnupiravir - ivermectin - vaxx - antioxidant 29 22_hydroxychloroquine_molnupiravir_ivermectin_vaxx
23 trump - doj - unredacted - subpoena - martha 29 23_trump_doj_unredacted_subpoena
24 transgenderism - gay - rupaul - psychopaths - cyborgs 29 24_transgenderism_gay_rupaul_psychopaths
25 gendarmerie - paris - protesters - strike - marched 28 25_gendarmerie_paris_protesters_strike
26 satan - luciferian - baal - blavatsky - cronus 27 26_satan_luciferian_baal_blavatsky
27 trump - impeachment - comey - defrauded - swalwell 27 27_trump_impeachment_comey_defrauded
28 trump - tippytoppatriot - dow - savior - highest 26 28_trump_tippytoppatriot_dow_savior
29 gazprom - eurozone - sanctions - rubles - preussenelektra 24 29_gazprom_eurozone_sanctions_rubles
30 biden - unmasked - lectern - cpap - brandon 24 30_biden_unmasked_lectern_cpap
31 pyramids - antarctica - monastery - chimney - smithsonian 24 31_pyramids_antarctica_monastery_chimney
32 ketvergiai - laisva - susisiekti - pasivaikscioti - prisideti 23 32_ketvergiai_laisva_susisiekti_pasivaikscioti
33 zuckerberg - metaverse - meghan - fakebook - clownsnewsnetwork 22 33_zuckerberg_metaverse_meghan_fakebook

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.3
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.3.1
  • Transformers: 4.46.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
5
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.