MARTINI_enrich_BERTopic_Rus_truth

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_Rus_truth")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 9
  • Number of training documents: 995
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 donetsk - zakharova - sanctions - mercenaries - nazi 23 -1_donetsk_zakharova_sanctions_mercenaries
0 mariupol - azov - missiles - evacuated - battalion 604 0_mariupol_azov_missiles_evacuated
1 gazprombank - sanctions - euros - vladimir - poland 141 1_gazprombank_sanctions_euros_vladimir
2 zelensky - volodymyr - scholz - slovakia - suzdaltsev 73 2_zelensky_volodymyr_scholz_slovakia
3 kharkov - biolaboratories - pentagon - outbreak - borisovna 37 3_kharkov_biolaboratories_pentagon_outbreak
4 beijing - taiwan - ambassador - zhang - sino 36 4_beijing_taiwan_ambassador_zhang
5 marchers - nazis - victory - ivanovo - slovakia 29 5_marchers_nazis_victory_ivanovo
6 lavrov - sanctions - kissinger - baltic - aggressors 28 6_lavrov_sanctions_kissinger_baltic
7 missiles - howitzers - raytheon - supplied - cnn 24 7_missiles_howitzers_raytheon_supplied

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.3
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.3.1
  • Transformers: 4.46.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.