MARTINI_enrich_BERTopic_mediaandcensoring

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_mediaandcensoring")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 19
  • Number of training documents: 2142
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 bbc - putin - misinformation - banned - tiktok 22 -1_bbc_putin_misinformation_banned
0 whatsapp - spying - vpn - honeypot - mail 1197 0_whatsapp_spying_vpn_honeypot
1 wikileaks - extradition - mossad - prisoner - belmarsh 104 1_wikileaks_extradition_mossad_prisoner
2 zakharova - russia - sputnik - lithuania - journalists 91 2_zakharova_russia_sputnik_lithuania
3 donetsk - mariupol - journalists - atrocities - slavyansk 86 3_donetsk_mariupol_journalists_atrocities
4 gaza - palestinians - israelis - journalists - ukrainehumanrightsabuses 82 4_gaza_palestinians_israelis_journalists
5 ukraine - missiles - fakes - bayraktar - firefighters 75 5_ukraine_missiles_fakes_bayraktar
6 propaganda - dictatorship - psychopathic - gaslight - consciousness 57 6_propaganda_dictatorship_psychopathic_gaslight
7 musk - twitterigtruth - msnbc - soros - fired 54 7_musk_twitterigtruth_msnbc_soros
8 infodemic - fauci - plandemic - ivermectin - monkeypox 54 8_infodemic_fauci_plandemic_ivermectin
9 ukraine - luhansk - russians - kherson - counteroffensive 50 9_ukraine_luhansk_russians_kherson
10 bbcisthevirus - cnn - worldwidedemonstration - radio - salford 49 10_bbcisthevirus_cnn_worldwidedemonstration_radio
11 nazies - ukrainehumanrightsabuses - zelenskiy - sarmatians - whitewashing 39 11_nazies_ukrainehumanrightsabuses_zelenskiy_sarmatians
12 tucker - cnn - conservative - redacted - viewership 38 12_tucker_cnn_conservative_redacted
13 slander - murdoch - sanctioning - sunak - partygate 36 13_slander_murdoch_sanctioning_sunak
14 bbc - agentsoftruth - alarmism - mi5 - thelightpaperdistribution 34 14_bbc_agentsoftruth_alarmism_mi5
15 youtube - weforum - misinformation - deleted - senate 28 15_youtube_weforum_misinformation_deleted
16 disinformation - biden - dhs - jankowicz - secretary 23 16_disinformation_biden_dhs_jankowicz
17 tweets - suspended - musk - spamming - radicalization 23 17_tweets_suspended_musk_spamming

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.3
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.3.1
  • Transformers: 4.46.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
5
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.