astro-hep-bert / README.md
arnosimons's picture
Update README.md
22e5d80 verified
metadata
license: apache-2.0
language:
  - en
pipeline_tag: fill-mask
widget:
  - text: >-
      The Standard Model (SM) of [MASK] physics has been tested by many
      experiments over the last four decades and has been shown to successfully
      describe high energy particle interactions.
    example_title: particle physics
  - text: >-
      Clear evidence for the production of a neutral boson with a measured mass
      of [MASK].0 ± 0.4 (stat) ± 0.4 (sys) GeV is presented.
    example_title: 126.0 ± 0.4 (stat) ± 0.4 (sys) GeV
  - text: >-
      An excess of [MASK] is observed above the expected background, with a
      local significance of 5.0 standard deviations, at a mass near 125 GeV,
      signalling the production of a new particle.
    example_title: excess of events
  - text: >-
      On September 14, 2015 at 09:50:45 UTC the two [MASK] of the Laser
      Interferometer Gravitational-Wave Observatory simultaneously observed a
      transient gravitational-wave signal.
    example_title: two detectors
  - text: >-
      These first images from the EHT achieve the highest [MASK] resolution in
      the history of ground-based VLBI.
    example_title: angular resolution
  - text: >-
      We propose a comprehensive theory of [MASK] matter that explains the
      recent proliferation of unexpected observations in high-energy
      astrophysics.
    example_title: dark matter
  - text: >-
      Formation of galaxy clusters corresponds to the collapse of the largest
      gravitationally bound overdensities in the initial [MASK] field and is
      accompanied by the most energetic phenomena since the Big Bang and by the
      complex interplay between gravity-induced dynamics of collapse and
      baryonic processes associated with galaxy formation.
    example_title: initial density field
  - text: >-
      The Event [MASK] Telescope (EHT) has led to the first images of a
      supermassive black hole, revealing the central compact objects in the
      elliptical galaxy M87 and the Milky Way.
    example_title: Event Horizon Telescope
datasets:
  - wikipedia
  - bookcorpus
  - arnosimons/astro-hep-corpus
tags:
  - arXiv
  - astrophysics
  - conceptual analysis
  - epistemic change
  - high-energy physics (HEP)
  - history of science
  - semantic shift detection
  - sociology of science
  - philosophy of science
  - physics
  - word embeddings

Model Card for Astro-HEP-BERT

Astro-HEP-BERT is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's bert-base-uncased, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.

The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).

For further insights into the model, the corpus, and the underlying research project (Network Epistemology in Practice) please refer to the Astro-HEP-BERT paper [link coming soon].

Model Details