Commit
·
d83d12e
1
Parent(s):
a6f4731
figure
Browse files- README.md +4 -66
- exoplanet_keywords.png +0 -0
README.md
CHANGED
@@ -10,7 +10,7 @@ A deep language model, GPT-2, is trained on scientific manuscripts from NASA's A
|
|
10 |
```python
|
11 |
from transformers import pipeline
|
12 |
|
13 |
-
exo = pipeline('text-generation',model='gpt2-exomachina
|
14 |
machina = lambda text: exo(text)[0]['generated_text']
|
15 |
|
16 |
print(machina("Transiting exoplanets are"))
|
@@ -19,7 +19,7 @@ print(machina("Transiting exoplanets are"))
|
|
19 |
## Training Samples
|
20 |
~40,000 Abstracts from NASA's Astrophysical data system (ADS) and ArXiv.
|
21 |
|
22 |
-
data:image/s3,"s3://crabby-images/6dbf6/6dbf6162fffd3a8eb926efe13a1a83dac44ebb3c" alt=""
|
94 |
-
|
95 |
-
## Pre-processing
|
96 |
-
Extract abstracts from the database and create a new file where each line is an new sample. Try a new tokenizer
|
97 |
-
|
98 |
-
## Things to improve
|
99 |
-
|
100 |
-
## Export the models to an iOS application
|
101 |
-
|
102 |
-
|
103 |
-
References
|
104 |
-
- https://huggingface.co/roberta-base
|
105 |
-
- GPT-2 generative text
|
106 |
-
- https://huggingface.co/docs
|
107 |
-
- https://huggingface.co/transformers/training.html
|
108 |
-
- https://huggingface.co/transformers/notebooks.html
|
109 |
-
https://colab.research.google.com/drive/1vsCh85T_Od7RBwXfvh1iysV-vTxmWXQO#scrollTo=ljknzOlNoyrv
|
110 |
-
http://jalammar.github.io/illustrated-gpt2/
|
|
|
10 |
```python
|
11 |
from transformers import pipeline
|
12 |
|
13 |
+
exo = pipeline('text-generation',model='pearsonkyle/gpt2-exomachina', tokenizer='gpt2', config={'max_length':1600})
|
14 |
machina = lambda text: exo(text)[0]['generated_text']
|
15 |
|
16 |
print(machina("Transiting exoplanets are"))
|
|
|
19 |
## Training Samples
|
20 |
~40,000 Abstracts from NASA's Astrophysical data system (ADS) and ArXiv.
|
21 |
|
22 |
+
data:image/s3,"s3://crabby-images/7dd72/7dd72b272f8b439873aa0ff810edef5deeea387d" alt=""
|
23 |
|
24 |
A few generated samples are below:
|
25 |
|
|
|
28 |
`that large, deciduous forest fires are composed of mostly dry, unprocessed material that is distributed in a nearly patchy fashion. The distributions of these fires are correlated with temperature, and also with vegetation...`
|
29 |
- *Directly imaged exoplanets probe* `key aspects of planet formation and evolution theory, as well as atmospheric and interior physics. These insights have led to numerous direct imaging instruments for exoplanets, many using polarimetry. However, current instruments take`
|
30 |
|
31 |
+
Letting the scrape run for ~2 hours found articles from these publications in descending amount:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
```
|
33 |
5364 - The Astrophysical Journal
|
34 |
3365 - Astronomy and Astrophysics
|
|
|
45 |
129 - Planetary and Space Science
|
46 |
114 - Space Science Reviews
|
47 |
109 - Geophysical Research Letters
|
48 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
exoplanet_keywords.png
ADDED
![]() |