Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,54 @@
|
|
1 |
-
---
|
2 |
-
license: unknown
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: unknown
|
3 |
+
---
|
4 |
+
|
5 |
+
# Prediction of Securities
|
6 |
+
This project contains various files that were generated during the time of creation of the course work
|
7 |
+
## Project Structure
|
8 |
+
|
9 |
+
### data/stocks
|
10 |
+
- **CSV Files**: Various CSV files containing stock data and sentiment scores.
|
11 |
+
- `nytimes.csv`: sentiment scores from NYTimes.
|
12 |
+
- `reuters.csv`: sentiment scores from Reuters.
|
13 |
+
- **final_data/**: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models.
|
14 |
+
- `AAPL.csv`: Apple Inc. stock data.
|
15 |
+
- `JPM.csv`: JPMorgan Chase & Co. stock data.
|
16 |
+
- `PG.csv`: Procter & Gamble Co. stock data.
|
17 |
+
- `TM.csv`: Toyota Motor Corporation stock data.
|
18 |
+
- `XOM.csv`: Exxon Mobil Corporation stock data.
|
19 |
+
|
20 |
+
- **Python Scripts**: Scripts related to data preprocessing and sentiment analysis.
|
21 |
+
- `preprocessing.py`: Script for preprocessing stock data.
|
22 |
+
- `stock_loader.py`: Script for loading stock data.
|
23 |
+
- `__init__.py`: Initialization file for the package.
|
24 |
+
|
25 |
+
### notebooks
|
26 |
+
- **Local**: Contains local Jupiter notebooks that were used for early stages of optimisation and testing
|
27 |
+
- `nyt_titles_loader.ipynb`: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle
|
28 |
+
- Other files showcase early attempts to use torch with optuna to tune RNNs
|
29 |
+
- **Kaggle**: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost
|
30 |
+
- `regression_plots_and_metrics.ipynb`: final values and plots used in the report
|
31 |
+
- `classification_plots_and_metrics.ipynb`: final values and plots used in the report
|
32 |
+
|
33 |
+
### rnn_model
|
34 |
+
- **Using Keras**: Contains RNN models implemented using Keras.
|
35 |
+
- `models.py`: Model getters
|
36 |
+
- `optimise.py`: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU
|
37 |
+
- `__init__.py`: Initialization file for the package.
|
38 |
+
|
39 |
+
- **Using Torch**: Contains RNN models implemented using PyTorch.
|
40 |
+
- `classification.py`: Classification RNN models using PyTorch.
|
41 |
+
- `early_stopping.py`: Early stopping utility for RNN models in PyTorch.
|
42 |
+
- `loaders.py`: Data loaders for RNN models in PyTorch.
|
43 |
+
- `optimise.py`: Optimization routines for RNN models in PyTorch.
|
44 |
+
- `regression.py`: Regression RNN models using PyTorch.
|
45 |
+
- `train_eval.py`: Training and evaluation scripts for RNN models in PyTorch.
|
46 |
+
- `__init__.py`: Initialization file for the package.
|
47 |
+
|
48 |
+
### utils
|
49 |
+
- **Utility Scripts**: Various utility scripts to support the main functionality.
|
50 |
+
- `sequences.py`: Utility functions for getting sequences.
|
51 |
+
- `stock_loader_utils.py`: Utility functions for loading stock data.
|
52 |
+
- `torch_train_util.py`: Utility functions for training PyTorch models.
|
53 |
+
- `utils.py`: General utility functions.
|
54 |
+
- `__init__.py`: Initialization file for the package.
|