|
--- |
|
license: unknown |
|
--- |
|
|
|
# Prediction of Securities |
|
This project contains various files that were generated during the time of creation of the course work |
|
## Project Structure |
|
|
|
### data/stocks |
|
- **CSV Files**: Various CSV files containing stock data and sentiment scores. |
|
- `nytimes.csv`: sentiment scores from NYTimes. |
|
- `reuters.csv`: sentiment scores from Reuters. |
|
- **final_data/**: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models. |
|
- `AAPL.csv`: Apple Inc. stock data. |
|
- `JPM.csv`: JPMorgan Chase & Co. stock data. |
|
- `PG.csv`: Procter & Gamble Co. stock data. |
|
- `TM.csv`: Toyota Motor Corporation stock data. |
|
- `XOM.csv`: Exxon Mobil Corporation stock data. |
|
|
|
- **Python Scripts**: Scripts related to data preprocessing and sentiment analysis. |
|
- `preprocessing.py`: Script for preprocessing stock data. |
|
- `stock_loader.py`: Script for loading stock data. |
|
- `__init__.py`: Initialization file for the package. |
|
|
|
### notebooks |
|
- **Local**: Contains local Jupiter notebooks that were used for early stages of optimisation and testing |
|
- `nyt_titles_loader.ipynb`: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle |
|
- Other files showcase early attempts to use torch with optuna to tune RNNs |
|
- **Kaggle**: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost |
|
- `regression_plots_and_metrics.ipynb`: final values and plots used in the report |
|
- `classification_plots_and_metrics.ipynb`: final values and plots used in the report |
|
|
|
### rnn_model |
|
- **Using Keras**: Contains RNN models implemented using Keras. |
|
- `models.py`: Model getters |
|
- `optimise.py`: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU |
|
- `__init__.py`: Initialization file for the package. |
|
|
|
- **Using Torch**: Contains RNN models implemented using PyTorch. |
|
- `classification.py`: Classification RNN models using PyTorch. |
|
- `early_stopping.py`: Early stopping utility for RNN models in PyTorch. |
|
- `loaders.py`: Data loaders for RNN models in PyTorch. |
|
- `optimise.py`: Optimization routines for RNN models in PyTorch. |
|
- `regression.py`: Regression RNN models using PyTorch. |
|
- `train_eval.py`: Training and evaluation scripts for RNN models in PyTorch. |
|
- `__init__.py`: Initialization file for the package. |
|
|
|
### utils |
|
- **Utility Scripts**: Various utility scripts to support the main functionality. |
|
- `sequences.py`: Utility functions for getting sequences. |
|
- `stock_loader_utils.py`: Utility functions for loading stock data. |
|
- `torch_train_util.py`: Utility functions for training PyTorch models. |
|
- `utils.py`: General utility functions. |
|
- `__init__.py`: Initialization file for the package. |
|
|