suchkow's picture
Update README.md
3e188a2 verified
|
raw
history blame
2.9 kB
---
license: unknown
---
# Prediction of Securities
This project contains various files that were generated during the time of creation of the course work
## Project Structure
### data/stocks
- **CSV Files**: Various CSV files containing stock data and sentiment scores.
- `nytimes.csv`: sentiment scores from NYTimes.
- `reuters.csv`: sentiment scores from Reuters.
- **final_data/**: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models.
- `AAPL.csv`: Apple Inc. stock data.
- `JPM.csv`: JPMorgan Chase & Co. stock data.
- `PG.csv`: Procter & Gamble Co. stock data.
- `TM.csv`: Toyota Motor Corporation stock data.
- `XOM.csv`: Exxon Mobil Corporation stock data.
- **Python Scripts**: Scripts related to data preprocessing and sentiment analysis.
- `preprocessing.py`: Script for preprocessing stock data.
- `stock_loader.py`: Script for loading stock data.
- `__init__.py`: Initialization file for the package.
### notebooks
- **Local**: Contains local Jupiter notebooks that were used for early stages of optimisation and testing
- `nyt_titles_loader.ipynb`: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle
- Other files showcase early attempts to use torch with optuna to tune RNNs
- **Kaggle**: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost
- `regression_plots_and_metrics.ipynb`: final values and plots used in the report
- `classification_plots_and_metrics.ipynb`: final values and plots used in the report
### rnn_model
- **Using Keras**: Contains RNN models implemented using Keras.
- `models.py`: Model getters
- `optimise.py`: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU
- `__init__.py`: Initialization file for the package.
- **Using Torch**: Contains RNN models implemented using PyTorch.
- `classification.py`: Classification RNN models using PyTorch.
- `early_stopping.py`: Early stopping utility for RNN models in PyTorch.
- `loaders.py`: Data loaders for RNN models in PyTorch.
- `optimise.py`: Optimization routines for RNN models in PyTorch.
- `regression.py`: Regression RNN models using PyTorch.
- `train_eval.py`: Training and evaluation scripts for RNN models in PyTorch.
- `__init__.py`: Initialization file for the package.
### utils
- **Utility Scripts**: Various utility scripts to support the main functionality.
- `sequences.py`: Utility functions for getting sequences.
- `stock_loader_utils.py`: Utility functions for loading stock data.
- `torch_train_util.py`: Utility functions for training PyTorch models.
- `utils.py`: General utility functions.
- `__init__.py`: Initialization file for the package.