suchkow commited on
Commit
3e188a2
1 Parent(s): 25e7dcb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -3
README.md CHANGED
@@ -1,3 +1,54 @@
1
- ---
2
- license: unknown
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: unknown
3
+ ---
4
+
5
+ # Prediction of Securities
6
+ This project contains various files that were generated during the time of creation of the course work
7
+ ## Project Structure
8
+
9
+ ### data/stocks
10
+ - **CSV Files**: Various CSV files containing stock data and sentiment scores.
11
+ - `nytimes.csv`: sentiment scores from NYTimes.
12
+ - `reuters.csv`: sentiment scores from Reuters.
13
+ - **final_data/**: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models.
14
+ - `AAPL.csv`: Apple Inc. stock data.
15
+ - `JPM.csv`: JPMorgan Chase & Co. stock data.
16
+ - `PG.csv`: Procter & Gamble Co. stock data.
17
+ - `TM.csv`: Toyota Motor Corporation stock data.
18
+ - `XOM.csv`: Exxon Mobil Corporation stock data.
19
+
20
+ - **Python Scripts**: Scripts related to data preprocessing and sentiment analysis.
21
+ - `preprocessing.py`: Script for preprocessing stock data.
22
+ - `stock_loader.py`: Script for loading stock data.
23
+ - `__init__.py`: Initialization file for the package.
24
+
25
+ ### notebooks
26
+ - **Local**: Contains local Jupiter notebooks that were used for early stages of optimisation and testing
27
+ - `nyt_titles_loader.ipynb`: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle
28
+ - Other files showcase early attempts to use torch with optuna to tune RNNs
29
+ - **Kaggle**: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost
30
+ - `regression_plots_and_metrics.ipynb`: final values and plots used in the report
31
+ - `classification_plots_and_metrics.ipynb`: final values and plots used in the report
32
+
33
+ ### rnn_model
34
+ - **Using Keras**: Contains RNN models implemented using Keras.
35
+ - `models.py`: Model getters
36
+ - `optimise.py`: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU
37
+ - `__init__.py`: Initialization file for the package.
38
+
39
+ - **Using Torch**: Contains RNN models implemented using PyTorch.
40
+ - `classification.py`: Classification RNN models using PyTorch.
41
+ - `early_stopping.py`: Early stopping utility for RNN models in PyTorch.
42
+ - `loaders.py`: Data loaders for RNN models in PyTorch.
43
+ - `optimise.py`: Optimization routines for RNN models in PyTorch.
44
+ - `regression.py`: Regression RNN models using PyTorch.
45
+ - `train_eval.py`: Training and evaluation scripts for RNN models in PyTorch.
46
+ - `__init__.py`: Initialization file for the package.
47
+
48
+ ### utils
49
+ - **Utility Scripts**: Various utility scripts to support the main functionality.
50
+ - `sequences.py`: Utility functions for getting sequences.
51
+ - `stock_loader_utils.py`: Utility functions for loading stock data.
52
+ - `torch_train_util.py`: Utility functions for training PyTorch models.
53
+ - `utils.py`: General utility functions.
54
+ - `__init__.py`: Initialization file for the package.