Prediction of Securities

This project contains various files that were generated during the time of creation of the course work

Project Structure

data/stocks

  • CSV Files: Various CSV files containing stock data and sentiment scores.

    • nytimes.csv: sentiment scores from NYTimes.
    • reuters.csv: sentiment scores from Reuters.
    • final_data/: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models.
      • AAPL.csv: Apple Inc. stock data.
      • JPM.csv: JPMorgan Chase & Co. stock data.
      • PG.csv: Procter & Gamble Co. stock data.
      • TM.csv: Toyota Motor Corporation stock data.
      • XOM.csv: Exxon Mobil Corporation stock data.
  • Python Scripts: Scripts related to data preprocessing and sentiment analysis.

    • preprocessing.py: Script for preprocessing stock data.
    • stock_loader.py: Script for loading stock data.
    • __init__.py: Initialization file for the package.

notebooks

  • Local: Contains local Jupiter notebooks that were used for early stages of optimisation and testing
  • nyt_titles_loader.ipynb: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle
  • Other files showcase early attempts to use torch with optuna to tune RNNs
  • Kaggle: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost
  • regression_plots_and_metrics.ipynb: final values and plots used in the report
  • classification_plots_and_metrics.ipynb: final values and plots used in the report

rnn_model

  • Using Keras: Contains RNN models implemented using Keras.

    • models.py: Model getters
    • optimise.py: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU
    • __init__.py: Initialization file for the package.
  • Using Torch: Contains RNN models implemented using PyTorch.

    • classification.py: Classification RNN models using PyTorch.
    • early_stopping.py: Early stopping utility for RNN models in PyTorch.
    • loaders.py: Data loaders for RNN models in PyTorch.
    • optimise.py: Optimization routines for RNN models in PyTorch.
    • regression.py: Regression RNN models using PyTorch.
    • train_eval.py: Training and evaluation scripts for RNN models in PyTorch.
    • __init__.py: Initialization file for the package.

utils

  • Utility Scripts: Various utility scripts to support the main functionality.
    • sequences.py: Utility functions for getting sequences.
    • stock_loader_utils.py: Utility functions for loading stock data.
    • torch_train_util.py: Utility functions for training PyTorch models.
    • utils.py: General utility functions.
    • __init__.py: Initialization file for the package.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.