File size: 3,491 Bytes
9047480
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: Cs482 Toxic Tweets
emoji: 
colorFrom: green
colorTo: green
sdk: streamlit
sdk_version: 1.17.0
app_file: app.py
pinned: false
duplicated_from: kya5/milestone-3
---

# Finetuning Language Models - Toxic Tweets

[![Sync to Hugging Face hub](https://github.com/jjmakes/cs482-project/actions/workflows/sync_to_hf.yml/badge.svg)](https://github.com/jjmakes/cs482-project/actions/workflows/sync_to_hf.yml)

## [See the deployed App on HuggingFace](https://huggingface.co./spaces/jjmakes/cs482-toxic-tweets)

CS 482 Project - [Instructions](https://pantelis.github.io/data-mining/aiml-common/projects/nlp/finetuning-language-models-tweets/index.html)

## Milestone 1 - Development Environment

## OS Version

This project was created in Ubuntu 20.04. Thus, steps for installing and developing in Windows are not included.

```
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
```

## Docker Installation

The instructions below will help install Docker on Ubuntu version 20.04.6

```
## Update list of existing packages
sudo apt update

## Install prerequisite packages
sudo apt install apt-transport-https ca-certificates curl software-properties-common

## Add GPG key for the official Docker repository
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

## Add the Docker repository to APT sources
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"

## Prep to install from docker repo
apt-cache policy docker-ce

## Install docker
sudo apt install docker-ce

## Check if docker is running
sudo systemctl status docker

## Add sudo docker permissions to current user
sudo usermod -aG docker ${USER}[![Sync to Hugging Face hub](https://github.com/jjmakes/cs482-project/actions/workflows/sync_to_hf.yml/badge.svg)](https://github.com/jjmakes/cs482-project/actions/workflows/sync_to_hf.yml)

## VS Code Installation

The instructions below will help install VS Code on Ubuntu version 20.04.6

[Download the VS Code .deb package (64 bit)](https://code.visualstudio.com/download)

```
## Navigate to downloads folder
cd ~/Downloads

## Install VS Code (replace <file> with the downloaded package)
sudo apt install ./<file>.deb
```

## Creating a development environment with docker

[Quick Start Development Container](https://code.visualstudio.com/docs/devcontainers/containers#_quick-start-try-a-development-container)

1. **F1**, _Dev Containers: Open Folder in Container..._
2. Select starting image

Some notable images worth using are:

- Alpine: Barebones Linux OS
- Python3: Container for developing Python 3 Applications

![](./milestone-1.png)


## Milestone 2

App is deployed to [HuggingFace](https://huggingface.co./spaces/jjmakes/cs482-toxic-tweets) via GitHub actions following [instructions provided in this tutorial](https://www.youtube.com/watch?v=8hOzsFETm4I). HuggingFace provides documentation for performing [sentiment analysis with python](https://huggingface.co./blog/sentiment-analysis-python).

### Testing with Streamlit Locally

To test with streamlit, install the project dependencies locally with:
```
pip3 install -r requirements.txt
```

To run the project, use:
```
streamlit run app.py --server.port 8888
```

The page can be set to hot-reload by selecting `Always Rerun` after a change is made.

Models used are pretrained and provided by [HuggingFace](https://huggingface.co./models?pipeline_tag=text-classification&sort=likes&search=sentiment).