Spaces:
Runtime error
Runtime error
File size: 4,721 Bytes
a04f9f9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
This is a basic guide that outlines what is necessary to train a NSFW detector
on top of a CLIP model.
Note that this is a general guide and meant to be used as a guideline.
## Dataset Prep
You will need to obtain thousands of NSFW and SFW images to train a good model.
Once the images are gathered, you can place them in a directory and leverage
[clip-retrieval](https://github.com/rom1504/clip-retrieval#clip-inference)
to easily create a numpy array of embeddings. (For more details read the docs)
Finally, you will need to create target values which can be done easily in the
python interpreter.
### Target Values & Dataset Combination
```python
# find out how many positive samples you have
pos_x = np.load(path/to_positive/samples.npy)
neg_x = np.load(path/to_negative/samples.npy)
num_pos = pos_x.shape[0]
num_neg = neg_x.shape[0]
# create target values
pos_y = np.ones((num_pos, 1))
neg_y = np.zeros((num_neg, 1))
# combine the x samples
# NOTE: we will rely on torch dataloader shuffling to break the ordering here
x = np.vstack((pos_x, neg_x))
y = np.vstack((pos_y, neg_y))
# save the dataset x & y
np.save("train_x.npy", x)
np.save("train_y.npy", y)
```
## Model Training
Thankfully it is possible to use a very simple linear model to train the NSFW
detector.
For the purposes of this guide we will reference [this repo](https://github.com/christophschuhmann/improved-aesthetic-predictor)
and its model architecture.
> NOTE: It is also possible to utilize the training script provided in that repo
> as boilerplate code, provided you have `.npy` files for your dataset's x & y
### Model Architecture
Feel free to tweak the model architecture here, but the important thing to
remember is that your input dimension should match the dimension of your CLIP
embeddings, and your output dimension should be 1.
```python
import torch.nn as nn
class H14_NSFW_Detector(nn.Module):
def __init__(self, input_size=1024):
super().__init__()
self.input_size = input_size
self.layers = nn.Sequential(
nn.Linear(self.input_size, 1024),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(1024, 2048),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(2048, 1024),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(1024, 256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256, 128),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(128, 16),
nn.Linear(16, 1)
)
def forward(self, x):
return self.layers(x)
```
### Training Snippets
Below is a list of snippets that should walk you through the general steps
necessary to train a MLP in PyTorch, however, it is completely fine to replace
the existing MLP in [this-repo](https://github.com/christophschuhmann/improved-aesthetic-predictor)
with the model provided above & begin training.
For those of you who wish to build out custom code, the following snippets should
get the ball rolling...
#### Import the necessary libraries:
```python
import torch
import torch.nn as nn
from torch.optim import Adam
from torch.utils.data import TensorDataset, DataLoader
```
#### Define the dataset and data loaders:
```python
# Define the dataset
x = torch.from_numpy(np.load("train_x.npy"))
y = torch.from_numpy(np.load("train_y.npy"))
train_dataset = TensorDataset(x,y)
# Define the data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
```
#### Initialize the model
```python
model = H14_NSFW_Detector()
```
#### Define the loss function
```python
criterion = nn.MSELoss()
```
#### Define the optimizer
```python
# Define the optimizer
optimizer = Adam(model.parameters())
```
#### Define the training loop
```python
# Define the number of epochs
num_epochs = 10
# Training loop
for epoch in range(num_epochs):
for inputs, labels in train_loader:
# Clear gradients
optimizer.zero_grad()
# Forward pass
outputs = model(inputs)
# Compute the loss
loss = criterion(outputs, labels)
# Backward pass and optimization
loss.backward()
optimizer.step()
```
#### Define an evaluation loop
```python
# Evaluation loop
model.eval()
with torch.no_grad():
for inputs, labels in val_loader:
# Forward pass
outputs = model(inputs)
# Compute the loss
loss = criterion(outputs, labels)
```
Note that this is a basic guide and you may need to add additional functionality
such as model saving and loading, early stopping, etc.
You may want to adjust the learning rate,
batch size, and number of epochs as well.
|