Russian Tank Destroyed or Not Classifier
Binary Image Classifier for whether a tank is destroyed or not using images from Oryx's collection of confirmed Russian armor losses.
Two models were trained:
- VGG
- Finetuned Resnet18
Model Description and Architecture
Models were trained in Pytorch. VGG was built from scratch while resnet18 was finetuned on pytorch's base resnet18 model with weights from IMAGENET1K_V1. Documentation for pytorch's resnet18 can be found here.
VGG Model Architecture:
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(5): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace=True)
(7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(8): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): ReLU(inplace=True)
(11): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(12): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(13): ReLU(inplace=True)
(14): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(5, 5))
(classifier): Sequential(
(0): Linear(in_features=3200, out_features=512, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.3, inplace=False)
(3): Linear(in_features=512, out_features=256, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.3, inplace=False)
(6): Linear(in_features=256, out_features=1, bias=True)
(7): Sigmoid()
)
)
Training Details
Dataset:
Training data was sourced from a kaggle dataset compiling images from Oryx's site. Only Russian tank images were used for training.
Image filenames in the original dataset contain a list of tags: ['destroyed', 'captured', 'abandoned']. Some images have multiple tags as a result of a vehicle being for example both abandoned then captured, or as a result of an image having multiple vehicles. Images with multiple vehicles or no tags were excluded. Single vehicle images with multiple tags were considered not destroyed if the destroyed tag was not present in the image filename.
After data cleaning and labeling, there were 487 destroyed tanks and 228 not destroyed tanks. These images were split 85/15 into a training and validation set. Imbalanced class representation was by using a weighted random sampler that overweighted the minority "not destroyed" class.
Images were resized and manipulated using the following transformation:
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
composed_transform = transforms.Compose([transforms.Resize((256, 256)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean, std)])
Model Hyperparameters:
VGG:
- Batch Size = 16
- Epochs = 20
- lr = 1e-3
- dropout_rate = 0.3
- Optimizer = Adam
- Loss = BCELoss
Resnet18:
- Batch Size = 16
- Epochs = 46
- lr = 1e-3
- Optimizer = Adam
- Loss = CrossEntropyLoss
Results:
VGG: 73% Accuracy Validation set
Resnet18: 90% Accuracy Validaton set
Has limited ability to correctly recognize the states of vehicles outside of the training scope.
Limitations:
Doesn't have the ability to recognize whether the provided image is a tank or not.