Model description

DinoVdrone is a model built on top of DinoVdrone-large-2025_02_04_23294-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

The source code for training the model can be found in this Git repository.

Developed by: lombardata, credits to César Leblanc and Victor Illien

Intended uses & limitations

You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

Training and evaluation data

Details on the estimated number of images for each class are given in the following table:

Class	train	test	val	Total
Acropore_branched	1400	331	324	2055
Acropore_digitised	1235	307	296	1838
Acropore_tabular	705	254	270	1229
Algae	5179	1709	1694	8582
Atra/Leucospilota	780	121	133	1034
Dead_coral	3796	1093	1060	5949
Fish	2822	810	786	4418
Millepore	977	324	337	1638
No_acropore_encrusting	854	417	404	1675
No_acropore_massive	3430	1180	1185	5795
No_acropore_sub_massive	3261	975	935	5171
Rock	5251	1740	1737	8728
Rubble	5152	1708	1690	8550
Sand	5285	1764	1767	8816
Sea_cucumber	1942	497	527	2966
Sea_urchins	269	131	140	540

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Number of Epochs: 66.0
Learning Rate: 0.001
Train Batch Size: 32
Eval Batch Size: 32
Optimizer: Adam
LR Scheduler Type: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
Freeze Encoder: Yes
Data Augmentation: Yes

Data Augmentation

Data were augmented using the following transformations :

Train Transforms

PreProcess: No additional parameters
Resize: probability=1.00
RandomHorizontalFlip: probability=0.25
RandomVerticalFlip: probability=0.25
ColorJiggle: probability=0.25
RandomPerspective: probability=0.25
Normalize: probability=1.00

Val Transforms

PreProcess: No additional parameters
Resize: probability=1.00
Normalize: probability=1.00

Training results

Epoch	Validation Loss	MAE	RMSE	KL div	Learning Rate
1	0.464070200920105	0.1411	0.1895	1.0901	0.001
2	0.4593982994556427	0.1389	0.1848	0.4544	0.001
3	0.4592764973640442	0.1384	0.1857	0.3269	0.001
4	0.4557758569717407	0.1355	0.1821	0.3961	0.001
5	0.4529845416545868	0.1346	0.1801	0.4950	0.001
6	0.4528021514415741	0.1321	0.1798	1.0128	0.001
7	0.4533347189426422	0.1330	0.1805	0.8411	0.001
8	0.4499141275882721	0.1301	0.1768	0.7630	0.001
9	0.45464015007019043	0.1310	0.1812	0.9855	0.001
10	0.45381101965904236	0.1336	0.1804	0.6065	0.001
11	0.4514716863632202	0.1301	0.1783	0.7073	0.001
12	0.4508911669254303	0.1314	0.1776	0.5892	0.001
13	0.45219239592552185	0.1316	0.1790	0.7082	0.001
14	0.44901221990585327	0.1301	0.1763	0.6661	0.001
15	0.4518992006778717	0.1337	0.1787	0.5407	0.001
16	0.4521825909614563	0.1319	0.1784	0.5890	0.001
17	0.4508560597896576	0.1308	0.1776	0.6908	0.001
18	0.45125871896743774	0.1326	0.1784	0.7074	0.001
19	0.45139938592910767	0.1338	0.1782	0.4506	0.001
20	0.45023980736732483	0.1315	0.1775	0.4707	0.001
21	0.44681790471076965	0.1274	0.1740	0.7163	0.0001
22	0.44754138588905334	0.1272	0.1742	0.7143	0.0001
23	0.44662079215049744	0.1269	0.1737	0.6689	0.0001
24	0.4464997351169586	0.1279	0.1737	0.6134	0.0001
25	0.44629067182540894	0.1266	0.1734	0.7332	0.0001
26	0.4470885097980499	0.1272	0.1736	0.7046	0.0001
27	0.4466330111026764	0.1274	0.1736	0.5721	0.0001
28	0.4454747140407562	0.1262	0.1727	0.6485	0.0001
29	0.44584745168685913	0.1263	0.1730	0.6530	0.0001
30	0.44628167152404785	0.1260	0.1731	0.6613	0.0001
31	0.44592905044555664	0.1261	0.1730	0.6828	0.0001
32	0.44724300503730774	0.1257	0.1739	0.8040	0.0001
33	0.4454258978366852	0.1250	0.1727	0.7662	0.0001
34	0.4446963667869568	0.1248	0.1721	0.7306	0.0001
35	0.4447169899940491	0.1258	0.1721	0.6649	0.0001
36	0.445063054561615	0.1258	0.1724	0.6900	0.0001
37	0.4462355971336365	0.1257	0.1731	0.7153	0.0001
38	0.445072203874588	0.1263	0.1723	0.6616	0.0001
39	0.44474172592163086	0.1256	0.1721	0.7085	0.0001
40	0.4460701048374176	0.1271	0.1731	0.5933	0.0001
41	0.4457239508628845	0.1261	0.1727	0.5261	1e-05
42	0.44464021921157837	0.1253	0.1720	0.6756	1e-05
43	0.4448077976703644	0.1247	0.1721	0.7392	1e-05
44	0.4451163709163666	0.1257	0.1722	0.5798	1e-05
45	0.44501832127571106	0.1251	0.1722	0.6602	1e-05
46	0.44481563568115234	0.1254	0.1720	0.5779	1e-05
47	0.44460880756378174	0.1249	0.1720	0.7530	1e-05
48	0.44483986496925354	0.1251	0.1720	0.6117	1e-05
49	0.4454316794872284	0.1255	0.1725	0.5545	1.0000000000000002e-06
50	0.44486185908317566	0.1249	0.1721	0.6288	1.0000000000000002e-06
51	0.4449009299278259	0.1247	0.1720	0.6185	1.0000000000000002e-06
52	0.4446372091770172	0.1246	0.1719	0.6934	1.0000000000000002e-06
53	0.44491493701934814	0.1251	0.1721	0.5841	1.0000000000000002e-06
54	0.4448229670524597	0.1249	0.1721	0.6878	1.0000000000000002e-06
55	0.4446983337402344	0.1250	0.1720	0.7091	1.0000000000000002e-07
56	0.4444815218448639	0.1246	0.1717	0.6894	1.0000000000000002e-07
57	0.4450804889202118	0.1251	0.1723	0.6663	1.0000000000000002e-07
58	0.4454004764556885	0.1245	0.1724	0.6827	1.0000000000000002e-07
59	0.4450150430202484	0.1246	0.1723	0.7230	1.0000000000000002e-07
60	0.445451945066452	0.1249	0.1725	0.7024	1.0000000000000002e-07
61	0.44454964995384216	0.1248	0.1718	0.7007	1.0000000000000002e-07
62	0.44488847255706787	0.1248	0.1721	0.6939	1.0000000000000002e-07
63	0.4447149932384491	0.1248	0.1719	0.6934	1.0000000000000004e-08
64	0.44488242268562317	0.1247	0.1721	0.7439	1.0000000000000004e-08
65	0.4448792636394501	0.1258	0.1722	0.5544	1.0000000000000004e-08
66	0.444937139749527	0.1251	0.1721	0.6393	1.0000000000000004e-08

Framework Versions

Transformers: 4.48.0
Pytorch: 2.5.1+cu124
Datasets: 3.0.2
Tokenizers: 0.21.0

groderg
/

DinoVdrone-large-2025_02_04_23294-bs32_freeze_probs