Questions about data split in hyperparam_optimiz_for_disease_classifier.py
#112
by
Jiahaoszu
- opened
Hey,
I found something strange in the code 'hyperparam_optimiz_for_disease_classifier.py' from line 70 - 75.
It seems that the 42 donors's ID are already in 'train_indiv' so no samples would be split into the eval sets.
Thank you for your interest in Geneformer! I changed the individual list to a set to ensure they are unique before subsetting into the train/valid/test sets. Please pull the updated version.
ctheodoris
changed discussion status to
closed