which ensemble id used ?

#89
by jinbo1129 - opened

hello, thanks for this great tool !!!

Now I want to train/test my own data.
First, I have to change the gene name to ensembl_id.
In this file gene_info_table.csv, most of the genes hase unique ensembl_id, but some of them have multiple ensembl_ids.
Can I just pick the first one for these genes having multiple ensembl_ids? Could you give some suggestions?

Thanks !!!

Thank you for your interest in Geneformer! As mentioned in prior closed discussions, other types of gene annotations (such as gene names) can be converted to Ensembl IDs using Ensembl Biomart. I added this point to the instructions in the transcriptome tokenizing example to help clarify.

ctheodoris changed discussion status to closed

Sign up or log in to comment