E3-JSI/gliner-multi-pii-domains-v1

abpani1994

13 days ago

Just to inform about the finetuning
sometimes it is extracting details like this
"type": "Email",
"matches": [
{
"value": "direct email",
},
{
"value": "email client",
},
{
"value": "email address",
}

abpani1994

13 days ago

•

edited 13 days ago

So I feel the data you fine tuned on has a lot of positive examples.
{'start': 0,
'end': 11,
'text': 'Partnership',
'label': 'organization',
'score': 0.8815939426422119},
{'start': 35,
'end': 46,
'text': 'Partnership',
'label': 'organization',
'score': 0.6203173995018005},
{'start': 255,
'end': 263,
'text': 'Investor',
'label': 'person',
'score': 0.7565749287605286},
{'start': 300,
'end': 308,
'text': 'Investor',
'label': 'person',
'score': 0.6845653653144836},

eriknovak

Department for Artificial Intelligence, Jožef Stefan Institute org 12 days ago

Thank you for your input.

We fine-tuned the model as per GLiNER’s provided example for fine-tuning their models. As per all NER models, it is expected that it will (sometimes) extract values that might not be inline with your expectations.

We suggest you to try a different (higher) threshold to remove such examples - this will also let the model know that you want to extract the values only if the model is really certain about it. Furthermore, you can try initializing multiple models for different entities and set the threshold for each one.

Hope this helps.

eriknovak changed discussion status to closed 12 days ago

E3-JSI
/

gliner-multi-pii-domains-v1

False positives