Report for ProsusAI/finbert
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 3 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_66agree
, split train
).
👉Robustness issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.100 | Add typos | 100/1000 tested samples (10.0%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 10.0% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
3956 | Operating result , excluding one-off items , totaled EUR 9.1 mn compared to EUR 10.6 mn in continuing operations , excluding one-off items in 2004 . | Operating result , excluding one-off items , totaled EUR 9.1 mj cokpared to EUR 106 mn in continuinf opeations , excludig one-off items oin 2004 . | negative (p = 0.68) | neutral (p = 0.46) |
1737 | Commencing the construction works of Pearl Plaza is a significant step in our Russian projects . | Commencibg tge construction works of Pearl Plaza is a significant tep in kur Russian projecfs . | positive (p = 0.66) | neutral (p = 0.81) |
1442 | Sales of mid-strength beer decreased by 40 % . | Sales of mid-strength beer decreaxsesd by 40 % . | negative (p = 0.97) | positive (p = 0.93) |
👉Performance issues (2)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text contains "business" |
Balanced Accuracy = 0.878 | — | -6.46% than global |
🔍✨Examples
For records in the dataset where `text` contains "business", the Balanced Accuracy is 6.46% lower than the global Balanced Accuracy.text | label | Predicted label |
|
---|---|---|---|
506 | `` These tests are part of a larger campaign which includes various customer trials and demonstrations to make LTE on 800 MHz commercially viable by this summer , '' Nokia Siemens head of LTE business line , Reino Tammela , said . | neutral | positive (p = 0.67) |
613 | ADP News - Apr 22 , 2009 - Finnish business information systems developer Solteq Oyj HEL : STQ1V said today its net loss widened to EUR 189,000 USD 245,000 for the first quarter of 2009 from EUR 10,000 for the same peri | negative | positive (p = 0.93) |
694 | The Group 's business is balanced by its broad portfolio of sports and presence in all major markets . | positive | neutral (p = 0.64) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text contains "quarter" |
Balanced Accuracy = 0.886 | — | -5.52% than global |
🔍✨Examples
For records in the dataset where `text` contains "quarter", the Balanced Accuracy is 5.52% lower than the global Balanced Accuracy.text | label | Predicted label |
|
---|---|---|---|
580 | The loss for the third quarter of 2007 was EUR 0.3 mn smaller than the loss of the second quarter of 2007 . | positive | negative (p = 0.93) |
586 | The third quarter result also includes a 400,000 euro ( $ 575,000 ) provision for down-sizing of lure manufacturing in Ireland . | neutral | negative (p = 0.56) |
612 | Finnish power supply solutions and systems provider Efore Oyj said its net loss widened to 3.2 mln euro $ 4.2 mln for the first quarter of fiscal 2006-2007 ending October 31 , 2007 from 900,000 euro $ 1.2 mln for the same period of fiscal 2005-06 . | negative | positive (p = 0.93) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!