Plutus: Pioneering Greek Financial AI in a Global Context
Introduction
In the world of financial technology, language matters. For years, English has dominated financial AI benchmarks and models, leaving other languages underrepresented. Plutus is changing that. Plutus is the first comprehensive Greek financial language model and benchmark, aiming to bring advanced AI capabilities to the Greek finance sector (Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance). Named after the Greek god of wealth, Plutus signifies abundance of knowledge – and indeed it provides a wealth of tools for developers, financial analysts, and blockchain enthusiasts alike. This project introduces Plutus-ben, a Greek Financial Benchmark suite covering multiple tasks, and Plutus-8B, a custom Greek financial Large Language Model (LLM). This ported by NaCTeM (The National Centre for Text Mining), Archimedes AI, AIRC (Artificial Intelligence Research Center, AIST), and The Fin AI, whose contributions in computational resources, domain expertise, and research collaboration have been instrumental in developing Plutus. Together, they fill a crucial gap in AI: helping computers understand and generate Greek financial language with the nuance and accuracy that the industry demands.
Why does this matter? Greece has a pivotal role in the global economy and a rich financial history, yet until now there were no AI benchmarks or models dedicated to Greek finance. General-purpose or multilingual models often stumble on the linguistic complexity of Greek and the domain-specific jargon of finance. Plutus aims to bridge this gap. In this blog post, we’ll explore Plutus’s significance and applications, see how it compares with broader financial AI efforts like the FinBen (Open FinLLM Leaderboard), and understand its potential impact on the finance and blockchain communities.
Why Greek Finance Needs Plutus
Financial language is tough even in English – add the intricacies of Greek, and the challenge grows. Prior to Plutus, researchers noted “considerable performance disparities” when applying multilingual NLP models to Greek finance. In other words, an AI trained on many languages or general finance texts often fell short in Greek, missing context or misidentifying terms. This is partly due to limited Greek financial data to train on and the unique grammar and vocabulary of Greek (imagine an English-trained model trying to parse a Greek stock prospectus – it’s bound to struggle). Until Plutus, no dedicated Greek financial benchmarks or specialized Greek finance LLMs existed, meaning there was no standard way to measure how well AI performs on tasks like reading Greek annual reports or extracting data from Greek financial news.
Plutus was born to address these challenges. By focusing on a low-resource language in finance, it highlights an important principle: financial AI should not be one-size-fits-all. Just as Greek economists and analysts have their own lingo, an AI model needs specialized training to truly understand it. Plutus’s creators emphasize that cross-lingual transfer (simply using an English model on Greek text) has limitations. Their findings show that without Greek-specific training, even state-of-the-art models struggle with accuracy and reasoning on Greek content. In short, Greek finance needed Plutus to ensure it isn’t left behind in the AI revolution.
Plutus-ben
At the heart of Plutus is Plutus-ben, the first Greek Financial Evaluation Benchmark. This benchmark is a suite of five core NLP tasks that reflect real-world needs in the Greek financial domain. These tasks are:
Numeric Named Entity Recognition (Numeric NER) – identifying numeric entities in text, such as monetary values, percentages, dates, and other numbers. For example, in a Greek financial report, “€5 εκατ.” (5 million euros) should be recognized as a monetary amount. Plutus-ben provides a dataset called Plutus Finner Numeric for this, with expert-annotated examples from Greek annual reports. Models are evaluated by how accurately they tag these numeric entities (measured by F1 score).
Textual Named Entity Recognition (Text NER) – extracting named entities like people, organizations, and locations from financial documents. The Plutus Finner Text dataset focuses on Greek financial text (e.g., company names, executive names in reports). This helps in tasks like parsing news to find which companies or CEOs are mentioned. Again, an entity F1 score gauges performance.
Question Answering (QA) – answering finance questions based on provided context. Uniquely, Plutus’s QA is framed in a multiple-choice format with an accompanying passage. Many questions came from Greek university finance exams , ensuring they test real finance knowledge. For instance, a question might ask, “Ποιος είναι ο κύριος κίνδυνος για μια τράπεζα;” (“What is the main risk for a bank?”) with choices A) credit risk, B) liquidity risk, etc. The model must choose the correct answer. Evaluation is by accuracy – did the model pick the right option?
Abstractive Summarization – generating a summary of a financial document in Greek. This task addresses scenarios like summarizing annual reports or news articles for quick insights. Plutus includes an abstractive summarization component (the dataset is referred to as GRFNS-2023 in their results, built from FNS-2023 challenge datasets). Models are evaluated with metrics like ROUGE (a measure of overlap between the AI’s summary and a reference summary).
Topic Classification – categorizing financial texts by topic. The Plutus Multifin dataset does this with Greek financial news headlines based on the existing dataset MultiFin. Each headline (and a short context) comes with a set of possible topics, such as “Φορολογία & Λογιστική” (Taxation & Accounting), “Επιχειρήσεις & Διοίκηση” (Business & Management), “Οικονομικά” (Economics), etc., and the model must pick the correct category (TheFinAI/plutus-multifin · Datasets at Hugging Face). This is evaluated by accuracy as well.
Together, these five tasks in Plutus-ben cover a spectrum of skills – from understanding raw numbers and names to answering complex questions and summarizing long texts. Importantly, the benchmark isn’t just theory; it’s built on data meticulously curated and annotated by native Greek finance experts. The team created three novel high-quality Greek financial datasets, supplemented by two existing resources. For example, the topic classification task leverages the existing MultiFin dataset (a multilingual finance news collection) by extracting its Greek portion (TheFinAI/plutus-multifin · Datasets at Hugging Face). Most other tasks (NER and QA) required fresh Greek-specific data creation, like annotating annual reports and collecting exam Q&A. All of these datasets are released publicly, so anyone can inspect them or use them to train their own models.
Plutus-8B
Data alone isn’t enough – you need a model that can learn from it. Plutus-8B is that model: an 8-billion-parameter large language model fine-tuned specifically for Greek financial text. It’s built on the Llama 3 architecture (a modern LLM base model), adapted through LoRA fine-tuning (a technique that efficiently fine-tunes only parts of the model). In essence, the team started with a general Greek language model (in fact, we used Llama-Kríkri-8B, a Greek-trained model, and then trained it further on Greek financial data. This included not only the Plutus-ben tasks data but also additional Greek financial documents to give it broader knowledge of the domain.
The result is Plutus-8B-instruct, an instruction-tuned model that can understand prompts and produce answers in Greek, focused on finance. Being instruction-tuned means it was trained to follow human instructions, making it suitable for conversational agents or Q&A systems (think of asking it: “Summarize this earnings report” or “What is the debt-to-equity ratio of company X and why does it matter?” in Greek, and getting a coherent answer). The model card provided by the authors highlights that Plutus-8B is intended for “Greek-centric financial language tasks”, designed to generate and understand Greek financial text with high fidelity.
Crucially, Plutus-8B and its training data are open-source. This openness is a boon for developers and researchers: financial analysts can fine-tune it further on their own data, developers can integrate it into applications (for example, a chatbot that answers questions about Greek market data), and the community can help improve it. The creators explicitly released Plutus-8B and all datasets to promote reproducible research and broader multilingual inclusivity in finance. In practical terms, a Greek bank or fintech startup could take Plutus-8B and deploy it to automate report analysis or customer inquiries in Greek, tasks that previously might have required either an English-centric model or no AI at all.
Performance Insights: How Does Plutus Stack Up?
Building a model is one thing; proving its worth is another. The Plutus team evaluated 22 different LLMs (including Plutus-8B itself and many others) on the Plutus-ben benchmark to see how they perform. These models ranged from small open-source models to giant proprietary ones like GPT-4. The results tell a compelling story about the value of specialization.
Plutus-8B emerged as the top performer overall on the Greek financial tasks, with a mean performance score of 0.60 (out of 1.0) across the five task metrics. For context, the mighty GPT-4 – often considered the gold standard – scored a 0.52 mean on the same Greek tasks. This is a significant win for Plutus. To break it down further:
On numeric entity recognition, Plutus-8B achieved an F1 score of 0.70, whereas GPT-4 managed only 0.28. This huge gap suggests that Plutus’s focused training on Greek financial numbers paid off – it can identify things like monetary amounts or dates in Greek text far more reliably than even the largest general models. Those of us who deal with Greek financial filings know how tricky numeric formats and language can be; Plutus seems to have mastered it.
On textual entity recognition, Plutus-8B scored 0.57, nearly matching GPT-4’s 0.60. It also outperformed a host of other models, including a Greek-general model called “Meltemi” and various multilingual LLMs, in recognizing people, organizations, and places in Greek documents. In practice, this means Plutus is very good at, say, parsing a press release and pulling out the company and executive names correctly.
For the QA task, Plutus reached 64% accuracy, which was competitive though slightly behind GPT-4’s 71% and another large model’s 74%. Still, considering Plutus has a fraction of the parameters and is focused on Greek, that’s impressive. It likely means Plutus knows Greek financial concepts well but could occasionally be tripped up by very complex questions (where GPT-4’s vast general knowledge helps). However, Plutus dramatically outperformed smaller open models – many of which scored below 50% on QA.
On summarization, measured by ROUGE-1 overlap, Plutus was a bit lower (0.34) than GPT-4 (0.38). Summarizing long, nuanced reports is hard, and even GPT-4 doesn’t excel at it (sub-0.4 scores indicate there’s room for improvement for all models here). Plutus’s decent showing indicates it can produce fairly good summaries of Greek finance documents, though perhaps not as fluently as GPT-4. This isn’t surprising – summarization often benefits from extreme scale and training on massive corpora.
In topic classification, Plutus-8B hit 72% accuracy, tying for the top spot with a 72B-parameter model, and beating GPT-4 (63%). This means when it comes to sorting Greek financial news into the correct categories, Plutus is as good as any model out there. If you’re building a tool to route Greek news to analysts by topic, Plutus would be an excellent engine for it.
Overall, these results underscore a key point: bigger isn’t always better – training on the right data is essential. Plutus-8B, though much smaller than GPT-4, leverages its Greek-specific knowledge to outperform or rival the giant on multiple tasks. The authors note that Greek financial NLP remains challenging due to language and domain complexity, but Plutus shows a clear improvement over general models that attempt cross-lingual transfer learning. It validates the idea that investing in domain-specific, language-specific models can yield superior performance where it counts. This mirrors what we’ve seen in other languages and domains too, and it’s exciting to have evidence now in the Greek finance space.
For the technically curious, it’s also worth noting some “surprising results” observed on the broader Financial LLM Leaderboard: for instance, in stock price forecasting tasks, smaller fine-tuned models sometimes beat much larger ones. This aligns with Plutus’s success – a focused model can outshine a general-purpose giant on specialized tasks. It’s a reminder that quality of training beats sheer quantity of parameters in niche areas.
Real-World Applications and Impact
What can Plutus actually do in practice? The applications of a Greek financial LLM span a wide range of stakeholders:
For financial analysts and investors: Plutus-8B can read and summarize lengthy Greek financial reports, saving countless hours. Imagine an analyst covering Greek markets – instead of manually translating or skimming a 100-page annual report from a Greek corporation, they could use Plutus to get an instant summary of key financial metrics and narratives. The model’s abstractive summarization capability is particularly useful here, condensing complex information while maintaining accuracy (and evaluated with rigorous ROUGE benchmarks). Additionally, the QA capability means an analyst could ask specific questions: “What was the net profit of Company X in 2024 and how did it compare to 2023?” and get a quick answer drawn from the report. This kind of interactive analysis could dramatically speed up research in Greek equities or bonds.
For Greek financial institutions: Banks, insurers, and regulators in Greece produce a lot of text – filings, compliance documents, news releases. Plutus can be fine-tuned further to assist in information extraction from these documents. For example, the numeric NER function can pull all relevant figures from a regulatory filing (capital ratios, liquidity percentages, etc.) automatically. Regulators could use this to monitor firms’ disclosures more efficiently. Banks could integrate Plutus into their internal systems to flag important names or numbers in incoming documents (like identifying all mentions of their bank in Greek news daily, along with sentiment).
For fintech developers: Developers building financial apps for the Greek market now have a powerful tool in their arsenal. Consider a fintech app that provides conversational financial advice in Greek – it needs to understand user queries about Greek stocks or economic indicators. With Plutus-8B, the app can leverage a model that understands context like “ATHEX index” or “Ελληνική Τράπεζα” right out of the box. Moreover, because Plutus-8B is open, developers can deploy it on-premises, which is important for data privacy (financial data can be sensitive, and not all companies are comfortable sending queries to an API like ChatGPT). Plutus enables more secure, local AI solutions for Greek finance.
For the blockchain community: Interestingly, the name “Plutus” might ring a bell for blockchain enthusiasts – it’s also the name of a smart contract language on the Cardano blockchain. While unrelated, this coincidence highlights an intersection: blockchain and decentralized finance (DeFi) projects could use Plutus-8B to analyze on-chain financial data or Greek regulatory news about crypto. For example, if Greece releases new regulations on cryptocurrency, a model like Plutus could summarize and clarify the impact in Greek or even translate it to English for international readers. Cryptocurrency markets are global, and having a multilingual financial model means better insights into how different countries (like Greece) are engaging with blockchain and finance. In broader terms, blockchain projects often need to parse financial documents (whitepapers, legal texts) – a Greek-focused model adds to the toolkit for covering all bases.
Educational use: Plutus’s QA dataset sourced from university exams means it has knowledge aligned with academic finance curricula. Professors in Greek finance courses could use the model to develop tutoring systems or to generate practice questions. Students could query the model for explanations of concepts in Greek, making finance education more accessible. It’s like having a knowledgeable TA that speaks Greek finance jargon fluently.
Beyond these specific examples, the symbolic impact of Plutus is substantial. It demonstrates that multilingualism in AI is achievable and beneficial even in highly specialized domains. This can encourage similar projects for other languages with significant finance sectors (imagine a Plutus equivalent for Arabic finance, or for the Nordic languages). As the Plutus paper concludes, the goal is to foster “broader multilingual inclusivity in finance”. In a global economy, that inclusivity is crucial – we don’t want AI that only understands Wall Street and not Athens or Madrid. Plutus is a step toward democratizing financial AI.
Conclusion
Plutus stands at the intersection of finance, language, and technology – it’s a testament to what targeted AI development can achieve. By crafting a benchmark and model around Greek financial language, the creators of Plutus have not only provided a powerful tool for immediate use, but also set a precedent. They highlight that language should not be a barrier to AI empowerment in finance. Whether you’re parsing the balance sheet of a Greek shipping company, or trying to glean insights from a Hellenic Financial Stability report, Plutus brings the clarity of large language models to your fingertips in Greek.
For Greece’s finance sector, Plutus could herald the beginning of a new era where AI assists in daily operations, research, and decision-making, all in the native language of the users. For the global AI community, Plutus offers valuable lessons and a blueprint for incorporating more languages and domains. As we move forward, we might see Plutus integrated into the larger Open Financial LLM Leaderboard, inspiring competition and improvement on Greek tasks just as we have for English and Spanish. And the next FinBen will include even more languages, ensuring that the benefits of AI in finance truly reach everyone, from Wall Street to Sofokleous Street.
In the end, Plutus embodies the idea that financial AI should speak your language – literally. By empowering Greek finance with its own tailored AI benchmark and model, Plutus not only honors the legacy of its namesake (the god of wealth) by spreading “riches” of knowledge, but also ensures that the Greek financial community can partake in the AI revolution on equal footing. It’s a significant step toward a more inclusive, effective, and multilingual future for finance and technology.