ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis
Abstract
Bidirectional transformers excel at sentiment analysis, and Large Language Models (LLM) are effective zero-shot learners. Might they perform better as a team? This paper explores collaborative approaches between ELECTRA and GPT-4o for three-way sentiment classification. We fine-tuned (FT) four models (ELECTRA Base/Large, GPT-4o/4o-mini) using a mix of reviews from Stanford Sentiment Treebank (SST) and DynaSent. We provided input from ELECTRA to GPT as: predicted label, probabilities, and retrieved examples. Sharing ELECTRA Base FT predictions with <PRE_TAG>GPT-4o-mini</POST_TAG> significantly improved performance over either model alone (82.74 macro F1 vs. 79.29 ELECTRA Base FT, 79.52 <PRE_TAG>GPT-4o-mini</POST_TAG>) and yielded the lowest cost/performance ratio (\0.12/F1 point). However, when GPT models were fine-tuned, including predictions decreased performance. GPT-4o FT-M was the top performer (86.99), with <PRE_TAG>GPT-4o-mini</POST_TAG> FT close behind (86.77) at much less cost (0.38 vs. \$1.59/F1 point). Our results show that augmenting prompts with predictions from fine-tuned encoders is an efficient way to boost performance, and a fine-tuned <PRE_TAG>GPT-4o-mini</POST_TAG> is nearly as good as <PRE_TAG>GPT-4o FT</POST_TAG> at 76% less cost. Both are affordable options for projects with limited resources.
Models citing this paper 2
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper