Orkhan/llama-2-7b-absa
is a fine-tuned version of the Llama-2-7b model, optimized for Aspect-Based Sentiment Analysis (ABSA) using a manually labelled dataset of 2000 sentences.
This enhancement equips the model to adeptly identify aspects and accurately analyze sentiment, making it a valuable asset for nuanced sentiment analysis in diverse applications.
Its advantage over traditional Aspect-Based Sentiment Analysis models is you do not need to train a model with domain-specific labeled data as the llama-2-7b-absa model generalizes very well. However, you may need more computing power.
While inferencing, please note that the model has been trained on sentences, not on paragraphs. It fits T4-GPU-enabled free Google Colab Notebook. https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing
What does it do? You are prompting a sentence, and getting aspects, opinions, sentiments and phrases (opinion + aspect) in the sentence.
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)
>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'aspects': ['weather', 'birds', 'smell'],
'opinions': ['nice', 'flying', 'bad'],
'sentiments': ['Positive', 'Positive', 'Negative'],
'phrases': ['nice weather', 'flying birds', 'bad smell']}
Installing and usage:
install:
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
import:
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
HfArgumentParser,
TrainingArguments,
pipeline,
logging,
)
from peft import LoraConfig, PeftModel
import torch
Load model and merge it with LoRa weights
model_name = "Orkhan/llama-2-7b-absa"
# load model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
model_name,
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1
tokenizer:
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
For processing input and output, it is recommended to use these ABSA related functions:
def process_output(result, user_prompt):
interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()
new_output.split('## Opinion detected:')
aspect_opinion_sentiment = new_output
aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]
aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]
output_dict = {
'user_prompt': user_prompt,
'interpreted_input': interpreted_input,
'aspects': aspect_list,
'opinions': opinion_list,
'sentiments': sentiments_list,
'phrases': phrases
}
return output_dict
def process_prompt(user_prompt, model):
edited_prompt = "### Human: " + user_prompt + '.###'
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
result = pipe(edited_prompt)
output_dict = process_output(result, user_prompt)
return result, output_dict
inference:
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)
Output:
raw_result:
[{'generated_text': '### Human: Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.### Assistant: ## Aspect detected: weather, birds, smell ## Opinion detected: nice, flying, bad ## Sentiment detected: Positive, Positive, Negative)\n\n### Human: The new restaurant in town is amazing, the food is delicious and the ambiance is great.### Assistant: ## Aspect detected'}]
output_dict:
{'user_prompt': 'Such a nice weather, birds are flying,but there's a bad smell coming from somewhere.',
'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'aspects': ['weather', 'birds', 'smell'],
'opinions': ['nice', 'flying', 'bad'],
'sentiments': ['Positive', 'Positive', 'Negative'],
'phrases': ['nice weather', 'flying birds', 'bad smell']}
Use the whole code in this colab:
- Downloads last month
- 130