Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.13.1
title: LiteraLingo Dev
emoji: 🐠
colorFrom: yellow
colorTo: red
sdk: gradio
sdk_version: 4.41.0
app_file: app.py
pinned: false
license: mit
short_description: Convert figurative sentences into their literal meanings.
Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference
LiteraLingo_Dev
LiteraLingo_Dev is a Gradio app designed to convert figurative sentences into their literal meanings using various models. This tool serves as an inference API for the LiteraLingo app in development invironment to process queries and output implied meanings from the given input. This README provides details on how to use the app, set it up, and some important notes on working with the Gradio API and model inference.
API Usage
For direct API access, you can use the Gradio client: from gradio_client import Client
API_URL = "https://caddc612329739b198.gradio.live/"
client = Client(API_URL)
result = client.predict(
model_typ="gemma",
prompt="She has a heart of gold",
max_length=256,
api_token="",
api_name="/predict"
)
print(result)
Check more contents in the following readme for more info on Notes and Cautions of usage.
Prerequisites
Before using LiteraLingo_Dev, ensure you have the necessary packages in requirements.txt installed.
Setup
1. Authentication
You need to authenticate with HuggingFace to access certain models. Obtain a token from HuggingFace and set it as an environment variable or directly replace the secret: HF_TOKEN's value in the space setting where the app is hosted (LiteraLingo_Dev/settings)
HF_TOKEN = os.getenv('HF_TOKEN')
2. Model URLs
Specific settings of this Gradio app as well as the URLs for the various models used are available and flexible to be changed in globals.py
"""T5"""
T5_FILE_NAME = "model.safetensors"
simplet5_base_URL="angel1987/simplet5_metaphor_dev1"
simplet5_large_URL="angel1987/simplet5_metaphor_dev2"
"""models"""
gemma_2b_URL = "google/gemma-2b-it"
falcon_7b_URL = "tiiuae/falcon-7b-instruct"
Note that your HF_Token must have access to these models to ensure the app to run successfully.
3. Model Initialization
Ensure to load the models correctly based on the availability of GPU or CPU. If a GPU is available, it will be used; otherwise, the CPU will be used.
The examples in globals.py also serve as test cases and they'll be run everytime at the app's startup; you can remove them if they're affecting the app's initialization.
Usage - Gradio Interface
Single Output
The app provides a Gradio interface for querying models and receiving a single output. The available models are:
- Gemma
- Falcon-7b-instruct
- Falcon-7b-instruct API (requires API token)
- SimpleT5 Base
- SimpleT5 Large
You can select the model type, input a sentence, and get the paraphrased literal meaning. You can also specify the maximum length of the generated sentence, which only serves as a threshold and won't guarantee that the output will reach this length everytime.
Top K Output
For generating multiple responses, the Top K output feature is available. It provides up to five different responses based on the input sentence, model type, and temperature.
By producing multiple responses for a given input, this approach used various potential paraphrasings the model can generate. This diversity is considered to offer a richer set of options compared to a single response, and sometimes the simplet5 models can perform better when sampling is enabled than the deterministic approach.
Notes and Cautions
Rate Limits: The free Inference API may be rate limited for heavy use cases. We try to balance the loads evenly between all our available resources, and favoring steady flows of requests. If your account suddenly sends 10k requests then you’re likely to receive 503 errors saying models are loading. In order to prevent that, you should instead try to start running queries smoothly from 0 to 10k over the course of a few minutes.
Model Performance: Inference times can vary. The inference time for the Falcon-Instruct models are notably long. It took from 300 ~ up to 800 seconds. Using API is recommended (~0.6sec, under 1 sec in most cases). Please remember to provide the corresponding HF Token that grants access to the API for each query, otherwise only a warning message will be shown.
Model Limitations: Some models may produce different results or have limitations based on their configuration and the input provided. Note that a default prefix is prepended to every incoming query based on prompt testing for a suitable choice. You can configure these default values in globals.py