--- language: en tags: - Explain code - Code Summarization - Summarization license: mit --- # Gemini For in-depth understanding of our model and methods, please see our blog [here](https://www.describe-ai.com/gemini) ## Model description Gemini is a transformer based on Google's T5 model. The model is pre-trained on approximately 800k code/description pairs and then fine-tuned on 10k higher-level explanations that were synthetically generated. Gemini is capable of summarization/explaining short to medium code snippets in: - Python - Javascript (mostly vanilla JS, however, it can handle frameworks like React as well) - Java - Ruby - Go And outputs a description in English. ## Intended uses & limitations Gemini without any additional fine-tuning is capable of explaining code in a sentence or two and typically performs best in Python and Javascript. We recommend using Gemini for either simple code explanation, documentation or producing more synthetic data to improve its explanations. ### How to use You can use this model directly with a pipeline for Text2Text generation, as shown below: ```python from transformers import pipeline, set_seed summarizer = pipeline('text2text-generation', model='describeai/gemini') code = "print('hello world!')" response = summarizer(code, max_length=100, num_beams=3) print("Summarized code: " + response[0]['generated_text']) ``` Which should yield something along the lines of: ``` Summarized code: The following code is greeting the world. ``` ### Model sizes - Gemini (this repo): 770 Million Parameters - Gemini-Small - 220 Million Parameters ### Limitations Typically, Gemini may produce overly simplistic descriptions that don't encompass the entire code snippet. We suspect with more training data, this could be circumvented and will produce better results. ### About Us A Describe.ai, we are focused on building Artificial Intelligence systems that can understand language as well as humans. While a long path, we plan to contribute our findings to our API to the Open Source community.