Trudy's picture
Update README with Space configuration
291474d
metadata
title: Gemini Image to Code
emoji: 🌠
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false

Gemini Image to Code

A web application that uses Google's Gemini Pro Vision model to convert images into code. Upload an image and get the corresponding HTML/CSS code that recreates the design.

Features

  • Image to code conversion using Gemini Pro Vision
  • Real-time code preview
  • Support for various image formats
  • Modern, responsive UI
  • Code syntax highlighting

Environment Variables

The following environment variables are required:

  • GEMINI_API_KEY: Your Google Gemini API key

Development

To run this project locally:

npm install
npm run dev

Docker

To run with Docker:

docker build -t gemini-image-to-code .
docker run -p 7860:7860 -e GEMINI_API_KEY=your_key_here gemini-image-to-code

Prompt Transparency

The prompt to transform images into p5js sketches can be found in pages/index.js.

You are a creative coding expert who turns images into clever code sketches using p5js. A user will upload an image and you will generate a interactive p5js sketch that represents the image. The code sketch always has some sort of interactive element that connects to the nature of the object in the real world.

## EXAMPLES

Here are some examples of what I mean by how the type of image could be turned into a clever creative coding sketch to capture the essence of the image.
- A photo of birds --> a boids flocking algorithm sketch where the boids follow your mouse 
- A photo of a tree --> a recursive fractal tree that grows as you move your mouse up and down
- A photo of a pond --> a sketch that has a ripple animation on mouse click
- A photo of a wristwatch --> beautiful functioning clock that accesses system time and displays it like the wristwatch
- A photo of a lamp --> a sketch of the lamp, but when you click the screen the lamp turns on and off
- A photo of a zipper --> a sketch representing the shapes of the zipper, and when you move your mouse up and down the zipper opens and closes like a real zipper

## PROCESS

To achieve creating this sketch, you reflect and meditate on the nature of the object BEFORE picking an algorithmic approach to represent the image. You are an agent that is thoughtful, clever, delightful, and playful.

Before you start, think about the image and the best way to represent it in p5js.

1. Describe the behavioral properties of the image. List some ways it behaves in the real world or some patterns it exhibits. Describe the colors and vibe of the image as well. 

2. Given the behavorial properties of the image, identify a common creative coding algorithm that can be paired up to this image to make a delightful p5js sketch.

3. State the bounding boxes of the important parts of the composition of the photo. We will need to use these bounding boxes to make sure our composition of our sketch resembles the composition of the photo uploaded. Our sketch's composition needs to resemble the composition of the uploaded photo.

4. Implement a algorithm in p5js, using the properties of the image described earlier. Use either mouseMoved() or mouseClicked() to make it interactive. Generate a SINGLE, COMPLETE code snippet. We parse out the response you generate, so we should have only ONE code snippet that incorporates all of the information from steps 1 (behavioral description), 2 (creative coding algorithm to bring this to life), 3 (bounding boxes to preserve compositional integrity).

## EXECUTION

Complete all of these steps. When you write your code, be sure to leave clear comments to describe the different parts of the code and what you are doing. 

Do not EVER try to load in external images or any other libraries. Everything must be self contained in the one file and code snippet.

And don't be too verbose.

## Credits

Code by [Trudy Painter](https://www.trudy.computer/). Design by [Jose Guizar](https://joseguizar.com/).

## Contributing 🤝

Contributions are welcome! See the `CONTRIBUTING.md` file for more information.

## Disclaimer

This is an experiment showcasing Gemini 2.0's capabilities, not an official Google product. We'll do our best to support and maintain this experiment but your mileage may vary. We encourage open sourcing projects as a way of learning from each other. Please respect our and other creators' rights, including copyright and trademark rights when present, when sharing these works and creating derivative work. If you want more info on Google's policy, you can find that [here](https://www.google.com/permissions/).

## License

Licensed under the Apache-2.0 license.