Upload 4 files
Browse files- README.md +86 -14
- requirements.txt +0 -0
- runtime.txt +1 -0
- utils.py +42 -0
README.md
CHANGED
@@ -1,14 +1,86 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# DataBot: AI-Driven Data Analyst
|
2 |
+
|
3 |
+
An interactive data analysis application built with Streamlit and LangChain that helps users analyze and visualize data through natural language conversations.
|
4 |
+
|
5 |
+
## Project Structure
|
6 |
+
|
7 |
+
```
|
8 |
+
data-analytics-bot/
|
9 |
+
βββ app.py # Main application
|
10 |
+
βββ utils.py # Helper functions
|
11 |
+
βββ requirements.txt # Dependencies
|
12 |
+
βββ .gitignore # Git ignore file
|
13 |
+
βββ README.md # Project documentation
|
14 |
+
```
|
15 |
+
|
16 |
+
## Installation
|
17 |
+
|
18 |
+
1. Clone the repository:
|
19 |
+
```
|
20 |
+
git clone <repository-url>
|
21 |
+
cd data-analytics-bot-1
|
22 |
+
```
|
23 |
+
|
24 |
+
2. Install the required dependencies:
|
25 |
+
```
|
26 |
+
pip install -r requirements.txt
|
27 |
+
```
|
28 |
+
|
29 |
+
3. Set up environment variables in the `.env` file.
|
30 |
+
|
31 |
+
## Usage
|
32 |
+
|
33 |
+
1. Start the web server:
|
34 |
+
```
|
35 |
+
python server.py
|
36 |
+
```
|
37 |
+
|
38 |
+
2. Access the bot through your web browser at `http://localhost:8000`.
|
39 |
+
|
40 |
+
## Features
|
41 |
+
|
42 |
+
- **Dark Mode Interface**: Clean, modern dark theme for comfortable viewing
|
43 |
+
- **Data Analysis**:
|
44 |
+
- CSV file upload and processing
|
45 |
+
- Basic statistical analysis
|
46 |
+
- Pattern recognition
|
47 |
+
- Data type detection
|
48 |
+
|
49 |
+
- **Interactive Visualizations**:
|
50 |
+
- Bar charts
|
51 |
+
- Scatter plots
|
52 |
+
- Line graphs
|
53 |
+
- Histograms
|
54 |
+
- Box plots
|
55 |
+
- Heat maps
|
56 |
+
- Pair plots
|
57 |
+
|
58 |
+
- **AI-Powered Chat**:
|
59 |
+
- Natural language queries
|
60 |
+
- Data insights generation
|
61 |
+
- Pattern analysis
|
62 |
+
- Statistical summaries
|
63 |
+
|
64 |
+
## Importance in the Market
|
65 |
+
|
66 |
+
The Data Analytics Bot leverages the power of LangChain and Python packages such as Pandas, Matplotlib, and Seaborn to provide robust data analysis capabilities. In today's data-driven world, the ability to quickly and efficiently analyze data is crucial for businesses and researchers alike. This bot simplifies the process, making it accessible to users with varying levels of technical expertise.
|
67 |
+
|
68 |
+
## Minimum Viable Product (MVP)
|
69 |
+
|
70 |
+
The MVP of the Data Analytics Bot includes the following functionalities:
|
71 |
+
- Upload and process CSV files.
|
72 |
+
- Perform basic statistical analysis.
|
73 |
+
- Generate visualizations such as bar plots, scatter plots, and histograms.
|
74 |
+
- Interactive chat interface for querying data.
|
75 |
+
|
76 |
+
## Dev Challenge Reference
|
77 |
+
|
78 |
+
This project was developed as part of the [GitHub Dev Challenge](https://dev.to/challenges/github). The challenge prompts inspired the creation of a tool that not only showcases technical skills but also addresses real-world data analysis needs. By participating in this challenge, we aimed to demonstrate the practical applications of LangChain and Python in building a user-friendly data analytics solution.
|
79 |
+
|
80 |
+
## Contributing
|
81 |
+
|
82 |
+
Contributions are welcome! Please open an issue or submit a pull request for any enhancements or bug fixes.
|
83 |
+
|
84 |
+
## License
|
85 |
+
|
86 |
+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
|
requirements.txt
ADDED
Binary file (374 Bytes). View file
|
|
runtime.txt
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
python-3.9.18
|
utils.py
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
from langchain_experimental.agents import create_pandas_dataframe_agent
|
3 |
+
from langchain_google_genai import ChatGoogleGenerativeAI
|
4 |
+
import pandas as pd
|
5 |
+
from dotenv import load_dotenv
|
6 |
+
|
7 |
+
load_dotenv()
|
8 |
+
|
9 |
+
def format_agent_output(output):
|
10 |
+
if isinstance(output, pd.DataFrame):
|
11 |
+
return output
|
12 |
+
elif isinstance(output, str):
|
13 |
+
if 'DataFrame' in output or 'describe' in output:
|
14 |
+
return pd.DataFrame(eval(output.split('Input:')[-1].strip()))
|
15 |
+
return output
|
16 |
+
return str(output)
|
17 |
+
|
18 |
+
def readData(path):
|
19 |
+
try:
|
20 |
+
df = pd.read_csv(path)
|
21 |
+
return df
|
22 |
+
except Exception as e:
|
23 |
+
raise Exception(f"Error reading data: {str(e)}")
|
24 |
+
|
25 |
+
def getAgent(data):
|
26 |
+
try:
|
27 |
+
llm = ChatGoogleGenerativeAI(
|
28 |
+
model="gemini-pro",
|
29 |
+
temperature=0.5,
|
30 |
+
google_api_key=os.environ.get("GOOGLE_API_KEY")
|
31 |
+
)
|
32 |
+
|
33 |
+
agent = create_pandas_dataframe_agent(
|
34 |
+
llm,
|
35 |
+
data,
|
36 |
+
verbose=True,
|
37 |
+
handle_parsing_errors=True,
|
38 |
+
allow_dangerous_code=True # Enable code execution
|
39 |
+
)
|
40 |
+
return agent
|
41 |
+
except Exception as e:
|
42 |
+
raise Exception(f"Error creating agent: {str(e)}")
|