Spaces:

bintangyosua
/

political-ideology

Sleeping

App Files Files Community

bintangyosua commited on Nov 7, 2024

Commit

ab89e07

verified ·

1 Parent(s): d9b24a3

Upload 8 files

Browse files

Files changed (4) hide show

Dockerfile +16 -16
README.md +36 -4
app.py +78 -1
requirements.txt +16 -17

Dockerfile CHANGED Viewed

@@ -1,16 +1,16 @@
-FROM python:3.11.4
-COPY --from=ghcr.io/astral-sh/uv:0.4.20 /uv /bin/uv
-RUN useradd -m -u 1000 user
-ENV PATH="/home/user/.local/bin:$PATH"
-ENV UV_SYSTEM_PYTHON=1
-WORKDIR /app
-COPY --chown=user ./requirements.txt requirements.txt
-RUN uv pip install -r requirements.txt
-COPY --chown=user . /app
-USER user
-CMD ["marimo", "run", "app.py", "--include-code", "--host", "0.0.0.0", "--port", "7860"]

+FROM python:3.11.4
+COPY --from=ghcr.io/astral-sh/uv:0.4.20 /uv /bin/uv
+RUN useradd -m -u 1000 user
+ENV PATH="/home/user/.local/bin:$PATH"
+ENV UV_SYSTEM_PYTHON=1
+WORKDIR /app
+COPY --chown=user ./requirements.txt requirements.txt
+RUN uv pip install -r requirements.txt
+COPY --chown=user . /app
+USER user
+CMD ["marimo", "run", "app.py", "--include-code", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,13 +1,45 @@
 ---
-title: marimo app template
 emoji: 🍃
 colorFrom: indigo
 colorTo: purple
 sdk: docker
 pinned: true
 license: mit
-short_description: Template for deploying a marimo application to HF
 ---
-Check out marimo at <https://github.com/marimo-team/marimo>
-Check out the configuration reference at <https://huggingface.co/docs/hub/spaces-config-reference>

 ---
+title: Political Ideologies Analysis and Classification
 emoji: 🍃
 colorFrom: indigo
 colorTo: purple
 sdk: docker
 pinned: true
 license: mit
+short_description: Analysis and Classification
 ---
+# Political Ideologies Analysis
+This project provides a comprehensive analysis of political ideologies using data from the Huggingface Political Ideologies dataset. The analysis involves data preprocessing, mapping ideological labels, and visualizing political statements through Word2Vec embeddings and t-SNE projections. Additionally, an interactive tool is created for exploring political ideologies and their related issue types in a 2D space.
+## Project Overview
+The goal of this project is to analyze the political ideologies dataset to understand the distribution of political ideologies (conservative vs liberal) and their association with various issue types. The analysis involves:
+- **Data Loading and Cleaning**: Loading, cleaning, and mapping data from the Huggingface dataset.
+- **Label Mapping**: Mapping ideological labels (conservative and liberal) and issue types to numerical values.
+- **Word2Vec Embeddings**: Generating word embeddings for political statements to create vector representations.
+- **Dimensionality Reduction**: Using t-SNE to reduce the dimensionality of embeddings and visualize them in 2D.
+- **Interactive Visualizations**: Visualizing the data using Altair with interactive charts to explore ideology and issue type distributions.
+## Dataset
+The dataset used in this project is the [Political Ideologies dataset](https://huggingface.co/datasets/JyotiNayak/political_ideologies) from Huggingface, which contains political statements along with their corresponding labels (conservative or liberal) and issue types (economic, environmental, social, etc.).
+## Requirements
+- Python 3.x
+- TensorFlow
+- Gensim
+- Pandas
+- NumPy
+- Matplotlib
+- Seaborn
+- Altair
+You can install the necessary dependencies with:
+```bash
+pip install -r requirements.txt
+```

app.py CHANGED Viewed

@@ -33,12 +33,31 @@ def __():
     from gensim.models import Word2Vec
     from sklearn.manifold import TSNE
     mo.md("""
     ## 1. Import all libraries needed
     The initial cells import the necessary libraries for data handling, visualization, and word embedding.
     """)
-    return TSNE, Word2Vec, alt, mo, np, pd, plt, sns
 @app.cell(hide_code=True)
@@ -325,5 +344,63 @@ def __(mo):
     return
 if __name__ == "__main__":
     app.run()

     from gensim.models import Word2Vec
     from sklearn.manifold import TSNE
+    import tensorflow as tf
+    from tensorflow.keras.models import Sequential
+    from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense
     mo.md("""
     ## 1. Import all libraries needed
     The initial cells import the necessary libraries for data handling, visualization, and word embedding.
     """)
+    return (
+        Bidirectional,
+        Dense,
+        Embedding,
+        LSTM,
+        Sequential,
+        TSNE,
+        Word2Vec,
+        alt,
+        mo,
+        np,
+        pd,
+        plt,
+        sns,
+        tf,
+    )
 @app.cell(hide_code=True)
     return
+@app.cell
+def __(mo):
+    mo.md(r"""## Building Bidirection LSTM Model""")
+    return
+@app.cell
+def __():
+    max_length = 100
+    embedding_dim = 100
+    num_classes = 2
+    return embedding_dim, max_length, num_classes
+@app.cell
+def __(
+    Bidirectional,
+    Dense,
+    Embedding,
+    LSTM,
+    Sequential,
+    embedding_dim,
+    max_length,
+    num_classes,
+    word2vec_model,
+):
+    model = Sequential()
+    model.add(Embedding(input_dim=len(word2vec_model.wv.index_to_key), output_dim=embedding_dim, input_length=max_length))
+    model.add(Bidirectional(LSTM(64, return_sequences=False)))
+    model.add(Dense(num_classes, activation='softmax'))
+    return (model,)
+@app.cell
+def __(model):
+    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
+    model.summary()
+    return
+@app.cell
+def __(df, np):
+    X = np.vstack(df['embedding'].values)
+    y = df['label'].values
+    return X, y
+@app.cell
+def __(X, model, y):
+    model.fit(X, y, epochs=10, batch_size=32, validation_split=0.2)
+    return
+@app.cell
+def __():
+    return
 if __name__ == "__main__":
     app.run()

requirements.txt CHANGED Viewed

@@ -1,17 +1,16 @@
-marimo
-pandas
-numpy
-scipy==1.10.1
-pyarrow
-matplotlib
-seaborn
-altair
-gensim
-scikit-learn
-# Or a specific version
-# marimo>=0.9.0
-# Add other dependencies as needed

+marimo
+pandas
+numpy
+scipy==1.10.1
+matplotlib
+seaborn
+altair
+gensim
+scikit-learn
+# Or a specific version
+# marimo>=0.9.0
+# Add other dependencies as needed