LSTM and Seq-to-Seq Language Translator This project implements language translation using two approaches:
LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores. Model Architectures
- LSTM-Based Translator The LSTM model is built with the following components:
Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. 2. Seq-to-Seq Translator The Seq-to-Seq model uses:
Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
LSTM and Seq-to-Seq Language Translator This project implements language translation using two approaches:
LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
Model Architectures
- LSTM-Based Translator The LSTM model is built with the following components:
Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. 2. Seq-to-Seq Translator The Seq-to-Seq model uses:
Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. Decoder: Predicts the target sequence without attention, relying entirely on the encoded context. Dataset The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes and tokens for better decoding.
Preprocessing:
Tokenization: Text is tokenized using Keras' Tokenizer. Padding: Sequences are padded to a fixed length for training. Vocabulary Sizes: English: 1000 pairs Hebrew: 1000 pairs
Training Details Training Parameters: Optimizer: Adam Loss Function: Sparse Categorical Crossentropy Batch Size: 32 Epochs: 20 Validation Split: 20% Checkpoints: Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
Training Metrics: Both models track:
Training Loss Validation Loss
Evaluation Metrics
- BLEU Score: The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
LSTM Model BLEU: [BLEU Score for LSTM] Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq] 2. CHRF Score: The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
LSTM Model CHRF: [CHRF Score for LSTM] Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
LSTM and Seq-to-Seq Language Translator This project implements language translation using two approaches:
LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
Model Architectures
- LSTM-Based Translator The LSTM model is built with the following components:
Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. 2. Seq-to-Seq Translator The Seq-to-Seq model uses:
Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. Decoder: Predicts the target sequence without attention, relying entirely on the encoded context. Dataset The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes and tokens for better decoding.
Preprocessing:
Tokenization: Text is tokenized using Keras' Tokenizer. Padding: Sequences are padded to a fixed length for training. Vocabulary Sizes: English: [English Vocabulary Size] Hebrew: [Hebrew Vocabulary Size] Training Details Training Parameters: Optimizer: Adam Loss Function: Sparse Categorical Crossentropy Batch Size: 32 Epochs: 20 Validation Split: 20% Checkpoints: Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
Training Metrics: Both models track:
Training Loss Validation Loss Evaluation Metrics
- BLEU Score: The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
LSTM Model BLEU: [BLEU Score for LSTM] Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq] 2. CHRF Score: The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
LSTM Model CHRF: [CHRF Score for LSTM] Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq] Results Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture. Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences.
Acknowledgments Dataset: [Custom Parallel Dataset] Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF.