{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "pUWCd_Ch5J49" }, "source": [ "# Character-level recurrent sequence-to-sequence model\n", "\n", "**Author:** [fchollet](https://twitter.com/fchollet)
\n", "**Date created:** 2017/09/29
\n", "**Last modified:** 2020/04/26
\n", "**Description:** Character-level recurrent sequence-to-sequence model." ] }, { "cell_type": "markdown", "metadata": { "id": "y2uZhuQ-5J5B" }, "source": [ "## Introduction\n", "\n", "This example demonstrates how to implement a basic character-level\n", "recurrent sequence-to-sequence model. We apply it to translating\n", "short English sentences into short French sentences,\n", "character-by-character. Note that it is fairly unusual to\n", "do character-level machine translation, as word-level\n", "models are more common in this domain.\n", "\n", "**Summary of the algorithm**\n", "\n", "- We start with input sequences from a domain (e.g. English sentences)\n", " and corresponding target sequences from another domain\n", " (e.g. French sentences).\n", "- An encoder LSTM turns input sequences to 2 state vectors\n", " (we keep the last LSTM state and discard the outputs).\n", "- A decoder LSTM is trained to turn the target sequences into\n", " the same sequence but offset by one timestep in the future,\n", " a training process called \"teacher forcing\" in this context.\n", " It uses as initial state the state vectors from the encoder.\n", " Effectively, the decoder learns to generate `targets[t+1...]`\n", " given `targets[...t]`, conditioned on the input sequence.\n", "- In inference mode, when we want to decode unknown input sequences, we:\n", " - Encode the input sequence into state vectors\n", " - Start with a target sequence of size 1\n", " (just the start-of-sequence character)\n", " - Feed the state vectors and 1-char target sequence\n", " to the decoder to produce predictions for the next character\n", " - Sample the next character using these predictions\n", " (we simply use argmax).\n", " - Append the sampled character to the target sequence\n", " - Repeat until we generate the end-of-sequence character or we\n", " hit the character limit.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ymvVW7f55J5C" }, "source": [ "## Setup\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "IKzDuATV5J5C" }, "outputs": [], "source": [ "import numpy as np\n", "import tensorflow as tf\n", "from tensorflow import keras\n" ] }, { "cell_type": "markdown", "metadata": { "id": "KsdDP8835J5D" }, "source": [ "## Download the data\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "QjrXitpv5J5E", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "a5c71e87-b3c7-419e-d987-5f2551c0e236" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['Archive: fra-eng.zip',\n", " ' inflating: _about.txt ',\n", " ' inflating: fra.txt ']" ] }, "metadata": {}, "execution_count": 2 } ], "source": [ "!!curl -O http://www.manythings.org/anki/fra-eng.zip\n", "!!unzip fra-eng.zip\n" ] }, { "cell_type": "markdown", "metadata": { "id": "4Qi0m1NC5J5E" }, "source": [ "## Configuration\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "UB6qEq0b5J5F" }, "outputs": [], "source": [ "batch_size = 64 # Batch size for training.\n", "epochs = 100 # Number of epochs to train for.\n", "latent_dim = 256 # Latent dimensionality of the encoding space.\n", "num_samples = 10000 # Number of samples to train on.\n", "# Path to the data txt file on disk.\n", "data_path = \"fra.txt\"\n" ] }, { "cell_type": "markdown", "metadata": { "id": "50hqcmjH5J5F" }, "source": [ "## Prepare the data\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "XIoa7eHS5J5G", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "583ed656-723a-4c36-eede-259afa77ffba" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Number of samples: 10000\n", "Number of unique input tokens: 71\n", "Number of unique output tokens: 92\n", "Max sequence length for inputs: 15\n", "Max sequence length for outputs: 59\n" ] } ], "source": [ "# Vectorize the data.\n", "input_texts = []\n", "target_texts = []\n", "input_characters = set()\n", "target_characters = set()\n", "with open(data_path, \"r\", encoding=\"utf-8\") as f:\n", " lines = f.read().split(\"\\n\")\n", "for line in lines[: min(num_samples, len(lines) - 1)]:\n", " input_text, target_text, _ = line.split(\"\\t\")\n", " # We use \"tab\" as the \"start sequence\" character\n", " # for the targets, and \"\\n\" as \"end sequence\" character.\n", " target_text = \"\\t\" + target_text + \"\\n\"\n", " input_texts.append(input_text)\n", " target_texts.append(target_text)\n", " for char in input_text:\n", " if char not in input_characters:\n", " input_characters.add(char)\n", " for char in target_text:\n", " if char not in target_characters:\n", " target_characters.add(char)\n", "\n", "input_characters = sorted(list(input_characters))\n", "target_characters = sorted(list(target_characters))\n", "num_encoder_tokens = len(input_characters)\n", "num_decoder_tokens = len(target_characters)\n", "max_encoder_seq_length = max([len(txt) for txt in input_texts])\n", "max_decoder_seq_length = max([len(txt) for txt in target_texts])\n", "\n", "print(\"Number of samples:\", len(input_texts))\n", "print(\"Number of unique input tokens:\", num_encoder_tokens)\n", "print(\"Number of unique output tokens:\", num_decoder_tokens)\n", "print(\"Max sequence length for inputs:\", max_encoder_seq_length)\n", "print(\"Max sequence length for outputs:\", max_decoder_seq_length)\n", "\n", "input_token_index = dict([(char, i) for i, char in enumerate(input_characters)])\n", "target_token_index = dict([(char, i) for i, char in enumerate(target_characters)])\n", "\n", "encoder_input_data = np.zeros(\n", " (len(input_texts), max_encoder_seq_length, num_encoder_tokens), dtype=\"float32\"\n", ")\n", "decoder_input_data = np.zeros(\n", " (len(input_texts), max_decoder_seq_length, num_decoder_tokens), dtype=\"float32\"\n", ")\n", "decoder_target_data = np.zeros(\n", " (len(input_texts), max_decoder_seq_length, num_decoder_tokens), dtype=\"float32\"\n", ")\n", "\n", "for i, (input_text, target_text) in enumerate(zip(input_texts, target_texts)):\n", " for t, char in enumerate(input_text):\n", " encoder_input_data[i, t, input_token_index[char]] = 1.0\n", " encoder_input_data[i, t + 1 :, input_token_index[\" \"]] = 1.0\n", " for t, char in enumerate(target_text):\n", " # decoder_target_data is ahead of decoder_input_data by one timestep\n", " decoder_input_data[i, t, target_token_index[char]] = 1.0\n", " if t > 0:\n", " # decoder_target_data will be ahead by one timestep\n", " # and will not include the start character.\n", " decoder_target_data[i, t - 1, target_token_index[char]] = 1.0\n", " decoder_input_data[i, t + 1 :, target_token_index[\" \"]] = 1.0\n", " decoder_target_data[i, t:, target_token_index[\" \"]] = 1.0\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Nmmia38F5J5H" }, "source": [ "## Build the model\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "xUBfSVSH5J5H" }, "outputs": [], "source": [ "# Define an input sequence and process it.\n", "encoder_inputs = keras.Input(shape=(None, num_encoder_tokens))\n", "encoder = keras.layers.LSTM(latent_dim, return_state=True)\n", "encoder_outputs, state_h, state_c = encoder(encoder_inputs)\n", "\n", "# We discard `encoder_outputs` and only keep the states.\n", "encoder_states = [state_h, state_c]\n", "\n", "# Set up the decoder, using `encoder_states` as initial state.\n", "decoder_inputs = keras.Input(shape=(None, num_decoder_tokens))\n", "\n", "# We set up our decoder to return full output sequences,\n", "# and to return internal states as well. We don't use the\n", "# return states in the training model, but we will use them in inference.\n", "decoder_lstm = keras.layers.LSTM(latent_dim, return_sequences=True, return_state=True)\n", "decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)\n", "decoder_dense = keras.layers.Dense(num_decoder_tokens, activation=\"softmax\")\n", "decoder_outputs = decoder_dense(decoder_outputs)\n", "\n", "# Define the model that will turn\n", "# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`\n", "model = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MYvCCy4i5J5I" }, "source": [ "## Train the model\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "3kgt3bnl5J5I", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "f347151f-3666-4f10-8a05-6949a2361301" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/100\n", "125/125 [==============================] - 8s 19ms/step - loss: 1.1334 - accuracy: 0.7368 - val_loss: 1.0400 - val_accuracy: 0.7264\n", "Epoch 2/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.8081 - accuracy: 0.7805 - val_loss: 0.8330 - val_accuracy: 0.7693\n", "Epoch 3/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.6407 - accuracy: 0.8185 - val_loss: 0.6837 - val_accuracy: 0.8008\n", "Epoch 4/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.5614 - accuracy: 0.8366 - val_loss: 0.6254 - val_accuracy: 0.8138\n", "Epoch 5/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.5160 - accuracy: 0.8490 - val_loss: 0.5773 - val_accuracy: 0.8346\n", "Epoch 6/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.4815 - accuracy: 0.8589 - val_loss: 0.5527 - val_accuracy: 0.8383\n", "Epoch 7/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.4538 - accuracy: 0.8659 - val_loss: 0.5317 - val_accuracy: 0.8430\n", "Epoch 8/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.4314 - accuracy: 0.8716 - val_loss: 0.5120 - val_accuracy: 0.8484\n", "Epoch 9/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.4118 - accuracy: 0.8768 - val_loss: 0.5096 - val_accuracy: 0.8493\n", "Epoch 10/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.3945 - accuracy: 0.8818 - val_loss: 0.4892 - val_accuracy: 0.8545\n", "Epoch 11/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.3785 - accuracy: 0.8864 - val_loss: 0.4884 - val_accuracy: 0.8550\n", "Epoch 12/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.3637 - accuracy: 0.8905 - val_loss: 0.4725 - val_accuracy: 0.8597\n", "Epoch 13/100\n", "125/125 [==============================] - 2s 14ms/step - loss: 0.3498 - accuracy: 0.8946 - val_loss: 0.4674 - val_accuracy: 0.8624\n", "Epoch 14/100\n", "125/125 [==============================] - 2s 15ms/step - loss: 0.3370 - accuracy: 0.8981 - val_loss: 0.4597 - val_accuracy: 0.8644\n", "Epoch 15/100\n", "125/125 [==============================] - 2s 14ms/step - loss: 0.3244 - accuracy: 0.9020 - val_loss: 0.4533 - val_accuracy: 0.8661\n", "Epoch 16/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.3124 - accuracy: 0.9056 - val_loss: 0.4569 - val_accuracy: 0.8655\n", "Epoch 17/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.3012 - accuracy: 0.9088 - val_loss: 0.4515 - val_accuracy: 0.8688\n", "Epoch 18/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2904 - accuracy: 0.9118 - val_loss: 0.4440 - val_accuracy: 0.8704\n", "Epoch 19/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2803 - accuracy: 0.9154 - val_loss: 0.4473 - val_accuracy: 0.8697\n", "Epoch 20/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2703 - accuracy: 0.9179 - val_loss: 0.4470 - val_accuracy: 0.8709\n", "Epoch 21/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2611 - accuracy: 0.9212 - val_loss: 0.4447 - val_accuracy: 0.8725\n", "Epoch 22/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2519 - accuracy: 0.9235 - val_loss: 0.4457 - val_accuracy: 0.8721\n", "Epoch 23/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2436 - accuracy: 0.9262 - val_loss: 0.4503 - val_accuracy: 0.8723\n", "Epoch 24/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2356 - accuracy: 0.9283 - val_loss: 0.4506 - val_accuracy: 0.8732\n", "Epoch 25/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2275 - accuracy: 0.9309 - val_loss: 0.4531 - val_accuracy: 0.8733\n", "Epoch 26/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2201 - accuracy: 0.9328 - val_loss: 0.4524 - val_accuracy: 0.8749\n", "Epoch 27/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2132 - accuracy: 0.9353 - val_loss: 0.4615 - val_accuracy: 0.8736\n", "Epoch 28/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.2064 - accuracy: 0.9370 - val_loss: 0.4609 - val_accuracy: 0.8740\n", "Epoch 29/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1999 - accuracy: 0.9390 - val_loss: 0.4612 - val_accuracy: 0.8750\n", "Epoch 30/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1933 - accuracy: 0.9411 - val_loss: 0.4701 - val_accuracy: 0.8734\n", "Epoch 31/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1877 - accuracy: 0.9427 - val_loss: 0.4718 - val_accuracy: 0.8747\n", "Epoch 32/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1816 - accuracy: 0.9443 - val_loss: 0.4749 - val_accuracy: 0.8747\n", "Epoch 33/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1763 - accuracy: 0.9462 - val_loss: 0.4805 - val_accuracy: 0.8746\n", "Epoch 34/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1711 - accuracy: 0.9477 - val_loss: 0.4855 - val_accuracy: 0.8742\n", "Epoch 35/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1661 - accuracy: 0.9494 - val_loss: 0.4849 - val_accuracy: 0.8745\n", "Epoch 36/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1612 - accuracy: 0.9505 - val_loss: 0.4939 - val_accuracy: 0.8739\n", "Epoch 37/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1566 - accuracy: 0.9518 - val_loss: 0.5005 - val_accuracy: 0.8734\n", "Epoch 38/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1517 - accuracy: 0.9536 - val_loss: 0.5021 - val_accuracy: 0.8748\n", "Epoch 39/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1476 - accuracy: 0.9548 - val_loss: 0.5051 - val_accuracy: 0.8744\n", "Epoch 40/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1434 - accuracy: 0.9561 - val_loss: 0.5081 - val_accuracy: 0.8740\n", "Epoch 41/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1396 - accuracy: 0.9573 - val_loss: 0.5173 - val_accuracy: 0.8745\n", "Epoch 42/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1356 - accuracy: 0.9584 - val_loss: 0.5199 - val_accuracy: 0.8745\n", "Epoch 43/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1318 - accuracy: 0.9591 - val_loss: 0.5236 - val_accuracy: 0.8738\n", "Epoch 44/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1290 - accuracy: 0.9602 - val_loss: 0.5382 - val_accuracy: 0.8731\n", "Epoch 45/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1250 - accuracy: 0.9616 - val_loss: 0.5393 - val_accuracy: 0.8736\n", "Epoch 46/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1218 - accuracy: 0.9624 - val_loss: 0.5392 - val_accuracy: 0.8734\n", "Epoch 47/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1189 - accuracy: 0.9633 - val_loss: 0.5483 - val_accuracy: 0.8742\n", "Epoch 48/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1159 - accuracy: 0.9642 - val_loss: 0.5486 - val_accuracy: 0.8740\n", "Epoch 49/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1127 - accuracy: 0.9652 - val_loss: 0.5606 - val_accuracy: 0.8734\n", "Epoch 50/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1104 - accuracy: 0.9654 - val_loss: 0.5610 - val_accuracy: 0.8738\n", "Epoch 51/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1075 - accuracy: 0.9664 - val_loss: 0.5674 - val_accuracy: 0.8735\n", "Epoch 52/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1050 - accuracy: 0.9673 - val_loss: 0.5702 - val_accuracy: 0.8731\n", "Epoch 53/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1027 - accuracy: 0.9679 - val_loss: 0.5756 - val_accuracy: 0.8732\n", "Epoch 54/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.1004 - accuracy: 0.9684 - val_loss: 0.5783 - val_accuracy: 0.8736\n", "Epoch 55/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0978 - accuracy: 0.9691 - val_loss: 0.5838 - val_accuracy: 0.8729\n", "Epoch 56/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0955 - accuracy: 0.9700 - val_loss: 0.5851 - val_accuracy: 0.8736\n", "Epoch 57/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0934 - accuracy: 0.9703 - val_loss: 0.5969 - val_accuracy: 0.8722\n", "Epoch 58/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0913 - accuracy: 0.9709 - val_loss: 0.6024 - val_accuracy: 0.8723\n", "Epoch 59/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0890 - accuracy: 0.9717 - val_loss: 0.6073 - val_accuracy: 0.8723\n", "Epoch 60/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0873 - accuracy: 0.9720 - val_loss: 0.6113 - val_accuracy: 0.8731\n", "Epoch 61/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0858 - accuracy: 0.9725 - val_loss: 0.6190 - val_accuracy: 0.8726\n", "Epoch 62/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0836 - accuracy: 0.9732 - val_loss: 0.6139 - val_accuracy: 0.8731\n", "Epoch 63/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0819 - accuracy: 0.9737 - val_loss: 0.6242 - val_accuracy: 0.8725\n", "Epoch 64/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0803 - accuracy: 0.9740 - val_loss: 0.6318 - val_accuracy: 0.8709\n", "Epoch 65/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0784 - accuracy: 0.9748 - val_loss: 0.6384 - val_accuracy: 0.8728\n", "Epoch 66/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0768 - accuracy: 0.9749 - val_loss: 0.6392 - val_accuracy: 0.8721\n", "Epoch 67/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0755 - accuracy: 0.9754 - val_loss: 0.6453 - val_accuracy: 0.8718\n", "Epoch 68/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0741 - accuracy: 0.9758 - val_loss: 0.6492 - val_accuracy: 0.8716\n", "Epoch 69/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0720 - accuracy: 0.9765 - val_loss: 0.6505 - val_accuracy: 0.8720\n", "Epoch 70/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0711 - accuracy: 0.9768 - val_loss: 0.6605 - val_accuracy: 0.8720\n", "Epoch 71/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0698 - accuracy: 0.9771 - val_loss: 0.6621 - val_accuracy: 0.8714\n", "Epoch 72/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0685 - accuracy: 0.9774 - val_loss: 0.6656 - val_accuracy: 0.8721\n", "Epoch 73/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0668 - accuracy: 0.9778 - val_loss: 0.6736 - val_accuracy: 0.8715\n", "Epoch 74/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0654 - accuracy: 0.9782 - val_loss: 0.6759 - val_accuracy: 0.8713\n", "Epoch 75/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0642 - accuracy: 0.9786 - val_loss: 0.6830 - val_accuracy: 0.8717\n", "Epoch 76/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0633 - accuracy: 0.9789 - val_loss: 0.6856 - val_accuracy: 0.8705\n", "Epoch 77/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0623 - accuracy: 0.9792 - val_loss: 0.6924 - val_accuracy: 0.8714\n", "Epoch 78/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0608 - accuracy: 0.9795 - val_loss: 0.6958 - val_accuracy: 0.8709\n", "Epoch 79/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0601 - accuracy: 0.9798 - val_loss: 0.7000 - val_accuracy: 0.8712\n", "Epoch 80/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0589 - accuracy: 0.9799 - val_loss: 0.6989 - val_accuracy: 0.8719\n", "Epoch 81/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0577 - accuracy: 0.9804 - val_loss: 0.7021 - val_accuracy: 0.8704\n", "Epoch 82/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0571 - accuracy: 0.9806 - val_loss: 0.7111 - val_accuracy: 0.8705\n", "Epoch 83/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0562 - accuracy: 0.9808 - val_loss: 0.7124 - val_accuracy: 0.8715\n", "Epoch 84/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0549 - accuracy: 0.9812 - val_loss: 0.7160 - val_accuracy: 0.8711\n", "Epoch 85/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0541 - accuracy: 0.9815 - val_loss: 0.7220 - val_accuracy: 0.8707\n", "Epoch 86/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0537 - accuracy: 0.9817 - val_loss: 0.7173 - val_accuracy: 0.8711\n", "Epoch 87/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0521 - accuracy: 0.9820 - val_loss: 0.7312 - val_accuracy: 0.8702\n", "Epoch 88/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0514 - accuracy: 0.9822 - val_loss: 0.7252 - val_accuracy: 0.8718\n", "Epoch 89/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0507 - accuracy: 0.9825 - val_loss: 0.7324 - val_accuracy: 0.8703\n", "Epoch 90/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0503 - accuracy: 0.9824 - val_loss: 0.7375 - val_accuracy: 0.8696\n", "Epoch 91/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0493 - accuracy: 0.9829 - val_loss: 0.7417 - val_accuracy: 0.8699\n", "Epoch 92/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0485 - accuracy: 0.9831 - val_loss: 0.7448 - val_accuracy: 0.8712\n", "Epoch 93/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0484 - accuracy: 0.9831 - val_loss: 0.7448 - val_accuracy: 0.8699\n", "Epoch 94/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0470 - accuracy: 0.9834 - val_loss: 0.7461 - val_accuracy: 0.8709\n", "Epoch 95/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0468 - accuracy: 0.9834 - val_loss: 0.7468 - val_accuracy: 0.8712\n", "Epoch 96/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0463 - accuracy: 0.9838 - val_loss: 0.7601 - val_accuracy: 0.8701\n", "Epoch 97/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0456 - accuracy: 0.9839 - val_loss: 0.7589 - val_accuracy: 0.8702\n", "Epoch 98/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0448 - accuracy: 0.9840 - val_loss: 0.7604 - val_accuracy: 0.8709\n", "Epoch 99/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0445 - accuracy: 0.9840 - val_loss: 0.7593 - val_accuracy: 0.8701\n", "Epoch 100/100\n", "125/125 [==============================] - 2s 13ms/step - loss: 0.0442 - accuracy: 0.9842 - val_loss: 0.7654 - val_accuracy: 0.8698\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "WARNING:absl:Found untraced functions such as lstm_cell_layer_call_fn, lstm_cell_layer_call_and_return_conditional_losses, lstm_cell_1_layer_call_fn, lstm_cell_1_layer_call_and_return_conditional_losses, lstm_cell_layer_call_fn while saving (showing 5 of 10). These functions will not be directly callable after loading.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "INFO:tensorflow:Assets written to: s2s/assets\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "INFO:tensorflow:Assets written to: s2s/assets\n", "WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.\n", "WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.\n" ] } ], "source": [ "# early_stopping_patience = 10\n", "\n", "# # Add early stopping\n", "# early_stopping = keras.callbacks.EarlyStopping(\n", "# monitor=\"val_accuracy\", patience=early_stopping_patience, restore_best_weights=True\n", "# )\n", "\n", "model.compile(\n", " optimizer=\"rmsprop\", loss=\"categorical_crossentropy\", metrics=[\"accuracy\"]\n", ")\n", "model.fit(\n", " [encoder_input_data, decoder_input_data],\n", " decoder_target_data,\n", " batch_size=batch_size,\n", " epochs=epochs,\n", " validation_split=0.2,\n", " # callbacks=[early_stopping]\n", ")\n", "# Save model\n", "model.save(\"s2s\")\n" ] }, { "cell_type": "markdown", "metadata": { "id": "HxkS8_Pf5J5I" }, "source": [ "## Run inference (sampling)\n", "\n", "1. encode input and retrieve initial decoder state\n", "2. run one step of decoder with this initial state\n", "and a \"start of sequence\" token as target.\n", "Output will be the next target token.\n", "3. Repeat with the current target token and current states\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "-KKcZuAa5J5I" }, "outputs": [], "source": [ "# Define sampling models\n", "# Restore the model and construct the encoder and decoder.\n", "model = keras.models.load_model(\"s2s\")\n", "\n", "encoder_inputs = model.input[0] # input_1\n", "encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output # lstm_1\n", "encoder_states = [state_h_enc, state_c_enc]\n", "encoder_model = keras.Model(encoder_inputs, encoder_states)\n", "\n", "decoder_inputs = model.input[1] # input_2\n", "decoder_state_input_h = keras.Input(shape=(latent_dim,))\n", "decoder_state_input_c = keras.Input(shape=(latent_dim,))\n", "decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]\n", "decoder_lstm = model.layers[3]\n", "decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(\n", " decoder_inputs, initial_state=decoder_states_inputs\n", ")\n", "decoder_states = [state_h_dec, state_c_dec]\n", "decoder_dense = model.layers[4]\n", "decoder_outputs = decoder_dense(decoder_outputs)\n", "decoder_model = keras.Model(\n", " [decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states\n", ")\n", "\n", "# Reverse-lookup token index to decode sequences back to\n", "# something readable.\n", "reverse_input_char_index = dict((i, char) for char, i in input_token_index.items())\n", "reverse_target_char_index = dict((i, char) for char, i in target_token_index.items())\n", "\n", "\n", "def decode_sequence(input_seq):\n", " # Encode the input as state vectors.\n", " states_value = encoder_model.predict(input_seq)\n", "\n", " # Generate empty target sequence of length 1.\n", " target_seq = np.zeros((1, 1, num_decoder_tokens))\n", " # Populate the first character of target sequence with the start character.\n", " target_seq[0, 0, target_token_index[\"\\t\"]] = 1.0\n", "\n", " # Sampling loop for a batch of sequences\n", " # (to simplify, here we assume a batch of size 1).\n", " stop_condition = False\n", " decoded_sentence = \"\"\n", " while not stop_condition:\n", " output_tokens, h, c = decoder_model.predict([target_seq] + states_value)\n", "\n", " # Sample a token\n", " sampled_token_index = np.argmax(output_tokens[0, -1, :])\n", " sampled_char = reverse_target_char_index[sampled_token_index]\n", " decoded_sentence += sampled_char\n", "\n", " # Exit condition: either hit max length\n", " # or find stop character.\n", " if sampled_char == \"\\n\" or len(decoded_sentence) > max_decoder_seq_length:\n", " stop_condition = True\n", "\n", " # Update the target sequence (of length 1).\n", " target_seq = np.zeros((1, 1, num_decoder_tokens))\n", " target_seq[0, 0, sampled_token_index] = 1.0\n", "\n", " # Update states\n", " states_value = [h, c]\n", " return decoded_sentence\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "pLvBXjXg5J5J" }, "source": [ "You can now generate decoded sentences as such:\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "7fG4EDSX5J5J", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "84f4486e-fc08-4269-fed2-48628b568240" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "-\n", "Input sentence: Go.\n", "Decoded sentence: Bouge !\n", "\n", "-\n", "Input sentence: Go.\n", "Decoded sentence: Bouge !\n", "\n", "-\n", "Input sentence: Go.\n", "Decoded sentence: Bouge !\n", "\n", "-\n", "Input sentence: Hi.\n", "Decoded sentence: Salut.\n", "\n", "-\n", "Input sentence: Hi.\n", "Decoded sentence: Salut.\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run!\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n", "-\n", "Input sentence: Run.\n", "Decoded sentence: Courez !\n", "\n" ] } ], "source": [ "for seq_index in range(20):\n", " # Take one sequence (part of the training set)\n", " # for trying out decoding.\n", " input_seq = encoder_input_data[seq_index : seq_index + 1]\n", " decoded_sentence = decode_sequence(input_seq)\n", " print(\"-\")\n", " print(\"Input sentence:\", input_texts[seq_index])\n", " print(\"Decoded sentence:\", decoded_sentence)\n" ] }, { "cell_type": "code", "source": [ "import json" ], "metadata": { "id": "bqV-cbvJA5hd" }, "execution_count": 10, "outputs": [] }, { "cell_type": "code", "source": [ "with open(\"input_vocab.json\", \"w\", encoding='utf-8') as outfile:\n", " json.dump(input_token_index, outfile, ensure_ascii=False)\n", "with open(\"target_vocab.json\", \"w\", encoding='utf-8') as outfile:\n", " json.dump(target_token_index, outfile, ensure_ascii=False)" ], "metadata": { "id": "jXPS4ycZ9A9o" }, "execution_count": 13, "outputs": [] }, { "cell_type": "code", "source": [ "!pip install huggingface-hub\n", "!sudo apt-get install git-lfs\n", "!git-lfs install" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "MCQ_ND66BXn9", "outputId": "f58a6d0d-2c4b-4fb6-f44e-43b8167a5ded" }, "execution_count": 14, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Collecting huggingface-hub\n", " Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)\n", "\u001b[?25l\r\u001b[K |█████ | 10 kB 35.3 MB/s eta 0:00:01\r\u001b[K |█████████▉ | 20 kB 24.7 MB/s eta 0:00:01\r\u001b[K |██████████████▊ | 30 kB 18.8 MB/s eta 0:00:01\r\u001b[K |███████████████████▋ | 40 kB 16.2 MB/s eta 0:00:01\r\u001b[K |████████████████████████▌ | 51 kB 8.3 MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▍ | 61 kB 9.6 MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 67 kB 4.1 MB/s \n", "\u001b[?25hRequirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (2.23.0)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (3.13)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (3.4.2)\n", "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (21.3)\n", "Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (4.10.1)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (3.10.0.2)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from huggingface-hub) (4.62.3)\n", "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=20.9->huggingface-hub) (3.0.7)\n", "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->huggingface-hub) (3.7.0)\n", "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub) (2.10)\n", "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub) (3.0.4)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub) (2021.10.8)\n", "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub) (1.24.3)\n", "Installing collected packages: huggingface-hub\n", "Successfully installed huggingface-hub-0.4.0\n", "Reading package lists... Done\n", "Building dependency tree \n", "Reading state information... Done\n", "The following packages were automatically installed and are no longer required:\n", " cuda-command-line-tools-10-0 cuda-command-line-tools-10-1\n", " cuda-command-line-tools-11-0 cuda-compiler-10-0 cuda-compiler-10-1\n", " cuda-compiler-11-0 cuda-cuobjdump-10-0 cuda-cuobjdump-10-1\n", " cuda-cuobjdump-11-0 cuda-cupti-10-0 cuda-cupti-10-1 cuda-cupti-11-0\n", " cuda-cupti-dev-11-0 cuda-documentation-10-0 cuda-documentation-10-1\n", " cuda-documentation-11-0 cuda-documentation-11-1 cuda-gdb-10-0 cuda-gdb-10-1\n", " cuda-gdb-11-0 cuda-gpu-library-advisor-10-0 cuda-gpu-library-advisor-10-1\n", " cuda-libraries-10-0 cuda-libraries-10-1 cuda-libraries-11-0\n", " cuda-memcheck-10-0 cuda-memcheck-10-1 cuda-memcheck-11-0 cuda-nsight-10-0\n", " cuda-nsight-10-1 cuda-nsight-11-0 cuda-nsight-11-1 cuda-nsight-compute-10-0\n", " cuda-nsight-compute-10-1 cuda-nsight-compute-11-0 cuda-nsight-compute-11-1\n", " cuda-nsight-systems-10-1 cuda-nsight-systems-11-0 cuda-nsight-systems-11-1\n", " cuda-nvcc-10-0 cuda-nvcc-10-1 cuda-nvcc-11-0 cuda-nvdisasm-10-0\n", " cuda-nvdisasm-10-1 cuda-nvdisasm-11-0 cuda-nvml-dev-10-0 cuda-nvml-dev-10-1\n", " cuda-nvml-dev-11-0 cuda-nvprof-10-0 cuda-nvprof-10-1 cuda-nvprof-11-0\n", " cuda-nvprune-10-0 cuda-nvprune-10-1 cuda-nvprune-11-0 cuda-nvtx-10-0\n", " cuda-nvtx-10-1 cuda-nvtx-11-0 cuda-nvvp-10-0 cuda-nvvp-10-1 cuda-nvvp-11-0\n", " cuda-nvvp-11-1 cuda-samples-10-0 cuda-samples-10-1 cuda-samples-11-0\n", " cuda-samples-11-1 cuda-sanitizer-11-0 cuda-sanitizer-api-10-1\n", " cuda-toolkit-10-0 cuda-toolkit-10-1 cuda-toolkit-11-0 cuda-toolkit-11-1\n", " cuda-tools-10-0 cuda-tools-10-1 cuda-tools-11-0 cuda-tools-11-1\n", " cuda-visual-tools-10-0 cuda-visual-tools-10-1 cuda-visual-tools-11-0\n", " cuda-visual-tools-11-1 default-jre dkms freeglut3 freeglut3-dev\n", " keyboard-configuration libargon2-0 libcap2 libcryptsetup12\n", " libdevmapper1.02.1 libfontenc1 libidn11 libip4tc0 libjansson4\n", " libnvidia-cfg1-510 libnvidia-common-460 libnvidia-common-510\n", " libnvidia-extra-510 libnvidia-fbc1-510 libnvidia-gl-510 libpam-systemd\n", " libpolkit-agent-1-0 libpolkit-backend-1-0 libpolkit-gobject-1-0 libxfont2\n", " libxi-dev libxkbfile1 libxmu-dev libxmu-headers libxnvctrl0 libxtst6\n", " nsight-compute-2020.2.1 nsight-compute-2022.1.0 nsight-systems-2020.3.2\n", " nsight-systems-2020.3.4 nsight-systems-2021.5.2 nvidia-dkms-510\n", " nvidia-kernel-common-510 nvidia-kernel-source-510 nvidia-modprobe\n", " nvidia-settings openjdk-11-jre policykit-1 policykit-1-gnome python3-xkit\n", " screen-resolution-extra systemd systemd-sysv udev x11-xkb-utils\n", " xserver-common xserver-xorg-core-hwe-18.04 xserver-xorg-video-nvidia-510\n", "Use 'sudo apt autoremove' to remove them.\n", "The following NEW packages will be installed:\n", " git-lfs\n", "0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.\n", "Need to get 2,129 kB of archives.\n", "After this operation, 7,662 kB of additional disk space will be used.\n", "Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 git-lfs amd64 2.3.4-1 [2,129 kB]\n", "Fetched 2,129 kB in 1s (1,551 kB/s)\n", "debconf: unable to initialize frontend: Dialog\n", "debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 1.)\n", "debconf: falling back to frontend: Readline\n", "debconf: unable to initialize frontend: Readline\n", "debconf: (This frontend requires a controlling tty.)\n", "debconf: falling back to frontend: Teletype\n", "dpkg-preconfigure: unable to re-open stdin: \n", "Selecting previously unselected package git-lfs.\n", "(Reading database ... 155113 files and directories currently installed.)\n", "Preparing to unpack .../git-lfs_2.3.4-1_amd64.deb ...\n", "Unpacking git-lfs (2.3.4-1) ...\n", "Setting up git-lfs (2.3.4-1) ...\n", "Processing triggers for man-db (2.8.3-2ubuntu0.1) ...\n", "Error: Failed to call git rev-parse --git-dir --show-toplevel: \"fatal: not a git repository (or any of the parent directories): .git\\n\"\n", "Git LFS initialized.\n" ] } ] }, { "cell_type": "code", "source": [ "!huggingface-cli login" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vr7EzjdvBzHT", "outputId": "69fa8334-c203-4000-d970-1a8b8f9f1ba6" }, "execution_count": 15, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\n", " _| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|\n", " _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|\n", " _|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|\n", " _| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|\n", " _| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|\n", "\n", " To login, `huggingface_hub` now requires a token generated from https://huggingface.co./settings/token.\n", " (Deprecated, will be removed in v0.3.0) To login with username and password instead, interrupt with Ctrl+C.\n", " \n", "Token: \n", "Login successful\n", "Your token has been saved to /root/.huggingface/token\n", "\u001b[1m\u001b[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.\n", "You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default\n", "\n", "git config --global credential.helper store\u001b[0m\n" ] } ] }, { "cell_type": "code", "source": [ "from huggingface_hub.keras_mixin import push_to_hub_keras\n", "push_to_hub_keras(model = model, repo_url = \"https://huggingface.co./keras-io/char-lstm-seq2seq\", organization = \"keras-io\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 345, "referenced_widgets": [ "0873642bfadd4c37b54e92dac1de35bf", "042ecadcff7b47fbb32069cdf1064d09", "942b7ed2e1f0404fb5f44b266a16d0cd", "8dcb06d036d64bad92a2117db8874bd9", "1ef6e4d8d6ff4f3cad20a518f8f4f2cb", "3679ebef4abc417ebb06c2f7d45bbf4b", "60a1fc7920304a5e82499d9268e6e1a8", "93e1a0fe3609475da590bd08fae66fa7", "3e3abf2a02724044af294b759243bbff", "173ac791952944b9a758a5ba47245e66", "34e8e3a5d2a0423cab28196710bdc684", "48c696f7e40c4ee5a41ff4e823e88e8d", "0fec85cbc02f43568b13b6b128513849", "0910616a312041489a80714c88219bd9", "734f0340dd2b4cd594290e0457a5df0b", "9462930017094b64bddaf19b5f66aa58", "63b105a140524e7abed1c7a2eee86d70", "ead2f71ec2034c70a970a72f18973607", "8b814bfa9fac40f1a24d1442f95d7ae8", "82502d93f92d4369ac159a5498c62015", "277a1e05f0dc4afb839468c0b5c08bd7", "0b0c60f854f54830adfdf0464701460c" ] }, "id": "ZhPSjrEAB26W", "outputId": "1e454a43-108a-450a-9e4e-3fa327b23e26" }, "execution_count": 16, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Cloning https://huggingface.co./keras-io/char-lstm-seq2seq into local empty directory.\n", "WARNING:huggingface_hub.repository:Cloning https://huggingface.co./keras-io/char-lstm-seq2seq into local empty directory.\n", "WARNING:absl:Found untraced functions such as lstm_cell_2_layer_call_fn, lstm_cell_2_layer_call_and_return_conditional_losses, lstm_cell_3_layer_call_fn, lstm_cell_3_layer_call_and_return_conditional_losses, lstm_cell_2_layer_call_fn while saving (showing 5 of 10). These functions will not be directly callable after loading.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "INFO:tensorflow:Assets written to: char-lstm-seq2seq/assets\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "INFO:tensorflow:Assets written to: char-lstm-seq2seq/assets\n", "WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.\n", "WARNING:absl: has the same name 'LSTMCell' as a built-in Keras object. Consider renaming to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.\n" ] }, { "output_type": "display_data", "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0873642bfadd4c37b54e92dac1de35bf", "version_minor": 0, "version_major": 2 }, "text/plain": [ "Upload file saved_model.pb: 0%| | 3.39k/1.38M [00:00 main\n", "\n", "WARNING:huggingface_hub.repository:To https://huggingface.co./keras-io/char-lstm-seq2seq\n", " df51a58..69c5bbb main -> main\n", "\n" ] }, { "output_type": "execute_result", "data": { "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" }, "text/plain": [ "'https://huggingface.co./keras-io/char-lstm-seq2seq/commit/69c5bbba7cfcad71d97557b045f3592ad5b26c39'" ] }, "metadata": {}, "execution_count": 16 } ] }, { "cell_type": "code", "source": [ "" ], "metadata": { "id": "2TbeYdeuCJ5_" }, "execution_count": null, "outputs": [] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "lstm_seq2seq", "provenance": [], "machine_shape": "hm" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" }, "accelerator": "GPU", "widgets": { "application/vnd.jupyter.widget-state+json": { "0873642bfadd4c37b54e92dac1de35bf": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_view_name": "HBoxView", "_dom_classes": [], "_model_name": "HBoxModel", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.5.0", "box_style": "", "layout": "IPY_MODEL_042ecadcff7b47fbb32069cdf1064d09", "_model_module": "@jupyter-widgets/controls", "children": [ "IPY_MODEL_942b7ed2e1f0404fb5f44b266a16d0cd", "IPY_MODEL_8dcb06d036d64bad92a2117db8874bd9", "IPY_MODEL_1ef6e4d8d6ff4f3cad20a518f8f4f2cb" ] } }, "042ecadcff7b47fbb32069cdf1064d09": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "942b7ed2e1f0404fb5f44b266a16d0cd": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_view_name": "HTMLView", "style": "IPY_MODEL_3679ebef4abc417ebb06c2f7d45bbf4b", "_dom_classes": [], "description": "", "_model_name": "HTMLModel", "placeholder": "​", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": "Upload file saved_model.pb: 100%", "_view_count": null, "_view_module_version": "1.5.0", "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_60a1fc7920304a5e82499d9268e6e1a8" } }, "8dcb06d036d64bad92a2117db8874bd9": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_view_name": "ProgressView", "style": "IPY_MODEL_93e1a0fe3609475da590bd08fae66fa7", "_dom_classes": [], "description": "", "_model_name": "FloatProgressModel", "bar_style": "success", "max": 1444428, "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": 1444428, "_view_count": null, "_view_module_version": "1.5.0", "orientation": "horizontal", "min": 0, "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_3e3abf2a02724044af294b759243bbff" } }, "1ef6e4d8d6ff4f3cad20a518f8f4f2cb": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_view_name": "HTMLView", "style": "IPY_MODEL_173ac791952944b9a758a5ba47245e66", "_dom_classes": [], "description": "", "_model_name": "HTMLModel", "placeholder": "​", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": " 1.38M/1.38M [00:06<00:00, 188kB/s]", "_view_count": null, "_view_module_version": "1.5.0", "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_34e8e3a5d2a0423cab28196710bdc684" } }, "3679ebef4abc417ebb06c2f7d45bbf4b": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "DescriptionStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "_model_module": "@jupyter-widgets/controls" } }, "60a1fc7920304a5e82499d9268e6e1a8": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "93e1a0fe3609475da590bd08fae66fa7": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "ProgressStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "bar_color": null, "_model_module": "@jupyter-widgets/controls" } }, "3e3abf2a02724044af294b759243bbff": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "173ac791952944b9a758a5ba47245e66": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "DescriptionStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "_model_module": "@jupyter-widgets/controls" } }, "34e8e3a5d2a0423cab28196710bdc684": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "48c696f7e40c4ee5a41ff4e823e88e8d": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_view_name": "HBoxView", "_dom_classes": [], "_model_name": "HBoxModel", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.5.0", "box_style": "", "layout": "IPY_MODEL_0fec85cbc02f43568b13b6b128513849", "_model_module": "@jupyter-widgets/controls", "children": [ "IPY_MODEL_0910616a312041489a80714c88219bd9", "IPY_MODEL_734f0340dd2b4cd594290e0457a5df0b", "IPY_MODEL_9462930017094b64bddaf19b5f66aa58" ] } }, "0fec85cbc02f43568b13b6b128513849": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "0910616a312041489a80714c88219bd9": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_view_name": "HTMLView", "style": "IPY_MODEL_63b105a140524e7abed1c7a2eee86d70", "_dom_classes": [], "description": "", "_model_name": "HTMLModel", "placeholder": "​", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": "Upload file keras_metadata.pb: 100%", "_view_count": null, "_view_module_version": "1.5.0", "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_ead2f71ec2034c70a970a72f18973607" } }, "734f0340dd2b4cd594290e0457a5df0b": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_view_name": "ProgressView", "style": "IPY_MODEL_8b814bfa9fac40f1a24d1442f95d7ae8", "_dom_classes": [], "description": "", "_model_name": "FloatProgressModel", "bar_style": "success", "max": 15672, "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": 15672, "_view_count": null, "_view_module_version": "1.5.0", "orientation": "horizontal", "min": 0, "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_82502d93f92d4369ac159a5498c62015" } }, "9462930017094b64bddaf19b5f66aa58": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_view_name": "HTMLView", "style": "IPY_MODEL_277a1e05f0dc4afb839468c0b5c08bd7", "_dom_classes": [], "description": "", "_model_name": "HTMLModel", "placeholder": "​", "_view_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "value": " 15.3k/15.3k [00:06<00:00, 2.02kB/s]", "_view_count": null, "_view_module_version": "1.5.0", "description_tooltip": null, "_model_module": "@jupyter-widgets/controls", "layout": "IPY_MODEL_0b0c60f854f54830adfdf0464701460c" } }, "63b105a140524e7abed1c7a2eee86d70": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "DescriptionStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "_model_module": "@jupyter-widgets/controls" } }, "ead2f71ec2034c70a970a72f18973607": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "8b814bfa9fac40f1a24d1442f95d7ae8": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "ProgressStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "bar_color": null, "_model_module": "@jupyter-widgets/controls" } }, "82502d93f92d4369ac159a5498c62015": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } }, "277a1e05f0dc4afb839468c0b5c08bd7": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_view_name": "StyleView", "_model_name": "DescriptionStyleModel", "description_width": "", "_view_module": "@jupyter-widgets/base", "_model_module_version": "1.5.0", "_view_count": null, "_view_module_version": "1.2.0", "_model_module": "@jupyter-widgets/controls" } }, "0b0c60f854f54830adfdf0464701460c": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_view_name": "LayoutView", "grid_template_rows": null, "right": null, "justify_content": null, "_view_module": "@jupyter-widgets/base", "overflow": null, "_model_module_version": "1.2.0", "_view_count": null, "flex_flow": null, "width": null, "min_width": null, "border": null, "align_items": null, "bottom": null, "_model_module": "@jupyter-widgets/base", "top": null, "grid_column": null, "overflow_y": null, "overflow_x": null, "grid_auto_flow": null, "grid_area": null, "grid_template_columns": null, "flex": null, "_model_name": "LayoutModel", "justify_items": null, "grid_row": null, "max_height": null, "align_content": null, "visibility": null, "align_self": null, "height": null, "min_height": null, "padding": null, "grid_auto_rows": null, "grid_gap": null, "max_width": null, "order": null, "_view_module_version": "1.2.0", "grid_template_areas": null, "object_position": null, "object_fit": null, "grid_auto_columns": null, "margin": null, "display": null, "left": null } } } } }, "nbformat": 4, "nbformat_minor": 0 }