{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# TRANSFORMER MODELS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transformers, what can they do?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sentiment Analysis" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co./distilbert/distilbert-base-uncased-finetuned-sst-2-english).\n", "Using a pipeline without specifying a model name and revision in production is not recommended.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From c:\\Users\\ACER\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\tf_keras\\src\\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.\n", "\n" ] }, { "data": { "text/plain": [ "[{'label': 'POSITIVE', 'score': 0.9598049521446228}]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from transformers import pipeline\n", "\n", "classifier = pipeline(\"sentiment-analysis\")\n", "classifier(\"I've been waiting for a HuggingFace course my whole life.\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'label': 'POSITIVE', 'score': 0.9598049521446228},\n", " {'label': 'NEGATIVE', 'score': 0.9994558691978455}]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# we can pass several sentences\n", "classifier(\n", " [\"I've been waiting for a HuggingFace course my whole life.\", \"I hate this so much!\"]\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Zero-shot classification" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co./facebook/bart-large-mnli).\n", "Using a pipeline without specifying a model name and revision in production is not recommended.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "13af57499d894e8aa77c7ed39138d3dd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "model.safetensors: 98%|#########8| 1.60G/1.63G [00:00 models.\", top_k=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Named Entity Recognition" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co./dbmdz/bert-large-cased-finetuned-conll03-english).\n", "Using a pipeline without specifying a model name and revision in production is not recommended.\n", "Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']\n", "- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", "- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n", "c:\\Users\\ACER\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\transformers\\pipelines\\token_classification.py:170: UserWarning: `grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy=\"AggregationStrategy.SIMPLE\"` instead.\n", " warnings.warn(\n" ] }, { "data": { "text/plain": [ "[{'entity_group': 'PER',\n", " 'score': 0.99884915,\n", " 'word': 'Ahmad',\n", " 'start': 11,\n", " 'end': 16},\n", " {'entity_group': 'ORG',\n", " 'score': 0.9950792,\n", " 'word': 'University of Engineering and Technology',\n", " 'start': 31,\n", " 'end': 71},\n", " {'entity_group': 'LOC',\n", " 'score': 0.97850055,\n", " 'word': 'Lahore',\n", " 'start': 73,\n", " 'end': 79},\n", " {'entity_group': 'ORG',\n", " 'score': 0.78072757,\n", " 'word': \"Bechelor ' s\",\n", " 'start': 95,\n", " 'end': 105},\n", " {'entity_group': 'ORG',\n", " 'score': 0.92247367,\n", " 'word': 'Computer Science',\n", " 'start': 109,\n", " 'end': 125}]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from transformers import pipeline\n", "\n", "ner = pipeline(\"ner\", grouped_entities=True)\n", "ner(\"My name is Ahmad and I work at University of Engineering and Technology, Lahore. I was prsuing Bechelor's of Computer Science.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question answering" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co./distilbert/distilbert-base-cased-distilled-squad).\n", "Using a pipeline without specifying a model name and revision in production is not recommended.\n" ] } ], "source": [ "from transformers import pipeline\n", "\n", "question_answerer = pipeline(\"question-answering\")\n", "\n", "ans = question_answerer(\n", " question=\"where do I work?\",\n", " context = \"My name is Ahmad and I work at University of Engineering and Technology, Lahore\"\n", ")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'University of Engineering and Technology, Lahore'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ans['answer']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Summarization" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co./sshleifer/distilbart-cnn-12-6).\n", "Using a pipeline without specifying a model name and revision in production is not recommended.\n" ] } ], "source": [ "from transformers import pipeline\n", "\n", "summarizer = pipeline(\"summarization\")\n", "summary = summarizer(\n", " \"\"\"\n", " America has changed dramatically during recent years. Not only has the number of \n", " graduates in traditional engineering disciplines such as mechanical, civil, \n", " electrical, chemical, and aeronautical engineering declined, but in most of \n", " the premier American universities engineering curricula now concentrate on \n", " and encourage largely the study of engineering science. As a result, there \n", " are declining offerings in engineering subjects dealing with infrastructure, \n", " the environment, and related issues, and greater concentration on high \n", " technology subjects, largely supporting increasingly complex scientific \n", " developments. While the latter is important, it should not be at the expense \n", " of more traditional engineering.\n", "\n", " Rapidly developing economies such as China and India, as well as other \n", " industrial countries in Europe and Asia, continue to encourage and advance \n", " the teaching of engineering. Both China and India, respectively, graduate \n", " six and eight times as many traditional engineers as does the United States. \n", " Other industrial countries at minimum maintain their output, while America \n", " suffers an increasingly serious decline in the number of engineering graduates \n", " and a lack of well-educated engineers.\n", "\"\"\"\n", ")" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " America has changed dramatically during recent years . The number of engineering graduates in the U.S. has declined in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering . Rapidly developing economies such as China and India continue to encourage and advance the teaching of engineering .\n" ] } ], "source": [ "print(summary[0]['summary_text'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Translation" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "import sentencepiece" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e7521143fb794a39b66b0f5d00f9fac8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "source.spm: 0%| | 0.00/802k [00:00