{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Movie Recommendation Project\n", "In this machine learning project, we build a recommendation system from the ground up to suggest movies to the user based on his/her preferences.\n", "\n", "## Dataset\n", "We are using the TMDB dataset available from Kaggle\n", "\n", "## What is a Recommendation System?\n", "Recommendation systems suggest recommendations to users depending on a variety of criteria.\n", "\n", "There are 3 types of recommendation systems.\n", "\n", "1. Demographic Filtering: The recommendations are the same for every user. They are generalized, not personalized. These types of systems are behind sections like “Top Trending”.\n", "2. Content-based Filtering: These suggest recommendations based on the item metadata (movie, product, song, etc). Here, the main idea is if a user likes an item, then the user will also like items similar to it.\n", "3. Collaboration-based Filtering: These systems make recommendations by grouping the users with similar interests. For this system, metadata of the item is not required.\n", "\n", "In this project, we are building a **Content-based** recommendation engine for movies." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from ast import literal_eval\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "from sklearn.metrics.pairwise import cosine_similarity\n", "import pickle" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "credits_df = pd.read_csv(\"./data/tmdb_5000_credits.csv\")\n", "movies_df = pd.read_csv(\"./data/tmdb_5000_movies.csv\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | budget | \n", "genres | \n", "homepage | \n", "id | \n", "keywords | \n", "original_language | \n", "original_title | \n", "overview | \n", "popularity | \n", "production_companies | \n", "production_countries | \n", "release_date | \n", "revenue | \n", "runtime | \n", "spoken_languages | \n", "status | \n", "tagline | \n", "title | \n", "vote_average | \n", "vote_count | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "237000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://www.avatarmovie.com/ | \n", "19995 | \n", "[{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":... | \n", "en | \n", "Avatar | \n", "In the 22nd century, a paraplegic Marine is di... | \n", "150.437577 | \n", "[{\"name\": \"Ingenious Film Partners\", \"id\": 289... | \n", "[{\"iso_3166_1\": \"US\", \"name\": \"United States o... | \n", "2009-12-10 | \n", "2787965087 | \n", "162.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso... | \n", "Released | \n", "Enter the World of Pandora. | \n", "Avatar | \n", "7.2 | \n", "11800 | \n", "
1 | \n", "300000000 | \n", "[{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"... | \n", "http://disney.go.com/disneypictures/pirates/ | \n", "285 | \n", "[{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na... | \n", "en | \n", "Pirates of the Caribbean: At World's End | \n", "Captain Barbossa, long believed to be dead, ha... | \n", "139.082615 | \n", "[{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"... | \n", "[{\"iso_3166_1\": \"US\", \"name\": \"United States o... | \n", "2007-05-19 | \n", "961000000 | \n", "169.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "At the end of the world, the adventure begins. | \n", "Pirates of the Caribbean: At World's End | \n", "6.9 | \n", "4500 | \n", "
2 | \n", "245000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://www.sonypictures.com/movies/spectre/ | \n", "206647 | \n", "[{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name... | \n", "en | \n", "Spectre | \n", "A cryptic message from Bond’s past sends him o... | \n", "107.376788 | \n", "[{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam... | \n", "[{\"iso_3166_1\": \"GB\", \"name\": \"United Kingdom\"... | \n", "2015-10-26 | \n", "880674609 | \n", "148.0 | \n", "[{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},... | \n", "Released | \n", "A Plan No One Escapes | \n", "Spectre | \n", "6.3 | \n", "4466 | \n", "
3 | \n", "250000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam... | \n", "http://www.thedarkknightrises.com/ | \n", "49026 | \n", "[{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,... | \n", "en | \n", "The Dark Knight Rises | \n", "Following the death of District Attorney Harve... | \n", "112.312950 | \n", "[{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"... | \n", "[{\"iso_3166_1\": \"US\", \"name\": \"United States o... | \n", "2012-07-16 | \n", "1084939099 | \n", "165.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "The Legend Ends | \n", "The Dark Knight Rises | \n", "7.6 | \n", "9106 | \n", "
4 | \n", "260000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://movies.disney.com/john-carter | \n", "49529 | \n", "[{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":... | \n", "en | \n", "John Carter | \n", "John Carter is a war-weary, former military ca... | \n", "43.926995 | \n", "[{\"name\": \"Walt Disney Pictures\", \"id\": 2}] | \n", "[{\"iso_3166_1\": \"US\", \"name\": \"United States o... | \n", "2012-03-07 | \n", "284139100 | \n", "132.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "Lost in our world, found in another. | \n", "John Carter | \n", "6.1 | \n", "2124 | \n", "
\n", " | movie_id | \n", "title | \n", "cast | \n", "crew | \n", "
---|---|---|---|---|
0 | \n", "19995 | \n", "Avatar | \n", "[{\"cast_id\": 242, \"character\": \"Jake Sully\", \"... | \n", "[{\"credit_id\": \"52fe48009251416c750aca23\", \"de... | \n", "
1 | \n", "285 | \n", "Pirates of the Caribbean: At World's End | \n", "[{\"cast_id\": 4, \"character\": \"Captain Jack Spa... | \n", "[{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de... | \n", "
2 | \n", "206647 | \n", "Spectre | \n", "[{\"cast_id\": 1, \"character\": \"James Bond\", \"cr... | \n", "[{\"credit_id\": \"54805967c3a36829b5002c41\", \"de... | \n", "
3 | \n", "49026 | \n", "The Dark Knight Rises | \n", "[{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba... | \n", "[{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de... | \n", "
4 | \n", "49529 | \n", "John Carter | \n", "[{\"cast_id\": 5, \"character\": \"John Carter\", \"c... | \n", "[{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de... | \n", "
\n", " | budget | \n", "genres | \n", "homepage | \n", "id | \n", "keywords | \n", "original_language | \n", "original_title | \n", "overview | \n", "popularity | \n", "production_companies | \n", "... | \n", "revenue | \n", "runtime | \n", "spoken_languages | \n", "status | \n", "tagline | \n", "title | \n", "vote_average | \n", "vote_count | \n", "cast | \n", "crew | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "237000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://www.avatarmovie.com/ | \n", "19995 | \n", "[{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":... | \n", "en | \n", "Avatar | \n", "In the 22nd century, a paraplegic Marine is di... | \n", "150.437577 | \n", "[{\"name\": \"Ingenious Film Partners\", \"id\": 289... | \n", "... | \n", "2787965087 | \n", "162.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso... | \n", "Released | \n", "Enter the World of Pandora. | \n", "Avatar | \n", "7.2 | \n", "11800 | \n", "[{\"cast_id\": 242, \"character\": \"Jake Sully\", \"... | \n", "[{\"credit_id\": \"52fe48009251416c750aca23\", \"de... | \n", "
1 | \n", "300000000 | \n", "[{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"... | \n", "http://disney.go.com/disneypictures/pirates/ | \n", "285 | \n", "[{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na... | \n", "en | \n", "Pirates of the Caribbean: At World's End | \n", "Captain Barbossa, long believed to be dead, ha... | \n", "139.082615 | \n", "[{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"... | \n", "... | \n", "961000000 | \n", "169.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "At the end of the world, the adventure begins. | \n", "Pirates of the Caribbean: At World's End | \n", "6.9 | \n", "4500 | \n", "[{\"cast_id\": 4, \"character\": \"Captain Jack Spa... | \n", "[{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de... | \n", "
2 | \n", "245000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://www.sonypictures.com/movies/spectre/ | \n", "206647 | \n", "[{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name... | \n", "en | \n", "Spectre | \n", "A cryptic message from Bond’s past sends him o... | \n", "107.376788 | \n", "[{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam... | \n", "... | \n", "880674609 | \n", "148.0 | \n", "[{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},... | \n", "Released | \n", "A Plan No One Escapes | \n", "Spectre | \n", "6.3 | \n", "4466 | \n", "[{\"cast_id\": 1, \"character\": \"James Bond\", \"cr... | \n", "[{\"credit_id\": \"54805967c3a36829b5002c41\", \"de... | \n", "
3 | \n", "250000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam... | \n", "http://www.thedarkknightrises.com/ | \n", "49026 | \n", "[{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,... | \n", "en | \n", "The Dark Knight Rises | \n", "Following the death of District Attorney Harve... | \n", "112.312950 | \n", "[{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"... | \n", "... | \n", "1084939099 | \n", "165.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "The Legend Ends | \n", "The Dark Knight Rises | \n", "7.6 | \n", "9106 | \n", "[{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba... | \n", "[{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de... | \n", "
4 | \n", "260000000 | \n", "[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... | \n", "http://movies.disney.com/john-carter | \n", "49529 | \n", "[{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":... | \n", "en | \n", "John Carter | \n", "John Carter is a war-weary, former military ca... | \n", "43.926995 | \n", "[{\"name\": \"Walt Disney Pictures\", \"id\": 2}] | \n", "... | \n", "284139100 | \n", "132.0 | \n", "[{\"iso_639_1\": \"en\", \"name\": \"English\"}] | \n", "Released | \n", "Lost in our world, found in another. | \n", "John Carter | \n", "6.1 | \n", "2124 | \n", "[{\"cast_id\": 5, \"character\": \"John Carter\", \"c... | \n", "[{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de... | \n", "
5 rows × 22 columns
\n", "\n", " | cast | \n", "crew | \n", "keywords | \n", "genres | \n", "
---|---|---|---|---|
0 | \n", "[{'cast_id': 242, 'character': 'Jake Sully', '... | \n", "[{'credit_id': '52fe48009251416c750aca23', 'de... | \n", "[{'id': 1463, 'name': 'culture clash'}, {'id':... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | \n", "
1 | \n", "[{'cast_id': 4, 'character': 'Captain Jack Spa... | \n", "[{'credit_id': '52fe4232c3a36847f800b579', 'de... | \n", "[{'id': 270, 'name': 'ocean'}, {'id': 726, 'na... | \n", "[{'id': 12, 'name': 'Adventure'}, {'id': 14, '... | \n", "
2 | \n", "[{'cast_id': 1, 'character': 'James Bond', 'cr... | \n", "[{'credit_id': '54805967c3a36829b5002c41', 'de... | \n", "[{'id': 470, 'name': 'spy'}, {'id': 818, 'name... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | \n", "
3 | \n", "[{'cast_id': 2, 'character': 'Bruce Wayne / Ba... | \n", "[{'credit_id': '52fe4781c3a36847f81398c3', 'de... | \n", "[{'id': 849, 'name': 'dc comics'}, {'id': 853,... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 80, 'nam... | \n", "
4 | \n", "[{'cast_id': 5, 'character': 'John Carter', 'c... | \n", "[{'credit_id': '52fe479ac3a36847f813eaa3', 'de... | \n", "[{'id': 818, 'name': 'based on novel'}, {'id':... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | \n", "
5 | \n", "[{'cast_id': 30, 'character': 'Peter Parker / ... | \n", "[{'credit_id': '52fe4252c3a36847f80151a5', 'de... | \n", "[{'id': 851, 'name': 'dual identity'}, {'id': ... | \n", "[{'id': 14, 'name': 'Fantasy'}, {'id': 28, 'na... | \n", "
6 | \n", "[{'cast_id': 34, 'character': 'Flynn Rider (vo... | \n", "[{'credit_id': '52fe46db9251416c91062101', 'de... | \n", "[{'id': 1562, 'name': 'hostage'}, {'id': 2343,... | \n", "[{'id': 16, 'name': 'Animation'}, {'id': 10751... | \n", "
7 | \n", "[{'cast_id': 76, 'character': 'Tony Stark / Ir... | \n", "[{'credit_id': '55d5f7d4c3a3683e7e0016eb', 'de... | \n", "[{'id': 8828, 'name': 'marvel comic'}, {'id': ... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | \n", "
8 | \n", "[{'cast_id': 3, 'character': 'Harry Potter', '... | \n", "[{'credit_id': '52fe4273c3a36847f801fab1', 'de... | \n", "[{'id': 616, 'name': 'witch'}, {'id': 2343, 'n... | \n", "[{'id': 12, 'name': 'Adventure'}, {'id': 14, '... | \n", "
9 | \n", "[{'cast_id': 18, 'character': 'Bruce Wayne / B... | \n", "[{'credit_id': '553bf23692514135c8002886', 'de... | \n", "[{'id': 849, 'name': 'dc comics'}, {'id': 7002... | \n", "[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam... | \n", "
\n", " | title | \n", "cast | \n", "director | \n", "keywords | \n", "genres | \n", "
---|---|---|---|---|---|
0 | \n", "Avatar | \n", "[Sam Worthington, Zoe Saldana, Sigourney Weaver] | \n", "James Cameron | \n", "[culture clash, future, space war] | \n", "[Action, Adventure, Fantasy] | \n", "
1 | \n", "Pirates of the Caribbean: At World's End | \n", "[Johnny Depp, Orlando Bloom, Keira Knightley] | \n", "Gore Verbinski | \n", "[ocean, drug abuse, exotic island] | \n", "[Adventure, Fantasy, Action] | \n", "
2 | \n", "Spectre | \n", "[Daniel Craig, Christoph Waltz, Léa Seydoux] | \n", "Sam Mendes | \n", "[spy, based on novel, secret agent] | \n", "[Action, Adventure, Crime] | \n", "
3 | \n", "The Dark Knight Rises | \n", "[Christian Bale, Michael Caine, Gary Oldman] | \n", "Christopher Nolan | \n", "[dc comics, crime fighter, terrorist] | \n", "[Action, Crime, Drama] | \n", "
4 | \n", "John Carter | \n", "[Taylor Kitsch, Lynn Collins, Samantha Morton] | \n", "Andrew Stanton | \n", "[based on novel, mars, medallion] | \n", "[Action, Adventure, Science Fiction] | \n", "