Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
-
Updated
May 19, 2021 - Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Movie Recommender based on the MovieLens Dataset (ml-100k) using item-item collaborative filtering.
Exploratory Dataset Analysis (EDA) will be uploaded to this repository. Libraries such as Pandas, Matplotlib, Seaborn and Plotly will be used for data analysis.
This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises
Building a movie recommender system with factorization machines on Amazon SageMaker.
Implementation of Spotify's Generalist-Specialist score on the MovieLens dataset.
Analysis of MovieLens Dataset in Python
Spark MLLIB: Collaborative Filtering Movie Recommendation System
MovieLens Dataset analysis using Hadoop and Pyspark
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.
Movie recommendation system based on Collaborative filtering using Apache Spark
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop.
Analytics data for looking : filter movies with drama genre, most rated movies, number of users and average rating for each age range,etc. Visualize the count and age of moviegoers with the Matplolib library and Show movies, age range, average rating.
This is a project made as a part of my data science master's program to analyze and draw inference from Movielens data.
Created visualizations of the MovieLens data set using matrix factorization http://www.yisongyue.com/courses/cs155/2018_winter/assignments/project2.pdf
Project to determine the ratings for a movie using each of the Spark & Hadoop Eco-system.
Contains my custom implementation of various machine learning models and analysis.
A recommendation algorithm capable of accurately predicting how a user will rate a movie they have not yet viewed based on their historical preferences. The models and EDA are based on the 1M MOVIELENS dataset
A Feature Preference based CF Experiment on MovieLens 100K dataset
Add a description, image, and links to the movielens-data-analysis topic page so that developers can more easily learn about it.
To associate your repository with the movielens-data-analysis topic, visit your repo's landing page and select "manage topics."