This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems, solutions, findings, and conclusions.
- Analyze the distribution of content types (movies vs TV shows).
- Identify the most common ratings for movies and TV shows.
- List and analyze content based on release years, countries, and durations.
- Explore and categorize content based on specific criteria and keywords.
The data for this project is sourced from the Kaggle dataset:
- Dataset Link: Movies Dataset
##importing the dataset
select * from dbo.netflix_titlesselect
type,
count(*) as totalcontent
from dbo.netflix_titles
group by typeObjective: Determine the distribution of content types on Netflix.
select rating,
count(*) as commonrating
from dbo.netflix_titles
group by ratingObjective: Identify the most frequently occurring rating for each type of content.
select title
from dbo.netflix_titles
where type = 'movie' and
release_year= '2020'Objective: Retrieve all movies released in a specific year.
select TOP 5 country,
count(title)
from dbo.netflix_titles
group by country
order by count(title) desc Objective: Identify the top 5 countries with the highest number of content items.
select * from dbo.netflix_titles
where type = 'movie'
and duration = (select max(duration) from dbo.netflix_titles)Objective: Find the movie with the longest duration.
select
* from dbo.netflix_titles
where director like '%rajiv chilaka%'Objective: List all content directed by 'Rajiv Chilaka'.
select *,
trim(trailing ' season' from duration) as season
from dbo.netflix_titles
where type= 'tv show'
and trim(trailing ' season' from duration) > 5Objective: Identify TV shows with more than 5 seasons.
select * from dbo.netflix_titles where director is null
Objective: List content that does not have a director.
select * from dbo.netflix_titles where cast like '%salman khan%'
Objective: find the movies actor is 'salman khan'
select *,
case
when description like '%kills%' or
description like '%violence%' then 'bad_content'
else 'good_content'
end category
from dbo.netflix_titlesObjective: Categorize content as 'Bad' if it contains 'kill' or 'violence' and 'Good' otherwise. Count the number of items in each category.