The document outlines a series of data manipulation tasks to be performed on a dataset. These tasks include checking the dataset's structure, handling missing values, creating new columns based on conditions, and performing group operations. The goal is to analyze and summarize information such as release years and unique country representations.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
0 views1 page
TDA Week1project
The document outlines a series of data manipulation tasks to be performed on a dataset. These tasks include checking the dataset's structure, handling missing values, creating new columns based on conditions, and performing group operations. The goal is to analyze and summarize information such as release years and unique country representations.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1
1. How many rows and columns are there in the dataset?
2. Print the names of all the columns.
3. Display the first 5 and last 3 rows of the dataset. 4. Select and display only the @tle and release_year columns. 5. Using iloc, show rows 10 to 20 (inclusive). 6. Check for missing values in each column. 7. Fill missing values in the country column with "Unknown" and in the director column with "Not Available" 8. Replace "United States" with "USA" in the country column. 9. Create a new column content_type that contains "Movie" if dura@on contains "min" and "TV Show" if it contains "Season". 10. Add a column years_since_release that shows how many years have passed since the release year (assuming current year is 2025). 11. Group the dataset by release_year and count how many shows were released each year. 12. Find how many unique countries are represented in the dataset.