Practice with Pandas
This section, the Forbes_Billionaire_Homework.ipynb file, showcases in-class practice with the Pandas library. It focuses on analyzing and manipulating a dataset containing information about billionaires from the Forbes 2020 list.
Click to see details
- Mounting Google Drive in Google Colab:Enables seamless access to files stored in Google Drive for data loading.
- Importing the Pandas Library
- Loading the
real_estate.csvDataset into a DataFrame - Displaying Basic Dataset Information:
- Use
.head()and.tail()to view subsets of the data.
- Use
- Identify the Name at the 0th Index:
- Display the name assigned to the 0th index in the DataFrame.
- Print the Dataset Dimensions:
- Output the number of rows and columns in the dataset.
- Display Record Counts:
- Show the last record using the
.tail(1)function. - Display the total number of rows (records).
- Display the total number of columns (attributes).
- Show the last record using the
- Check Data Types:
- Display the data types of all fields.
- Identify the data type of the
Sourcevariable.
- Check for Missing Values:
- Use
.isna().sum()and.isnull().sum()to identify missing values.
- Use
- Analyze Data Values:
- Use
.value_counts()to inspect values in specific columns.
- Use
- Check Column Types:
- Verify the types of the columns.
- Basic Statistics:
- Print descriptive statistics using the
.describe()function.
- Print descriptive statistics using the
- Identify the Youngest Billionaire:
- Find and display the youngest billionaire in the dataset.
- Sort and Reset the Index:
- Sort the dataset by
Agein ascending order using.sort_values()and reset the index.
- Sort the dataset by
- Create a New Column:
- Add a column called
Year_of_birth.
- Add a column called
- Filter Billionaires Born in 1996:
- Select records of billionaires born in 1996.
- Count Billionaires Born in 1996:
- Display the total number of billionaires born in 1996 using the
.count()function.
- Display the total number of billionaires born in 1996 using the
Through practical exercises, the dataset is explored using Pandas functions to extract insights, such as identifying the youngest billionaire, handling missing data, creating new columns, and filtering data based on specific criteria.