Practice with Pandas

Click to open the file…

This section, the Forbes_Billionaire_Homework.ipynb file, showcases in-class practice with the Pandas library. It focuses on analyzing and manipulating a dataset containing information about billionaires from the Forbes 2020 list.

Click to see details
  • Mounting Google Drive in Google Colab:Enables seamless access to files stored in Google Drive for data loading.
  • Importing the Pandas Library
  • Loading the real_estate.csv Dataset into a DataFrame
  • Displaying Basic Dataset Information:
    • Use .head() and .tail() to view subsets of the data.
  • Identify the Name at the 0th Index:
    • Display the name assigned to the 0th index in the DataFrame.
  • Print the Dataset Dimensions:
    • Output the number of rows and columns in the dataset.
  • Display Record Counts:
    • Show the last record using the .tail(1) function.
    • Display the total number of rows (records).
    • Display the total number of columns (attributes).
  • Check Data Types:
    • Display the data types of all fields.
    • Identify the data type of the Source variable.
  • Check for Missing Values:
    • Use .isna().sum() and .isnull().sum() to identify missing values.
  • Analyze Data Values:
    • Use .value_counts() to inspect values in specific columns.
  • Check Column Types:
    • Verify the types of the columns.
  • Basic Statistics:
    • Print descriptive statistics using the .describe() function.
  • Identify the Youngest Billionaire:
    • Find and display the youngest billionaire in the dataset.
  • Sort and Reset the Index:
    • Sort the dataset by Age in ascending order using .sort_values() and reset the index.
  • Create a New Column:
    • Add a column called Year_of_birth.
  • Filter Billionaires Born in 1996:
    • Select records of billionaires born in 1996.
  • Count Billionaires Born in 1996:
    • Display the total number of billionaires born in 1996 using the .count() function.

Through practical exercises, the dataset is explored using Pandas functions to extract insights, such as identifying the youngest billionaire, handling missing data, creating new columns, and filtering data based on specific criteria.