Шукаєте відповіді та рішення тестів для Batch-01_BSc_Semester-01_Basics Of Data Analytics? Перегляньте нашу велику колекцію перевірених відповідей для Batch-01_BSc_Semester-01_Basics Of Data Analytics в iitjbsc.futurense.com.
Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!
Which of the following are indicators of good data visualizations? [ Select all that apply]
To complete this assignment please refer to the below given dataset link.
Customer_Survey_Dataset
🔧 Part A: Hands-On Data Cleaning Tasks
Dataset: Use any simple CSV file (e.g., customer information, survey data) with numeric and categorical columns.
Remove Duplicates
Identify and drop duplicate rows from the dataset.
Show the number of rows before and after.
Handle Missing Values
Find all missing values.
Replace missing numerical values with mean and categorical values with mode.
Detect and Handle Data Errors
Manually introduce one incorrect age (e.g., -5 or 135).
Detect and handle the error using logical reasoning.
Correct Formatting Issues
Format a column with inconsistent date formats (e.g., "2024-01-01", "01/01/2024").
Standardize all dates to YYYY-MM-DD.
Compare Two Data Sources
Create two mini datasets with a record mismatch (e.g., salary of the same person differs).
Detect the discrepancy and choose which value to retain, providing justification.
🧮 Part B: Hands-On Transformation Tasks
Standardization (Z-score Normalization)
Apply standardization to a numeric column.
Show the original mean and standard deviation, and verify the new column’s mean is ~0 and std dev is ~1.
Min-Max Scaling
Apply Min-Max scaling to a numeric column.
Confirm values are scaled between 0 and 1.
Log Transformation
Apply log transformation to a skewed column (add +1 if there are zeros).
Plot histogram before and after to visualize distribution improvement.
Categorical Encoding
Use one-hot encoding for a gender column.
Use label encoding for a color column.
Use ordinal encoding for a satisfaction rating column (Low, Medium, High).
You are given a dataset on purchases with the following information.
Purchase ID | Purchase Date | Item | Quantity | Amount | Mode of Payment |
Match the analytics questions in Table 1 with their appropriate descriptive analytics
Table 1
The highest value of the purchase done | |
The most commonly used mode of payment | |
The busiest date of the store | |
The most commonly sold item |
Table 2
Calculate the frequency of each mode of payment and identify the one with the highest frequency | |
Calculate the frequency of each Item and identify the one with the highest frequency | |
Sort the data from largest to smallest ‘Amount’ and identify the highest value | |
Calculate the number of purchases per Purchase Date and identify the date with the most number of purchases |
Which of the following features cannot be engineered using information on an individual’s address
Which of the following is NOT possible to figure out by filtering the data
Which of the following is NOT possible to figure out by sorting the data?
Отримайте необмежений доступ до відповідей на екзаменаційні питання - встановіть розширення Crowdly зараз!