Data Analysis with Python Cognitive Class Exam Quiz Answers

Clear My Certification September 18, 2020 Cognitive Class Leave a comment 22,328 Views

Enroll Here: Data Analysis with Python Cognitive Class Exam Quiz Answers

Data Analysis with Python Cognitive Class Certification Answers

Module 1 – Introduction

Question 1: What does CSV stand for ?

Comma Separated Values
Car Sold values
Car State values
None of the above

Question 2: In the data set what represents an attribute or feature?

Row
Column
Each element in the data set

Question 3: What is another name for the variable that we want to predict?

Target
Feature
Dataframe

Question 4: What is the command to display the first five rows of a dataframe df?

df.head()
df.tail()

Question 5: What command do you use to get the data type of each row of the dataframe df?

df.dtypes
df.head()
df.tail()

Question 6: How do you get a statistical summary of a dataframe df?

df.describe()
df.head()
df,tails()

Question 7: If you use the method describe() without changing any of the arguments you will get a statistical summary of all the columns of type object?

False
True

Module 2 – Data Wrangling

Question 1: Consider the dataframe “df” what is the result of the following operation df[‘symbolling’] = df[‘symbolling’] + 1?:

Every element in the column “symbolling” will increase by one
Every element in the row “symbolling” will increase by one
Every element in the dataframe will increase by one

Question 2: Consider the dataframe “df”, what does the command df.rename(columns={‘a’:’b’}) change about the dataframe “df”

rename column “a” of the dataframe to “b”
rename the row “a” to “b”
nothing as you must set the parameter “inplace =True “

Question 3: Consider the dataframe “df” , what is the result of the following operation df[‘price’] = df[‘price’].astype(int) ?

convert or cast the row ‘price’ to an integer value
convert or cast the column ‘price’ to an integer value
convert or cast the entire dataframe to an integer value

Question 4: Consider the column of the dataframe df[‘a’]. The colunm has been standardized. What is the standard deviation of the values, i.e the result of applying the following operation df[‘a’].std() :

Question 5: Consider the column of the dataframe df[‘Fuel’], with two values ‘gas’ and’ diesel’. What will be the name of the new colunms pd.get_dummies(df[‘Fuel’]) ?

1 and 0
Just diesel
Just gas
Gas and diesel

Question 6: What are the values of the new columns from part 5 a)

1 and 0
Just diesel
Just gas
Gas and diesel

Module 3 – Exploratory Data Analysis

Question 1: Consider the dataframe “df”. Which method provides the summary statistics?

df.describe()
df.head()
df.tail()
df.summary()

Question 2: Consider the following dataframe:

df_test = df[‘body-style’, ‘price’]

The following operations is applied:

df_grp = df_test.groupby([‘body-style’], as_index=False).mean()

What are resulting values of df_grp[‘price’]:

The average price for each body style
The average price
The average body style

Question 3: Correlation implies causation :

False
True

Question 4: What is the minimum possible value of Pearson’s Correlation :

1
-100
-1

Question 5: What is the Pearson correlation between variables X and Y, if X=Y:

-1
1
0
X
Y

Module 4 – Model Development

Question 1: Let X be a dataframe with 100 rows and 5 columns, let y be the target with 100 samples,assuming all the relevant libraries and data have been imported, the following line of code has been executed:

LR = LinearRegression()

LR.fit(X, y)

yhat = LR.predict(X)

How many samples does yhat contain :

5
500
100
0

Question 2: What value of R^2 (coefficient of determination) indicates your model performs best ?

-100
-1
0
1

Question 3: What statement is true about Polynomial linear regression

Polynomial linear regression is not linear in any way
Although the predictor variables of Polynomial linear regression are not linear the relationship between the parameters or coefficients is linear.
Polynomial linear regression uses wavelets

Question 4: The larger the mean square error, the better your model has performed

False
True

Question 5: Assume all the libraries are imported, y is the target and X is the features or dependent variables, consider the following lines of code:

Input = [(‘scale’, StandardScaler()), (‘model’, LinearRegression())]

pipe = Pipeline(Input)

pipe.fit(X,y)

ypipe = pipe.predict(X)

What have we just done in the above code?

Polynomial transform, Standardize the data, then perform a prediction using a linear regression model
Standardize the data, then perform prediction using a linear regression model
Polynomial transform then Standardize the data

Module 5 – Model Evaluation:

Question 1: In the following plot, the vertical access shows the mean square error andthe horizontal axis represents the order of the polynomial. The red line represents the training error the blue line is the test error. What is the best order of the polynomial given the possible choices in the horizontal axis?

Question 2: What is the use of the “train_test_split” function such that 40% of the data samples will be utilized for testing, the parameter “random_state” is set to zero, and the input variables for the features and targets are_data, y_data respectively.

train_test_split(x_data, y_data, test_size=0, random_state=0.4)
train_test_split(x_data, y_data, test_size=0.4, random_state=0)
train_test_split(x_data, y_data)

Question 3: What is the output of cross_val_score(lre, x_data, y_data, cv=2)?

The predicted values of the test data using cross validation.
The average R^2 on the test data for each of the two folds
This function finds the free parameter alpha

Question 4: What is the code to create a ridge regression object “RR” with an alpha term equal 10

RR=LinearRegression(alpha=10)
RR=Ridge(alpha=10)
RR=Ridge(alpha=1)

Question 5: What dictionary value would we use to perform a grid search for the following values of alpha: 1,10, 100. No other parameter values should be tested

alpha=[1,10,100]
[{‘alpha’: [1,10,100]}]
[{‘alpha’: [0.001,0.1,1, 10, 100, 1000,10000,100000,100000],’normalize’:[True,False]} ]

Data Analysis with Python Final Exam Answers – Cognitive Class

Question 1: What does the following command do:

df.dropna(subset=[“price”], axis=0)

Drop the “not a number” from the column price
Drop the row price
Rename the data frame price

Question 2: How would you provide many of the summery statistics for all the columns in the dataframe “df”:

df.describe(include = “all”)
df.head()
type(df)
df.shape

Question 3: How would you find the shape of the dataframe df

df.describe()
df.head()
type(df)
df.shape

Question 4: What task does the following command to df.to_csv(“A.csv”) perform

change the name of the column to “A.csv”
load the data from a csv file called “A” into a dataframe
Save the dataframe df to a csv file called “A.csv”

Question 5: What task does the following line of code perform:

df[‘peak-rpm’].replace(np.nan, 5,inplace=True)

replace the not a number values with 5 in the column ‘peak-rpm’
rename the column ‘peak-rpm’ to 5
add 5 to the data frame

Question 6: What task does the following line of code perform:

df[‘peak-rpm’].replace(np.nan, 5,inplace=True)

replace the not a number values with 5 in the column ‘peak-rpm’
rename the column ‘peak-rpm’ to 5
add 5 to the data frame

Question 7: How do you “one hot encode” the column ‘fuel-type’ in the dataframe df

pd.get_dummies(df[“fuel-type”])
df.mean([“fuel-type”])
df[df[“fuel-type”])==1 ]=1

Question 8: What does the vertical axis in a scatter plot represent

independent variable
dependent variable

Question 9: What does the horizontal axis in a scatter plot represent

independent variable
dependent variable

Question 10: If we have 10 columns and 100 samples how large is the output of df.corr()

10 x 100
10 x 10
100×100
100×100

Question 11: What is the largest possible element resulting in the following operation “df.corr()”

100
1000
1

Question 12: If the Pearson Correlation of two variables is zero:

the two variable have zero mean
the two variables are not correlated

Question 13: If the p value of the Pearson Correlation is 1:

the variables are correlated
the variables are not correlated
none of the above

Question 14: What does the following line of code do: lm = LinearRegression()

fit a regression object lm
create a linear regression object
predict a value

Question 15: If the predicted function is:

Yhat = a + b1 X1 + b2 X2 + b3 X3 + b4 X4

The method is

Polynomial Regression
Multiple Linear Regression

Question 16: What steps do the following lines of code perform:

Input=[(‘scale’,StandardScaler()),(‘model’,LinearRegression())]

pipe=Pipeline(Input)

pipe.fit(Z,y)

ypipe=pipe.predict(Z)

Standardize the data, then perform a polynomial transform on the features Z
find the correlation between Z and y
Standardize the data, then perform a prediction using a linear regression model using the features Z and targets y

Question 17: What is the maximum value of R^2 that can be obtained

Question 18: We create a polynomial feature as follows “PolynomialFeatures(degree=2)”, what is the order of the polynomial

Question 19: You have a linear model the average R^2 value on your training data is 0.5, you perform a 100th order polynomial transform on your data then use these values to train another model, your average R^2 is 0.99 which comment is correct

100-th order polynomial will work better on unseen data
You should always use the simplest model
the results on your training data is not the best indicator of how your model performs, you should use your test data to get a beter idea

Question 20: You train a ridge regression model, you get a R^2 of 1 on your training data and you get a R^2 of 0 on your validation data, what should you do:

Nothing your model performs flawlessly on your test data
your model is under fitting perform a polynomial transform
your model is overfitting, increase the parameter alpha

Introduction to Data Analysis with Python

Data analysis with Python is an incredibly versatile and powerful skill. Python offers a wide range of libraries and tools for various aspects of data analysis, from data manipulation and cleaning to visualization and statistical modeling. Here’s a general overview of the key libraries and steps involved:

Data Collection: Start by gathering your data from various sources such as databases, CSV files, APIs, or web scraping.
Data Cleaning and Preprocessing: Use libraries like Pandas to clean and preprocess your data. This involves handling missing values, removing duplicates, converting data types, and more.
Exploratory Data Analysis (EDA): Explore your data to understand its structure, patterns, and relationships. Matplotlib, Seaborn, and Plotly are popular libraries for creating visualizations.
Statistical Analysis: Perform statistical analysis to uncover insights and trends in your data. You can use libraries like SciPy and StatsModels for statistical tests and modeling.
Machine Learning: If applicable, build machine learning models to make predictions or classify data. Scikit-learn is a powerful library for implementing machine learning algorithms in Python.
Data Visualization: Visualize your findings using libraries like Matplotlib, Seaborn, Plotly, or Bokeh to create informative and visually appealing plots and charts.
Reporting: Communicate your results effectively through reports, dashboards, or presentations using tools like Jupyter Notebooks, Dash, or Streamlit.

Some popular Python libraries for data analysis:

Pandas: Offers data structures and functions for efficient data manipulation and analysis.
NumPy: Provides support for numerical operations and arrays, often used in conjunction with Pandas.
Matplotlib: A versatile plotting library for creating static, interactive, and animated visualizations.
Seaborn: Built on top of Matplotlib, Seaborn offers a higher-level interface for statistical data visualization.
Scikit-learn: A comprehensive library for machine learning tasks such as classification, regression, clustering, and dimensionality reduction.
StatsModels: Offers classes and functions for estimating and interpreting various statistical models.
Plotly: Provides interactive and web-based visualizations, suitable for creating dashboards and presentations.
Jupyter Notebooks/JupyterLab: Interactive environments for writing and sharing code, visualizations, and narratives.

To get started with data analysis in Python, you can follow online tutorials, enroll in courses, or work on real-world projects to gain hands-on experience. Additionally, exploring documentation and examples of the aforementioned libraries will help you become proficient in data analysis with Python.

Priya Dogra – Certification | Jobs | Internships

Data Analysis with Python Cognitive Class Exam Quiz Answers

Related Articles

Enroll Here: Data Analysis with Python Cognitive Class Exam Quiz Answers

Data Analysis with Python Cognitive Class Certification Answers

Module 1 – Introduction

Module 2 – Data Wrangling

Module 3 – Exploratory Data Analysis

Module 4 – Model Development

Module 5 – Model Evaluation:

Data Analysis with Python Final Exam Answers – Cognitive Class

Introduction to Data Analysis with Python

About Clear My Certification

Check Also

Controlling Hadoop Jobs using Oozie Cognitive Class Exam Quiz Answers

Leave a Reply Cancel reply

Machine Learning A-Z™: Hands-On Python & R In Data Science Udemy 100% OFF Coupon Code

Latest Off Page SEO Techniques 2024 | How to Rank your Website in Search Engine

Download Video Marketing Blaster Pro 1.49 Free

Six Sigma Black Belt Certification Answers – GreyCampus

Metaverse Free Certification | Metaverse Quiz Questions and Answers

NSDC’s Digital Skill Initiative Webinar on Digital Financial inclusion and opportunity for the creation of Nano-Entrepreneurs through SahiPay as a Solution

Siemens Energy | Mechanical Engineer | Job Alert | Jobs For Engineering Students | Apply Now

Google Internships | Summer Internships for Students for 2022 – Apply Here

IT Fundamentals for Cybersecurity Specialization Certification

How to Become an Invigilator in British Council as Freelancer

Field Sales Trainee Hiring by Swiggy | Swiggy Jobs | Swiggy Internships

IBM SkillsBuild Training Program | Google Career Certificate Scholarship Program

Infosys Springboard Fundamentals of Information Security Free Certification Program

Infosys Springboard Fundamentals of Information Security Answers

Amazon Work From Home Job | Customer Service Jobs