Enroll Here: A Foundation Program in Data Science Certification
Question 1: Fill in the blanks with the correct option(s): Logistic regression is a ____________ regression technique that is used to model data having a ________ outcome
- linear, numeric
- linear, binary
- nonlinear, numeric
- nonlinear, binary
Question 2: Which of the following is NOT a supervised learning?
- PCA
- Decision Tree
- Linear Regression
- Naive Bayesian
Question 3: Which of the following is the method to find the best fit line for data in Linear Regression?
- Least Square Error
- Maximum Likelihood
- Logarithmic Loss
- Both A and B
Question 4: Which of the following assumption in regression modelling impacts the trade-off between under-fitting and over-fitting the most?
- The polynomial degree
- Whether we learn the weights by matrix inversion or gradient descent
- The use of a constant-term
- None of the above
Question 5: Which one of the following statements is true regarding residuals in regression analysis?
- Mean of residuals is always zero
- Mean of residuals is always less than zero
- Mean of residuals is always greater than zero
- There is no such rule for residuals.
Question 6: Which of the one is true about Heteroskedasticity?
- Linear Regression with varying error terms
- Linear Regression with constant error terms
- Linear Regression with zero error terms
- None of these
Question 7: To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?
- Scatter plot
- Bar chart
- Histograms
- None of these
Question 8: Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?
- Ridge regression uses subset selection of features
- Lasso regression uses subset selection of features
- Both use subset selection of features
- None of above
Question 9: Which of the following options is true regarding “Regression” and “Correlation”? Note: y is the dependent variable and x is an independent variable.
- The relationship is symmetric between x and y in both.
- The relationship is not symmetric between x and y in both.
- The relationship is not symmetric between x and y in case of correlation but in case of regression it is symmetric.
- The relationship is symmetric between x and y in case of correlation but in case of regression it is not symmetric.
Question 10: Which of the following methods does not have a closed form solution for its coefficients?
- Ridge regression
- Lasso
- Both Ridge and Lasso
- None of both
Question 11: Which of the following step/assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most?
- The polynomial degree
- Whether we learn the weights by matrix inversion or gradient descent
- The use of a constant-term
- None of the above
Question 12: Let’s say a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?
- You will always have test error zero
- You can not have test error zero
- None of the above
- Both A and B
Question 13: Which of the following indicates a fairly strong relationship between X and Y?
- Correlation coefficient = 0.9
- The p-value for the null hypothesis Beta coefficient =0 is 0.0001
- The t-statistic for the null hypothesis Beta coefficient=0 is 30
- None of these
Question 14: Which of the following algorithm are not an example of an ensemble learning algorithm?
- Random Forest
- Extra Trees
- Gradient Boosting
- Decision Trees
Question 15: Which of the following is/are true while applying bagging to regression trees? 1.We build the N regression with N bootstrap sample. 2.We take the average the of N regression tree. 3. Each tree has a high variance with low bias.
- 1 and 2
- 2 and 3
- 1 and 3
- 1,2 and 3
Question 16: How to select best hyperparameters in tree based models?
- Measure performance over training data
- Measure performance over validation data
- Both of these
- None of these
Question 17: What are tree based classifiers?
- Classifiers which form a tree with each attribute at one level.
- Classifiers which perform series of condition checking with one attribute at a time.
- Both the options given above.
- None of the above
Question 18: How will you counter over-fitting in decision tree?
- By pruning the longer rules
- By creating new rules
- Both By pruning the longer rules’ and ‘ By creating new rules’
- None of the option
Question 19: Which of the following sentence(s) is/are correct?
- In pre-pruning a tree is ‘pruned’ by halting its construction early.
- A pruning set of class labeled tuples is used to estimate cost complexity.
- The best pruned tree is the one that minimizes the number of encoding bits.
- All of the above
Question 20: Which one of these is not a tree based learner?
- CART
- ID3
- Bayesian Classifier
- Random Forest