Wednesday , November 6 2024
Breaking News

Machine Learning with R Cognitive Class Exam Quiz Answers

Machine Learning with R Cognitive Class Certification Answers

Question 1: Machine Learning was developed shortly (within the same century) as statistical modelling, therefore adopting many of its practices.

  • True
  • False

Question 2: Supervised learning deals with unlabeled data, while unsupervised learning deals with labelled data.

  • True
  • False

Question 3: Machine Learning is applied in current technologies, such as:

  • Trend Prediction (ex. House Price Trends)
  • Gesture Recognition (ex. Xbox Connect)
  • Facial Recognition (ex. Snapchat)
  • A and B, but not C
  • All of the above

Question 1: In K-Nearest Neighbors, which of the following is true:

  • A very high value of K (ex. K = 100) produces a model that is better than a very low value of K (ex. K = 1)
  • A very high value of K (ex. K = 100) produces an overly generalised model, while a very low value of k (ex. k = 1) produces a highly complex model.
  • A very low value of K (ex. K = 1) produces an overly generalised model, while a very high value of k (ex. k = 100) produces a highly complex model.
  • All of the Above

Question 2: A difficulty that arises from trying to classify out-of-sample data is that the actual classification may not be known, therefore making it hard to produce an accurate result.

  • True
  • False

Question 3: When building a decision tree, we want to split the nodes in a way that decreases entropy and increases information gain.

  • True
  • False

Question 1: Which of the following is generally true about the evaluation models: Train and Test on the Same Dataset and Train/Test Split.

  • Train and Test on the Same Dataset has a high training accuracy and high out-of-sample accuracy, while Train/Test Split has a low training accuracy and low out-of-sample accuracy.
  • Train and Test on the Same Dataset has a low training accuracy and high out-of-sample accuracy, while Train/Test Split has a high training accuracy and low out-of-sample accuracy.
  • Train and Test on the Same Dataset has a high training accuracy and low out-of-sample accuracy, while Train/Test Split has a low training accuracy and high out-of-sample accuracy.
  • Train and Test on the Same Dataset has a low training accuracy and low out-of-sample accuracy, while Train/Test Split has a high training accuracy and high out-of-sample accuracy.

Question 2: Which of the following is true about bias and variance?

  • Having a high bias underfits the data and produces a model that is overly complex, while having high variance overfits the data and produces a model that is overly generalized.
  • Having a high bias underfits the data and produces a model that is overly generalized, while having high variance overfits the data and produces a model that is overly complex.
  • Having a high bias overfits the data and produces a model that is overly complex, while having high variance underfits the data and produces a model that is overly generalized.
  • Having a high bias overfits the data and produces a model that is overly generalized, while having high variance underfits the data and produces a model that is overly complex.

Question 3: Root Mean Squared Error is the most popular evaluation metric out of the three discussed, because it produces the same units as the response vector, making it easy to relate information.

  • True
  • False

Question 1: What are some disadvantages that K-means clustering presents?

  • Updating can occur even though there is a possibility of a centroid not having data points in its group
  • K-means clustering is generally slower, compared to many other clustering algorithms
  • There is high bias in the models, due to where the centroids are initiated
  • None of the above

Question 2: Decision Trees tend to have high bias and low variance, which Random Forests fix.

  • True
  • False

Question 3: A Dendrogram can only be read for Agglomerative Hierarchical Clustering, not Divisive Hierarchical Clustering.

  • True
  • False

Question 1: Filters produce a feature set that does not contain assumptions based on the predictive model, making it a useful tool to expose relationships between features.

  • True
  • False

Question 2: Principle Components Analysis retains all information during the projection process of higher order features to lower orders.

  • True
  • False

Question 3: Which of the following is not a challenge to a recommendation system that uses collaborative filtering?

  • Diversity Sheep
  • Shilling Attacks
  • Scalability
  • Synonyms

Question 1: Randomness is important in Random Forests because it allows us to have distinct, different trees that are based off of different data.

  • True
  • False

Question 2: When building a decision tree, we want to split the nodes in a way that increases entropy and decreases information gain.

  • True
  • False

Question 3: Which of the following is true?

  • A high value of K in KNN creates a model with low bias and high variance
  • An observation must contain values for all features
  • A categorical value cannot be numeric
  • None of the above

Question 4: In terms of Bias and Variance, Variance is the inconsistency of a model due to small changes in the dataset.

  • True
  • False

Question 5: Which is the definition of entropy?

  • The purity of each node in a random forest.
  • Information collected that can increase the level of certainty in a particular prediction.
  • The information that is used to randomly select a subset of data.
  • The amount of information disorder in the data.

Question 6: Which of the following is true about hierarchical linkages?

  • Average linkage is the average distance of each point in one cluster to every point in another cluster
  • Complete linkage is the shortest distance between a point in two clusters
  • Centroid linkage is the distance between two randomly generated centroids in two clusters
  • Single linkage is the distance between any points in two clusters

Question 7: In terms of Bias and Variance, Variance is the inconsistency of a model due to small changes in the dataset.

  • True
  • False

Question 8: Which is true about bootstrapping?

  • All data points must be used when bootstrapping is applied
  • The data points are randomly selected with replacement
  • The data points are randomly selected without replacement
  • It is the same as bagging

Question 9: Machine Learning is still in early development and does not have much of an impact on the current society.

  • True
  • False

Question 10: In comparison to supervised learning, unsupervised learning has:

  • Less tests
  • More models
  • A better controlled environment
  • More tests, but less models

Question 11: Outliers are points that are classified by Density-Based Clustering that do not belong to any cluster.

  • True
  • False

Question 12: Which of the following is false about Linear Regression?

  • It does not require tuning parameters
  • It is highly interpretable
  • It is fast
  • It has a low variability on predictive accuracy

Question 13: Machine Learning uses algorithms that can learn from data without relying on standard programming practices.

  • True
  • False

Question 14: Which of the following are types of supervised learning?

  • Clustering
  • Regression
  • Classification
  • Both A and B

Question 15: A Bottom-Up version of hierarchical clustering is known as Divisive clustering. It is a more popular method than the agglomerative method.

  • True
  • False

Question 16: Which is NOT a specific outcome of how Dimensionality Performance improves production?

  • Highlights the main linear technique called Principle Components Analysis.
  • Creates step-wise regression.
  • Reduces number of features to be considered.
  • Highlights relevant variables only and omits irrelevant ones.

Question 17: Feature Selection is the process of selecting the variables that will be projected from a high-order dimension to a lower one.

  • True
  • False

Question 18: Hierarchical Clustering is one of the three main algorithms for clustering along with K-Means and Density Based Clustering.

  • True
  • False

Question 19: Which one is NOT a feature of Dimensionality Reduction?

  • It can be divided into two subcategories called Feature Selection and Feature Extraction
  • Removal of an “outsider” from the least cohesive cluster.
  • Feature Selection includes Wrappers, Filters, and Embedded.
  • Feature Extraction includes Principle Components Analysis.
  • It reduces the number of variables/features in review.

Question 20: Low bias tends to create overly generalized models, which can cause a loss of relevant relations between the features and target output. When a model has low bias, we say that it “under fits” the data.

  • True
  • False

Introduction to Machine Learning with R

“Introduction to Machine Learning with R” is a comprehensive guide designed to help individuals delve into the world of machine learning using the R programming language. Here’s a breakdown of what such a book might cover:

  1. Basics of R: It would start with an introduction to R programming for those who are new to the language. This would cover basic syntax, data types, data structures, and functions in R.
  2. Introduction to Machine Learning: The book would provide a clear explanation of what machine learning is, its various types (supervised, unsupervised, reinforcement learning), and its applications in real-world scenarios.
  3. Data Preprocessing: Before diving into building machine learning models, it’s crucial to understand how to preprocess data. This includes tasks such as handling missing values, dealing with categorical variables, feature scaling, and data transformation.
  4. Supervised Learning Algorithms: The book would cover a range of supervised learning algorithms such as linear regression, logistic regression, decision trees, random forests, support vector machines (SVM), and neural networks. Each algorithm would be explained in detail along with practical examples using R.
  5. Unsupervised Learning Algorithms: Unsupervised learning techniques like clustering (K-means, hierarchical clustering) and dimensionality reduction (principal component analysis – PCA, t-distributed stochastic neighbor embedding – t-SNE) would be explored, again with practical examples in R.
  6. Model Evaluation and Selection: Understanding how to evaluate machine learning models is crucial. The book would cover techniques for evaluating model performance, including metrics like accuracy, precision, recall, F1-score, ROC curves, and AUC-ROC.
  7. Model Tuning and Optimization: To improve model performance, techniques for hyperparameter tuning and optimization would be discussed. This involves methods like cross-validation, grid search, and random search.
  8. Feature Selection and Engineering: Exploring methods for selecting relevant features and creating new features to improve model performance would be covered. Techniques might include recursive feature elimination, feature importance, and domain-specific feature engineering.
  9. Deployment and Productionization: Once a model is trained, it needs to be deployed into production. This section would discuss strategies for deploying machine learning models, including using R packages like plumber for creating APIs.
  10. Real-world Projects and Case Studies: The book might include real-world projects and case studies to illustrate how machine learning techniques can be applied to solve practical problems across various domains like healthcare, finance, marketing, and more.
  11. Ethical Considerations: As with any technology, it’s important to consider the ethical implications of machine learning. This section would explore topics like bias in algorithms, fairness, privacy, and transparency.

Overall, “Introduction to Machine Learning with R” would serve as a comprehensive resource for beginners looking to learn the fundamentals of machine learning using the R programming language, as well as for practitioners seeking to deepen their understanding and expand their skill set.

About Clear My Certification

Check Also

Controlling Hadoop Jobs using Oozie Cognitive Class Exam Quiz Answers

Enroll Here: Controlling Hadoop Jobs using Oozie Cognitive Class Exam Quiz Answers Controlling Hadoop Jobs …

Leave a Reply

Your email address will not be published. Required fields are marked *