Enroll Here: Predictive Modeling Fundamentals I Cognitive Class Exam Quiz Answers
Predictive Modeling Fundamentals I Cognitive Class Certification Answers
Module 1: Introduction to Data Mining Quiz Answers – Cognitive Class
Question 1: Which of the following applications would require the use of data mining? Select all that apply.
- Predicting the outcome of flipping a fair coin
- Determining which products in a store are likely to be purchased together
- Predicting future stock prices using historical records
- Determining the total number of products sold by a store
- Sorting a student database by gender
Question 2: Which of the following is NOT a section of the Modeler Interface?
- Nodes
- Palettes
- Stream Canvas
- Stream, Outputs, and Model Manager
- All of the above are sections of the Modeler Interface
Question 3: Which of the following is NOT a part of the Cross-Industry Process for Data Mining?
- Business Understanding
- Evaluation
- Data Preparation
- Data Storage
- Modeling
Module 2: The Data Mining Process Quiz Answers – Cognitive Class
Question 1: Which phase of the data mining process focuses on understanding the project requirements and objectives?
- Data Understanding
- Data Exploration
- Data Preprocessing
- Business Understanding
- Data Preparation
Question 2: Which Data Preprocessing task focuses on removing outliers and filling in missing values?
- Data Integration
- Data Cleaning
- Data Transformation
- Data Reduction
- None of the above
Question 3: The IBM SPSS Modeler supports which data type?
- Nominal
- Categorical
- Ordinal
- Continuous
- All of the above
Module 3: Modeling Techniques Quiz Answers – Cognitive Class
Question 1: Which of the following methods are commonly used for supervised learning tasks? Select all that apply.
- Neural Networks
- Decision Trees
- K-Means
- CARMA
- Regression
Question 2: Classification is a subset of supervised learning that focuses on modeling continuous variables. True or false?
- True
- False
Question 3: Which of the following algorithms is NOT supported by the SPSS Modeler?
- Logistic Regression
- CARMA
- K-Means
- Apriori
- All of the above algorithms are supported
Module 4: Model Evaluation Quiz Answers – Cognitive Class
Question 1: What is the term for a negative data point that is incorrectly classified as positive?
- True Negative
- False Positive
- True Positive
- False Negative
- None of the above
Question 2: Which of the following is NOT a cost-sensitive performance metric?
- Precision
- Accuracy
- Specificity
- Sensitivity
- All of the above metrics are cost-sensitive
Question 3: What is the formula for the precision metric?
- (False Positive) / (True Negative + True Positive)
- (True Positive) / (True Positive + False Positive)
- (False Positive) / (True Positive + False Positive)
- (True Positive) / (True Positive + False Negative)
- (True Negative) / (True Negative + False Positive)
Module 5: Deployment on IBM Bluemix Quiz Answers – Cognitive Class
Question 1: In general, the testing dataset should be significantly larger than the training dataset. True or false?
- True
- False
Question 2: Which of the following is NOT a model deployment solution?
- Bluemix
- CRISP-DM
- IBM Collaboration and Deployment Services
- SPSS Solution Publisher
- All of the above are model deployment solutions
Question 3: Which of the following statements are true of IBM Bluemix? Select all that apply.
- Bluemix generally takes about a week to deploy an app
- Bluemix is supported by a growing community
- Bluemix is closed-source
- Bluemix provides a self-service application-hosting environment
- Bluemix provides built-in load-balancing capabilities
Predictive Modeling Fundamentals I Final Exam Answers – Cognitive Class
Question 1: Which of the following suggests that the model is overfitting the data?
- High accuracy on training data and high accuracy on testing data
- Low accuracy on training data and high accuracy on testing data
- Low accuracy on training data and low accuracy on testing data
- High accuracy on training data and low accuracy on testing data
- None of the above
Question 2: Which of the following tasks would require the use of data mining?
- Predicting the outcome of rolling two fair dice
- Determining which products in a store are likely to be purchased together
- Sorting a customer database by age
- Computing the number of products sold over a given time period
- All of the above
Question 3: Suppose you have collected data on your customers and you wish to determine the demographics they fall into. Which technique is best suited for this task?
- Neural Network
- Logistic Regression
- Clustering
- Linear Regression
- Decision Tree
Question 4: Suppose you wish to use data mining in order to determine which customers are most likely to sign up for a new service. Which technique is best suited for this task?
- Apriori
- Decision Tree
- Sequence
- K-means
- CARMA
Question 5: Which SPSS Modeler node can be used to determine a model’s performance? Select all that apply.
- Evaluation Node
- Analysis Node
- Table Node
- Auto Classifier Node
- Sequence Node
Question 6: Which of the following is NOT a classification or prediction algorithm in SPSS Modeler?
- Linear Regression
- Neural Network
- Logistic Regression
- Discriminant
- Apriori
Question 7: Which SPSS Modeler node is used to specify whether a given field is an input or a target?
- Auto Classifier Node
- Data Audit Node
- Table Node
- Type Node
- Analysis Node
Question 8: Which SPSS Modeler node is useful for exploratory analysis on a data set?
- Analysis Node
- Auto Classifier Node
- Table Node
- Data Audit Node
- Evaluation Node
Question 9: Which SPSS Modeler node is used to both rename fields and exclude fields from the model?
- Restructure Node
- Filter Node
- Partition Node
- Data Audit Node
- Evaluation Node
Question 10: What is the formula for the accuracy metric? TP = true positive, TN = true negative, FP = false positive, and FN = false negative.
- TN / (TN + FP)
- TP / (TP + FN)
- TP / (TP + FP)
- (FP + FN) / (TP + TN + FP + FN)
- (TP + TN) / (TP + TN + FP + FN)
Question 11: Which major data preprocessing step focuses on feature selection and feature extraction?
- Data Integration
- Data Cleaning
- Data Reduction
- Data Audit
- Data Transformation
Question 12: Which SPSS Modeler node is used to identify missing data and screen out potentially problematic fields?
- Auto Data Preparation Node
- Auto Classifier Node
- Restructure Node
- Evaluation Node
- Data Audit Node
Question 13: SPSS Modeler provides automated tools that determine the best algorithm to use for an application. True or false?
- True
- False
Question 14: Which SPSS Modeler node is used for sampling the data set?
- Type Node
- Partition Node
- Data Audit Node
- Filter Node
- Restructure Node
Question 15: Which phase of the data mining process focuses on gathering insights about the data set?
- Data Integration
- Data Understanding
- Business Understanding
- Data Preparation
- Data Preprocessing
Introduction to Predictive Modeling Fundamentals I
Predictive modeling is a powerful tool in data science used to predict future outcomes based on historical data. Whether it’s forecasting sales, predicting customer churn, or diagnosing diseases, predictive modeling can be applied across various industries and domains. Here’s a primer on some fundamental concepts:
- Objective Definition: The first step in predictive modeling is to clearly define the objective. What are you trying to predict? This could be a binary outcome (e.g., will a customer buy a product?) or a continuous variable (e.g., predicting the price of a house).
- Data Collection and Preprocessing: Gathering relevant data is crucial. This often involves collecting data from various sources such as databases, APIs, or files. Data preprocessing is the step where you clean, transform, and prepare the data for modeling. This may include handling missing values, encoding categorical variables, and scaling numerical features.
- Feature Selection and Engineering: Features are the variables used to make predictions. Feature selection involves choosing the most relevant features that contribute to the predictive power of the model. Feature engineering involves creating new features or transforming existing ones to improve model performance.
- Model Selection: There are various algorithms available for predictive modeling, each with its strengths and weaknesses. Common algorithms include linear regression, decision trees, random forests, support vector machines, and neural networks. The choice of algorithm depends on the nature of the problem, the size and complexity of the data, and computational resources.
- Model Training: Once the data is prepared and the model is selected, it’s time to train the model on the historical data. During training, the model learns the patterns and relationships between the features and the target variable.
- Model Evaluation: After training, the model’s performance needs to be evaluated using metrics appropriate for the specific problem. Common evaluation metrics include accuracy, precision, recall, F1 score, and area under the ROC curve (AUC).
- Hyperparameter Tuning: Many machine learning algorithms have parameters that need to be tuned to optimize model performance. Hyperparameter tuning involves finding the best combination of hyperparameters through techniques like grid search, random search, or Bayesian optimization.
- Model Deployment: Once a satisfactory model is trained and evaluated, it can be deployed to make predictions on new, unseen data. Deployment can be in the form of a web service, API, or integrated directly into existing systems.