Enroll Here: Data Science Methodology Cognitive Class Exam Quiz Answers
Data Science Methodology Cognitive Class Certification Answers
Module 1 – From Problem to Approach Quiz Answers – Cognitive Class
Question 1: Select the correct statement.
- A methodology is an application for a computer program.
- A methodology is a set of instructions.
- A methodology is a system of methods used in a particular area of study or activity.
- All of the above statements are correct.
Question 2: Select the correct statement.
- The data science methodology described in this course is only used by certified data scientists.
- The data science methodology described in this course is outlined by John Rollins from IBM.
- The data science methodology described in this course is limited to IBM.
- None of the above statements are correct.
Question 3: Select the correct statement.
- The first stage of the data science methodology is data understanding.
- The first stage of the data science methodology is modeling.
- The first stage of the data science methodology is business understanding.
- The first stage of the data science methodology is data collection.
Module 2 – From Requirements to Collection Quiz Answers – Cognitive Class
Question 1: Select the correct statement.
- If a problem is a dish, then data is an answer.
- If a problem is a dish, then data is an ingredient.
- If a problem is a dish, then data is a list of information.
- None of the above statements are correct.
Question 2: Select the correct statement.
- A data requirement is never refined.
- A data requirement is set in stone.
- A data requirement is the initial set of ingredients.
- None of the above statements are correct.
Question 3: Select the correct statement.
- Data scientists determine how to prepare the data.
- Data scientists identify the data that is required for data modeling.
- Data scientists determine how to collect the data.
- All of the above.
Module 3 – From Understanding to Preparation Quiz Answers – Cognitive Class
Question 1: Select the correct statement about data preparation.
- Data preparation involves properly formatting the data.
- Data preparation involves correcting invalid values and addressing outliers.
- Data preparation involves removing duplicate data.
- Data preparation involves addressing missing values.
- All of the above statements are correct.
Question 2: Select the correct statement about data understanding.
- Data understanding encompasses removing redundant data.
- Data understanding encompasses all activities related to constructing the dataset.
- Data understanding encompasses sorting the data.
- All of the above statements about data understanding are correct.
Question 3: Select the correct statement about what data scientists and database administrators (DBAs) do during data preparation.
- During data preparation, data scientists and DBAs identify missing data.
- During data preparation, data scientists and DBAs determine the timing of events.
- During data preparation, data scientists and DBAs aggregate the data and merge them from different sources.
- During data preparation, data scientists and DBAs define the variables to be used in the model.
- All of the above statements are correct.
Module 4 – From Modeling to Evaluation Quiz Answers – Cognitive Class
Question 1: Select thee correct statement.
- A training set is used for data visualization.
- A training set is used for predictive modeling.
- A training set is used for statistical analysis.
- A training set is used for descriptive modeling.
- None of the above statements are correct,
Question 2: A statistician calls a false-negative, a type I error, and a false-positive, a type II error.
- True
- False
Question 3: Select the correct statement about model evaluation.
- Model evaluation can include statistical significance testing.
- Model evaluation includes ensuring that the data are properly handled and interpreted.
- Model evaluation includes ensuring the model is designed as intended.
- Model evaluation includes snsuring that the model is working as intended.
- All of the above statements are correct.
Module 5 – From Deployment to Feedback Quiz Answers – Cognitive Class
Question 1: The final stages of the data science methodology are an iterative cycle between model evaluation, deployment, and feedback.
- True
- False
Question 2: What is model evaluation used for?
- Assessing the model after getting deployed.
- Assessing the model before getting deployed.
- Determining if the model is good for other uses.
- All of the above.
- None of the above.
Question 3: Select the correct statement about the feedback stage of the data science methodology.
- Feedback is essential to the long-term viability of the model.
- Feedback is not helpful and gets in the way.
- Feedback is not required once launched.
- None of the above statements are correct.
Data Science Methodology Final Exam Answers – Cognitive Class
Question 1: Select the correct sentence about the data science methodology explained in the course.
- Data science methodology is not an iterative process – one does not go back and forth between methodological steps.
- Data science methodology is a specific strategy that guides processes and activities relating to data science only for text analytics.
- Data science methodology always starts with data collection.
- Data science methodology provides the data scientist with a framework for how to proceed to obtain answers.
- Data science methodology depends on a specific set of technologies or tools.
Question 2: Business understanding is important in the data science methodology stage. Why?
- Because it shapes the rest of the methodological steps.
- Because it clearly defines the problem and the needs from a business perspective.
- Because it ensures that the work generates the intended solution.
- Because it involves domain expertise.
- All of the above.
Question 3: A data scientist determines that building a recommender system is the solution for a particular business problem at hand. What stage of the data science methodology does this represent?
- Modeling
- Deployment
- Model evaluation
- Analytic approach
- Data understanding
Question 4: Which of the following represent the two important characteristics of the data science methodology?
- It is a highly iterative process and immediately ends when the model is deployed.
- It is not an iterative process and it never ends.
- It has no endpoint because data collection occurs before identifying the data requirements.
- It immediately ends when the model is deployed because no feedback is required.
- It is a highly iterative process and it never ends.
Question 5: What do data scientists typically use for exploratory analysis of data and to get acquainted with them?
- They use support vector machines and neural networks as feature extraction techniques.
- They begin with regression, classification, or clustering.
- They use deep learning.
- They use descriptive statistics and data visualization techniques.
- All of the above.
Question 6: Select the correct statement about data preparation.
- Data preparation cannot be accelerated through automation.
- Data preparation involves dealing with missing improperly coded data and can include using text analysis to structure unstructured or semi-structured text data.
- Data preparation is typically the least time-consuming methodological step.
- All of the above.
- None of the above.
Question 7: Which statement best describes the modeling stage of the data science methodology.
- Modeling is followed by the analytic approach stage.
- Modeling may require testing multiple algorithms and parameters.
- Modeling is always based on predictive models.
- Modeling always uses training and test sets.
- All of the above.
Question 8: Which of the following statements best describe the model evaluation stage of the data science methodology?
- Model evaluation may entail statistical significance tests, particularly when additional proof is necessary to justify some of the emerging recommendations.
- Model evaluation is important because it examines how well the model performs in the context of the business problem.
- Model evaluation entails computing graphs and/or various diagnostic measures such as a confusion matrix.
- Model evaluation is done using a test set if the model is a predictive one.
- All of the above.
Question 9: What does deploying a model into production represent?
- It represents the end of the iterative process that includes feedback, model refinement, and redeployment.
- It represents the beginning of an iterative process that includes feedback, model refinement and redeployment and requires the input of additional groups, such as marketing personnel and business owners.
- It represents the final data science product.
- None of the above.
Question 10: A data scientist, John, was asked to help reduce readmission rates at a local hospital. After some time, John provided a model that predicted which patients were more likely to be readmitted to the hospital and declared that his work was done. Which of the following best describes this scenario?
- John only provided one model as a solution and he should have provided multiple models.
- The scenario is already optimal.
- Even though John only submitted one solution, it might be a good one. However, John needed feedback on his model from the hospital to confirm that his model was able to address the problem appropriately and sufficiently.
- John’s mistake is that he lied in the analytic approach step of the data science methodology.
- John still needed to collect more data.
Question 11: A car company asked a data scientist to determine what type of customers are more likely to purchase their vehicles. However, the data comes from several sources and is in a relatively “raw format”. What kind of processing can the data scientist perform on the data to prepare it for modeling?
- Feature engineering.
- Transforming the data into more useful variables.
- Combining the data from the various sources.
- Addressing missing/invalid values.
- All of the above.
Question 12: High-performance, massively parallel systems can be used to facilitate the following methodological steps.
- Data Preparation and Modeling.
- Modeling only.
- Deployment.
- Business Understanding.
- All of the above.
Question 13: Data scientists may use either a “top-down” approach or a “bottom-up” approach to data science. These two approaches refer to:
- “Top-down” approach – the data, when sorted, is modeled from the “top” of the data towards the “bottom”. “Bottom-up” approach – the data is modeled from the “bottom” of the data to the “top”.
- “Top-down” approach – models are fit before the data is explored. “Bottom-up” approach – data is explored, and then a model is fit.
- “Top-down” approach – first defining a business problem then analyzing the data to find a solution. “Bottom-up” approach – starting with the data, and then coming up with a business problem based on the data.
- “Top-down” approach – using massively parallel, warehouses with huge data volumes as the data source. “Bottom-up” approach – using a sample of small data before using large data.
- All of the above.
Question 14: The following are all examples of rapidly evolving technologies that affect data science methodology EXCEPT for?
- Data sampling.
- Automation.
- Text analysis.
- Platform growth.
- In-database analytics.
Question 15: Data understanding involves all of the following EXCEPT for?
- Discovering initial insights about the data.
- Visualizing the data.
- Assessing data quality.
- Understanding the content of the data.
- Gathering and analyzing feedback for assessment of the model’s performance.
Question 16: For predictive models, a test set, which is similar to – but independent of – the training set, is used to determine how well the model predicts outcomes. This is an example of what step in the methodology?
- Data preparation.
- Deployment.
- Analytic approach.
- Model evaluation.
- Data requirements.
Question 17: “When ______ data is available (such as customer call center logs or physicians’ notes in unstructured or semi-structured format), _______ analytics can be useful in deriving new structured variables to enrich the set predictors and improve model accuracy.” Which of the following most appropriately fills in the blanks?
- text; text
- market; statistical
- big; digital
- highly structured; text
- text; predictive
Question 18: Typically, in a predictive model, the training set and the test set are very different and independent, such as having a different set of variables or structure.
- True
- False
Question 19: Data scientists may frequently return to a previous stage to make adjustments, as they learn more about the data and the modeling.
- True
- False
Question 20: Why should data scientists maintain continuous communication with business sponsors throughout a project?
- So that business sponsors can provide domain expertise.
- So that business sponsors can ensure the work remains on track to generate the intended solution.
- So that business sponsors can review intermediate findings.
- All of the above.
- None of the above.
Introduction to Data Science Methodology
Data Science Methodology is a structured approach to solving complex problems using data. It encompasses a set of processes, techniques, and tools that enable data scientists to extract valuable insights and make informed decisions. Here’s a breakdown of the key components of data science methodology:
- Problem Definition: The first step in any data science project is to clearly define the problem you’re trying to solve. This involves understanding the business objectives, identifying stakeholders’ requirements, and defining success criteria.
- Data Collection: Once the problem is defined, the next step is to gather relevant data from various sources. This could include structured data from databases, unstructured data from text documents or social media, or semi-structured data from APIs.
- Data Preparation: Raw data is often messy and requires cleaning and preprocessing before analysis. This step involves tasks such as handling missing values, removing duplicates, standardizing formats, and transforming data into a suitable structure for analysis.
- Exploratory Data Analysis (EDA): EDA involves exploring the dataset to understand its characteristics, identify patterns, and uncover insights. Techniques such as summary statistics, data visualization, and correlation analysis are commonly used in this phase.
- Feature Engineering: Feature engineering involves creating new features or transforming existing features to improve the performance of machine learning models. This may include techniques such as encoding categorical variables, scaling numerical features, or creating interaction terms.
- Model Development: In this phase, predictive or descriptive models are developed using machine learning algorithms. This involves selecting appropriate algorithms, training the models on the data, tuning hyperparameters, and evaluating their performance using validation techniques.
- Evaluation: Models are evaluated based on their performance metrics and compared against baseline models or business benchmarks. This step helps determine whether the model meets the requirements defined in the problem definition phase.
- Deployment: Once a satisfactory model is developed, it is deployed into production where it can be used to make predictions or drive decision-making. This involves integrating the model into existing systems, monitoring its performance, and ensuring its scalability and reliability.
- Feedback Loop: Data science projects are iterative processes, and it’s essential to continuously monitor and refine models over time. Feedback from stakeholders, as well as changes in the underlying data or business environment, may necessitate updates to the model or methodology.