Wednesday , June 19 2024
Breaking News

# Data Science 101 Cognitive Class Exam Quiz Answers

## Data Science 101 Cognitive Class Certification Answers

Question 1: From the reading In the report by the McKinsey Global Institute, by 2018, it is projected that there will be a shortage of people with deep analytical skills in the United States. What is the size of this shortage?

• 140 000 – 190 000 people
• 120 000
• 20 000 – 50 000 people
• 800 000 – 900 000 people
• 3 – 6 million people

Question 2: What has changed from the past to make Data science an in-demand occupation?

• There is now a lack of data
• Laws have changed
• Vast amount of data date being created
• The advent of the free market

Question 3: What is the minimum education requirement to become a data scientist?

• You must have a Degree in Computer Science
• You must have a Master’s degree in Statistics
• You must have a Ph.D. in Machine learning
• The above are all helpful, but they are not necessary to become a data scientist, education backgrounds of data scientists vary

Question 1: What is structured data??

• Data that can be stored in a database or some tabular form
• Images and video
• Segments of text
• Audio data

Question 2: What does the following formula represent: Base fair + Time x (Time in cab)

• The possible formula used in regression analysis to determine the cost of a cab ride
• The formula used to build a recommender system for rating a cab service
• A possible formula used in regression analysis used to determine the price of a house
• What is the impact of lot size on housing price?

Question 3: In the reading, what is an example of a question that can be put to a regression analysis?

• Do homes with brick exterior sell in rural areas?
• What is the impact of lot size on housing price?
• What are typical land taxes in a house sale?
• How much does a finished basement cost?
• How much should a house near a park cost?

Question 1: Complete the following sentence that best explains why business needs to capture data: At the end of the day, for businesses, they know one thing, that if they are unable to measure something:

• they are unable to graph it
• they are unable to improve it
• they are unable to show compliance with tax laws
• they are unable to facilitate meetings between sales and marketing

Question 2: A business should never:

• delete data
• use Machen learning
• well document data
• use PowerPoint to deliver a message

Question 3: In the reading above, what is the role of the data scientist?

• Email the stakeholders about the analysis
• Manage a team of analysts to create a model
• Develop the strategy to fix the problems in the findings
• Use the insights to build the narrative to communicate the findings
• Use the data to tell the story the CEO wants to tell

Question 1: What popular product is primarily based on data science:

• Smartphone
• Space X’s rockets
• Tesla’s Electric Cars

Question 2: From the readings the results section is where you present:

• The empirical findings
• R Squared
• The conclusion
• The contributors
• The methods used

Question 3: Complete the sentence: Predictions are useful?

• they are always correct
• but you need lots of data
• but they must come from a complicated model
• they are always wrong

Question 1: In the reading, how does the author define ‘data science’?

• Data science is way of understanding things, of understanding the world
• Data science is a physical science like physics or chemistry
• Data science is some data and more science
• Data science is what data scientists do
• Data science is the art of uncovering the hidden secrets in data

• His definition limits data science to activities involving machine learning
• His definition is only for people who program in Python
• His definition excludes statistics
• His definition is about weaving strong narratives into analytics
• His definition is inclusive of individuals from various academic backgrounds and training

Question 3: A good data scientist should?

• calculate confidence intervals
• be sceptical
• use complicated models
• only use big data

Question 1: In the reading, the output of a data mining exercise largely depends on:

• The engineer
• The programming language used
• The quality of the data
• The scope of the project
• The data scientist

Question 2: What has changed from the past to make Data science an in-demand occupation?

• There is now a lack of data
• Laws have changed
• Vast amount of data date being created
• The advent of the free market

Question 3: You develop an algorithm to predict rainy days, your algorithm predicts a rainy day, but the prediction is false. What is the following an example of?

• the r squared
• ture negatives
• false positive
• generated values

Question 4: What is an example of regression problem

• Finding an object in an image
• Reducing the size of a dataset
• Predicting the price of a house using the square footage
• Finding clusters in the data

Question 5: What should be a prime concern for storing data?

• Data safety and privacy
• Hiring the right database manager
• The size of the files
• The physical location of the servers

Question 6: What is a good starting point for data mining?

• Data Visualization
• Writing a data dictionary
• Non-parametric methods
• Creating a relational database
• Machine learning

Question 7: Complete the following sentence that best explains why business needs to capture data: At the end of the day, for businesses, they know one thing, that if they are unable to measure something:

• they are unable to graph it
• they are unable to improve it
• they are unable to show compliance with tax laws
• they are unable to facilitate meetings between sales and marketing

Question 8: When establishing data mining goals, the accuracy expected from the results also influences the:

• The timelines for the project
• The scope of the project
• The costs
• The presentation
• Data scientist

Question 9: When processing data, what factor can lead to errors in data?

• Synchronizing the database
• Changing services providers
• Renaming variables
• Human error
• Overfitting

Question 10: A good data scientist should?

• calculate confidence intervals
• be sceptical
• use complicated models
• only use big data

## Introduction to Data Science 101

Data science is an interdisciplinary field that involves extracting insights and knowledge from data using various techniques and methodologies. Here’s a basic introduction to some key concepts in data science:

1. Data: Data is at the heart of data science. It can be any information that is collected, stored, and analyzed. Data can come in different forms such as structured data (organized in a tabular format like databases), unstructured data (text, images, videos), and semi-structured data (a mix of both).
2. Statistics: Statistics is fundamental to data science. It involves collecting, analyzing, interpreting, and presenting data. Statistical methods help in summarizing data, making inferences, and predicting future outcomes based on historical data.
3. Machine Learning: Machine learning is a subset of artificial intelligence (AI) that focuses on developing algorithms and models that enable computers to learn from data and make predictions or decisions without being explicitly programmed. There are various types of machine learning techniques, including supervised learning, unsupervised learning, and reinforcement learning.
4. Data Mining: Data mining is the process of discovering patterns, correlations, and insights from large datasets. It involves using techniques from statistics, machine learning, and database systems to extract useful information from data.
5. Data Visualization: Data visualization is the graphical representation of data. It helps in understanding complex data patterns, trends, and relationships by presenting information in a visual format such as charts, graphs, and maps.
6. Big Data: Big data refers to extremely large and complex datasets that traditional data processing methods are inadequate to handle. Big data technologies such as Hadoop and Spark are used to store, process, and analyze massive volumes of data.
7. Python/R Programming: Python and R are two of the most popular programming languages used in data science. They offer powerful libraries and tools for data manipulation, analysis, and visualization.
8. Data Preprocessing: Data preprocessing involves cleaning, transforming, and organizing raw data into a format suitable for analysis. It is a crucial step in the data science workflow as the quality of the input data directly impacts the results of analysis.
9. Data Ethics and Privacy: Data ethics and privacy considerations are important in data science. It involves ensuring that data is collected, stored, and used in an ethical and responsible manner, while also respecting the privacy rights of individuals.
10. Domain Knowledge: Domain knowledge refers to expertise in a specific field or industry. It plays a vital role in data science as understanding the context and nuances of the data domain is essential for accurate analysis and interpretation.