Home Uncategorised DataOps Methodology Cognitive Class Answers

DataOps Methodology Cognitive Class Answers

221
0

Apply Here: DataOps Methodology

Module 1: Establish DataOps – Prepare for operation Lesson 2 – Establish Data Strategy

Question 1: Before we can put together a data strategy, we need to have a good understanding of the data available and how it is used in the organization.

  • True
  • False

Question 2: What is a data strategy?

  • An architecture and actionable roadmap along with an action plan
  • A competitive publication to show that our organization is modern
  • A plan to move all legacy data systems to the cloud

Question 3: Implementing a data strategy should always result in cost savings in the year the plan is realized.

  • True
  • False  

Question 4: Which of the following statements about Data Strategy are ?

  • Whatever the type of data, it should only include internally produced data
  • All types of data – both structured and unstructured need to be considered
  • Volumes of data have increased hugely, but are now starting to stabilize
  • Only business executives should be consulted in putting together a strategy

Question 5: Data Governance is a key part of executing a data strategy.

  • True
  • False

Module 1: Establish DataOps – Prepare for operation Lesson 3 – Establish Team

Question 1: A DataOps team consists of members mostly from IT departments.

  • True
  • False

Question 2: Which of the following roles are active team members of any DataOps team?

  • Chief Technology Officer
  • Chief Data Officer
  • Data Engineer
  • Database Administrator
  • Data Steward
  • Data Architect
  • Data Scientist

Question 3: Creating and maintain business terms is a major responsibility of which following role?

  • Data Engineer
  • Data Quality Analyst
  • Data Steward
  • Data Scientist

Question 4: Only Chief Data Officer can update the KPIs for a data sprint.

  • True
  • False

Question 5: DataOps relies heavily on the use of automation, so that communication among team members is not necessary.

  • True
  • False

Module 2: Establish DataOps – Optimize for operation Lesson 1 – Establish Toolchain

Question 1: DataOps toolchain helps you deliver quality data slowly.

  • True
  • False  

Question 2: DataOps Toolchain and DevOps are the same thing.

  • True
  • False  

Question 3: DataOps Toolchain can work without DataOps API(s).

  • True
  • False  

Question 4: What are the key components of DataOps Toolchain?

  • Continuous Deployment
  • Communication
  • Source Control
  • All of above  

Question 5: Who is responsible for creating DataOps Toolchain? (Choose all that apply)

  • Data Scientist
  • Administrator
  • DBA
  • Data Engineer

Module 2: Establish DataOps – Optimize for operation Lesson 2 – Establish Baseline

Question 1: Data Management is the same as Information Governance.

  • True
  • False

Question 2: What is the most costly result from an external influence to an organization?

  • Data Breach Fines and Penalties
  • Insurance Policy Payout
  • Claim Settlement
  • None of these

Question 3: Reference data is defined as data used as a permissible value within a data field.

  • True
  • False

Module 2: Establish DataOps – Optimize for operation Lesson 3 – Establish Business Priorities

Question 1: Business Priority should be the primary focus when deciding what the DataOps team should do.

  • True
  • False

Question 2: What is a data backlog?

  • A bottleneck in the data pipeline
  • A list of all data sources
  • A prioritized set of requirements expressed as data tasks
  • A plan to move all data into a catalog

Question 3: A prioritized data backlog will reduce the time taken to start the next DataOps iteration.

  • True
  • False

Question 4: A Data Task should be prioritized by considering:

  • The cost of providing the data
  • The career advancement possibilities of solving business challenges
  • The impact to sales from implementing the data pipeline
  • All of the above  

Question 5: KPIs are used to determine the progress and throughput of a DataOps data sprint.

  • True
  • False

Module 3: Iterate DataOps – Know your data Lesson 1 – Discover

Question 1: You will need someone on your team with detailed knowledge of the business processes you’re going to analyze so selected data elements are appropriate to reaching your objectives.

  • True
  • False

Question 2: What should you do if you identify gaps or mismatches in the data required for the analysis?

  • Rethink how you will do the analysis with different data
  • Create the missing data
  • Find a new source for the missing or mismatched data
  • All of the above  

Question 3: You should trace the linage of data elements to be used for analysis to make sure they come from a trusted source.

  • True
  • False

Question 4: What is the primary objective of the Discover phase?

  • Decide what the analytics team wants to have for lunch
  • Identify and locate the specific data elements required to accomplish an analysis
  • Uncover the meaning of data column headers and how they relate to the underlying data
  • Gain an understanding of the business goals and KPIs of an analysis effort

Question 5: A Data Engineer who thoroughly understands where specific data resides, including the specific databases and files where each identified data element resides, should be involved in Data Discovery process.

  • True
  • False

Module 3: Iterate DataOps – Know your data Lesson 2 – Classify

Question 1: Classification of each data element will make it easier going forward for users to distinguish the meaning and applicability of the data for their purposes.

  • True
  • False

Question 2: Which description best defines taxonomy?

  • Organizing data elements into meaningful structures
  • An IBM network protocol which reduces network latency
  • The art of preparing, stuffing, and mounting the skins of animals with lifelike effect

Question 3: A single data element can be placed into an unlimited number of data domains.

  • True
  • False

Question 4: Which of the following is the objective of classification?

  • To bring out points of similarity and dissimilarity among various groups
  • To present data in a simple, logical and understandable form
  • To condense the mass of data
  • All of the above  

Question 5: You should design workflows which are specific to the classification tool you are using.

  • True
  • False

Module 4: Iterate DataOps – Trust your data Lesson 1 – Manage Qualities & Entities

Question 1: Data quality is data accuracy.

  • True
  • False

Question 2: All data across the enterprise should have the same data quality.

  • True
  • False

Question 3:A data quality framework consists of which of the following 4 phases:

  • Profile
  • Define
  • Remediate
  • Monitor
  • Assess
  • Deploy 

Question 4: When assessing data quality, you only need the data set containing the data, metadata is optional.

  • True
  • False

Module 4: Iterate DataOps – Trust your data Lesson 2 – Manage Policies

Question 1: How does data classification affect defining policies?

  • Inheritance, retention and probabilities
  • Protection, reporting and inheritance
  • Protection, accessibility and retention
  • Retention, deletion and storage

Question 2: What impact does a highly sensitive classification have on a policy definition?

  • Require data anonymization, de-identification, and masking
  • Limit access to the data and/or require data masking
  • Limit access to the data and make it unprintable
  • No impact

Question 3: What are the most common state, country or regional regulations affecting personal information?

  • SIN, SSN and BAN
  • FDIC, BCBS and SOX
  • CCPA, GDPR and LGPD
  • PCI, PII and PHI

Question 4: Once policies have been defined affecting the data, rules must be enforced to act.

  • True
  • False

Module 5: Iterate DataOps – Use your data  Lesson 1 – Self Service

Question 1: Self Service of data is only possible when any data movement and transformation required to join multiple data assets have been performed.

  • True
  • False

Question 2: Self Service can use the following governance artefacts to refine a search in a catalog. (Choose all that apply)

  • Data Protection Rules
  • Business Terms
  • Tags

Question 3: A data consumer should not be able to access data that has been identified as sensitive, where there is not a business need to do so.

  • True
  • False

Question 4: Which of the following statements about Self Service are ?

  • Data consumers typically do not know how to manipulate the data
  • Data Protection rules prevent a data consumer from inadvertently seeing data that is sensitive
  • Creating multiple catalogs can partition data assets by their content and anticipated audience
  • A data consumer needs to know SQL to join multiple data assets

Question 5: Data Consumers provide valuable input to data scientists by clarifying the combination of data assets and how they need to be transformed, prior to data movement being designed and implemented.

  • True
  • False

Module 5: Iterate DataOps – Use your data  Lesson 2 – Manage Movement & Integration

Question 1: You should define the use case at the outset of a Data Movement and Integration project to support a “Build It and They Will Come” strategy.

  • True
  • False

Question 2: Which of the following does not represent a data integration pattern:

  • Data virtualization
  • Data replication
  • Data lineage
  • Message-oriented movement
  • Bulk/batch

Question 3: Which of the following is not a Data Movement and Integration Job Design consideration?

  • Design for reusability
  • Deployment models (e.g. Containers, Kubernetes Orchestration, OpenShift)
  • Design for parallel processing
  • Everything should be programmed in Python
  • Design for job portability (build once and run anywhere)

Question 4: Hand coding generally provides a 10X productivity gain over commercial data integration software tooling.

  • True
  • False

Question 5: Which of the following is not an example of a message queuing system?

  • Kafka
  • VSAM
  • Microsoft Azure Queues
  • GCP PubSub
  • AWS Simple Queue Service
  • MQ

Module 5: Iterate DataOps – Use your data  Lesson 3 – Improve/Complete

Question 1: DataOps is a completely new methodology and it doesn’t learn anything from agile and devOps.

  • True
  • False

Question 2: Data consumers can first start to provide feedback to the current data sprint in the stakeholder review meeting.

  • True
  • False

Question 3: Which of the following assets or artifacts could be found in catalog?

  • Code
  • Business terms
  • Data rules
  • Source data
  • Data lineage

Question 4: All issues need to be remediated before moving on to the next data sprint.

  • True
  • False

Question 5: Completing a data sprint involves publishing governed artifacts and data assets to a production environment.

  • True
  • False

Module 6: Improve DataOps  Review and Refine DataOps

Question 1: DataOps is a fixed process which should not be changed once defined.

  • True
  • False

Question 2: Improvements to the DataOps process could involve changes to

  • Technology used in DataOps
  • DataOps team roles and responsibilities
  • Processes for ETL
  • All of the above  

Question 3: Reviewing the Data classification phase involves reviewing how accurate the data mappings to the business terms are.

  • True
  • False

Question 4: Reviewing the Establish Baseline Process should include reviewing how effective the processes are for establishing a baseline for –

  • External Regulatory requirements
  • Organization maturity and Readiness
  • Governance and Oversight
  • All of the above

Question 5: KPIs are key in determining the effectiveness of all parts of the DataOps process.

  • True
  • False

DataOps Methodology Final Exam Answers

Question 1: What is a data strategy?

  • An architecture and actionable roadmap along with an action plan
  • A competitive publication to show that our organization is modern
  • A plan to move all legacy data systems to the cloud

Question 2: Which of the following statements about Data Strategy are ?

  • Whatever the type of data, it should only include internally produced data
  • All types of data – both structured and unstructured need to be considered
  • Volumes of data have increased hugely, but are now starting to stabilize
  • Only business executives should be consulted in putting together a strategy

Question 3: Which of the following roles are active team members of any DataOps team?

  • Chief Technology Officer
  • Chief Data Officer
  • Data Engineer
  • Database Administrator
  • Data Steward
  • Data Architect
  • Data Scientist

Question 4: Creating and maintaining business terms is a major responsibility of which following role?

  • Data Engineer
  • Data Quality Analyst
  • Data Steward
  • Data Scientist

Question 5: Business Priority should be the primary focus when deciding what the DataOps team should do.

  • True
  • False

Question 6: What is a data backlog?

  • A bottleneck in the data pipeline
  • A list of all data sources
  • A prioritized set of requirements expressed as data tasks
  • A plan to move all data into a catalog

Question 7: A Data Task should be prioritized by considering:

  • The cost of providing the data
  • The career advancement possibilities of solving business challenges
  • The impact to sales from implementing the data pipeline
  • All of the above

Question 8: KPIs are used to determine the progress and throughput of a DataOps data sprint.

  • True
  • False

Question 9: What are key components of DataOps toolchain?

  • Continuous Deployment
  • Communication
  • Source Control
  • All of above

Question 10: Who is responsible for creating DataOps toolchain? (Choose all that apply)

  • Data Scientist
  • Administrator
  • DBA
  • Data Engineer

Question 11: What is the primary objective of the Discover phase?

  • Decide what the analytics team wants to have for lunch.
  • Identify and locate the specific data elements required to accomplish an analysis
  • Uncover the meaning of data column headers and how they relate to the underlying data.
  • Gain an understanding of the business goals and KPIs of an analysis effort.

Question 12: Which description best defines taxonomy?

  • Organizing data elements into meaningful structures.
  • An IBM network protocol which reduces network latency.
  • The art of preparing, stuffing, and mounting the skins of animals with lifelike effect.

Question 13: Which of the following is the objective of classification?

  • To bring out points of similarity and dissimilarity among various groups.
  • To present data in a simple, logical and understandable form.
  • To condense the mass of data.
  • All of the above

Question 14: A data quality framework consists of which of the following 4 phases:

  • Profile
  • Define
  • Remediate
  • Monitor
  • Assess
  • Deploy

Question 15: How does data classification affect defining policies?

  • Inheritance, retention and probabilities
  • Protection, reporting and inheritance
  • Protection, accessibility and retention
  • Retention, deletion and storage

Question 16: What impact does a highly sensitive classification have on a policy definition?

  • Require data anonymization, de-identification, and masking
  • Limit access to the data and/or require data masking
  • Limit access to the data and make it unprintable
  • No impact

Question 17: Self Service can use the following governance artefacts to refine a search in a catalog. (Choose all that apply)

  • Data Protection Rules
  • Business Terms
  • Tags

Question 18: Which of the following statements about Self Service are ?

  • A data consumer needs to know SQL to join multiple data assets
  • Data Protection rules prevent a data consumer from inadvertently seeing data that is sensitive
  • Creating multiple catalogs can partition data assets by their content and anticipated audience
  • Data consumers typically do not know how to manipulate the data

Question 19: Which of the following does not represent a data integration pattern:

  • Data virtualization
  • Data replication
  • Data lineage
  • Message-oriented movement
  • Bulk/batch

Question 20: Which of the following is not a Data Movement and Integration Job Design consideration?

  • Design for reusability
  • Deployment models (e.g. containers, Kubernetes orchestration, OpenShift)
  • Design for parallel processing
  • Everything should be programmed in Python
  • Design for job portability (build once and run anywhere)

Question 21: Data consumers can first start to provide feedback to the current data sprint in the stakeholder review meeting.

  • True
  • False

Question 22: Which of the following could be found in catalog?

  • Code
  • Business terms
  • Data rules
  • Source data
  • Data lineage

Question 23: All issues need to be remediated before moving on to the next data sprint.

  • True
  • False

Question 24: Improvements to the DataOps process could involve changes to

  • Technology used in DataOps
  • DataOps team roles and responsibilities
  • Processes for ETL
  • All of the above

Question 25: Reviewing the Establish Baseline Process should include reviewing how effective are the processes for establishing a baseline for –

  • External Regulatory requirements
  • Organization maturity and Readiness
  • Governance and Oversight
  • All of the above

LEAVE A REPLY

Please enter your comment!
Please enter your name here