Simplifying data pipelines with Apache Kafka Cognitive Class Exam Answers
Simplifying data pipelines with Apache Kafka
Module 1: Introduction to Apache Kafka
Question 1 :Which of the following are a Kafka use case?
- Messaging
- All of the above
- Stream Processing
- Website Activity Tracking
- Log Aggregation
Question 2 : A Kafka cluster is comprised of one or more servers which are called “producers”
- True
- False
Question 3 : Kafka requires Apache ZooKeeper
- True
- False
Module 2: Kafka Command Line
Question 1 :There are two ways to create a topic in Kafka, by enabling the auto.create.topics.enable property and by using the kafka-topics.sh script.
- True
- False
Question 2 :Which of the following is NOT returned when –describe is passed to kafka-topics.sh?
- Configs
- None of the Above
- PartitionNumber
- ReplicationFactor
- Topic
Question 3 :Topic deletion is disabled by default.
- True
- False
Module 3: Kafka Producer Java API
Question 1 :The setting of ack that provides the strongest guarantee is ack=1
- True
- False
Question 2 :The KafkaProducer is the client that publishes records to the Kafka cluster.
- True
- False
Question 3 : Which of the following is not a Producer configuration setting?
- batch.size
- linger.ms
- key.serializer
- retries
- None of the above
Module 4: Kafka Consumer Java API
Question 1 :The Kafka consumer handles various things behind the scenes, such as:
- Failures of servers in the Kafka cluster
- Adapts as partitions of data it fetches migrates within the cluster
- Data management and storage into databases
- and b) only
- All of the Above
Question 2 : If enable.auto.commit is set to false, then committing offsets is done manually, which provides gives you more control.
- True
- False
Question 3 : Rebalancing is a process where group of consumer instances within a consumer group, coordinate to own mutally shared sets of partitions of topics that the groups are subscribed to.
- True
- False
Module 5: Kafka Connect and Spark Streaming
Question 1 :Which of the following are Kafka Connect features?
A common framework for Kafka connectors
- Automatic offset management
- REST interface
- Streaming/batch integration
- All of the above
Question 2 :Kafka Connector has two types of worker nodes called standalone mode and centralized mode cluster
- True
- False
Question 3 : Spark periodically queries Kafka to get the latest offsets in each topic and partition that it is interested in consuming form.
- True
- False
Final Exam
If the auto.create.topics.enable property is set to false and you try to write a topic that doesn’t yet exist, a new topic will be created.
- True
- False
Which of the following is false about Kafka Connect?
- Kafka Connect makes building and managing stream data pipelines easier
- Kafka Connect simplifies adoption of connectors for stream data integration
- It is a framework for small scale, asynchronous stream data integration
- None of the above
Kafka comes packaged with a command line client that you can use as a producer.
- True
- False
Kafka Connect worker processes work autonomously to distribute work and provide scalability with fault tolerance to the system.
- True
- False
What are the three Spark/Kafka direct approach benefits? (Place the answers in alphabetical order.)
Kafka Consumer is thread safe, as it can give each thread its own consumer instance
- True
- False
What other open source producers can be used to code producer logic?
- Java
- Python
- C++
- All of the above
If you set acks=1 in a Producer, it means that the leader will write the received message to the local log and respond after waiting for full acknowledgement from all of its followers.
- True
- False
Kafka has a cluster-centric design which offers strong durability and fault-tolerance guarantees.
- True
- False
Which of the following values of ack will not wait for any acknowledgement from the server?
- all
- 0
- 1
- -1
A Kafka cluster is comprised of one or more servers which are called “Producers”
- True
- False
What are In Sync Replicas?
- They are a set of replicas that are not active and are delayed behind the leader
- They are a set of replicas that are not active and are fully caught up with the leader
- They are a set of replicas that are alive and are fully caught up with the leader
- They are a set of replicas that are alive and are delayed behind the leader
In many use cases, you see Kafka used to feed streaming data into Spark Streaming
- True
- False
All Kafka Connect sources and sinks map to united streams of records
- True
- False
Which is false about the KafkaProducer send method?
- The send method returns a Future for the RecordMetadata that will be assigned to a record
- All writes are asynchronous by default
- It is not possible to make asynchronous writes
- Method returns immediately once record has been stored in buffer of records waiting to be sent