Question 1: Spin up an insecure three-node cluster locally. At least one node will need to be on the default port, 26257.
To verify that you’ve done this correctly, you’ll need to run a command and paste it into the box below.
For Windows users, run the following command in PowerShell:
-join (cockroach node ls –insecure –host localhost:26257)
For Mac users, run the following command in the terminal:
cockroach node ls –insecure –host localhost:26257 | xargs
Answer: id 1 2 3
Question 2: Connect to your cluster with the SQL shell, and create a table named defaultdb.events (i.e., create a table named events in the database defaultdb).
When you create your table, give it exactly three columns:
user_id column of type UUID
event_code column of type STRING
ts column of type TIMESTAMP
Finally, make the user_id the primary key of the table.
To verify that your table was built correctly run the following command from the SQL shell:
INSERT INTO events VALUES (‘63616665-6630-3064-6465-616462656562’, ‘party’, ‘2019-10-31’), (‘63616665-6630-3064-6465-616462656563’, ‘catering’, ‘2019-11-01T17:00:00’);
Copy only the first line of the response, and paste it below.
Paste the first line of the response here:
Answer: INSERT 2
Question 3: Using your three-node cluster from the first problem, scale out to a five-node cluster.
When finished, you’ll need to demonstrate that you did this correctly. Download the create_table_and_split_ranges.sql script (click on the link to download), and run it with the following command:
cockroach sql –insecure < create_table_and_split_ranges.sql
This script creates a table, and splits the table into lots of ranges. Wait 3-4 minutes for the cluster to rebalance those ranges among the nodes, then go to the Admin UI (the landing page). You should be able to use this link.
The “Replicas” column shows the number of replicas on each node, and while they’re not likely to be equal, they should be close to each other in value. Which of the following describes the number of replicas on most nodes?
- 0 <= Replicas < 100
- 100 <= Replicas < 200
- 200 <= Replicas < 300
- 300 <= Replicas < 400
Question 4: The cluster you spun up in the previous question should now have 5 nodes. For the default replication factor of 3, how many nodes can fail at the same time while still ensuring continuous data availability?
- No nodes can fail while still ensuring availability
- 1 node can fail while still ensuring availability
- 2 nodes can fail while still ensuring availability
- 3 nodes can fail while still ensuring availability
- 4 nodes can fail while still ensuring availability
Question 5: Following up on the previous question, how many simultaneous node failures would the cluster be able to handle while still ensuring data availability if the replication factor were increased to 5?
- No nodes can fail while still ensuring availability
- 1 node can fail while still ensuring availability
- 2 nodes can fail while still ensuring availability
- 3 nodes can fail while still ensuring availability
- 4 nodes can fail while still ensuring availability
Question 6: Which isolation level is used by CockroachDB?
- Read Uncommitted
- Read Committed
- Snapshot
- Repeatable Read
- Linearizable
- Serializable
Question 7: Which of the following are features of CockroachDB? Check all that apply.
- Transactions
- High availability
- Geo-replication
- Scalability
- Tunable consistency
Question 8: You have a table named real_estate. Its primary key is on the id field. Consider the following query:
> SELECT city, street_name, street_number FROM real_estate WHERE street_name = ‘Penny Lane’;
city | street_name | street_number
+———-+————-+—————+
New York | Penny Lane | 1
and its logical EXPLAIN plan:
> EXPLAIN SELECT * FROM real_estate WHERE street_name = ‘Penny Lane’;
tree | field | description
+——+————-+—————————-+
| distributed | true
| vectorized | false
scan | |
| table | real_estate@primary
| spans | ALL
| filter | street_name = ‘Penny Lane’
Which of the following should most improve performance for this query (assuming the data set doesn’t change)?
- Build a secondary index on street_name
- Add an additional node to the cluster
- Build a secondary index on the id field
- Increase the replication factor of the real_estate table
Question 9: With your cluster running, perform the following command from the terminal or from PowerShell:
cockroach workload init movr
This will build and populate several tables in the movr database.
The movr.vehicles table has one secondary index.
What is the name of that secondary index?
Answer:
Question 10: You have a globally distributed cluster, with nine nodes spread across three datacenters, each more than 1,000 km from either of the others. Each node uses the `–locality` flag to identify its datacenter. You find that write latency is high, and upon investigating, discover that none of the nodes are I/O limited, nor are they bandwidth limited. The issue is that writes are suffering from speed-of-light latency during Raft consensus operations.
Which of the following would be most effective at reducing the high write latency?
- Geopartition the cluster data in order to keep Raft replicas close together
- Reduce the replication factor
- Archive infrequently accessed data
- Increase the number of nodes
- Build additional secondary indexes