Question 1: Why would you use the Read Excel operator instead of the Import Data wizard to import data from an Excel spreadsheet into RapidMiner Studio? (Select one)
- To prevent RapidMiner from overwriting data in the Excel spreadsheet
- To import data from other spreadsheets with the same metadata without going through the Import Data wizard again
- To share the spreadsheet data with a colleague
- To ensure that RapidMiner reads the data as Excel, rather than CSV
Question 2: While working with a data set, you wish to create a new attribute which subtracts 1 week from an attribute named “date”. There are no time changes to be concerned with in the data:
Which of the following are valid methods to create this new attribute in RapidMiner Studio? (Select ANY correct answer)
- use the Adjust Date operator with adjustment=-7 and date unit=Day
- use the Generate Attributes operator with the function ‘ date_add(date,-1,DATE_UNIT_WEEK) ‘
- use the Date to Numerical operator, subtract the number of milliseconds in one week, and then use the Numerical to Date operator
- none of the above methods are able to create this new attribute in RapidMiner Studio
Question 3: What is the function of the Process Documents operator with TF-IDF vectors selected? (Select one)
- to transform a text attribute into multiple numerical attributes for future modeling
- to identify common topics from the entire corpus
- to create an association rule graph
- to ascertain the quality of the text prior to modeling
Question 4: You have a data set containing first and last names like this:
You now want to create a new attribute called “Full Name” which will look like this:
Pick any valid way to accomplish this task (Select ANY correct answer)
- use the Reorder Attributes and the Generate Aggregation operator.
- use the Generate Concatenation operator.
- use the Generate Attributes operator and the concat function.
- use the Pivot operator.
Question 5: You have two ExampleSets and wish to join them into one ExampleSet as shown below:
Which type of join will produce the desired result? (Select ANY correct answer)
- inner
- left
- right
- outer
Question 6: You have an input table with direct mailing transactions. Two of the columns are label and earnings (indicated by red boxes below) and you want to aggregate the data into a 2×2 table as shown:
- Which of the two processes shown below will product this result? (Select one)
- Option A
- Option B
Question 7: The following text is a movie review of the film “Rat Race”:
After watching “Rat Race” last week, I noticed my cheeks were sore.
The text is entered into Create Document, then into Tokenize (by word), and then Stem (Porter) as shown below:
Which of the following is the correct output of this process? (Select one)
- watching Rat Race week I noticed cheeks sore
- after watch rat race last week i notic my cheek were sore
- After After_watching watching watching_Rat Rat Rat_Race Race Race_last last last_week week week_I I I_noticed noticed noticed_my my my_cheeks cheeks cheeks_were were were_sore sore
- After watching Race last week noticed cheeks were sore
Question 8: You observe your colleague dragging a process called “Normalization” from her Local Repository onto the Process panel and connecting it as shown:
What will this new operator named “Execute Normalization” do with the output from Filter Examples? (Select one)
- It will create a new visualization called “Normalization”.
- It will execute a Jupyter notebook called “Normalization”.
- It will create a new building block called “Normalization”.
- It will execute “Normalization” as an embedded process.
Question 9: You have imported a data set into RapidMiner Studio containing sales transactions which look like this:
Unfortunately the date attribute was imported as type nominal. To convert this attribute to a date-time type, you could (Select ANY correct answer)
- use the Nominal to Date operator with date format = ‘yyyy-MM-dd HH:mm:ss’
- use the Numerical to Date operator with no time offset
- use the Generate Attributes operator with the expression = ‘date_parse_custom(date,”yyyy-MM-dd HH:mm:ss”) ‘
- add a new Turbo Prep subprocess with a ‘Change to date and time’ Transformation and format = ‘yyyy-MM-dd HH:mm:ss’
Question 10: You have two ExampleSets and wish to combine them into one ExampleSet as shown below:
Which of the following RapidMiner Studio processes will produce this desired result? (Select one)
- Option A
- Option B
- Option C
- Option D
Question 11: What is the function of this group of operators in RapidMiner Studio? (Select one)
- to read/write file objects in Amazon S3
- to read/write ExampleSets in Amazon S3
- to read/write processes in Amazon S3
- to stream databases in Amazon S3
Question 12: The Sonar data set has one nominal, special attribute named “class” and 60 real, regular attributes. A small part of the data set is shown below:
To rename ALL of the regular attributes so that they no longer contain the “_” character, you could (Select ANY correct answer)
- add the Rename by Replacing operator to your process, replacing the “_” character with nothing in the Parameters panel.
- add the Rename operator to your process, entering each “old” and “new” attribute name in the Parameters panel.
- go to Turbo Prep, rename the attributes, commit the transformations, and then add the result to your process.
- add the Rename by Generic Names operator to your process, entering the “_” character as the generic name stem in the Parameters panel.
Question 13: Which of the following techniques in RapidMiner Studio can be used to organize your process and/or make it more understandable to others? (Select ALL correct answers)
- grouping operators into subprocesses
- attaching colored notes to operators
- adding colored notes to a process
- editing the default names of operators in a process
- None of the above
Question 14: The Generate Sales Data operator creates a fictitious ExampleSet of sales transaction data:
You put this operator into a blank process and connect it to a Select Attributes operator with certain parameters, as shown below:
Which attribute(s) will be in the results? (Select one)
- date
- transaction_id, customer_id, product_id, amount
- customer_id, product_id, amount
- customer_id, product_id, date, amount
Question 15: In order to train a machine learning model to predict an attribute named “Churn”, you should set its role to (Select one)
- churn
- id
- label
- prediction
Question 16: To remove one or more attributes from an ExampleSet in a RapidMiner Studio process, you can (Select ALL correct answers)
- use the Delete Attributes operator.
- use the Select Attributes operator and check the ‘invert selection’ option in the Parameters panel.
- use the Filter Attributes operator.
- use the Select Attributes operator and choose the subset of desired attributes in the Parameters panel.
Question 17: You have an ExampleSet of movie reviews with a polynominal attribute named “text” as shown below:
To change the word “film” to “movie” in the attribute named “text”, you could (Select ANY correct answer)
- Option A
- Option B
- Option C
- Option D
Question 18: A sample of the Titanic data set with seven examples and five attributes is shown below:
A Filter Examples operator is now applied to the sample with these parameters:
How many examples will be in the resulting ExampleSet after this Filter Examples operator is applied to the sample? (Select one)
Hint: Notice the ‘Match any’ radio box in the image above
- 1
- 2
- 3
- 4
Question 19: Which of the following allows you to create a new attribute containing the square root of an existing attribute? (Select ANY correct answer)
- Generate Function Set
- Auto Model
- Generate Attributes
- Turbo Prep → Generate
Question 20: You have a table with sales transactions over time. Three of the columns are Product Category, Units Sold, and State (indicated by red boxes below):
- Group by=’Product Category’, Column grouping=’Units Sold’, Aggregation=sum of ‘State’
- Group by=’State’, Column grouping=’Product Category’, Aggregation=sum of ‘Units Sold’
- Group by=’Product Category’, Column grouping=’Units Sold’, Aggregation=sum of ‘State’
- Group by=’State’, Column grouping=’Units Sold’, Aggregation=sum of ‘Product Category’
Question 21: In order to always import the most recent entries from a database table into your RapidMiner Studio process, you should (Select one)
- use the Manage Database Connections wizard each time you run the process.
- use the Update Database operator to refresh the data, and then use the Read Database operator.
- use the Read Database operator to retrieve data from the table each time you run the process.
- always ensure that you have the most recent JDBC drivers installed in RapidMiner Studio.
Question 22: You have two example sets that both contain an attribute named “Age” and wish to use Union to create one ExampleSet as shown below:
Pick one of the following processes could produce this desired result? (Select ANY correct answer)
- Option A
- Option B
- Option C
- Option D