Gold Standard data is used to ensure accuracy on your tasks. By labeling a set of known answers, we can prevent scammers from giving you bad results and train new users to deliver the results you expect. CrowdFlower provides a few interfaces for creating and editing Gold Standard data. This document describes those interfaces.
This is the preferred method for creating Gold Standard Data. The interface works best if you have already collected judgments for a percentage of your units. You can do this by ordering a small percentage of your job (say 5%). Getting these judgments ahead of time allows us to suggest the most objective units as a potential Gold Standard. Objectivity is very important when choosing Gold. It is also fine to flag somewhat subjective units as Gold, but you should increase the difficulty of the unit in this case. You can find detailed instructions regarding the use of the Gold digging interface at the following URL (replace job_id with the job you want to add Gold to)
You can also make a specific unit Gold by going to the following URL (replace unit_id with the specific unit_id)
Manual Gold setting
If you already have known answers for some of the rows in your data source, you can link this Gold using the form-builder interface or the API. Once you have created your form elements, you can click “Mark Selected Question as Gold” in the form-builder interface. This will bring up a list of all the column headers from your data source. Click the column header that has values defined as Gold Standard. These values should correspond to the values submitted with your form. You can see what these values should be at the following URL (replace job_id with the job you are editing)
It is important to note that less than one-third of the rows in your Gold column should have values. We generally suggest having 5-10% of the units for a given job flagged as Gold. Gold can contain multiple correct answers. When you specify Gold manually, we will automatically split your data source on the newline character
Gold is an iterative process. Often you will find that some items you flagged as Gold are too subjective for users to get them correct consistently. We provide an interface to monitor the quality of your Gold and potentially add or remove values from your Gold definition, as well as providing a reason to help our users understand why the answer you have chosen is correct. You can see a report of your Gold’s health at the following url (replace job_id with the job you are editing)
Our users are allowed to contest the integrity of individual Gold items. The items in the Gold report are ordered by controversy. The more a Gold unit has been missed and contested, the earlier it is displayed in the list of Gold. The easiest way to deal with highly contested Gold items is to de-flag the unit as Gold. This will prevent it from being displayed to future workers on the task. Clicking an individual Gold item’s
id will display an overview of the answers collected thus far. You can also edit the Gold unit by clicking the “edit” link on this page.
The Gold report is also available as a CSV download at:
You can edit this spreadsheet and upload it back to the current job or a new job to modify or add Gold data. The important columns of note in the Gold report CSV are
You can flag Gold items as hidden (see “Hidden Gold” below) by setting the
_hidden cell to “true” for a given Gold item. If you set this cell to “remove,” the Gold item will be de-flagged completely as Gold. If left blank, the unit will remain a normal Gold Standard item.
_contention contains a list of user-generated reasons as to why they believe your answer isn’t correct. This is helpful in determining why users are getting confused.
_difficulty column should contain an integer between 1 and 100. The higher the number, the more difficult the Gold is considered, and it will be displayed later in a worker’s judgment session.
_id must be a valid
unit_id, if the
unit_id is from the current job, these units will be modified. If the
unit_id is from another job (that you own); it will be copied to this job as new Gold units.
_form_field column contains a list separated by newlines ordered by frequency of the actual responses we are getting for this specific Gold field (_form_field corresponds to a specific field in your form, see the legend above). You can use this data to find possible values to add to the
Gold_column_name cells described below.
Gold_column_name column contains the actual Gold field’s data and corresponds to the field you defined as Gold (by default this is “form_field_Gold”). If this cell contains any newlines, multiple correct answers will be created by splitting the value on newlines.
Gold_column_name_reason should be used to explain to the user why your answer(s) are correct.
You can flag Gold items as hidden from the Gold’s report interface. This allows you to check the accuracy of your task once completed. Gold units flagged as hidden will not affect a worker’s trust score, nor will they inform the worker if they miss them. Hidden Gold is provided as a way to perform a sanity check on the data you have collected. Hidden units have a value of
FALSE for the
_Golden column in your downloaded spreadsheets.
If too many people are missing and contesting your Gold, we will automatically pause your job and send you an email. You should edit the most controversial Gold items as described above before resuming your job.