Login

Login

Units

A Unit represents information you want annotated. If you were to upload a spreadsheet of data, each row would be turned into a Unit. Each Unit for a particular Job should contain the same data. A Unit must belong to a Job.

When Units are created, they are not automatically ordered by default. If you want all Units added to a Job to be automatically ordered, set auto_order to true on your Job.

Attributes

Read / Write

  • job_id
  • missed_count
  • difficulty
  • state
  • data
  • agreement

Read-only

  • updated_at
  • created_at
  • judgments_count
  • id

Results

Any unit that has gathered trusted judgments will also have a results attribute.

results has a top-level attribute judgments, which contains a JSON array of all the Judgments that have been gathered so far during your job.

"results" : {
  "judgments"    : [
    {"job_id"    : 5539,
     "data"      : {...},
     "worker_id" : 12,
     "unit_id"   : 1001,
     ...
    }, ...
    ]
  ...
}

Any aggregate calculations that are done on your data will also be included in results. For example, if your job asks workers to visit a website and answer a series of questions about the website, your results structure might look like this:

"results" : {
  "what_is_the_url" : { "agg" : "http://blog.crowdflower.com",
                        "confidence" : "0.9" }
}

See aggregation for more details on the types of aggregations that are possible.

Methods

Common

ActionURL / ParamsVerb
Create/jobs/$JOB_ID/unitsPOST
Read/jobs/$JOB_ID/units/($UNIT_ID)GET
Update/jobs/$JOB_ID/units/$UNIT_IDPUT
Destroy/jobs/$JOB_ID/units/$UNIT_IDDELETE

Parameters

Create and Update both accept the Read / Write attributes listed above. Be sure to prepend “unit” to each url parameter, e.g.

unit[golden]=true&unit[data][somekey]=Some+value

If auto_order is enabled, the unit will be ordered as soon as it is created. If you have not set channels for this job, however, the unit will be saved successfully but will return the following error:

{
  "results": {
    "judgments": []
  },
  "created_at": "2010-05-03T14:04:31-07:00",
  "data": {
    "name": "pirate radio"
  },
  "updated_at": "2010-05-03T14:04:31-07:00",
  "judgments_count": 0,
  "id": 735275,
  "difficulty": 0,
  "job_id": 124056,
   "message": {
      "error": "Unit was successfully created, but it could not be ordered because channels have not been set for this job."
  },
  "state": "new"
}

Status

If you are adding units from a spreadsheet or a feed, you can use ping to determine the status of the upload:

ActionURL / ParamsVerb
Status/jobs/$JOB_ID/units/pingGET

Response

Pending / processing:

{
  "done": false,
  "count": 100
}

Finished:

{
  "done": true,
  "count": 200
}

Failed:

{
  "done": false,
  "count": 55,
  "error": "Unable to parse row 57. Please be sure your file is UTF-8 encoded."
}

Although multiple queued uploads are allowed, ping only returns the status of the most recent upload.

Cancel

If a unit has been ordered, you can use this method to cancel an individual unit. This action does not refund your account.

ActionURL / ParamsVerb
Cancel/jobs/$JOB_ID/units/$UNIT_ID/cancelPOST

Response

{"success":"Unit 1038109 has been cancelled."}

In case of an error processing the request, the error will appear in a JSON key called error:

{"error":"Job 6245 does not contain Unit 1038109"}

cURL Example

curl -X PUT https://api.crowdflower.com/v1/jobs/6245/units/1038109/cancel.json?key=$API_KEY

Bulk Split

ActionURL / ParamsVerb
Split/jobs(/$JOB_ID)/units/splitPUT

Parameters

on
A comma-delimited list of columns to be split.
with
The internal delimiter for the column. Default is the space character (” “).

Response

A successful request will return a 200 “OK” response.

Explanation and Example

Note: You are strongly advised to post complex data formats to CrowdFlower using JSON rather than CSV. JSON is better suited for complex data — the split operation is provided as a convenience operation for existing datasets, and it is not recommended for new users.

If you upload a spreadsheet that has more complex data than a CSV can naturally represent, the split operation can be useful. Any of the columns of your uploaded CSV may be internally delimited by a special character (by default, a blank space). Calling split on this column will let CrowdFlower know that the contents of this column should be treated by CrowdFlower internally as a collection of discrete items rather than a block.

Suppose your existing dataset is an arbitrary collection of major authors.

author,major_works,countries_active
Homer,The Iliad|The Odyssey,Greece
Dickens,David Copperfield|Bleak House,England
Nabokov,Camera Obscura|Lolita,Russia|United States
Rabelais,Gargantua and Pantagruel,France
Cervantes,Don Quixote,Spain

When this data is posted as a CSV to CrowdFlower, one unit is created for each of the five rows of data. The units each have data associated with the three CSV columns provided. When initially posted, CrowdFlower treats all of the values transferred as free text values with no depth or structure. After the initial data post, Dickens’ major works field is set to "David Copperfield|Bleak House."

To let CrowdFlower know that the major_works and countries_active columns are each actually collections of delimited values, you can use the split operation.

PUT http://api.crowdflower.com/v1/jobs/$JOB_ID/units/split?on=major_works,countries_active&with=|

(Be careful to URL-encode the parameters — this is not done above for readability.)

After the PUT, CrowdFlower will consider Dickens’ major_works field to be set to the collection [ "David Copperfield", "Bleak House" ]. Similarly, Nabokov’s countries_active field will be set to [ "Russia", "United States" ]. The square brackets here indicate a data structure that is analogous to a List or Vector in Java, a list in Python, an Array in Ruby, etc. If you were to request Homer’s major_works from CrowdFlower, it would be returned as a JSON array:

major_works : [ "The Iliad","The Odyssey" ]

Because the author field was not split, it will not be treated as a collection:

author : "Homer"

The preferred way to acheive this with CrowdFlower is by posting your data in a structured format with JSON.


Products

read more...

Customers

read more...

Social Media



Law Talk

Privacy Policy Terms of Service ©2011 CrowdFlower