# Batches # Working with Batches An AI Task Builder Batch allows you to collect human annotations on your existing data. You provide a dataset, define instructions, and participants evaluate each datapoint according to your instructions. This guide covers the workflow for creating, configuring, and publishing a Batch. ## Workflow overview Create a dataset: Set up a dataset to hold your data. Upload your data: Request presigned URLs and upload your files to S3. Monitor dataset status: Wait for the dataset to finish processing. Create a batch: Initialise a new batch and attach your dataset. Create instructions: Define what participants should do with each datapoint. Set up the batch: Trigger task generation. Monitor batch status: Wait for tasks to be generated. Create a study: Create a Prolific study that references your batch. Publish the study: Make the study available to participants. Retrieve responses: Download the annotated data after participants complete their tasks. ## Creating a dataset Create a dataset to hold your data. ```bash POST /api/v1/data-collection/datasets ``` ```json { "name": "Product reviews Q4 2024", "workspace_id": "6278acb09062db3b35bcbeb0" } ``` ### Response ```json { "id": "0192a3b5-e8f9-7a0b-1c2d-3e4f5a6b7c8d", "name": "Product reviews Q4 2024", "status": "UNINITIALISED" } ``` ## Uploading your data Upload your dataset as a CSV file using presigned URLs. ### Step 1: Request a presigned URL ```bash GET /api/v1/data-collection/datasets/{dataset_id}/upload-url/{filename} ``` For example: ```bash GET /api/v1/data-collection/datasets/0192a3b5-e8f9-7a0b-1c2d-3e4f5a6b7c8d/upload-url/reviews.csv ``` ### Step 2: Upload to S3 Use the presigned URL from the response to upload your CSV file directly to S3. ```bash curl -X PUT \ -H "Content-Type: text/csv" \ --data-binary @reviews.csv \ "{presigned_url}" ``` ### CSV format Your CSV should contain one row per datapoint. Each column is displayed to participants alongside the instructions. ```csv id,review_text,product_name,rating 1,"Great product, exactly what I needed!",Widget Pro,5 2,"Arrived damaged, very disappointed",Widget Pro,1 3,"Works as expected, nothing special",Basic Widget,3 ``` For advanced options including metadata columns and custom task grouping, see [Working with Datasets](/api-reference/ai-task-builder/datasets). ## Monitoring dataset status Poll the dataset endpoint to check when processing is complete. ```bash GET /api/v1/data-collection/datasets/{dataset_id} ``` Wait for the status to change to `READY` before proceeding. ### Dataset status | Status | Description | | --------------- | ------------------------------------------ | | `UNINITIALISED` | Dataset created but no data uploaded | | `PROCESSING` | Dataset is being processed | | `READY` | Dataset is ready to be attached to a batch | | `ERROR` | Something went wrong during processing | ## Creating a batch Once your dataset is ready, create a batch and attach the dataset. ```bash POST /api/v1/data-collection/batches ``` ```json { "workspace_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "name": "Product review sentiment analysis", "dataset_id": "0192a3b5-e8f9-7a0b-1c2d-3e4f5a6b7c8d", "task_details": { "task_name": "Review Sentiment Classification", "task_introduction": "

Read each product review carefully and classify its sentiment.

", "task_steps": "
  1. Read the review text
  2. Consider the overall tone
  3. Select the appropriate sentiment
" } } ``` ### Task details The optional `task_details` object provides context to participants: | Field | Type | Description | | ------------------- | ------ | -------------------------------- | | `task_name` | string | Title displayed to participants | | `task_introduction` | string | Introduction or general guidance | | `task_steps` | string | Steps participants should follow | All three fields support basic HTML formatting. ### Response ```json { "id": "0192a3b4-d6e7-7f8a-0b1c-2d3e4f5a6b7c", "workspace_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "name": "Product review sentiment analysis", "dataset_id": "0192a3b5-e8f9-7a0b-1c2d-3e4f5a6b7c8d", "status": "UNINITIALISED", "total_task_count": 0 } ``` ### Batch status A batch transitions through the following states: | Status | Description | | --------------- | ---------------------------------------- | | `UNINITIALISED` | Batch created but contains no tasks | | `PROCESSING` | Batch is being processed into tasks | | `READY` | Batch is ready to be attached to a study | | `ERROR` | Something went wrong during processing | ## Creating instructions Instructions define what participants should do with each datapoint. Each instruction is displayed to participants sequentially alongside the datapoint. ```bash POST /api/v1/data-collection/batches/{batch_id}/instructions ``` ### Instruction types | Type | Description | | -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `multiple_choice` | Selection from a list of options. Use `answer_limit` to control how many options can be selected: `1` for single-select, `-1` for unlimited, or any number up to the total options. | | `free_text` | Open-ended text input | | `multiple_choice_with_free_text` | Selection from options, each with a heading and an associated free text field for additional input | | `file_upload` | File submission (images, documents, etc.) | By default, when there are 5 or more options, a dropdown is rendered instead of checkboxes or radio buttons. Set `disable_dropdown: true` to always use checkboxes/radio buttons. See [Instructions](/api-reference/ai-task-builder/instructions) for full details on all instruction fields. ### Example: Sentiment classification ```json { "order": 1, "type": "multiple_choice", "description": "What is the overall sentiment of this review?", "answer_limit": 1, "options": [ { "label": "Positive", "value": "positive" }, { "label": "Neutral", "value": "neutral" }, { "label": "Negative", "value": "negative" } ] } ``` ### Example: Free text explanation ```json { "order": 2, "type": "free_text", "description": "Briefly explain why you chose this sentiment rating", "placeholder_text_input": "e.g. The reviewer uses positive language and expresses satisfaction..." } ``` ## Setting up the batch Once your instructions are created, trigger task generation. Each datapoint in your dataset is paired with all instructions to create a task. Tasks are then organized into **task groups** — participants complete one task group per submission. ```bash POST /api/v1/data-collection/batches/{batch_id}/setup ``` ```json { "tasks_per_group": 5 } ``` The `tasks_per_group` parameter controls how many tasks are randomly assigned to each group. If omitted, each task group contains a single task. Participants complete all tasks within their assigned group in a single submission. No participant will be assigned the same task group twice, even if they complete multiple submissions. For custom task grouping based on your own criteria, see [Working with Datasets](/api-reference/ai-task-builder/datasets). This triggers task generation. The batch status will change to `PROCESSING` and then to `READY` once complete. ## Monitoring batch status Poll the batch endpoint to check when task generation is complete. ```bash GET /api/v1/data-collection/batches/{batch_id} ``` Wait for the status to change to `READY` before creating a study. ```json { "id": "0192a3b4-d6e7-7f8a-0b1c-2d3e4f5a6b7c", "workspace_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "name": "Product review sentiment analysis", "status": "READY", "total_task_count": 150 } ``` The `total_task_count` reflects the number of datapoints in your dataset. ## Publishing a batch To make your batch available to participants, create a Prolific study that references it. ```bash POST /api/v1/studies/ ``` When creating the study, set `data_collection_method` to `AI_TASK_BUILDER_BATCH` and provide your batch ID: ```json { "name": "Product Review Sentiment Analysis", "internal_name": "sentiment-analysis-q4-2024", "description": "

Help us understand the sentiment in product reviews by classifying each review and explaining your reasoning.

", "estimated_completion_time": 15, "maximum_allowed_time": 45, "reward": 300, "data_collection_method": "AI_TASK_BUILDER_BATCH", "data_collection_id": "0192a3b4-d6e7-7f8a-0b1c-2d3e4f5a6b7c", "data_collection_metadata": { "annotators_per_task": 3 } } ``` Use `annotators_per_task` in `data_collection_metadata` to specify how many participants should annotate each datapoint. The default is 1. After publishing, this value can only be increased. Then publish the study: ```bash POST /api/v1/studies/{study_id}/transition/ ``` ```json { "action": "PUBLISH" } ``` ## Retrieving responses After participants have completed their tasks, download the annotated data as a CSV. ```bash GET /api/v1/data-collection/batches/{batch_id}/report/ ``` This returns your original CSV with additional columns containing participant responses for each instruction. *** By using AI Task Builder, you agree to our [AI Task Builder Terms](https://prolific.notion.site/Researcher-Terms-7787f102f0c541bdbe2c04b5d3285acb).