Use Case Examples
These end-to-end examples illustrate common ways researchers use Polyphony. Each example walks through the key steps from corpus upload to analysis.
1. Sentiment Classification with Human Annotators
Goal: Label a dataset of 1 000 product reviews as positive, neutral, or negative with two independent annotators, then compute agreement and create a gold standard.
Steps
- Upload corpus — Go to Corpus and upload a CSV with columns
review_id,text, andproduct. The system creates a corpus with the uploaded columns. - Define variable — Go to Variables and create a Single categorical variable named Sentiment with options positive, neutral, and negative.
- Build annotation form — Go to Annotation Builder, create a form, add a Document viewer block and an Input component block linked to the Sentiment variable.
- Invite annotators — Go to Annotator Management and invite two colleagues.
- Create workflow — Go to Human Workflow, create an Overlap workflow covering the whole corpus, assign both annotators, and activate it. Each annotator receives 1 000 tasks.
- Annotate — Annotators log in and work through their task queues.
- Analyse — Once both annotators are done, go to Analysis to view Cohen's Kappa and other IAA metrics. Create a gold standard column using majority vote.
2. LLM-Assisted Annotation with Human Review
Goal: Use an LLM to pre-annotate 10 000 news articles for topic, then have a researcher review and correct a sample.
Steps
- Upload corpus — Upload a CSV with
article_idandbodycolumns. - Create partitions — Split articles into review-sample (500 articles) and llm-only (9 500 articles) using the partitions column in the upload file, or via the Corpus view.
- Define variable — Create a Single categorical variable Topic with options politics, sports, technology, entertainment, other.
- Create LLM pipeline — Go to LLM Pipeline, create a pipeline for the corpus, add a prompt for the Topic variable, and run it. The LLM annotates all 10 000 articles.
- Human review — Create a Human Workflow scoped to the review-sample partition and assign the researcher. After reviewing, compute agreement between the human corrections and LLM outputs in Analysis.
- Export — Export the final corpus with LLM outputs as XLSX via the Corpus export button.
3. Multi-Dimensional Annotation for NLP Training Data
Goal: Collect fine-grained annotations for 500 customer support messages across three dimensions: intent, urgency, and sentiment, and produce training data for a supervised model.
Steps
- Upload corpus — Upload a CSV with
ticket_idandmessagecolumns. - Define variables — Create three variables:
- Single categorical Intent (billing, technical, shipping, other)
- Single categorical Urgency (low, medium, high)
- Likert Sentiment (1–5)
- Build annotation form — Create one form with a document viewer and three input components, one for each variable.
- Create three-annotator workflow — Use Overlap mode with three annotators so every message is annotated three times.
- Analyse — After annotation, go to Analysis to compute Fleiss' Kappa for each variable. Create gold standard columns for variables with sufficient agreement.
- Export training data — Export the corpus as XLSX; the gold standard columns appear alongside the original text and metadata.