Thinking in public.

Perspectives, methodology notes, and field observations from the Deaimer team. We write to sharpen our own thinking — and because the AI data industry is better when practitioners talk openly.

All posts Methodology Research Operations Industry Ethics
METHODOLOGYMar 18, 2026

Why calibration sessions matter more than gold sets.

The gold-set-only approach to rater quality is an incomplete picture. Here's what a disciplined calibration cadence actually looks like.

Read post →
OPERATIONSMar 8, 2026

What 248 live workflows taught us about scheduling.

Lessons from running hundreds of parallel annotation workflows — and why the simplest scheduling rule almost always wins.

Read post →
RESEARCHFeb 25, 2026

Rater drift is real. Here's how we catch it.

How we monitor rater agreement over time, what we do when a senior rater starts drifting, and the dashboards that flag it in real time.

Read post →
ETHICSFeb 12, 2026

The ethics refusal clause — and why we use it.

What our MSA refusal language actually covers, how we make judgment calls, and why this isn't just corporate positioning.

Read post →
INDUSTRYJan 30, 2026

What's actually changing in data operations in 2026.

Three patterns we're seeing across our enterprise and frontier-lab engagements heading into the year.

Read post →
METHODOLOGYJan 15, 2026

How to write a useful annotation rubric.

Rubric design is one of the highest-leverage activities in data operations. Here's what separates great rubrics from frustrating ones.

Read post →
OPERATIONSDec 28, 2025

What we got wrong in 2025.

An honest inventory of the mistakes we made, the lessons we took, and what we're doing differently going into 2026.

Read post →
RESEARCHDec 15, 2025

Why benchmark saturation isn't the whole story.

Public benchmarks are saturating. That doesn't mean evaluation is solved — it means evaluation work is just beginning.

Read post →
ETHICSDec 1, 2025

Why we published our labor audit — findings and all.

Transparency is a process, not a PR exercise. What we found in our own operation, what we're changing, and why we shared it publicly.

Read post →
INDUSTRYNov 18, 2025

The consolidation nobody is talking about.

The AI data industry is consolidating — but not in the direction most analysts expect. Here's what we're seeing on the ground.

Read post →
METHODOLOGYNov 2, 2025

Evaluation design for agentic systems.

Agentic evaluation is different — multi-step, tool-using, stateful. What we've learned from designing evals for these systems.

Read post →
OPERATIONSOct 20, 2025

How we onboard a new specialist panel in 5 days.

The operational playbook for ramping up a specialized annotator team — whether it's radiologists, attorneys, or CFAs.

Read post →
LET'S BUILD

Let's make your AI better together.

Tell us what you're training, aligning, or evaluating. We'll map a delivery plan, staffing model, and timeline within one working week.