Use Claira to extract and standardize factual document metadata like dates, titles, authors, and document types in Nuix Discover.

Objective Coding

Objective coding is the process of adding or correcting factual metadata on documents -- things like the document date, title, author, document type, and language. Unlike legal coding (which involves judgment calls about relevance or privilege), objective coding focuses on verifiable information that can be read directly from the document.

This matters because metadata drives everything downstream: search filters, sort orders, chronologies, and production sets all depend on accurate, consistent metadata. When that metadata is missing or wrong, your review suffers.

When to use objective coding

Missing metadata. Documents were loaded without key fields like date or author.
Incomplete imports. Metadata extraction during processing captured some fields but missed others.
Large volumes needing standardization. Thousands of documents have metadata in inconsistent formats.
Chronologies needed. You need reliable dates to build a timeline, but the existing date field is unreliable.

How it differs from legal coding

Objective Coding	Legal Coding
Factual, verifiable	Requires judgment
Document Date, Title, Author, Doc Type, Language	Relevance, Privilege, Issue Tags
Can be automated with high confidence	Requires human review
Same answer regardless of reviewer	May vary by reviewer

The workflow

Objective coding in Claira follows a seven-step process. Each step builds on the previous one.

Step 1: Open your Claira workspace

Navigate to your case in Claira and open the workspace where you will run the objective coding scan.

ClairaCasesYour CaseWorkspace

Step 2: Confirm text extraction

Before Claira can analyze a document, it needs extracted text to work with. Verify that your documents have been processed and that text extraction completed successfully.

Claira analyzes extracted text only. If a document has no extracted text -- for example, because it is an image-only PDF that was not OCR'd -- Claira cannot read it and will return fallback values. Check your processing logs before running a bulk scan.

Step 3: Create or connect destination fields

You need fields in Nuix Discover to store Claira's output. We recommend creating dedicated AI fields rather than writing directly to your primary metadata fields. This gives you a chance to QC the results before committing them.

Example fields:

OC DateDateAI-extracted document date

OC TitleTextAI-extracted document title

OC AuthorTextAI-extracted author name

OC Doc TypeTextAI-extracted document type

OC LanguageTextAI-detected language

Connect these fields in Claira's workspace settings so scan results are written to the right place.

Step 4: Build and test your prompt on a single document

Start with one document to make sure your prompt produces the output you expect. Open a document in the single review view, enter your prompt, and check the result.

◈Objective Coding Prompt -- Date

Identify the primary document date. If multiple dates, pick most recent. Respond ONLY MM/DD/YYYY. If no clear date, respond --

This prompt is intentionally strict: it asks for a single date in a specific format, with a clear fallback value (--) when no date can be determined. That structure makes QC and downstream processing much easier.

Build one prompt per field. A single prompt that tries to extract date, title, author, and doc type all at once is harder to test, harder to QC, and more likely to produce inconsistent results.

Step 5: Run a bulk scan

Once you are satisfied with your prompt on individual documents, run it across your full document set (or a targeted subset) using Claira's bulk scan feature.

WorkspaceBulk ScanConfigureRun

Monitor the scan progress in the workspace. Larger document sets will take longer, but you can continue working while the scan runs.

Step 6: Quality control

This is the most important step. Do not skip it.

Filter for fallback values. Search your destination field for the fallback value (e.g., --) to find documents where Claira could not extract the metadata. Review these manually.
Spot-check results. Open 20-30 documents across different document types and compare Claira's output to the actual document content.
Validate edge cases. Pay special attention to documents with unusual formats, multiple dates, or ambiguous authorship.

If more than 10-15% of documents returned the fallback value, your prompt may need refinement, or the underlying text extraction quality may be an issue. Investigate before proceeding.

Step 7: Update primary metadata fields

Once you have QC'd the results and are confident in their accuracy, copy the values from your AI fields (e.g., OC Date) into your primary metadata fields (e.g., Document Date). This can be done through Nuix Discover's bulk field update tools.

This step is intentionally separate from the scan. Writing directly to primary metadata fields during a scan means any errors become part of your production-ready data with no easy way to roll back. Always QC first.

Multi-Code mode (single pass, multiple fields)

Use Multi-Code when one scan must populate multiple outputs (for example: date, title, author, and document type).

Configure up to 8 fields in the dedicated Multi-Code section.
Click Reset fields (top right of the Multi-Code card) to clear the shared instruction, field prompts, and destination field selections, and restore 2 empty rows.
In Prompt Lab, when Claira detects a multi-output prompt, click Use Multi-Code to convert it automatically:
- shared preamble text is moved into the Multi-Code shared instruction,
- each detected output request is split into a separate field prompt (2-8 parts),
- existing Multi-Code rows are cleared and rebuilt from the converted parts.
Use the Insert into selector as the default focus for the shared instruction or a field prompt row (all rows are listed, including those still being configured).
Open View History or Quickstart (templates or prompt generator): Claira asks where to put the content—the shared instruction, any existing field row, or New field (until you reach 8 rows). If that destination already has text, choose append or replace next.
In bulk scans, token usage is charged per document using the same rules as other bulk scans, with Multi-Code costs depending on the Scan as dropdown selection:
- Text + Multi-Code: 1 token per configured (active) field (for example, 3 fields = 3 tokens per document; 8 fields = 8 tokens per document).
- Image + Multi-Code: 5 tokens for the first field, plus 1 token for each additional field (3 fields = 7 tokens; 8 fields = 12 tokens per document).
- Audio + Multi-Code: 10 tokens for the first field, plus 1 token for each additional field (3 fields = 12 tokens; 8 fields = 17 tokens per document).
- Video + Multi-Code: 20 tokens for the first field, plus 1 token for each additional field (3 fields = 22 tokens; 8 fields = 27 tokens per document).
- Image, audio, and video modes all require Pro plan or higher. See Media scans for accepted file formats per mode.

Best practices

Be specific in your prompts. "What is the date?" is too vague. "Identify the primary document date. Respond ONLY MM/DD/YYYY." is clear and testable.
Group similar document types. If your collection includes contracts, emails, and memos, consider running separate prompts tuned to each type. Contracts have "Effective Date" while emails have "Sent Date" -- one prompt may not handle both well.
Always QC before committing. The AI fields exist to give you a safety net. Use it.
Use fallback values. A clear fallback like -- or N/A is far better than a blank field. It tells you Claira tried and could not find the answer, which is different from Claira not having processed the document at all.

Limitations

Text mode reads extracted text only. In Text mode, Claira cannot read images, handwritten notes, audio, video, or content embedded in non-text formats unless they have been OCR'd or transcribed. For those documents, switch the Scan as dropdown to Image, Audio, or Video so the source file is sent directly to a multimodal model.
OCR quality matters in Text mode. If the OCR is poor (garbled text, missing characters), Claira's output will reflect that. Garbage in, garbage out — switching to Image mode is often the right fix.
Documents are analyzed individually. Claira does not cross-reference between documents. If the author is only named in a cover email but not in the attached report, the report will not inherit that author.
Complex cases need human judgment. A document with five plausible dates requires a reviewer to decide which one is "primary." Claira will follow your prompt's instructions, but those instructions may not cover every edge case.

Need help? Contact support@claira.to

Objective Coding

On this page