ClairaClaira Help Desk
PromptingTemplatesInvestigations

Comprehensive PII (surface forms)

Voir en français

Comprehensive PII (surface forms)

Use this template for repeatable PII inventories in Claira: one comma-separated line of quoted values, each surface form exactly as written in the document.

PromptingTemplate PickerComprehensive PII (surface forms)

When you select this template, Claira displays the prompt configuration panel. Map the output to a Text review field in your case (this output is not suitable for Date or single-choice fields). Restrict field visibility and follow your organization's data-handling policy when storing raw PII.

PII surface formsTextclaira-pii-surface-01

What this prompt is for

This template lists every distinct written form of personally identifiable information (PII) without normalization, in a single machine-friendly line. The format works well as input for Search Term Families in Nuix Discover: each quoted value (or a group you treat as one family) can drive grouped search or QC across the collection.

Step-by-step in Claira

  1. Open Prompting > Template picker in your case.
  2. Under Investigations, select Comprehensive PII (surface forms) and map the output to one or more Text review fields.
  3. Run a 10–25 document sample first; confirm quoting, commas, and the NO PII DETECTED sentinel behave as you expect.
  4. Adjust field permissions and workflow settings outside the prompt body, then extend to bulk review.

The prompt

Extract every unique instance of Personally Identifiable Information (PII) from this document. PII includes, but is not limited to: full names, partial names, initials, nicknames, titles with names (e.g., Mr. Doe, Dr. Smith), email addresses, phone numbers, physical addresses, dates of birth, government-issued identifiers (SSN, SIN, passport numbers, driver's license numbers), financial account numbers, medical record numbers, IP addresses, and usernames. Do not include dates or URLs that are not contained in email addresses.
Follow these rules exactly:

- Output every unique surface form of each PII instance exactly as it appears in the document. If the same person, address, or other entity is written in multiple formats, include each distinct format as a separate entry (e.g., "John Doe", "J. Doe", "Mr. Doe", "Doe, John").
- Do not include duplicates of the exact same formatted string. Each entry in the output must be unique character-for-character.
- Do not normalize, correct, reformat, expand, or abbreviate any value. Preserve original casing, punctuation, spacing, and spelling, including apparent typos.
- For addresses, include each distinct written format separately (e.g., "123 Main St.", "123 Main Street", "123 Main St, Montreal, QC").
- For phone numbers, include each distinct written format separately (e.g., "514-555-1234", "(514) 555-1234", "+1 514 555 1234").
- Wrap each value in straight double quotation marks. Separate entries with a comma. Do not use line breaks, bullets, numbering, curly braces, square brackets, or any other wrapper or delimiter.
- Do not include any introductory text, explanatory text, headings, labels, categories, counts, trailing commentary, or closing remarks. The response must begin with the first quotation mark and end with the final quotation mark.
If no PII is found, output exactly: "NO PII DETECTED"
  • Add case-specific scope in workflow instructions (custodians, date bounds, document types) without changing the output rules above.
  • Keep downstream tooling expectations explicit (e.g., how you split values into Search Term Families).
  • Version your configuration so bulk runs stay reproducible.

Worked example

Input excerpt

Please contact John Doe at john@acme.co or J. Doe at the office via (514) 555-1234. Same person also listed as "Doe, John" on the fax cover.

Expected output shape

"John Doe","J. Doe","john@acme.co","(514) 555-1234","Doe, John"

Troubleshooting

  • If the model adds labels or line breaks, restate in workflow instructions that the answer must be a single comma-separated quoted list only.
  • If OCR quality is poor, PII in unscanned images may be missed — verify extraction text before relying on results.
  • If output is too noisy, run this template on a sample first, then narrow scope with a more targeted prompt from the PII Identification workflow.

Was this page helpful?