> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getaftercare.com/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Data Quality Analysis

> Evaluate the quality of your survey responses with AI

<Tip>
  Looking for the [API reference](/api-reference/data-quality/evaluate)?
</Tip>

## Introduction

Aftercare evaluates the quality of your responses in the context of the response itself and compared to all other survey responses. Not all
quality checks are built the same, so we've built a few different metrics to give you a holistic view of your responses.

<Note type="warning">
  The `qualityScore` field is deprecated and will be removed in a future
  release. Please use `demeritScore` instead. The `demeritScore` provides the
  same information but with a more intuitive scale - higher scores indicate
  lower quality responses.
</Note>

## Components

Aftercare AI breaks down the quality of your responses into a few different metrics:

### Respondent level metrics

* `Inconsistency`: how inconsistent a respondent is across multiple responses. Higher scores indicate more contradictions and inconsistencies in the responses.

### Response level metrics

* `Nonsense`: how coherent and logical the response itself is. (ie, gibberish, troll responses, etc.)
* `Relevance`: how pertinent the response is to the question.
* `Low-effort`: how much effort was put into the response. (length, specific details, etc.)
* `Llm-generated`: how likely the response is to be AI or LLM generated.
* `Self-duplication`: how similar a particular answer is to previous answers in the same survey response.
* `Shared-duplication`: how similar the response is to other responses for a particular question across respondents. Duplicates can be found in the data quality console in the platform.
  <Warning>Shared duplicate scores will only be calculated if a response identifier is provided. See [Shared duplicate detection](#shared-duplicate-detection) for more information.</Warning>
* `Keystroke-issues`: a higher score indicates suspicious non-human keystroke patterns. See [Keystroke event collection](#keystroke-event-collection) for more information.
* `Honeypot`: a higher score indicates that the respondant has likely fallen for the honeypot phrase. See [Honeypot detection](#honeypot-detection) for more information.

Aftercare will generate a confidence score for each of these metrics along with an overall demerit score from 0-100 you can use to compare the quality of your responses. The higher the demerit score, the lower the quality of the data.
Aftercare will let you know which (if any) of these quality metrics were violated and how many violations were found in total.

## Quality Detection Modes

In addition to checking for specific quality issues, Aftercare provides three detection modes that group related issues together for common use cases:

<Card title="Responsiveness" icon="reply">
  Focuses on identifying poor-quality responses by checking for issues like
  nonsensical content, irrelevant answers, and low-effort responses.

  <Tip>
    Use this mode when you want to quickly identify respondents who aren't
    engaging meaningfully with your survey.
  </Tip>
</Card>

<Card title="Authenticity" icon="shield-check">
  Concentrates on detecting potentially fraudulent responses by checking for
  issues like LLM-generated content and duplicated answers across respondents.

  <Tip>
    Use this mode when validating that responses are genuine and not
    artificially generated.
  </Tip>
</Card>

<Card title="Composite" icon="list-check">
  Performs a comprehensive evaluation by checking for all available quality
  issues. This is the most thorough analysis and is the default mode.

  <Tip>
    Use this mode when you want a comprehensive check of all quality factors
    across your dataset.
  </Tip>
</Card>

<Note>
  If you specify both a detection mode and a list of specific quality issues in
  your request, Aftercare will prioritize the detection mode.
</Note>

## Keystroke event collection

Collect real-time keystroke events from a respondent's browser and use them to detect bot activity. Aftercare will compute the following metrics:

* `total keystrokes`: the total number of keystrokes in the response
* `average hold time`: the average time a key is held down for
* `hold time standard deviation`: the standard deviation of the hold time of each key
* `average digraph delay`: the average time between digraphs (two-character combinations)
* `digraph delay standard deviation`: the standard deviation of the time between digraphs
* `digraph delay p90`: the 90th percentile of the time between digraphs
* `pause ratio`: the ratio of pause to keystroke events

From these metrics, Aftercare will compute a `keystroke-issues` score from 0-100. See these metrics in the data quality console in the platform.

To get started, see the [Events API](/integrations/events) documentation.

## Shared duplicate detection

Shared duplicate detection compares responses across respondents to detect copy pasted responses. Longer responses are weighted more heavily than shorter responses.

Shared duplicates will only compare across responses that have a response identifier. If your respondants are anonymous, you can generate a random uuid for each response and pass it in the `responseIdentifier` field.
It helps to prepend `random_` to the response identifier to make it easier to identify.

Already generated responses and want to retroactively add response identifiers? Reach out to [support@getaftercare.com](mailto:support@getaftercare.com) and we'll be happy to help. We're adding a feature to allow you to add response identifiers to existing responses in the platform soon.

## Honeypot detection

With the increasing sophistication of LLM powered bots, it is becoming more difficult to detect them using simple heuristics.

A honeypot is a hidden phrase that an LLM reading the page will see but a human will not. We can use a honeypot phrase to poison the LLMs context
and detect if a respondent has fallen for the honeypot phrase.

Pass a `honeypot phrase` to the data quality API and Aftercare will detect if a respondent has fallen for the honeypot phrase. See the [Data Quality API](/api-reference/data-quality/evaluate) documentation for more information.

## Inconsistency detection

Inconsistency scores are computed by comparing a respondent's multiple responses to each other. The API response will include the inconsistency score and whether it is flagged or not.

To see specific responses that are flagged as inconsistent and the specific rationales, see the [Data Quality Console](/data-quality-console) in the platform.

## Tips for improving your response quality

While you can use the data quality API to evaluate each response individually, providing a `survey ID`, `question ID`, and `response ID` will help tie together responses across respondents.
Doing so will provide a more accurate assessment of the quality of your responses.
