Source Citation v2

Introduction

When a Wikit Semantics APP answers a question, it can cite its sources as numbered references (e.g., [1], [2], [10]) - an option to be enabled in the prompt. This API allows you to retrieve the full metadata of these sources to display them in your user interface.

Benefits:

Trace the origin of the information provided
Ensure transparency for users
Allow verification by accessing source documents

Prerequisites

This documentation assumes that you have already:

Executed a query via the /query-executions endpoint
Retrieved the queryId from the response

For more information on executing queries, see the dedicated section.

Retrieve sources for a query

Endpoint

GET /semantics/apps/{llm_app_id}/query-execution-sources/{query_id}/get-quoted-sources

Request example

bash

curl -X GET "<https://apis.wikit.ai/semantics/apps/$APP_ID/query-execution-sources/$QUERY_ID/get-quoted-sources>" \\
  -H "accept: application/json" \\
  -H "Wikit-Semantics-API-Key: $API_KEY" \\
  -H "X-Wikit-Organization-Id: $ORG_ID"

ℹ️ The API_KEY is only required for private APPs

The endpoint returns a list of passages identified in the document fragments.

Response example

json

[
  {
    "quoted_as": "1",
    "document": {
      "id": "686e90dd2ff687c57806d107",
      "name": "Remote_Chart_01_08_2023.pdf",
      "title": null,
      "url": null,
      "public_link": false,
      "storage_name": "storage_link",
      "data_source_id": "686d402ce52bbbf7116f55f8",
      "organization_id": "686d3ceb18a3087351f0fa35"
    },
    "chunk": {
      "id": "686e90e725e88b90bdd50eb2",
      "data": "## content of the chunk...",
      "document_id": "686e90dd2ff687c57806d107",
      "data_source_id": "686d402ce52bbbf7116f55f8",
      "llm_app_id": null,
      "user_id": null,
      "flags": null,
      "position": 29,
      "page_start": null,
      "page_end": null
    },
    "data_source": {
      "id": "686d402ce52bbbf7116f55f8",
      "name": "Human Resources"
    },
    "score": 0.88380575,
    "retriever_type": "SEMANTIC_SEARCH"
  }
]

Response structure

The response is an array of source objects, each object containing:

Main fields

Field	Type	Description
`quoted_as`	string	Citation number in the response (e.g.: "1", "2", "10")
`document`	object	Source document metadata
`chunk`	object	Document excerpt used as a source
`data_source`	object	Information about the data source
`score`	float	Semantic relevance score (between 0 and 1)
`retriever_type`	string	Search type used (e.g.: "SEMANTIC_SEARCH")

`document` object

Field	Type	Description
`id`	string	Unique document identifier
`name`	string	Source file name
`title`	string/null	Document title (if available)
`url`	string/null	Document URL (if available)
`public_link`	boolean	Indicates if the document has a public link
`storage_name`	string	Internal storage path
`data_source_id`	string	Data source identifier
`organization_id`	string	Organization identifier

`chunk` object

Field	Type	Description
`id`	string	Unique chunk identifier
`data`	string	Source text excerpt (markdown content)
`document_id`	string	Reference to the parent document
`position`	integer	Position of the chunk in the document
`page_start`	integer/null	Start page (if applicable)
`page_end`	integer/null	End page (if applicable)

`data_source` object

Field	Type	Description
`id`	string	Data source identifier
`name`	string	Data source name (e.g.: "Human Resources")

Best practices

1. Call timing

Call the sources API after the response streaming has finished

2. Conditional display

Only display the sources section if they exist

3. URL management

Some sources do not have an URL. Handle this case in your UI

4. Conditional display

The API may return the same document multiple times; handle this case according to your needs

5. Conditional display

If you wish, you can hide/transform the [1], [2], ... when displaying the response

6. Common error codes

Code	Meaning	Solution
401	Invalid API Key	Check the API key and Organization ID
404	Query not found	Check the queryId or wait a few seconds
500	Server error	Retry after a delay

Source Citation v1 (deprecated)

Retrieving source citations for a query

After executing a query, it is possible to extract the citations identified in the sources (i.e., document fragments) used by the App.

POST /semantics/apps/{llm_app_id}/query-executions/{query_execution_id}/citations

bash

curl -X POST "https://apis.wikit.ai/semantics/apps/$SEMANTICS_APP_ID/query-executions/$SEMANTICS_QUERY_EXECUTION_ID/citations" \
  -H "Authorization: Bearer $SEMANTICS_TOKEN" \
  -H "X-Wikit-Organization-Id: $SEMANTICS_ORG_ID" \
  -H "X-Wikit-Response-Format: json" \
  -H "Content-Type: application/json" \
  -d "{}"

The endpoint returns a list of passages identified in the document fragments.

Example:

json

[
    {
        "_id": "668e9bf39bd4493d2557a6ea",
        "created_at": "2024-07-10T14:34:26Z",
        "query_execution_id": "668e9bf09bd4493d2557a6e6",
        "total_count": 1,
        "reply_sentence": "The colors of Wikit's visual identity are: Sky Blue, Midnight Blue, and Violet.",
        "source_sentence": "The colors of Wikit's visual identity are: - Sky Blue; - Midnight Blue; - Violet.",
        "chunk_data_snapshot": " [...]  {{source_sentence}}  [...] ",
        "start_char_idx": 0,
        "end_char_idx": 89,
        "chunk_id": "668e9b94ca8d43490c14ede3",
        "chunk_page_start": 2,
        "chunk_page_end": 2,
        "chunk_position": null,
        "document_id": "668e9b94ca8d43490c14ede0",
        "document_name": "wikit-visual-identity.pdf",
        "document_title": null,
        "document_url": "https://semantics-files.wikit.ai/v1/viewer/NjVkNWIzNDRhMDUxZWEyOGI3ZjU0ZDc2LzY2OGU5YjgzY2E4ZDQzNDkwYzE0ZWRkZi8xNzIwNjIxOTcxLjkwNzU3N193aWtpdF9kb2N1bWVudF9zYW1wbGUucGRm",
        "organization_id": "65d5b344a051ea28b7f54d76",
        "threshold": 50
    }
]

In this example, the fragment associated with page 3 (see the chunk_page_start and chunk_page_end indexes, which start at 0) of the document "wikit-visual-identity.pdf" (see the document_name field) was identified by Wikit Semantics as a source.

Source Citation v2 ​

Introduction ​

Prerequisites ​

Retrieve sources for a query ​

Endpoint ​

Request example ​

Response example ​

Response structure ​

Main fields ​

document object ​

chunk object ​

data_source object ​

Best practices ​

1. Call timing ​

2. Conditional display ​

3. URL management ​

4. Conditional display ​

5. Conditional display ​

6. Common error codes ​

Source Citation v1 (deprecated) ​

Retrieving source citations for a query ​