Source Citation v2
Introduction
When a Wikit Semantics APP answers a question, it can cite its sources as numbered references (e.g., [1], [2], [10]) - an option to be enabled in the prompt. This API allows you to retrieve the full metadata of these sources to display them in your user interface.
Benefits:
- Trace the origin of the information provided
- Ensure transparency for users
- Allow verification by accessing source documents
Prerequisites
This documentation assumes that you have already:
- Executed a query via the
/query-executionsendpoint - Retrieved the
queryIdfrom the response
For more information on executing queries, see the dedicated section.
Retrieve sources for a query
Endpoint
GET /semantics/apps/{llm_app_id}/query-execution-sources/{query_id}/get-quoted-sources
Request example
curl -X GET "<https://apis.wikit.ai/semantics/apps/$APP_ID/query-execution-sources/$QUERY_ID/get-quoted-sources>" \\
-H "accept: application/json" \\
-H "Wikit-Semantics-API-Key: $API_KEY" \\
-H "X-Wikit-Organization-Id: $ORG_ID"ℹ️ The API_KEY is only required for private APPs
The endpoint returns a list of passages identified in the document fragments.
Response example
[
{
"quoted_as": "1",
"document": {
"id": "686e90dd2ff687c57806d107",
"name": "Remote_Chart_01_08_2023.pdf",
"title": null,
"url": null,
"public_link": false,
"storage_name": "storage_link",
"data_source_id": "686d402ce52bbbf7116f55f8",
"organization_id": "686d3ceb18a3087351f0fa35"
},
"chunk": {
"id": "686e90e725e88b90bdd50eb2",
"data": "## content of the chunk...",
"document_id": "686e90dd2ff687c57806d107",
"data_source_id": "686d402ce52bbbf7116f55f8",
"llm_app_id": null,
"user_id": null,
"flags": null,
"position": 29,
"page_start": null,
"page_end": null
},
"data_source": {
"id": "686d402ce52bbbf7116f55f8",
"name": "Human Resources"
},
"score": 0.88380575,
"retriever_type": "SEMANTIC_SEARCH"
}
]Response structure
The response is an array of source objects, each object containing:
Main fields
| Field | Type | Description |
|---|---|---|
quoted_as | string | Citation number in the response (e.g.: "1", "2", "10") |
document | object | Source document metadata |
chunk | object | Document excerpt used as a source |
data_source | object | Information about the data source |
score | float | Semantic relevance score (between 0 and 1) |
retriever_type | string | Search type used (e.g.: "SEMANTIC_SEARCH") |
document object
| Field | Type | Description |
|---|---|---|
id | string | Unique document identifier |
name | string | Source file name |
title | string/null | Document title (if available) |
url | string/null | Document URL (if available) |
public_link | boolean | Indicates if the document has a public link |
storage_name | string | Internal storage path |
data_source_id | string | Data source identifier |
organization_id | string | Organization identifier |
chunk object
| Field | Type | Description |
|---|---|---|
id | string | Unique chunk identifier |
data | string | Source text excerpt (markdown content) |
document_id | string | Reference to the parent document |
position | integer | Position of the chunk in the document |
page_start | integer/null | Start page (if applicable) |
page_end | integer/null | End page (if applicable) |
data_source object
| Field | Type | Description |
|---|---|---|
id | string | Data source identifier |
name | string | Data source name (e.g.: "Human Resources") |
Best practices
1. Call timing
Call the sources API after the response streaming has finished
2. Conditional display
Only display the sources section if they exist
3. URL management
Some sources do not have an URL. Handle this case in your UI
4. Conditional display
The API may return the same document multiple times; handle this case according to your needs
5. Conditional display
If you wish, you can hide/transform the [1], [2], ... when displaying the response
6. Common error codes
| Code | Meaning | Solution |
|---|---|---|
| 401 | Invalid API Key | Check the API key and Organization ID |
| 404 | Query not found | Check the queryId or wait a few seconds |
| 500 | Server error | Retry after a delay |
Source Citation v1 (deprecated)
Retrieving source citations for a query
After executing a query, it is possible to extract the citations identified in the sources (i.e., document fragments) used by the LLM app.
POST /semantics/apps/{llm_app_id}/query-executions/{query_execution_id}/citations
curl -X POST "https://apis.wikit.ai/semantics/apps/$SEMANTICS_APP_ID/query-executions/$SEMANTICS_QUERY_EXECUTION_ID/citations" \
-H "Authorization: Bearer $SEMANTICS_TOKEN" \
-H "X-Wikit-Organization-Id: $SEMANTICS_ORG_ID" \
-H "X-Wikit-Response-Format: json" \
-H "Content-Type: application/json" \
-d "{}"The endpoint returns a list of passages identified in the document fragments.
Example:
[
{
"_id": "668e9bf39bd4493d2557a6ea",
"created_at": "2024-07-10T14:34:26Z",
"query_execution_id": "668e9bf09bd4493d2557a6e6",
"total_count": 1,
"reply_sentence": "The colors of Wikit's visual identity are: Sky Blue, Midnight Blue, and Violet.",
"source_sentence": "The colors of Wikit's visual identity are: - Sky Blue; - Midnight Blue; - Violet.",
"chunk_data_snapshot": " [...] {{source_sentence}} [...] ",
"start_char_idx": 0,
"end_char_idx": 89,
"chunk_id": "668e9b94ca8d43490c14ede3",
"chunk_page_start": 2,
"chunk_page_end": 2,
"chunk_position": null,
"document_id": "668e9b94ca8d43490c14ede0",
"document_name": "wikit-visual-identity.pdf",
"document_title": null,
"document_url": "https://semantics-files.wikit.ai/v1/viewer/NjVkNWIzNDRhMDUxZWEyOGI3ZjU0ZDc2LzY2OGU5YjgzY2E4ZDQzNDkwYzE0ZWRkZi8xNzIwNjIxOTcxLjkwNzU3N193aWtpdF9kb2N1bWVudF9zYW1wbGUucGRm",
"organization_id": "65d5b344a051ea28b7f54d76",
"threshold": 50
}
]In this example, the fragment associated with page 3 (see the chunk_page_start and chunk_page_end indexes, which start at 0) of the document "wikit-visual-identity.pdf" (see the document_name field) was identified by Wikit Semantics as a source.