Skip to content

LLM App Query Execution

The LLM App is the central concept of the solution. It is responsible for orchestrating Large Language Models (LLMs) and related tools (for example, semantic search of document fragments).

Question-Answer type LLM Apps have Data Sources to extract answer elements for users.

Authentication and Authorization

An API key is required to invoke the query execution API.

This key must be placed in a wikit-semantics-api-key HTTP header.

It is provided by Wikit.

Query Execution

POST /semantics/apps/{llm_app_id}/query-executions

bash
curl "https://apis.wikit.ai/semantics/apps/$SEMANTICS_APP_ID/query-executions" \
-H "Content-Type: application/json" \
-H "Wikit-Semantics-API-Key: $SEMANTICS_API_KEY" \
-H "X-Wikit-Response-Format: json" \
-H "X-Wikit-Organization-Id: $SEMANTICS_ORG_ID" \
-d '{
  "query": "How many days of special leave am I entitled to for my wedding?"
}'

Response

jsx
{
    "answer": "In France, you are entitled to 4 days of special leave for your marriage or PACS, according to the Labor Code (article L3142-1).",
    "queryId": "67c5646e05d2ed84ac20495d",
    "conversation_id": "67c5646e05d2ed84ac20495c",
    "metadata": null
}

Query Execution with Streaming

POST /semantics/apps/{llm_app_id}/query-executions?is_stream_mode=true

Response streaming is the standard way LLMs generate content token by token.

To enable it, simply set the is_stream_mode parameter to true in the query part of the URL:

bash
curl "https://apis.wikit.ai/semantics/apps/$SEMANTICS_APP_ID/query-executions?is_stream_mode=true" \
-H 'Content-Type: application/json' \
-H "Wikit-Semantics-API-Key: $SEMANTICS_API_KEY" \
-H "X-Wikit-Response-Format: json" \
-H "X-Wikit-Organization-Id: $SEMANTICS_ORG_ID" \
-d '{
  "query": "How many days of special leave am I entitled to for my wedding?"
}'

Response

jsx
data: {"queryId": "67c565542c8f2bb8fb52d43a", "chunk":"In "}STOP

data: {"queryId": "67c565542c8f2bb8fb52d43a", "chunk":"Fra"}STOP

data: {"queryId": "67c565542c8f2bb8fb52d43a", "chunk":" ce,"}STOP

data: {"queryId": "67c565542c8f2bb8fb52d43a", "chunk":" you "}STOP

data: {"queryId": "67c565542c8f2bb8fb52d43a", "chunk":" are"}STOP

Query Execution in an Existing Conversation

With the Conversational Question-Answer LLM App type, a conversation is created upon the first query execution via the /semantics/apps/{llm_app_id}/query-executions endpoint. The conversation ID is returned in the JSON output or in the response headers in streaming mode (X-wikit-Conversation-Id): it must be used in subsequent query executions to continue the conversation, via the X-Wikit-Conversation-Id HTTP header:

json
curl "https://apis.wikit.ai/semantics/apps/$SEMANTICS_APP_ID/query-executions?is_stream_mode=true" \
-H 'Content-Type: application/json' \
-H "Wikit-Semantics-API-Key: $SEMANTICS_API_KEY" \
-H "X-Wikit-Response-Format: json" \
-H "X-Wikit-Organization-Id: $SEMANTICS_ORG_ID" \
-H "X-Wikit-Conversation-Id: $SEMANTICS_CONVERSATION_ID" \
-d '{
  "query": "What about my daughter's birthday?"
}'