Overview
The Vulgate Chat API lets you have AI-powered conversations grounded in your document libraries. Send one or more messages — including text, images, and files — and receive a streamed or complete response, backed by citations from your ingested content.
The API is designed to be compatible with the OpenAI Chat Completions interface, so you can use it as a drop-in replacement with any OpenAI-compatible SDK — just point it to your Vulgate instance and swap the model name.
Base URL
All Chat API endpoints are relative to your Vulgate instance:
https://vulgate.ai/api
Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/chat/completions | POST | Generate a chat completion, with optional streaming |
Authentication
All endpoints require authentication via a Bearer token in the Authorization header:
Authorization: Bearer <your-api-key>
API keys are scoped to a team and carry the permissions of the team owner. See Getting Started for how to obtain and configure your API key.
Models
Pass one of the following values in the model field of your request:
| Model | Description |
|---|---|
vulgate-1 | Latest Vulgate model. Recommended for all new integrations. |
vulgate | Alias for vulgate-1. |
Libraries
Unlike the Search API, the Chat API does not require you to specify libraries explicitly. The AI automatically searches the libraries associated with your team based on your API key.
OpenAI compatibility
The /api/chat/completions endpoint follows the OpenAI Chat Completions API shape. This means you can use it with any OpenAI-compatible client by setting a custom base URL:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.VULGATE_API_KEY,
baseURL: "https://vulgate.ai/api",
});
const response = await client.chat.completions.create({
model: "vulgate-1",
messages: [{ role: "user", content: "What does the Catechism say about prayer?" }],
});
Note that some OpenAI-specific fields (e.g. response_format: json_schema, stop) are accepted for compatibility but are not currently supported.
Quick example
Send a single user message to the completions endpoint:
curl -X POST "https://vulgate.ai/api/chat/completions" \
-H "Authorization: Bearer $VULGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vulgate-1",
"messages": [
{ "role": "user", "content": "What does Aquinas say about the existence of God?" }
]
}'
The response contains the assistant’s answer under choices[0].message.content, along with a citations array referencing the source passages used:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1741200000,
"model": "vulgate-1",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Thomas Aquinas presents five arguments for the existence of God in the *Summa Theologiae*, known as the Five Ways..."
},
"finish_reason": "stop"
}
],
"citations": [
{
"cited_text": "<p>The existence of God can be proved in five ways...</p>",
"document_title": "Summa Theologiae",
"document_index": 0,
"document_author": "Thomas Aquinas",
"source_url": null
}
]
}
Rate limits
Requests to /api/chat/completions are subject to two limits:
- Rate limit: 10 requests per 10 seconds per team, using a sliding window.
- Execution timeout: 120 seconds per request. Requests that involve multiple tool calls (searching and reading documents) may take longer — if your request times out, try simplifying the query.
Exceeding the rate limit returns a 429 response with the following headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Next steps
- Chat Completions - full reference for the completions endpoint, including streaming, multimodal input, and citations