Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.aperium.apps.hillspire.com/llms.txt

Use this file to discover all available pages before exploring further.

Aperium reads its configuration from environment variables on the backend pods. This page lists every variable a self-hosted deployment needs, grouped by what they control.
Connector credentials are not configured here. Anything connector-specific (Google Workspace, Slack, Atlassian, Microsoft 365, Odoo, Salesforce, NetSuite, BigQuery, Onyx, and so on) is entered by an admin through the admin onboarding flow on first sign-in or the Admin Console’s MCP Servers tab afterward. Aperium stores those credentials encrypted against the tenant. See Integrations.

Application basics

VariableExamplePurpose
APP_ENVproductionApplication environment label. Drives a few defaulted behaviors and shows up in logs and traces.
LOG_LEVELINFOBackend log level. Use DEBUG for noisy local debugging, INFO or WARNING in production.
SECRET_KEYprod_secret_keyServer-side secret used for signing internal tokens. Generate a long random string per environment and treat it as a credential.
CORS_ORIGINShttps://aperium.apps.your-company.comComma-separated list of origins allowed to call the backend. Must include the frontend URL.

Database

VariableExamplePurpose
DATABASE_TYPEpostgresqlDatabase driver. PostgreSQL is the supported production option.
DATABASE_URLpostgresql://user:pass@host:5432/aperiumSQLAlchemy connection URL for the application database.

LLM providers

Aperium supports calling Claude either through Anthropic’s API directly or through AWS Bedrock. Pick whichever fits your security and procurement requirements (or run both side by side and select per-deployment with PRIMARY_LLM_PROVIDER).

Provider selection

VariableExamplePurpose
PRIMARY_LLM_PROVIDERanthropicDefault provider for the main reasoning model. Use anthropic or bedrock.
PRIMARY_LLM_MODELclaude-sonnet-4-6Default model used for the main agent loop. For Bedrock, use a Bedrock-style model ID such as us.anthropic.claude-sonnet-4-6.
SECONDARY_LIGHTWEIGHT_LLM_PROVIDERanthropicProvider for cheaper/faster auxiliary calls (routing, classification, summarization).
SECONDARY_LIGHTWEIGHT_LLM_MODELclaude-haiku-4-5Model for the lightweight auxiliary path.

Anthropic API

VariableExamplePurpose
ANTHROPIC_API_KEYsk-ant-...Anthropic API key used to call Claude when PRIMARY_LLM_PROVIDER=anthropic.

AWS Bedrock

Set BEDROCK_ENABLED=true to register Bedrock as a provider, then either supply IAM keys explicitly or leave them blank to use the standard boto3 credential chain (env vars, ~/.aws/credentials, instance profile, IRSA on EKS, etc.).
VariableExamplePurpose
BEDROCK_ENABLEDtrueMaster switch for the Bedrock provider. Defaults to false.
BEDROCK_AWS_REGIONus-west-2AWS region that hosts the Bedrock endpoint and inference profiles.
BEDROCK_AWS_ACCESS_KEY_IDAKIA...IAM access key with bedrock:InvokeModel permission. Optional; falls back to the boto3 default chain when blank.
BEDROCK_AWS_SECRET_ACCESS_KEYsecretIAM secret key. Optional; falls back to the boto3 default chain when blank.
BEDROCK_AWS_SESSION_TOKENsecretSession token for temporary credentials (STS). Leave blank if you’re using long-lived keys or instance/IRSA credentials.
BEDROCK_DEFAULT_MODELus.anthropic.claude-sonnet-4-6Default Bedrock model ID used when no model is specified.
BEDROCK_MODELSus.anthropic.claude-sonnet-4-6,us.anthropic.claude-haiku-4-5-20251001-v1:0Comma-separated allowlist of Bedrock model IDs available to the platform. Both inference profile IDs ({region}.anthropic.claude-...) and direct model IDs (anthropic.claude-...) are accepted.
BEDROCK_COST_MAPus.anthropic.claude-sonnet-4-6:3.0:15.0Per-model cost tracking for usage reports, in model:input_cost_per_M_tokens:output_cost_per_M_tokens format. Comma-separate multiple entries.
BEDROCK_REQUEST_TIMEOUT60Per-request timeout in seconds. Range 5 to 300.
Bedrock provider configuration is environment-driven only; there is no per-tenant override.

Other providers

VariableExamplePurpose
GOOGLE_API_KEYAIza...Google AI Studio key, used when calling Gemini-family models or Google embeddings.

Token and context window budgets

These caps control how Aperium splits the model’s context window across the system prompt, tool schemas, conversation history, and tool results.
VariableExamplePurpose
MODEL_MAX_TOKENS200000Maximum context window Aperium plans against.
SYSTEM_PROMPT_TOKEN_RESERVE2000Tokens reserved for the system prompt.
TOOLS_SCHEMA_TOKEN_RESERVE30000Tokens reserved for advertising tool schemas to the model.
RESPONSE_TOKEN_RESERVE8192Tokens reserved for the model’s reply.
MAX_TOOL_RESULT_CHARS80000Per-tool-call hard cap on returned content (in characters). Larger results are truncated.
LONG_CONTEXT_MODEautoWhether to enable extended-context mode when the model supports it. auto, on, or off.

Context management and loop detection

VariableExamplePurpose
CONTEXT_MANAGEMENT_ENABLEDfalseWhether Aperium actively compacts conversation history when it nears the budget.
MAX_LOOP_PATTERN_REPEATS9Maximum number of times Aperium will allow the same tool-call pattern to repeat before breaking the loop.

Response streaming

VariableExamplePurpose
ENABLE_RESPONSE_STREAMINGtrueStream the model’s reply to the frontend as it’s generated. Recommended true in production.

Prompt caching

Prompt caching lets the model reuse a long static prefix (system prompt + tool schemas + skills) across requests so you only pay full price for the first hit.
VariableExamplePurpose
ENABLE_PROMPT_CACHINGtrueMaster switch for Anthropic prompt caching.
CACHE_TTL_TOOLS1hTTL for the tools-schema cache block.
CACHE_TTL_SYSTEM_PROMPT1hTTL for the system-prompt cache block.
CACHE_TTL_SKILLS1hTTL for skills/instruction blocks.
CACHE_TTL_CONVERSATION5mTTL for the conversation prefix cache. Short, since conversation grows quickly.

MCP runtime

These variables control how Aperium talks to MCP servers (both built-in connectors and any custom ones registered through the Admin Console). They do not contain connector credentials.
VariableExamplePurpose
MCP_AUTH_MODEenforceenforce requires every MCP call to carry a valid auth token. Use this in production.
MCP_CREDENTIAL_ENCRYPTION_KEYbase64-encoded 32-byte keyEncrypts tenant integration credentials at rest. Generate one per environment and treat it as a master secret.
MCP_POOL_ENABLEDtrueReuse MCP client connections across requests.
MCP_POOL_MAX_PER_SERVER3Maximum pooled clients per MCP server.
MCP_POOL_IDLE_TIMEOUT300Seconds an idle pooled client is kept before being closed.
MCP_POOL_KEEPALIVE_INTERVAL60Keepalive ping interval (seconds) on long-lived MCP connections.

Tool loading and routing

Capability routing trims which tools are advertised to the model on each turn so context budget and latency stay reasonable when many MCP servers are connected.
VariableExamplePurpose
TOOL_LOADING_CAPABILITY_ROUTING_ENABLEDtrueTurn on capability-based tool selection.
TOOL_LOADING_CAPABILITY_ROUTING_SHADOW_MODEtrueWhen true, run capability routing alongside the full tool set and log differences without changing behavior. Use this to validate before flipping fully on.
TOOL_SEMANTIC_EMBEDDING_PROVIDERsentence-transformersProvider used to embed tool descriptions for routing.
TOOL_SEMANTIC_EMBEDDING_MODELjinaai/jina-embeddings-v5-text-nanoSpecific embedding model used by the routing provider.
ENABLE_PARALLEL_TOOL_EXECUTIONfalseAllow the agent to call multiple tools in parallel within a single turn.
ENABLE_FORK_MODELfalseEnable the fork-model execution path for branched agent runs.
SEMANTIC_ROUTING_ENABLEDfalseMaster switch for semantic routing of agent intents.

Multi-pod and Redis

If you run more than one backend pod, you must enable Redis so pods can share session state and broadcast notifications.
VariableExamplePurpose
MULTI_POD_ENABLEDfalseSet to true whenever the backend runs with more than one replica.
REDIS_ENABLEDtrueMaster switch for Redis-backed features. Required when MULTI_POD_ENABLED=true.
REDIS_URLredis://10.55.1.11:6379Connection string for Redis.
REDIS_PUBSUB_CHANNEL_PREFIXaperium:notificationsChannel prefix Aperium uses for cross-pod notifications.

Vector database (Qdrant)

Used for retrieval, memory, and other features that need vector search.
VariableExamplePurpose
VECTOR_DB_PROVIDERqdrantVector store backend.
QDRANT_MODEserverserver for a managed Qdrant deployment, local for an embedded one.
QDRANT_HOSTqdrant.qdrant.svc.cluster.localHostname of the Qdrant server.
QDRANT_API_KEYsecretAPI key for Qdrant.

Tabular query backend

Aperium can compute its tabular analytics either against PostgreSQL or against BigQuery. Choose one with TABULAR_QUERY_BACKEND and configure that backend’s variables.
VariableExamplePurpose
TABULAR_QUERY_BACKENDpostgresqlpostgresql or bigquery.

PostgreSQL backend

VariableExamplePurpose
TABULAR_POSTGRESQL_SCHEMAtabularSchema name Aperium reads and writes tabular data into.
TABULAR_POSTGRESQL_QUERY_TIMEOUT_SECONDS30Per-query timeout.
TABULAR_POSTGRESQL_LOCK_TIMEOUT_MS1000Lock acquisition timeout in milliseconds.
TABULAR_POSTGRESQL_MAX_RESULT_ROWS1000Hard cap on rows returned per query.
TABULAR_POSTGRESQL_MAX_BYTES_RETURNED2000000Hard cap on bytes returned per query.
TABULAR_POSTGRESQL_SWEEPER_BATCH_SIZE100Batch size used by the cleanup sweeper.
TABULAR_POSTGRESQL_CLEANUP_RUNNER_MODEcronjobHow the cleanup runner is scheduled (cronjob for a Kubernetes CronJob, otherwise an in-process scheduler).

BigQuery backend

VariableExamplePurpose
TABULAR_BIGQUERY_PROJECThs-apps-prodGCP project hosting the BigQuery dataset.
TABULAR_BIGQUERY_DATASETaperium_tabularDataset name.
TABULAR_BIGQUERY_LOCATIONUSBigQuery location for the dataset.

File uploads and shared storage

Aperium needs a place to store uploaded files. Choose local (a shared RWX volume) or gcs (a Google Cloud Storage bucket) and configure the matching variables.
VariableExamplePurpose
FILE_UPLOAD_STORAGE_BACKENDlocallocal or gcs.
FILE_UPLOAD_LOCAL_DIR/sharedMount path for the shared volume when using local storage.
FILE_UPLOAD_LOCAL_IS_SHAREDtrueSet true when the local directory is mounted into every backend pod (required for multi-pod).
FILE_UPLOAD_GCS_BUCKETaperium-prodGCS bucket name when using the GCS backend.
FILE_UPLOAD_GCS_PROJECThs-apps-prodGCP project owning the upload bucket.
FILE_UPLOAD_GCS_UPLOAD_PREFIXuploads/Prefix used inside the bucket.
FILE_CACHE_CLEANUP_RUNNER_MODEcronjobHow file-cache cleanup runs (cronjob for a Kubernetes CronJob, otherwise in-process).

Document processing

VariableExamplePurpose
DOC_PROCESSOR_PDF_MAX_CONTENT_SIZE_MB50Hard cap on the size of an uploaded PDF before processing.

Sharing

Used by the share-link feature.
VariableExamplePurpose
SHARING_FRONTEND_URLhttps://aperium.apps.your-company.comPublic URL of the frontend, used to build share links.
SHARING_GCS_BUCKETbucket nameGCS bucket used for shared assets when sharing is backed by GCS.
SHARING_GCS_PROJECTproject idGCP project owning the sharing bucket.

Email (SMTP)

Used for notification emails and invitations.
VariableExamplePurpose
SMTP_HOSTsmtp.gmail.comSMTP server hostname.
SMTP_PORT587SMTP port.
SMTP_USERaperium@your-company.comUsername for SMTP auth.
SMTP_PASSWORDsecretPassword or app password.
SMTP_FROM_EMAILaperium@your-company.com”From” address on outgoing email.
SMTP_FROM_NAMEAperium AIDisplay name on outgoing email.
SMTP_USE_TLStrueUse STARTTLS.
SMTP_USE_SSLfalseUse implicit TLS instead of STARTTLS. Set one of SMTP_USE_TLS or SMTP_USE_SSL, not both.

Tracing and observability

Aperium emits OpenTelemetry traces for the agent loop and tool calls. Phoenix is the bundled trace UI.
VariableExamplePurpose
ENABLE_TRACINGTrueMaster switch for tracing.
TRACING_ENDPOINThttp://otel-collector.monitoring.svc.cluster.local:4318OTLP HTTP endpoint for the trace collector.
TRACING_STARTUP_MODEdeferreddeferred initializes tracing lazily on the first span; eager initializes it at boot.
PHOENIX_API_KEYsecretAPI key used by the backend to authenticate to Phoenix.

Sentry

VariableExamplePurpose
SENTRY_ENABLEDtrueMaster switch for Sentry error reporting.
SENTRY_DSNhttps://...@sentry.io/...Sentry DSN for your project.
SENTRY_ENVIRONMENTproductionEnvironment label attached to events.
SENTRY_WITH_LOCALStrueInclude local variables on stack frames in Sentry events.

Guardrails

VariableExamplePurpose
GUARDRAILS_ENABLEDfalseMaster switch for input/output guardrail policies. Editable from the Admin Console once enabled.

Agent intelligence

VariableExamplePurpose
AGENT_INTELLIGENCE_SCHEDULER_ENABLEDtrueEnables the background scheduler that runs proactive agent jobs.

Embedding cache (multi-pod)

When the backend runs with more than one replica, point Hugging Face and sentence-transformers at a shared volume so models aren’t downloaded to every pod.
VariableExamplePurpose
HF_HOME/shared/hf-cacheCache directory used by Hugging Face downloads.
SENTENCE_TRANSFORMERS_HOME/shared/hf-cacheCache directory used by the sentence-transformers library.

Feature flags

These toggle UI features. They can be flipped per environment.
VariableExamplePurpose
GALLERY_ENABLEDtrueShow the gallery surface.
DASHBOARD_V2_ENABLEDfalseSwitch to the v2 dashboard.
DAILY_BRIEF_ENABLEDtrueEnable the daily brief feature.
LYRA_DASHBOARD_ENABLEDtrueEnable Lyra, our newer and more powerful dashboard service. Recommended true for new deployments.

Frontend

The frontend reads URLs from environment variables prefixed with VITE_:
VITE_API_URL=http://localhost:8080
VITE_WEBSOCKET_URL=ws://localhost:8080/ws/chat
VITE_FRONTEND_PORT=3002