Skip to Content
Developer GuideSelf-HostingEnvironment Variables

Environment Variables

Copy .env.example to .env and fill in your values.

Required

VariableDescription
DATABASE_URLPostgreSQL connection string. Example: postgresql://user:pass@localhost:5432/clearsight
REDIS_URLRedis connection string. Default: redis://localhost:6379
AZURE_OPENAI_ENDPOINTFull chat completions URL (see below)
AZURE_OPENAI_API_KEYAzure OpenAI API key

AZURE_OPENAI_ENDPOINT format is critical.

It must be the complete URL including deployment name and api-version:

https://<resource>.openai.azure.com/openai/deployments/<model>/chat/completions?api-version=2025-01-01-preview

The code uses this URL directly as the fetch target. Do not omit the path or query string.

Optional

VariableDefaultDescription
AZURE_OPENAI_API_VERSION2025-01-01-previewAzure OpenAI API version
MAX_CRAWL_PAGESunlimitedHard cap on pages per crawl
CRAWL_DELAY_MS200Delay (ms) between page fetches during discovery
WORKER_CONCURRENCY3Parallel Playwright instances for page scanning
AI_CONCURRENCY2Parallel AI enrichment workers
BULL_BOARD_PORT3001Port for Bull Board admin UI

Tuning concurrency

WORKER_CONCURRENCY controls how many pages are scanned simultaneously. Each worker runs a headless Chromium instance. Higher values = faster crawls but more RAM usage.

Rule of thumb: 1 Playwright worker ≈ 300–500MB RAM. With WORKER_CONCURRENCY=3, budget ~1.5GB for the worker process.

AI_CONCURRENCY controls parallel Azure OpenAI calls. Keep this ≤ your Azure deployment’s rate limit.

Next steps

Last updated on