Self-Host Langfuse: Complete LLM Observability for Your Homelab

Deploy Langfuse in your homelab to monitor LLM calls, track tokens, manage prompts, and debug AI applications with full control over your data.

• 7 min read
langfusellmobservabilityself-hosteddockerhomelabmonitoringai
Self-Host Langfuse: Complete LLM Observability for Your Homelab

Running LLMs at home means tracking every token, debugging agent loops, and understanding exactly what your models are doing. Langfuse brings enterprise-grade LLM observability to your homelab without sending data to third parties.

Why LLM Observability Matters

Every time you call an LLM—whether it’s a simple chat completion or a multi-step agent workflow—you’re dealing with:

  • Cost: API calls cost money; local models consume electricity
  • Latency: Complex chains can take seconds or minutes
  • Quality: Are outputs meeting expectations?
  • Debugging: When agents fail, why did they fail?

Without observability, you’re flying blind. Langfuse gives you visibility into all of this.

What is Langfuse?

Langfuse is an open-source LLM engineering platform that provides:

  • Tracing: End-to-end visualization of LLM application flows
  • Prompt Management: Version control for prompts with testing and deployment
  • Evaluation: Human and automated scoring for output quality
  • Token Tracking: Monitor usage across all your LLM applications
  • Agent Observability: Visualize multi-step agent executions

The best part? It’s completely self-hostable. Run it in your homelab and keep your data private.

Langfuse dashboard showing traces and metrics Langfuse gives you complete visibility into every LLM call

Architecture: How Langfuse Works

Langfuse v3 uses a modern microservices architecture designed for production scale:

Langfuse architecture diagram showing components Langfuse v3 architecture with dedicated containers for scalability

Core Components

ComponentPurpose
Langfuse WebUI and API server
Langfuse WorkerAsync event processing
PostgresTransactional data (users, projects)
ClickhouseObservability data (traces, scores)
Redis/ValkeyCaching and queue operations
S3/Blob StorageRaw events and multi-modal data

This architecture means:

  • High ingestion throughput without timeouts
  • Fast analytical queries via Clickhouse
  • Recoverable events (stored in S3 first)
  • Background migrations for smooth upgrades

Homelab Deployment with Docker Compose

Let’s get Langfuse running in your homelab. The Docker Compose setup is perfect for homelab scale—single VM, full functionality.

System Requirements

ScaleCPURAMDisk
Testing2 cores8 GiB50 GiB
Production4+ cores16 GiB100 GiB

Observability data grows fast. Plan your storage accordingly.

Step 1: Clone and Configure

# Clone the Langfuse repository
git clone https://github.com/langfuse/langfuse.git
cd langfuse

# Open docker-compose.yml and update all secrets marked with # CHANGEME
# Use strong, random passwords for production
nano docker-compose.yml

The key secrets to change:

  • SALT_01, SALT_02, SALT_03 - Used for encryption
  • ENCRYPTION_KEY - 32-byte key for sensitive data
  • POSTGRES_PASSWORD - Database password
  • Various API secrets

Step 2: Start Langfuse

# Start all services
docker compose up -d

# Watch the logs
docker compose logs -f langfuse-web

Wait about 2-3 minutes. When you see “Ready” in the logs, you’re good to go.

Step 3: Access the UI

Open http://localhost:3000 (or http://your-vm-ip:3000 for a remote server).

Create your first organization and project. You’ll get API keys for your applications.

Langfuse setup screen Create your first project to start collecting traces

Integrating with Your LLM Applications

Langfuse provides SDKs for Python and JavaScript/TypeScript. Here’s how to integrate with a local Ollama setup.

Python SDK Setup

from langfuse import Langfuse

# Initialize with your project keys
langfuse = Langfuse(
    public_key="pk-xxx",
    secret_key="sk-xxx",
    host="http://your-langfuse-server:3000"
)

# Trace a generation
trace = langfuse.trace(
    name="chat-completion",
    metadata={"model": "llama3.2", "source": "ollama"}
)

generation = trace.generation(
    name="llm-call",
    model="llama3.2",
    input=[{"role": "user", "content": "Hello!"}],
    output={"role": "assistant", "content": "Hi there!"}
)

# Flush to ensure data is sent
langfuse.flush()

With OpenAI-Compatible APIs

If you’re using Ollama’s OpenAI-compatible endpoint or another compatible service:

# Use the callback handler for automatic tracing
from langfuse.openai import OpenAI
from openai import OpenAI as BaseOpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",  # Ollama
    api_key="ollama"  # Required but not used
)

# Traces are automatically sent to Langfuse
response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

JavaScript/TypeScript SDK

import { Langfuse } from "langfuse";

const langfuse = new Langfuse({
  publicKey: "pk-xxx",
  secretKey: "sk-xxx",
  baseUrl: "http://your-langfuse-server:3000"
});

const trace = langfuse.trace({
  name: "agent-workflow",
  metadata: { agent: "my-agent" }
});

Key Features for Homelab Use

Token Tracking

Monitor exactly how many tokens your applications consume:

  • Input/output tokens per request
  • Daily/weekly/monthly aggregation
  • Cost estimation (configure your token costs)
  • Breakdown by model and application

Token tracking dashboard Track token usage across all your LLM applications

Prompt Versioning

Managing prompts is like managing code:

# Get the current version of a prompt
prompt = langfuse.get_prompt("my-prompt-name")

# Use it in your application
formatted = prompt.compile({"variable": "value"})

# Prompts are cached client-side for performance

Features:

  • Version history with diff views
  • A/B testing different prompts
  • Deploy to production without code changes
  • Rollback if something breaks

Session-Based Tracing

Group traces by user session to understand journeys:

trace = langfuse.trace(
    name="user-session",
    id="user-123-session-abc",
    user_id="user-123",
    session_id="session-abc"
)

Perfect for debugging multi-turn conversations or agent workflows.

Score-based Evaluation

Add quality metrics to your traces:

# Human evaluation
trace.score(name="helpfulness", value=5)

# Automated evaluation
trace.score(name="hallucination-check", value=0.2, data_type="NUMERIC")

Create dashboards to track quality over time.

Dashboard Views

Langfuse provides several built-in views:

Sessions View

  • Group traces by session or user
  • See the full conversation flow
  • Filter by date, model, metadata

Traces View

  • Individual trace inspection
  • Nested spans for complex operations
  • Input/output inspection
  • Timing breakdown

Generations View

  • Filter by model
  • See prompt used
  • Compare inputs/outputs
  • Token counts and latency

Scores View

  • Evaluation results
  • Trend analysis
  • Filter by score name

Production Considerations

Security

Authentication: Enable email/password or SSO:

# In docker-compose.yml environment
AUTH_DISABLE_LOCAL_AUTH: "false"  # For local auth

For SSO with Google:

AUTH_GOOGLE_CLIENT_ID: "your-client-id"
AUTH_GOOGLE_CLIENT_SECRET: "your-secret"

Encryption: All sensitive data is encrypted at rest using the ENCRYPTION_KEY you configured.

Network Isolation: Run Langfuse in an air-gapped environment with no internet access. Perfect for sensitive homelab data.

Data Retention

Observability data grows. Plan for retention:

  • Clickhouse stores all traces/scores
  • S3/MinIO stores raw events
  • Configure retention policies based on your needs

Backups

Docker Compose doesn’t include automated backups. For production:

  1. Database backups: Regular Postgres dumps
  2. Clickhouse backups: Use Clickhouse backup tools
  3. S3/MinIO: Configure object versioning

Scaling

The Docker Compose setup scales vertically. For horizontal scaling (high availability), migrate to Kubernetes.

Integration with Homelab Stack

Langfuse fits naturally into a monitoring stack:

ToolPurpose
LangfuseLLM observability
Prometheus/GrafanaInfrastructure metrics
Uptime KumaUptime monitoring
LokiLog aggregation

All self-hosted, all under your control.

Langfuse vs. Alternatives

FeatureLangfuseLangSmithHeliconeW&B
Self-Hosted✅ Full features⚠️ Limited
Open Source✅ MIT⚠️ Partial
Prompt Management
Agent Tracing
Evaluations
CostFreePaidFreemiumFreemium

For homelab use, Langfuse is the clear winner: fully open-source, fully self-hosted, no usage limits.

Best Practices for Homelab Scale

1. Start Small, Grow as Needed

Docker Compose on a single VM is perfect for exploration. Scale to Kubernetes only when you need high availability.

2. Configure Retention Early

Observability data accumulates fast. Set retention policies before disk fills up.

3. Use Prompt Versioning

Treat prompts like code. Version them, test changes, and deploy confidently.

4. Add Custom Metadata

trace = langfuse.trace(
    name="my-app",
    metadata={
        "environment": "homelab",
        "server": "proxmox-vm-3",
        "model_version": "llama3.2-3b"
    }
)

Filter traces by metadata for powerful debugging.

5. Set Up Alerts for Cost Spikes

If someone accidentally creates an infinite loop in an agent, you want to know before your API bill explodes.

When to Choose Cloud Over Self-Hosted

Consider Langfuse Cloud if:

  • You don’t want to manage infrastructure
  • Your team needs zero-setup collaboration
  • You need managed backups and updates
  • You want support contracts

Self-hosted is better when:

  • Data must stay on-premises
  • You have homelab infrastructure ready
  • You want zero per-seat/usage costs
  • Complete control is required

Getting Started Checklist

  • Provision VM: 4 cores, 16 GiB RAM, 100 GiB disk
  • Install Docker and docker-compose-plugin
  • Clone Langfuse repository
  • Update all secrets in docker-compose.yml
  • Run docker compose up -d
  • Access UI at port 3000
  • Create organization and project
  • Install Python or JS SDK
  • Add tracing to your first application
  • Set up prompt versioning
  • Configure retention policies

Conclusion

Langfuse brings enterprise LLM observability to your homelab without the enterprise price tag. With Docker Compose deployment, you can be up and running in minutes, gaining visibility into every token, prompt, and agent execution.

For homelab enthusiasts running local LLMs—whether Ollama, LocalAI, or cloud APIs—the ability to trace, evaluate, and optimize is invaluable. Langfuse self-hosted means complete control, zero usage fees, and your data never leaves your network.

Next Steps: Check out the Langfuse documentation for SDK integration guides, evaluation strategies, and advanced configuration options.


Have questions about self-hosting Langfuse? Join the Langfuse Discord or check the GitHub discussions.

Anthony Lattanzio

Anthony Lattanzio

Tech Enthusiast & Builder

I'm a tech enthusiast who loves building things with hardware and software. By night, I run a homelab that's grown way beyond what any reasonable person needs. Check out about me for more.

Comments

Powered by GitHub Discussions