Agentic Hub
FindProjectsSkillsCollectionsGuide
All projects
Evals & ObservabilityTypeScript

langfuse

langfuse/langfuse
View on GitHub

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

Stars
25k
Forks
2.5k
Last push
today
License
NOASSERTION
Good First Issues
0
Browse on GitHub
Help Wanted
0
Browse on GitHub

Topics

analyticsautogenevaluationlangchainlarge-language-modelsllama-indexllmllm-evaluation

More Evals & Observability

See all

promptfoo

promptfoo/promptfoo
Evals & Observability

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

20k 1.7kTypeScripttoday
No open beginner issues

opik

comet-ml/opik
Evals & Observability

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

19k 1.4kPythontoday
No open beginner issues

evals

openai/evals
Evals & Observability

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

18k 2.9kPython2d ago
No open beginner issues
Agentic Hub

Your launchpad into the agentic AI open source ecosystem. AI-curated projects, beginner scores, and live status. Updated weekly.

Explore

  • All Projects
  • Categories
  • Contribution Guide

Resources

  • Good First Issue
  • Up For Grabs
  • GitHub: ai-agents

MIT License. Built with Next.js on Cloudflare Pages. Data refreshed weekly via GitHub Actions + AI enrichment.

Source