The Definitive Tool for Testing LangGraph and CrewAI Agents
Testing complex AI agent workflows is tough. You're often left sifting through terminal logs, trying to piece together a narrative from scattered print statements. It's time-consuming and makes debugging a chore, especially when your agents involve multiple steps, tools, and branching logic. What if you could see the entire execution flow, understand decisions, and inspect state changes visually?
That's exactly what eval-view sets out to solve. It's a new open-source tool designed specifically for developers building with LangGraph and CrewAI, turning the opaque process of agent execution into something you can actually see and understand.
What It Does
eval-view is a visual debugging and evaluation dashboard for AI agent frameworks. In simple terms, it hooks into your LangGraph or CrewAI runs and gives you a clean, web-based interface to replay exactly what happened. Instead of a linear log, you get a detailed, step-by-step trace of the entire workflow.
You can see which nodes (or agents) were executed, in what order, what their inputs and outputs were, and the state of the graph at any point in time. It captures the context, the tool calls, the reasoning—everything you need to figure out why your agent took a weird turn or where it excelled.
Why It's Cool
The magic of eval-view isn't just that it visualizes runs; it's how it seamlessly integrates and the depth of insight it provides. For LangGraph, it renders the actual graph structure of your state machine, letting you click through the execution path. For CrewAI, it lays out the task sequence and agent contributions clearly.
What makes it stand out is its focus on the developer experience. It's not a heavy, monolithic service. You can run it locally, point it at your trace data (which it helps you generate), and immediately start inspecting. This turns the iterative process of tweaking prompts, tools, and graph logic from a guessing game into a methodical debugging session. You can compare runs, see where costs might be ballooning from unnecessary tool calls, and validate that your agents are following the intended reasoning process.
How to Try It
Getting started is straightforward. The project is on GitHub, and the README has clear setup instructions.
-
Clone the repo:
git clone https://github.com/hidai25/eval-view.git cd eval-view -
Install dependencies: It's a Python project, so the usual
pip install -r requirements.txtapplies. -
Instrument your agent: You'll need to add a callback or tracer to your LangGraph or CrewAI code to export the execution data. The repository provides clear examples for both frameworks. Essentially, you're adding a few lines to your existing code to capture the trace.
-
Launch the dashboard: Run the provided Streamlit app, and load your generated trace file. Suddenly, you have a full interactive view of your agent's run.
Head over to the eval-view GitHub repository for the detailed code snippets and to see the interface in action.
Final Thoughts
As someone who's wrestled with debugging multi-step AI agents, a tool like eval-view feels like turning on the lights. The visual, replayable trace it provides is infinitely more useful than console logging for understanding non-linear workflows. If you're building anything moderately complex with LangGraph or CrewAI—whether it's a research assistant, a coding partner, or an automated workflow—integrating this early will save you hours of frustration. It moves agent development from "hoping it works" to actually seeing how it works.
Follow for more interesting projects: @githubprojects
Repository: https://github.com/hidai25/eval-view