Integrating LangGraph Agents into Open WebUI

As an AI developer, you often need to demo your machine learning services or products. When it comes to chatbot development, especially for privacy-critical applications, two of the most useful demo tools are Streamlit and Open WebUI.

These tools serve similar purposes but cater to different needs. Open WebUI is a feature-rich, user-friendly platform that looks and feels like ChatGPT and can be hosted completely offline. Streamlit, on the other hand, requires more custom development but provides greater flexibility for designing unique interfaces. Depending on your project’s goals, one might be more suitable than the other.

In this article, I will show you how to:

Host Open WebUI locally for offline operation
Integrate LangGraph agents into it using FastAPI

Step 1: Setting Up an LLM Inference Engine

Before starting, you need an LLM inference backend such as Ollama or vLLM. Open WebUI supports several LLM runners, including Ollama and OpenAI-compatible APIs. If you want a simple and fast setup for demos, Ollama is an excellent choice.

Installing OllamaOllama can be installed easily on most systems. You can follow the steps below depending on your platform.

Linux (with curl):

curl -fsSL <https://ollama.com/install.sh> | sh

**macOS and Windows:**Download the installer from https://ollama.com/download and follow the setup instructions.

Once Ollama is installed and is running, Open WebUI can connect to it automatically through its local API endpoint.

Step 2: Building the AI Agent with LangGraph

LangGraph is a leading open-source framework for building AI agents and complex reasoning pipelines. It makes it easy to create both simple and multi-step workflows that interact with various tools and data sources.

Here is a high-level structural overview of a Pandas DataFrame Chatbot implemented using LangGraph (you can find the full code here):

import os
import sqlite3
# ... imports go here

df = pd.read_csv(DATAFRAME_ASSISTANT_SAMPLE_CSV_PATH)

llm = ChatOllama(model=OLLAMA_MODEL, temperature=TEMPERATURE)

ValueType = Annotated[
    Literal["default", "pandas_agent", "ml_agent"],
    Field(
        description="Allowed values: 'default' (no additional context is neede  d), 'pandas_agent' (data analysis of a pandas dataframe "
                    "is required for an accurate answer), or 'ml_agent' (machine learning operations like model training or inference are required)"
    )
]

class RouterModel(BaseModel):
    value: ValueType

structured_llm = llm.with_structured_output(RouterModel)

class State(MessagesState):
    summary: str
    context: str

async def build_graph(
        return_memory: bool = False,
) -> Union[
    CompiledStateGraph,
    Tuple[CompiledStateGraph, SqliteSaver],
]:
    """
    Construct and return a compiled LangChain StateGraph tailored for conversational reasoning using conditional
    routing between ML Agent and Pandas DataFrame Agent.
    """
    summary_thr = 10
    n_msg_overlap = 2

    def router_system_message(summary_msg: str):
        sys_msg = SystemMessage(content=f"""
            You are an intelligent decision-making agent tasked with routing user queries based on their content.
            ...

            {summary_msg}

            Your response: Output only one of the following options based on the user query: 'default', 'pandas_agent', or 'ml_agent'.
            """)

        return sys_msg

    # Node
    def conversation_node(state: State):
        filtered_messages = [
            msg for msg in state["messages"] if msg.type in {"human", "ai"} and msg.content
        ]
        n_msg = len(filtered_messages)

        if n_msg > summary_thr and (n_msg - 1) % summary_thr == 0:
            messages = filtered_messages[-(summary_thr + 1):]

            conversation_history = []
            for msg in messages:
                # some code goes here populating conversation_history

            human_ai_messages = "\\n".join([item for item in conversation_history[:-1]])
            summary = state.get("summary")
            if summary:
                summary_message = f"""
                This is a summary of the conversation so far: '{summary}'

                Update and extend the summary ...
                <start>
                {human_ai_messages}
                <end>
                """
            else:
                summary_message = f"""
                Generate a brief summary (3-4 sentences) of the conversation below, ...
                <start>
                {human_ai_messages}
                <end>
                """

            response = llm.invoke(summary_message)

            return {"summary": response.content}

    # Node
    def pandas_agent_node(state: State):
        filtered_messages = [
            msg for msg in state["messages"] if msg.type in {"human", "ai"} and msg.content
        ]
        n_msg = len(filtered_messages)

        # Create Python REPL tool with access to dataframe
        python_tool = PythonAstREPLTool(locals={
            "df": df
        })

        # some code goes here ...

        prompt = f"""You are a helpful assistant that has access to the tools below to answer user questions.

        Available tools:
        - python_tool: A Python shell with access to a pandas DataFrame (df) with the following columns:
        {df.columns.tolist()}

        IMPORTANT: When calling tools, you MUST use the exact JSON format:
        {{"query": "your_python_code_here"}}

        Example tool call:
        {{"query": "df.head()"}}

        Given a user question, write the Python code to answer it and wrap it in the proper JSON tool call format.

        If tools have been used, summarize their outputs clearly in your final response to the user.
        DO NOT respond until you have reviewed the results of any tool invocations.
        Please make your answer as concise as possible.

        {summary_prompt}
        """

        pandas_agent = create_react_agent(
            model=llm,
            tools=[python_tool],
            prompt=prompt,
        )

        try:
            result = pandas_agent.invoke({"messages": messages}, {"recursion_limit": 10})
            return {"messages": result["messages"]}
        except ResponseError as e:
             # some code goes here ...
        except GraphRecursionError as e:
             # some code goes here ...

    # Node
    def ml_agent_node(state: State):
        # Create Python REPL tool with access to ML functions
        python_tool = PythonAstREPLTool(locals={
            "build_decision_tree_classifier": build_decision_tree_classifier,
            "model_inference": model_inference,
            "compute_accuracy_metrics": compute_accuracy_metrics,
            "export_decision_tree_to_text": export_decision_tree_to_text
        })

        # some code goes here ...

        prompt = f"""You are a helpful ML assistant that has access to the tools below to answer user questions.

        Available tools:
        - python_tool: A Python shell with access to a function 'build_decision_tree_classifier(target_variable, average)' for building decision tree models,
        'model_inference(feature_values)' for using the trained decision tree model for making predictions, 'compute_accuracy_metrics(target_variable, average)'
        for evaluating a trained decision tree model at user request, and 'export_decision_tree_to_text()' for exporting a trained decision tree model into a
        human-readable text representation containing the decision rules of the tree, formatted as nested if/else statements.

        - To build a decision tree model call "build_decision_tree_classifier(target_variable, 'weighted')".
        - To make inference call ...

        {summary_prompt}
        """

        ml_agent = create_react_agent(
            model=llm,
            tools=[python_tool],
            prompt=prompt,
        )

        try:
            result = ml_agent.invoke({"messages": messages}, {"recursion_limit": 10})
            return {"messages": result["messages"]}
        except ResponseError as e:
             # some code goes here ...
        except GraphRecursionError as e:
             # some code goes here ...

    def router(state: State) -> Literal['pandas_agent', 'ml_agent']:
        # some code goes here ...

        try:
           # some more code goes here ...
           result = structured_llm.invoke([router_system_message(summary_prompt)] + messages[-1:])

           if result is None:
              raise ValueError("LLM returned None!")
        except (KeyError, ValidationError, ValueError, TypeError) as e:
            logger.warning(f"Unexpected router output: {e}. Falling back to 'pandas_agent'.")
            result = RouterModel(value="pandas_agent")

        if result.value == 'ml_agent':
            return 'ml_agent'
        else:
            return 'pandas_agent'  # we forward 'default' -> 'pandas_agent'

    # Sqlite checkpointer
    sqlite_db_path = SQLITE_CHECKPOINTER_DB_PATH.as_posix()
    os.makedirs(os.path.dirname(sqlite_db_path), exist_ok=True)
    conn = sqlite3.connect(sqlite_db_path, check_same_thread=False)
    memory = SqliteSaver(conn)

    # Graph
    builder = StateGraph(State)

    # Define nodes: these do the work
    builder.add_node("conversation", conversation_node)
    builder.add_node("pandas_agent", pandas_agent_node)
    builder.add_node("ml_agent", ml_agent_node)

    # Define edges: these determine how the control flow moves
    builder.add_edge(START, "conversation")
    builder.add_conditional_edges(
        "conversation",
        router,
    )
    builder.add_edge("pandas_agent", END)
    builder.add_edge("ml_agent", END)

    aconn = await aiosqlite.connect(sqlite_db_path)
    saver = AsyncSqliteSaver(aconn)

    graph = builder.compile(checkpointer=saver)

    if return_memory:
        return graph, memory

    return graph

This code defines a LangGraph-based conversational system that routes user queries to specialized agents depending on the task type.

At a high level:

It loads a sample CSV file into a pandas DataFrame and initializes a large language model (llm) using ChatOllama.
A router model is defined to decide whether a query should go to a pandas data analysis agent, a machine learning agent, or stay as a default conversation.