LLM Executor¶

The LLM executor is a pluggable responder that handles llm.request events by calling a user-supplied LLM function, then emits llm.response events with the result.

LLMExecutorResponder¶

`kando.responders.llm_executor.LLMExecutorResponder(llm_fn)` ¶

Factory: returns a Responder that calls llm_fn for every llm.request event.

Usage

def my_llm(messages, model, max_tokens): ... # call Anthropic/OpenAI/etc return response_text, cost_usd

responders = create_kit() + [LLMExecutorResponder(my_llm)]

Source code in kando/responders/llm_executor.py

def LLMExecutorResponder(llm_fn: LLMFn) -> Responder:
    """Factory: returns a Responder that calls llm_fn for every llm.request event.

    Usage:
        def my_llm(messages, model, max_tokens):
            ...  # call Anthropic/OpenAI/etc
            return response_text, cost_usd

        responders = create_kit() + [LLMExecutorResponder(my_llm)]
    """
    return Responder(
        name="llm_executor",
        pattern=frozenset({LLM_REQUEST}),
        fn=_make_executor_fn(llm_fn),
    )

LLMFn signature¶

LLMFn = Callable[[list[dict], str, int], tuple[str, float]]
#                  messages   model max_tokens  -> (text, cost_usd)

The function receives:

Parameter	Type	Description
`messages`	`list[dict]`	Chat messages (OpenAI/Anthropic format)
`model`	`str`	Model identifier from the `llm.request` event
`max_tokens`	`int`	Max response tokens

It must return a (response_text, cost_usd) tuple.

Built-in implementations¶

Anthropic¶

Uses the Anthropic SDK directly. Activated when ANTHROPIC_API_KEY is set.

from kando.responders.anthropic_llm import anthropic_llm
from kando.responders.llm_executor import LLMExecutorResponder

responders = create_kit() + [LLMExecutorResponder(anthropic_llm)]

OpenRouter¶

Routes through the OpenRouter API. Activated when OPENROUTER_API_KEY is set (takes priority over Anthropic).

from kando.responders.openrouter_llm import openrouter_llm
from kando.responders.llm_executor import LLMExecutorResponder

responders = create_kit() + [LLMExecutorResponder(openrouter_llm)]

Set OPENROUTER_MODEL to override the default model (anthropic/claude-haiku-4-5).

Usage¶

from kando.responders.llm_executor import LLMExecutorResponder

# Custom LLM function
def my_llm(messages: list[dict], model: str, max_tokens: int) -> tuple[str, float]:
    # Call your LLM provider
    response = call_api(messages=messages, model=model, max_tokens=max_tokens)
    return response.text, response.cost

responders = create_kit() + [LLMExecutorResponder(my_llm)]
runtime = Runtime(ledger=store, responders=responders)
world = runtime.run(seed)

Caching¶

The executor checks world.context["cache"] (an LLMCache) before calling the LLM function. If a cached response exists for the same (messages, model, max_tokens) triple, it is served without an API call. New responses are written to the cache automatically.

How the CLI wires it¶

When you run kando run, the CLI auto-attaches an LLMExecutorResponder if an API key is available:

OPENROUTER_API_KEY → uses openrouter_llm
ANTHROPIC_API_KEY → uses anthropic_llm
Neither set → no executor attached; llm.request events are logged but unanswered