Using the RLM Client

The main class for recursive language model completions. Enables LMs to programmatically examine, decompose, and recursively call themselves over their input.

Quick Example

from rlm import RLM

rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
)
result = rlm.completion("Your prompt here")
print(result.response)

Constructor

RLM(
    backend: str = "openai",
    backend_kwargs: dict | None = None,
    environment: str = "local",
    environment_kwargs: dict | None = None,
    depth: int = 0,
    max_depth: int = 1,
    max_iterations: int = 30,
    custom_system_prompt: str | None = None,
    other_backends: list[str] | None = None,
    other_backend_kwargs: list[dict] | None = None,
    logger: RLMLogger | None = None,
    verbose: bool = False,
)
backendstrdefault: "openai"

LM provider to use for completions.

ValueProvider
"openai"OpenAI API
"anthropic"Anthropic API
"portkey"Portkey AI gateway
"openrouter"OpenRouter
"litellm"LiteLLM (multi-provider)
"vllm"Local vLLM server
backend_kwargsdict | Nonedefault: None

Provider-specific configuration (API keys, model names, etc.).

# OpenAI / Anthropic
backend_kwargs={
    "api_key": "...",
    "model_name": "gpt-5-mini",
}

# vLLM(local)
backend_kwargs={
    "base_url": "http://localhost:8000/v1",
    "model_name": "meta-llama/Llama-3-70b",
}

# Portkey
backend_kwargs={
    "api_key": "...",
    "model_name": "@openai/gpt-5-mini",
}
environmentstrdefault: "local"

Code execution environment for REPL interactions.

ValueDescription
"local"Same-process with sandboxed builtins
"docker"Docker container
"modal"Modal cloud sandbox
environment_kwargsdict | Nonedefault: None

Environment-specific configuration.

# Docker
environment_kwargs={"image": "python:3.11-slim"}

# Modal
environment_kwargs={
    "app_name": "my-app",
    "timeout": 600,
}

# Local
environment_kwargs={"setup_code": "import numpy as np"}
max_iterationsintdefault: 30

Maximum REPL iterations before forcing a final answer.

max_depthintdefault: 1

Note: This is a TODO. Only max_depth=1 is currently supported.

Maximum recursion depth. When depth >= max_depth, falls back to regular LM completion.

custom_system_promptstr | Nonedefault: None

Override the default RLM system prompt.

other_backendslist[str] | Nonedefault: None

Additional backends available for sub-LM calls within the REPL.

rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
    other_backends=["anthropic"],
    other_backend_kwargs=[{"model_name": "claude-sonnet-4-20250514"}],
)
other_backend_kwargslist[dict] | Nonedefault: None

Configurations for other_backends (must match order).

loggerRLMLogger | Nonedefault: None

Logger for saving RLM execution trajectories to JSON-lines files.

from rlm.logger import RLMLogger

logger = RLMLogger(log_dir="./logs")
rlm = RLM(..., logger=logger)
verbosebooldefault: False

Enable rich console output showing iterations, code execution, and results.

Methods

completion()

Main method for RLM completions. Executes the recursive loop and returns the final result.

The method returns an RLMChatCompletion object containing the final response, execution metadata, and usage statistics. This object provides access to the RLM's output and performance metrics.

result = rlm.completion(
    prompt: str | dict,
    root_prompt: str | None = None,
)

Arguments

NameTypeDescription
promptstr | dictInput context (becomes context variable in REPL)
root_promptstr | NoneOptional hint visible only to the root LM call

Returns

RLMChatCompletion object with:

AttributeTypeDescription
responsestrFinal answer from the RLM
execution_timefloatTotal execution time in seconds
usage_summaryUsageSummaryAggregated token usage across all LM calls
root_modelstrModel name used for root completion