IPythonREPL

IPythonREPL executes code inside a real IPython session instead of plain exec(). It supports two kernel modes — an in-process shell that runs in the same Python process as the RLM (the default, fastest), and a subprocess kernel that runs a real ipykernel in a separate Python process for hard cell timeouts and full namespace isolation from the RLM host. Both modes give the LM access to IPython's full surface (cell magics, rich repr, line tracebacks).

Prerequisite: install the optional extra:

pip install 'rlms[ipython]'
# or with uv:
# uv pip install -e ".[ipython]"

from rlm import RLM

# In-process(default kernel_mode): same process, fast.
rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
    environment="ipython",
    environment_kwargs={
        "kernel_mode": "in_process",
        "cell_timeout": 30,         # SIGALRM-based; Unix main thread only
    },
)

# Subprocess: separate Python process, hard timeouts, full isolation.
rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
    environment="ipython",
    environment_kwargs={
        "kernel_mode": "subprocess",
        "cell_timeout": 30,         # Hard guarantee via interrupt_kernel
        "startup_timeout": 60,
        "max_concurrent_subcalls": 4,
    },
)

Arguments

Argument	Type	Default	Description
`kernel_mode`	`"in_process" \| "subprocess"`	`"in_process"`	Where the IPython session runs
`cell_timeout`	`float \| None`	`None`	Per-cell timeout in seconds; None disables
`startup_timeout`	`float`	`60.0`	Subprocess kernel boot timeout
`subcall_timeout`	`float \| None`	`None`	Per-request kernel→broker socket timeout (subprocess)
`max_concurrent_subcalls`	`int`	`4`	Global cap on concurrent subcall_fn invocations
`setup_code`	`str`	`None`	Code to run at initialization
`custom_tools`	`dict`	`None`	Functions / values injected into the namespace

In-process vs. subprocess

	in_process	subprocess
Process	Same as host	Separate Python via ipykernel
Subcall path	Direct Python call	TCP broker (4-byte length-prefixed JSON)
`cell_timeout`	Best-effort SIGALRM (Unix, main thread)	Hard, via `interrupt_kernel`
Cell magics (%%timeit, …)	Yes	Yes
`input()`	Disabled (raises)	Disabled (`allow_stdin=False`)
Isolation from host	Shares stdout/stderr/cwd/SIGALRM	Full process isolation
Custom tool injection	Direct namespace inject	Pickled with `dill` over ZMQ

How It Works

In-process

Creates a fresh InteractiveShell with a per-instance user module (so multiple in-process REPLs don't share sys.modules['__main__']) and IPython's history database disabled.
Injects scaffold helpers (llm_query, rlm_query, FINAL_VAR, SHOW_VARS) and a stubbed input() into user_ns.
execute_code runs each cell via shell.run_cell under an RLock; cell_timeout is enforced with SIGALRM + setitimer.
rlm_query calls subcall_fn directly, gated by a per-instance semaphore so kernel-side fan-out can't exceed max_concurrent_subcalls.

Subprocess

Starts a TCP broker on 127.0.0.1:0 (ephemeral port).
Launches an ipykernel subprocess pinned to the host's sys.executable (so it inherits the same site-packages — important for dill, custom imports, etc.).
Bootstraps kernel-side scaffold helpers that route llm_query to the LM Handler and rlm_query / FINAL_VAR to the broker over the 4-byte-prefixed JSON protocol.
Each user cell first sets a unique _RLM_CURRENT_CELL cell-id in the kernel via a separate execute_interactive call (so cell magics still work). Every broker request carries this id.
cell_timeout is enforced by kc.execute_interactive(timeout=…) + km.interrupt_kernel(). Subcall completions whose subcall_fn finishes after the originating cell timed out are stored under that cell's id and discarded as stale on the next drain — they aren't misattributed to a later cell.

┌──────────────────────────────────────────┐
│ Host (RLM process)                       │
│  ┌─────────────┐ Socket ┌──────────────┐ │
│  │ Subcall     │◄──────►│  LM Handler  │ │
│  │ broker      │        └──────────────┘ │
│  │ (TCP + JSON)│                         │
│  └─────┬───────┘                         │
└────────┼─────────────────────────────────┘
         │ ZMQ (jupyter_client)
┌────────┼─────────────────────────────────┐
│ ipykernel subprocess                     │
│  ┌─────▼────────┐                        │
│  │  IPython     │ rlm_query() / FINAL_VAR│
│  │  user_ns     │ → broker over TCP      │
│  │  (cell_id    │ llm_query() → LM Handler│
│  │   tagged)    │                        │
│  └──────────────┘                        │
└──────────────────────────────────────────┘

Notable behavior

Per-instance serialization. execute_code takes an RLock, so concurrent calls from different threads are serialized within an instance.
Global subcall cap. max_concurrent_subcalls bounds total in-flight subcall_fn invocations on the instance — even if user code spawns kernel-side threads that each fan out a batch.
Reentry guard. If subcall_fn calls execute_code back on the parent REPL (or a cell traverses rlm_query.__self__.execute_code(…) in in-process mode), the call raises RuntimeError instead of deadlocking the cell lock or corrupting the in-flight cell's tracking. subcall_fn should spawn a child REPL.
Cell-id attribution. Subcall completions are tagged with the originating cell's id so a slow subcall_fn that finishes after its cell timed out is never counted under a later cell. Long-lived kernel threads that call rlm_query after their spawning cell ends will, however, be tagged with whatever cell is active at call time.
In-process is not isolated. Two in-process instances each get a unique __main__ substitute, but they still share the host's stdout/stderr/cwd/SIGALRM. Use subprocess if you need true isolation.

When to use which mode

in_process — fastest path, no IPC, fine for trusted code, development, short-lived cells. cell_timeout is best-effort (Unix main thread only).
subprocess — when you need a hard cell_timeout guarantee, or want full namespace / signal / cwd isolation between the LM's code and the RLM host.