IPythonREPL

IPythonREPL executes code inside a real IPython session instead of plain exec(). It supports two kernel modes — an in-process shell that runs in the same Python process as the RLM (the default, fastest), and a subprocess kernel that runs a real ipykernel in a separate Python process for hard cell timeouts and full namespace isolation from the RLM host. Both modes give the LM access to IPython's full surface (cell magics, rich repr, line tracebacks).

Prerequisite: install the optional extra:

pip install 'rlms[ipython]'
# or with uv:
# uv pip install -e ".[ipython]"
from rlm import RLM

# In-process(default kernel_mode): same process, fast.
rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
    environment="ipython",
    environment_kwargs={
        "kernel_mode": "in_process",
        "cell_timeout": 30,         # SIGALRM-based; Unix main thread only
    },
)

# Subprocess: separate Python process, hard timeouts, full isolation.
rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5-mini"},
    environment="ipython",
    environment_kwargs={
        "kernel_mode": "subprocess",
        "cell_timeout": 30,         # Hard guarantee via interrupt_kernel
        "startup_timeout": 60,
        "max_concurrent_subcalls": 4,
    },
)

Arguments

ArgumentTypeDefaultDescription
kernel_mode"in_process" | "subprocess""in_process"Where the IPython session runs
cell_timeoutfloat | NoneNonePer-cell timeout in seconds; None disables
startup_timeoutfloat60.0Subprocess kernel boot timeout
subcall_timeoutfloat | NoneNonePer-request kernel→broker socket timeout (subprocess)
max_concurrent_subcallsint4Global cap on concurrent subcall_fn invocations
setup_codestrNoneCode to run at initialization
custom_toolsdictNoneFunctions / values injected into the namespace

In-process vs. subprocess

in_processsubprocess
ProcessSame as hostSeparate Python via ipykernel
Subcall pathDirect Python callTCP broker (4-byte length-prefixed JSON)
cell_timeoutBest-effort SIGALRM (Unix, main thread)Hard, via interrupt_kernel
Cell magics (%%timeit, …)YesYes
input()Disabled (raises)Disabled (allow_stdin=False)
Isolation from hostShares stdout/stderr/cwd/SIGALRMFull process isolation
Custom tool injectionDirect namespace injectPickled with dill over ZMQ

How It Works

In-process

  1. Creates a fresh InteractiveShell with a per-instance user module (so multiple in-process REPLs don't share sys.modules['__main__']) and IPython's history database disabled.
  2. Injects scaffold helpers (llm_query, rlm_query, FINAL_VAR, SHOW_VARS) and a stubbed input() into user_ns.
  3. execute_code runs each cell via shell.run_cell under an RLock; cell_timeout is enforced with SIGALRM + setitimer.
  4. rlm_query calls subcall_fn directly, gated by a per-instance semaphore so kernel-side fan-out can't exceed max_concurrent_subcalls.

Subprocess

  1. Starts a TCP broker on 127.0.0.1:0 (ephemeral port).
  2. Launches an ipykernel subprocess pinned to the host's sys.executable (so it inherits the same site-packages — important for dill, custom imports, etc.).
  3. Bootstraps kernel-side scaffold helpers that route llm_query to the LM Handler and rlm_query / FINAL_VAR to the broker over the 4-byte-prefixed JSON protocol.
  4. Each user cell first sets a unique _RLM_CURRENT_CELL cell-id in the kernel via a separate execute_interactive call (so cell magics still work). Every broker request carries this id.
  5. cell_timeout is enforced by kc.execute_interactive(timeout=…) + km.interrupt_kernel(). Subcall completions whose subcall_fn finishes after the originating cell timed out are stored under that cell's id and discarded as stale on the next drain — they aren't misattributed to a later cell.
┌──────────────────────────────────────────┐
│ Host (RLM process)                       │
│  ┌─────────────┐ Socket ┌──────────────┐ │
│  │ Subcall     │◄──────►│  LM Handler  │ │
│  │ broker      │        └──────────────┘ │
│  │ (TCP + JSON)│                         │
│  └─────┬───────┘                         │
└────────┼─────────────────────────────────┘
         │ ZMQ (jupyter_client)
┌────────┼─────────────────────────────────┐
│ ipykernel subprocess                     │
│  ┌─────▼────────┐                        │
│  │  IPython     │ rlm_query() / FINAL_VAR│
│  │  user_ns     │ → broker over TCP      │
│  │  (cell_id    │ llm_query() → LM Handler│
│  │   tagged)    │                        │
│  └──────────────┘                        │
└──────────────────────────────────────────┘

Notable behavior

  • Per-instance serialization. execute_code takes an RLock, so concurrent calls from different threads are serialized within an instance.
  • Global subcall cap. max_concurrent_subcalls bounds total in-flight subcall_fn invocations on the instance — even if user code spawns kernel-side threads that each fan out a batch.
  • Reentry guard. If subcall_fn calls execute_code back on the parent REPL (or a cell traverses rlm_query.__self__.execute_code(…) in in-process mode), the call raises RuntimeError instead of deadlocking the cell lock or corrupting the in-flight cell's tracking. subcall_fn should spawn a child REPL.
  • Cell-id attribution. Subcall completions are tagged with the originating cell's id so a slow subcall_fn that finishes after its cell timed out is never counted under a later cell. Long-lived kernel threads that call rlm_query after their spawning cell ends will, however, be tagged with whatever cell is active at call time.
  • In-process is not isolated. Two in-process instances each get a unique __main__ substitute, but they still share the host's stdout/stderr/cwd/SIGALRM. Use subprocess if you need true isolation.

When to use which mode

  • in_process — fastest path, no IPC, fine for trusted code, development, short-lived cells. cell_timeout is best-effort (Unix main thread only).
  • subprocess — when you need a hard cell_timeout guarantee, or want full namespace / signal / cwd isolation between the LM's code and the RLM host.