mlx-code

Community Article Published June 6, 2026

Upvote

Josef Albers

JosefAlbers

A lightweight coding agent built on Apple's MLX framework.

Features

Composable by design: Agent, Tool, and the REPL are separate pieces you can import and wire together however you like
Swappable backends: point the harness at the local MLX server, a remote provider, or any OpenAI-compatible endpoint without changing anything else
Git worktree isolation: every session gets a fresh worktree so the agent can't silently corrupt your working tree
9 built-in tools: Read, Write, Edit, Bash, Grep, Find, Ls, Skill, Agent
Interactive REPL commands: /clear, /history, /tools, /branch, /abort

Quick Start

pip install mlx-code
mlc

Command Line

`mlc`: local server + harness

Starts the MLX inference server and launches a harness against it.

# Default: local MLX server + built-in REPL harness
mlc

# Use a different harness (routes traffic through the local server)
mlc --leash claude
mlc --leash gemini
mlc --leash codex

# Server only, no harness
mlc --leash none

# Specify a model
mlc --model mlx-community/Qwen3.5-4B-OptiQ-4bit

# Restrict the tools available to the agent
mlc --tools Read Write Bash

# Custom system prompt
mlc --system "You are a helpful assistant."

# Load skills from a directory (scans recursively for SKILL.md files)
mlc --skill ./my-skills

# Resume a previous session from a git commit hash
mlc --resume <commit-hash>

# Because `mlc` reads from stdin when it isn't a TTY, it composes naturally with shell pipes:
echo "Here's the solution you proposed: <excerpt>$(mlc -p "write code for a chrome extension to play youtube x5 speed")</excerpt> Now argue against it. What are the edge cases this doesn't handle? What assumptions did you make that might not hold in a production system? What would you change if you knew this code would be read by a senior engineer in a security audit?" | mlc

`mlc-run`: harness only

Runs the agent harness against an already-running server or a remote provider.

# Connect to a local server at 127.0.0.1:8000 (default)
mlc-run

# Remote providers
mlc-run --api claude
mlc-run --api gemini
mlc-run --api deepseek --model deepseek-v4-pro
mlc-run --api codex

# Custom endpoint
echo "explain lsp.py" | mlc-run -a deepseek | cat - PLAN.md | mlc-run --url http://localhost:9000

Using as a Library

Import the pieces you need to build background workers, scheduled jobs, or event-triggered handlers.

Spawn an agent from Python

import asyncio
from mlx_code.repl import Agent

async def main():
    agent = Agent(system="You are a concise technical writer.")
    await agent.run("Summarise all *.py files changed in the last 7 days. Save to digest.md.")

asyncio.run(main())

Multi-agent pipeline

import asyncio
from mlx_code.repl import Agent

async def main():
    researcher = Agent(system="You are a research assistant.")
    await researcher.run("Research PBFT consensus. Save a structured summary to kb/draft.md.")

    reviewer = Agent(system="You are a critical reviewer.")
    await reviewer.run(
        "Read kb/draft.md. Write a one-paragraph critique to kb/critique.md. "
        "Use only information in that file."
    )

asyncio.run(main())

Parallel workers with `asyncio.gather`

import asyncio
from mlx_code.repl import Agent

async def main():
    topics = ["history", "algorithms", "industry_usage"]
    agents = [Agent() for _ in topics]
    await asyncio.gather(*[
        a.run(f"Research the {t} of Byzantine Fault Tolerance. Save to kb/{t}.md.")
        for a, t in zip(agents, topics)
    ])
    reducer = Agent()
    await reducer.run("Read all files in kb/. Synthesise into final_report.md.")

asyncio.run(main())

Resume a session from a git commit

mlx-code stores the full conversation as JSON in each commit message, so you can restore both the workspace state and the agent's memory from any checkpoint.

import asyncio
from mlx_code.gits import resume_worktree
from mlx_code.repl import Agent, repl

async def main():
    gwt, messages = resume_worktree(".", "abc1234")
    agent = Agent(ctx={"gwt": gwt})
    agent.messages = messages
    await repl(agent)

asyncio.run(main())

Custom tools

Subclass Tool, define a Pydantic schema, and pass the class at instantiation.

from mlx_code.tools import Tool
from mlx_code.repl import Agent
from pydantic import BaseModel, Field

class QueryParams(BaseModel):
    query: str = Field(description="SQL query to run")

class LiveDBTool(Tool):
    name = "QueryDB"
    description = "Execute a query against the dev database"
    parameters = QueryParams

    async def execute(self, params: QueryParams, signal=None) -> dict:
        result = run_query(params.query)   # your logic here
        return {"content": [{"type": "text", "text": result}], "is_error": False}

agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])

Credits

Built on mlx and mlx-lm. Inspired by Mario Zechner's pi.

License

Apache License 2.0: see LICENSE for details.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment