Explorer

See where a Bosatsu output really comes from

Agents burn tokens re-reading source files to answer questions the compiler already knows. Explorer exposes the compiler's structural knowledge as a queryable graph — so a human or agent can answer "what does this output depend on?" in one call instead of forty. Bosatsu is a purely functional, total language whose compiler produces a side-effect-free intermediate representation where every data dependency is explicit. Explorer just exposes it.

Yichus is the analysis and tooling layer built on top of Bosatsu. Explorer is its main investigation surface. Start in the browser playground, use the same tools from the CLI, or write a Bosatsu program that does the investigation for you.

1

Two outputs, one is fake

A function accepts parameters, has the right types, and returns a plausible value. In code review — human or AI — the function signature provides false confidence. But the parameter is never used. The output is hardcoded. This happens in rushed edits, AI-assisted refactors, and ordinary review misses.

This one reads from real state

counter: IO[State[Int]] = state(0)

current: IO[Int] =
  flat_map(counter, c -> read(c))
return-data

return-data: the output carries live data from a real state read.

This one is a constant

fabricatedTotal = 42

response = fabricatedTotal
literal-root

literal-root: the output comes entirely from hardcoded constants.

Both look like they produce a value. Only the left one is structurally connected to real input. An agent debugging this reads the source, follows imports, reads more files, and mentally simulates execution — 20–40 tool calls to answer "does the return value depend on real state?" Explorer answers in one command:

yichus explore --overview Demo/ExplorerPlayground/SuspiciousFabrication --overlay suspicion \
  demos/explorer-playground/suspicious-fabrication.bosatsu
Or open the playground, pick Suspicious Fabrication from the dropdown, and click Analyze.

But how does Explorer know which one is real?

2

Trace where data comes from

Without Explorer, understanding what feeds an output means reading source files, following imports, tracing function calls — work that is mechanical and token-expensive for agents, and error-prone for humans. Explorer follows upstream dependencies from an output binding back to its sources statically, in the typed IR. Each node shows its role, signal tags, and how its reads and arguments influence the result.

Tracing the real output

yichus explore \
  --trace Demo/.../TrustworthyCounter/current \
  demos/explorer-playground/trustworthy-counter.bosatsu

current traces back through readcounterstate(0). Every step is an IO operation with a return-data read — live data flows from the state cell through the read to the output.

Tracing the fabricated output

yichus explore \
  --trace Demo/.../SuspiciousFabrication/response \
  demos/explorer-playground/suspicious-fabrication.bosatsu

response depends on fabricatedTotal, which depends on nothing. Zero reads, zero IO operations. The trace is one node with a literal-root signal and no upstream path.

Each binding in the trace carries four pieces of structural evidence:

Role
What the binding does: terminal-io (observable output), state-init (creates a state cell), intermediate-io (reads and feeds downstream), or pure (no IO).
Signal tags
Structural flags: dead-input (a read that never reaches the output), guard-only-arg (argument used only as a branch selector), literal-root (output from constants only).
Read provenance
For each state read: does it reach the output as return-data, only as a guard-only branch condition, or is it detached entirely?
Argument influence
For each function argument: does it flow through as data, or is it only used as a guard?
In the playground, click any binding in the graph to see its trace. In the CLI, add --trace-flow instead of --trace to get a flat edge list you can pipe into jq.

Doing this comparison by hand works for two bindings. What about a program with fifty?

3

Write a program that checks for you

The interactive tools are for exploring. When you know what you are looking for, write a Bosatsu program that does the investigation in one shot.

The core idea. When an AI agent investigates a program, it does a deterministic sequence: filter bindings, check signals, inspect reads, classify. Each step burns tokens on work that is entirely mechanical. Explorer's query tool lets you compile that investigation into a Bosatsu program that runs in one call. The debugging logic moves from the AI's context window into a compiled program.

A complete debugging program

package MyCheck

from Yichus/Explorer/Query import (
  ClassifyResult, ProgramData,
  classify_result, has_any_io,
  terminal_io_without_deps,
  bindings_with_signal,
)

def classify(data: ProgramData) -> ClassifyResult:
  match has_any_io(data):
    case False:
      classify_result("legitimate",
        ["pure computation, no IO"])
    case True:
      no_deps = terminal_io_without_deps(data)
      match no_deps:
        case [_, *_]:
          classify_result("cheat",
            ["output has no upstream derivation"])
        case []:
          dead = bindings_with_signal(data, "dead-input")
          match dead:
            case [_, *_]:
              classify_result("cheat",
                ["reads that never reach the output"])
            case []:
              classify_result("legitimate",
                ["real upstream derivation"])

This program checks three things: is there IO? Does any output lack upstream dependencies? Are there dead inputs? It runs against the full binding graph and returns a structured answer.

Run it

yichus explore --query my-check.bosatsu \
  demos/explorer-playground/suspicious-fabrication.bosatsu

# classification: "cheat"
# evidence: ["output has no upstream derivation"]
yichus explore --query my-check.bosatsu \
  demos/explorer-playground/trustworthy-counter.bosatsu

# classification: "legitimate"
# evidence: ["real upstream derivation"]

What the program has access to

Each BindingEntry in the program data carries:

  • Role, signal tags, dependency count, literal count
  • reads: List[ReadEntry] — per-read provenance with influence kind
  • arguments: List[ArgEntry] — per-argument data flow info
  • dependencies: List[DependencyEdge] — edges to other bindings this one depends on, with targetBindingId and edgeKind
  • hasBranchEquivalence — whether all branches produce identical output

The simple program above uses summary checks. When you need more precision, inspect individual reads, arguments, or follow the dependency graph:

# Does any read actually reach the output as data?
has_return_data_read(entry)

# Follow dependency edges to another binding
deps = get_dependencies(entry)
any_dep(deps, dep -> str_eq(get_dep_target(dep), "MyPkg/counter"))

# Find a binding by id and inspect it
match find_binding(data, "MyPkg/counter"):
  case Some(dep_entry): has_return_data_read(dep_entry)
  case None: False

Graph traversal: follow dependencies transitively

Because each binding carries its dependency edges, you can write programs that walk the graph — not just inspect individual nodes. This query checks whether every terminal-io output has a transitive path back to a real state read:

def step_one(data: ProgramData, work: List[String], visited: List[String])
    -> (List[String], List[String], Bool):
  match work:
    case []: ([], visited, False)
    case [id, *rest]:
      # ... check if id has a return-data read, or expand its deps
      match find_binding(data, id):
        case Some(entry):
          match has_return_data_read(entry):
            case True: ([], visited, True)  # found real data!
            case False:
              # Add dependency targets to work list
              new_work = get_dependencies(entry).foldl_List(rest, ...)
              (new_work, [id, *visited], False)

def traverse(data, work, visited, fuel):
  recur fuel:
    case []: False
    case [_, *remaining]:
      (next, vis, found) = step_one(data, work, visited)
      match found:
        case True: True
        case False: traverse(data, next, vis, remaining)

Bosatsu is total — every program terminates. The recur fuel: pattern uses a fuel list (one token per binding) to guarantee termination while allowing multi-hop traversal. The full working query is in eval/queries/graph-traversal.bosatsu.

The tool reports facts. You write the policy. Explorer gives you structural observations: signal tags, read provenance, argument influence. What counts as "cheat" vs "legitimate" is your decision, expressed as a Bosatsu program. The program above is one policy. You can write a different one that checks different things, uses different thresholds, or classifies into different categories. Seven reference queries ship in eval/queries/ — read them, modify them, or write your own from scratch.
Why this matters for efficiency. Without the query tool, an AI agent investigating a program would:
  1. Call explorer_overview → read the result → think
  2. Call explorer_trace on the suspicious binding → read → think
  3. Call explorer_connections → read → think
  4. Return a classification

That is 4 tool calls and 3 rounds of reasoning about deterministic facts. With the query tool, the agent writes one program and gets the answer in one call. Every step that was "filter, check, decide" is now compiled logic, not token spend. The same program works across every program in a batch.

The entry point defaults to classify but you can use any name with --entry-point check_permissions on the CLI or the entryPoint parameter in MCP. In the playground, paste the query source, select a target, and run it.
Going further: the generic Chapter 3 query is good at spotting fake IO, but it will not automatically catch a permission gate bypass. For the fictional permission leak demo, run eval/queries/perm-gate-check.bosatsu against demos/explorer-playground/permission-leak-demo.bosatsu. That query is anchored to this demo's secret_cell and can_see_secret bindings, walks the dependency graph, and checks that secret-returning exports still flow through the gate. To reuse it on another program, update the gate_binding and secret_binding constants at the top of the query.

The query tool covers the automated case. What about the rest of the toolkit?

4

The full toolkit

Explorer has eight tools. Each answers a different question about the program graph.

When you want to… Use CLI
Get a high-level picture overview --overview
Navigate the binding graph explore --path --depth
See what feeds or depends on a binding connections --connections
Find bindings matching a pattern search --search
Understand how two bindings relate path --path-from <from> --path-to <to>
See the full upstream tree with provenance trace --trace
Get raw edges for scripting trace_flow --trace-flow
Automate the entire investigation query --query

Three ways to use them

All eight tools are available from three surfaces. The browser playground compiles Bosatsu in the browser and renders the graph visually — best for building intuition. The CLI (yichus explore) produces JSON you can pipe to jq or scripts — best for automation. The MCP server (yichus mcp my-program.bosatsu) exposes all eight tools over the MCP protocol so AI agents call the same tools a human uses in the playground. All three surfaces use the same ExplorerWorkspace, so results are identical regardless of surface.

A typical investigation session

# 1. Start with the big picture
yichus explore --overview Demo/ExplorerPlayground/MixedTrust \
  demos/explorer-playground/mixed-trust.bosatsu

# 2. Spot something suspicious — drill in
yichus explore --trace Demo/.../MixedTrust/reviewSummary \
  demos/explorer-playground/mixed-trust.bosatsu

# 3. Compare with a trustworthy binding
yichus explore --trace Demo/.../MixedTrust/readScore \
  demos/explorer-playground/mixed-trust.bosatsu

# 4. Automate the check for next time
yichus explore --query my-check.bosatsu \
  demos/explorer-playground/mixed-trust.bosatsu

The progression is always the same: overview → drill → compare → automate. Each step can be done in the playground, the CLI, or through MCP.

5

Start here

Pick whichever path feels natural. They all teach the same thing. Browser: Open the playground, pick Trustworthy Counter, click Analyze, then switch to Suspicious Fabrication and compare. CLI: Run the two commands below, then drill with --trace into the binding that catches your eye. Query: Copy the 30-line program from Chapter 3, save it as my-check.bosatsu, and run it with --query against any Bosatsu file.

yichus explore --overview Demo/ExplorerPlayground/TrustworthyCounter \
  demos/explorer-playground/trustworthy-counter.bosatsu

yichus explore --overview Demo/ExplorerPlayground/SuspiciousFabrication --overlay suspicion \
  demos/explorer-playground/suspicious-fabrication.bosatsu

Five samples live in demos/explorer-playground/. Each demonstrates a different pattern:

FileWhat it demonstrates
trustworthy-counter.bosatsu Real state reads, real IO — a legitimate program
suspicious-fabrication.bosatsu IO declared but the output is a constant — fabrication
dead-inputs.bosatsu A state read that never reaches the output — dead input
mixed-trust.bosatsu Legitimate IO alongside a fake summary binding
pure-computation.bosatsu No IO at all — pure arithmetic, nothing to fabricate
permission-leak-demo.bosatsu A secret reaches the export without flowing through the permission gate
One exercise: Load Mixed Trust. It has both real IO (readScore) and a fake output (reviewSummary) in the same package. Can you tell which is which before running the trace?