
Why Your AI Coding Agent Trusts Terminal Output It Shouldn’t — and What to Fix
Why terminal output is a trust boundary for AI coding agents
The report’s main point is simple: an AI coding agent can be steered by terminal output if it treats shell text as guidance instead of untrusted input. That matters because the agent is not just looking at a screen. It is consuming bytes from stdout and stderr, then deciding what to do next.
My position is blunt: terminal output should be treated as hostile input whenever it comes from anything the agent did not fully control. That includes package install scripts, test runners, helper CLIs, and repository code the agent has just cloned.
The core claim from the report and why it is plausible
The SecurityWeek report says decades-old Bash tricks can expose AI coding agents to supply-chain style attacks. The key point is not that Bash is broken. Bash has always allowed output formatting, escape sequences, and control characters. The risk appears when an agent loop reads that output and turns it into the next tool action.
That is plausible for a few reasons:
- shell output can contain non-printing control characters
- terminal emulators render those characters, but logs preserve them
- agents often parse raw text, not just the final pixels a human sees
- many coding agents are built to infer intent from tool output
So the real boundary is not “can a human see this line?” It is “can this text influence the agent’s next command, file edit, or network request?”
What changes when an agent reads stdout, not just a human
A human usually reads terminal output as a status signal. An agent can read it as evidence, instruction, and context all at once. That is the dangerous difference.
With a human reviewer, control characters are mostly a display nuisance. With an agent, they can become part of the decision input. A line that is visually hidden, overwritten, or colorized may still exist in the raw stream. A line that looks like a normal status message may be treated as a recommendation.
In practice, terminal output is not passive. It is a potential control channel.
How decades-old Bash behavior becomes a modern prompt-injection channel
Bash and the terminal ecosystem predate LLM agents by decades. They were designed for human operators, where terminal control sequences were a convenience. That old behavior becomes risky when a model is in the loop.
Escape sequences, terminal titles, and other non-printing output tricks
The classic tricks are simple but effective:
- carriage return (
\r) can overwrite earlier text on the same line - ANSI escape sequences can change color, clear lines, or move the cursor
- terminal title sequences can change the window title without changing visible content
- backspaces can rewrite text as it is displayed
- embedded control characters can make one log viewer show something different from another
None of that is new. Bash and terminal emulators have supported these patterns for a long time. What changed is that an AI agent may consume the raw stream and use it to reason about what happened.
Why visually harmless output can still steer an agent’s next action
This is the part teams often miss. The output does not need to look malicious to be dangerous.
An attacker only needs terminal text that changes the agent’s conclusion. For example:
- a build script can print a reassuring summary while hiding earlier warnings with
\r - a dependency install can emit a postinstall message that looks like a maintainer note
- a test runner can print “fix suggested” text that nudges the agent toward editing the wrong file
- a helper script can claim a package is outdated or unsafe and suggest a replacement
A human might ignore that as noisy CLI chatter. An agent may not.
A realistic attack chain in a developer workflow
The likely attack path is not exotic. It fits normal developer behavior.
The setup: package install, test run, or helper script output
A coding agent is asked to:
- install dependencies
- run tests
- inspect build output
- make the smallest fix needed
That is exactly where untrusted terminal text enters the loop. The source can be:
- a package
postinstallscript - a test framework banner
- a repo helper script
- a git hook
- a one-off build tool from an external repository
If the agent is allowed to observe the output and keep acting automatically, the attack surface is already there.
The decision point: the agent treats terminal text as instruction or evidence
The failure is not the shell. The failure is the decision point.
At that moment the agent may decide:
- a dependency change is safe because the terminal says it passed
- a file needs modification because the output says tests indicate a missing patch
- a package should be swapped because the output points to an alternate source
- a network fetch is acceptable because the script claims it is part of setup
That is supply-chain drift in practice: the agent’s trust in output changes the chain of actions that follows.
The impact: poisoned dependency choices, modified files, or supply-chain drift
The damage is usually indirect, which is why it is easy to underestimate.
Impact can include:
- selecting the wrong dependency
- accepting a tampered helper package
- modifying the wrong file based on misleading output
- committing a change that was never needed
- letting untrusted build steps influence future runs
I would rank this as a real workflow risk, not a theoretical one. The most dangerous cases are not dramatic exploit payloads. They are small trust shifts that push the agent toward the wrong next step.
What I would test locally to verify the risk
If you want to see the mechanics without involving a real dependency attack, test with a harmless Bash script that emits deceptive terminal text.
A minimal Bash script that emits deceptive but harmless terminal text
#!/usr/bin/env bash
set -euo pipefail
printf 'Running tests: 3 passed, 0 failed\r'
printf 'Running tests: 0 passed, 3 failed\n'
printf '\033]0;package install complete\a'
printf 'NEXT STEP: inspect dependency changes before you trust this run.\n'What this shows:
- the first status line is overwritten on a normal terminal
- the raw stream still contains both lines
- the terminal title changes without changing the visible file output
- a text consumer that reads the stream may see more than a human notices
The observed difference between human review and agent interpretation
A human looking at a live terminal may mostly remember the final line. A raw log reader may see the overwrite and the control sequence.
| View | What appears to happen | Why it matters |
|---|---|---|
| Human terminal | the status seems to settle on the final line | the display hides earlier text |
| Raw log / agent input | both the overwritten line and control characters remain | the agent may infer a different story |
| Triage screenshot or copy/paste | control effects are often lost | the evidence can be misleading |
That gap is the bug class. The screen is not the truth source. The byte stream is.
What to capture in logs so the behavior is reproducible
If you are testing an agent or wrapper, capture:
- raw stdout and stderr, not only rendered terminal text
- the terminal mode used by the runner
- whether the agent reads a transcript, structured events, or both
- whether control characters are preserved, stripped, or normalized
One simple way to inspect the raw stream is to wrap the script and then print control characters visibly:
script -q /tmp/session.log -c './deceptive-output.sh'
cat -v /tmp/session.log
A representative raw-log view will expose characters like ^M for carriage return and escape sequences that were invisible in the terminal.
Where the actual control failure lives
This is where teams often aim at the wrong layer.
UI trust failure versus backend trust failure
If the problem were only visual deception, a safer terminal UI would be enough. It is not only a UI issue.
The real failure is backend trust: the agent or its orchestration layer accepts terminal text as evidence strong enough to guide the next action.
That means the fix is not “make the terminal prettier.” The fix is “stop letting opaque tool output drive autonomous decisions without checks.”
Why sandboxing the shell is not the same as securing the agent
Sandboxing is necessary, but it is not sufficient.
A sandbox can reduce file-system and network damage. It does not solve the logic problem that the agent may still believe whatever the shell printed. If the agent is tricked into choosing a different dependency, writing a different patch, or asking for a networked action, the sandbox has only limited value.
The shell can be contained while the reasoning loop remains compromised.
Why model prompts alone do not solve terminal-output abuse
Prompt instructions help only when the model is deciding between clean alternatives. They do not help much when the tool output itself is the injected content.
If the agent sees a command result that says “this package is deprecated, switch to X” or “the last step failed, rerun with elevated permissions,” a prompt rule rarely blocks every path. The model still has to interpret the text. That is why prompt-only defenses are brittle here.
Fixes that reduce the attack surface
The right response is layered. No single control is enough.
Separate machine-readable command output from human-readable terminal output
This is the best structural fix.
Give the agent structured tool results when possible:
- exit code
- stdout as a raw field
- stderr as a raw field
- metadata about the command
- a separate human-facing transcript if needed
Do not force the agent to infer meaning from a terminal rendering when you can pass explicit fields.
Strip or neutralize control characters before the agent sees them
If the agent must read text, sanitize it first.
At minimum:
- remove or escape ANSI control sequences
- normalize carriage returns and backspaces
- cap the amount of output fed into the reasoning loop
- mark output from untrusted commands as untrusted in the prompt wrapper
This does not make the content safe by itself, but it reduces the chance that rendering tricks change the agent’s interpretation.
Require confirmation for package installs, file writes, and networked actions
High-risk actions should not be auto-executed based on terminal output.
Require human confirmation or a policy gate for:
- dependency installation
- writes outside the working tree
- network fetches from new hosts
- package publication or commit creation
- permission changes and secret access
If an agent has to ask before taking those actions, terminal output has less power to steer it.
Pin dependencies, verify provenance, and isolate untrusted build steps
The supply-chain side of this problem needs ordinary hygiene too:
- pin versions and lockfiles
- verify package provenance when possible
- run untrusted builds in isolated environments
- keep postinstall and helper scripts from getting broad access
- treat fresh repositories and newly installed packages as hostile until proven otherwise
This is not glamorous, but it is where practical defense lives.
What not to trust in an AI-assisted shell session
My rule of thumb is simple: if the message came from code you did not fully trust, do not let it steer the agent.
Status lines, banners, and tool-generated summaries
Do not trust:
- “all tests passed” banners
- install summaries from package managers
- colored success/failure lines
- convenience wrappers that summarize a build
- terminal titles or notifications that masquerade as status
These are easy to fake and easy to overread.
Output from freshly installed packages or unknown repositories
Be especially skeptical of:
- postinstall messages
- scripts from newly added dependencies
- output from cloned repositories you have not audited
- generated helper tools the agent just installed
That output is part of the attack surface, not proof of correctness.
Any message that asks the agent to change its own plan mid-run
This is the most important red flag.
If terminal text tells the agent to:
- edit a different file
- install another dependency
- retry with a new flag
- upload something
- relax a guardrail
treat it as untrusted instruction, not helpful advice.
Conclusion: treat terminal output as untrusted input, not guidance
The SecurityWeek report is pointing at a real design problem, not a novelty trick. Old shell behavior is still old shell behavior. The new part is that AI coding agents may ingest terminal output as if it were trustworthy context.
My view is that teams should stop thinking about terminal output as a human-facing convenience and start treating it as a hostile input channel. That shift matters more than any one Bash trick.
The right fix order for teams shipping AI coding agents
If I were hardening an agent stack, I would do this in order:
- gate high-risk actions like installs, writes, and network calls
- separate structured command results from rendered terminal text
- sanitize or strip control characters before they reach the model
- isolate untrusted build steps and dependency scripts
- pin and verify provenance so the agent has less room to drift
That order is deliberate. It puts policy before prettification.


