AI agent harnesses and developer tooling updates #8

May 4, 2026 — 4 min read

Today's Letter

VS Code pull request #310226 merged on Apr. 16, reportedly enabling AI co-authoring by default.
The change appears to add a "Co-Authored-By: Copilot" commit footer beyond explicit Copilot use.
The public PR page shows 2 commits and 1 changed file, but no description was provided.
The page does not confirm exact trigger conditions or whether the footer can be disabled globally.
Community response was sharply negative, with 372 thumbs-down reactions visible on the PR.
If shipped as implied, commit attribution and repository history could change without clear user intent.

Source: github.com

Kimi K2.6 from Moonshot AI placed first in ThinkPol's Day 12 AI Coding Contest, according to the report, finishing with 22 match points and a 7-1-0 record.
The task was a Word Gem Puzzle on 10×10 to 30×30 boards, where models had 10 seconds per round to slide tiles and claim English words formed in straight horizontal or vertical lines.
Scoring favored longer words: words under seven letters lost points, while seven-letter-plus words scored length minus six, which pushed most serious entries toward longer-word filtering.
The report says Kimi used an aggressive greedy sliding loop and posted the highest cumulative score at 77, while Xiaomi's MiMo V2-Pro finished second at 20 match points with a mostly static word-scan strategy.
GPT-5.5 ranked third with 16 match points, Claude Opus 4.7 ranked fifth with 12, and the write-up argues that models that did not slide often broke down on larger 30×30 boards.
The results are not yet independently confirmed beyond the primary source, but the benchmark highlights how protocol handling, search strategy, and penalty-aware planning can outweigh brand-name model positioning on structured coding tasks.

Source: thinkpol.ca
More: news.google.com

Mendral said agent harnesses are more reliable outside the sandbox, where the LLM loop stays on the backend and calls sandbox tools over an API.
The post argues this keeps LLM API keys, user tokens, and database credentials out of ephemeral execution environments, according to the report.
Mendral said the outside model allows sandboxes to be suspended when idle, with Blaxel used for standby resume times of about 25ms during interactive turns.
For long-running sessions, the company said it runs each agent turn as an Inngest step so workflows can survive deploys, restarts, and instance failures.
The post describes skills and memories as virtual files backed by different systems: workspace paths go to the sandbox, while shared memory and skill namespaces map to a database.
Mendral said this avoids treating multi-user agent state as a distributed filesystem problem when several engineers share the same agent and update memory in parallel.
The design is presented as a backend architecture choice rather than a standard pattern, and the claims are not yet independently confirmed beyond Mendral's own blog post.

Source: mendral.com
More: news.google.com

Flue presented itself as a TypeScript agent harness for autonomous agents, according to its site.
The framework pairs model access with sessions, skills, memory, filesystem, and sandbox controls.
Agents can run from a CLI or be bundled into an HTTP server for deployment.
Flue offers a built-in virtual sandbox and can connect to external sandboxes.
Example code shows structured skill outputs with Valibot and shell steps inside one session.
Sample models listed include anthropic/claude-sonnet-4-6 and anthropic/claude-opus-4-7.
A GitHub issue triage example is framed as 22 lines of TypeScript on the site.
The site also claims token scoping so secrets like GITHUB_TOKEN stay outside the agent sandbox.

Source: flueframework.com

DO_NOT_TRACK proposes a single environment variable, `DO_NOT_TRACK=1`, to signal that software should disable telemetry, usage reporting, crash reporting, ad tracking, and other non-essential network requests, according to the project site.
The stated goal is to replace per-tool opt-out switches with one cross-tool convention for local software and CLI workflows.
The site lists existing opt-out examples across tools including .NET, AWS SAM CLI, Azure CLI, Gatsby, Go telemetry, Google Cloud SDK, Homebrew, and Netlify CLI.
Setup examples are provided for Bash, Zsh, Fish, PowerShell, and Windows CMD so the variable can persist across terminal sessions.
For software authors, the proposal asks tools to check whether `DO_NOT_TRACK` is set to `1` and to honor it alongside current product-specific switches.
The page also recommends moving telemetry from opt-out to opt-in where possible.
The proposal references `NO_COLOR` and `FORCE_COLOR` as precedent for simple environment-variable standards, but broader ecosystem adoption is not yet officially confirmed.

Source: donottrack.sh
More: news.google.com

Jocoletter curates AI, software, and product trends for developers and builders.

#DO_NOT_TRACK #Flue #Mendral #Microsoft #MoonshotAI