AI agent harnesses and developer tooling updates #8
Today's Letter
- VS Code PR suggests default Copilot co-author footer
- Moonshot AI Kimi K2.6 tops coding puzzle benchmark
- Mendral argues agent harnesses should run outside the sandbox
- Flue, TypeScript agent harness framework unveiled
- DO_NOT_TRACK, single env var proposed for telemetry opt-out
VS Code PR suggests default Copilot co-author footer
- VS Code pull request #310226 merged on Apr. 16, reportedly enabling AI co-authoring by default.
- The change appears to add a "Co-Authored-By: Copilot" commit footer beyond explicit Copilot use.
- The public PR page shows 2 commits and 1 changed file, but no description was provided.
- The page does not confirm exact trigger conditions or whether the footer can be disabled globally.
- Community response was sharply negative, with 372 thumbs-down reactions visible on the PR.
- If shipped as implied, commit attribution and repository history could change without clear user intent.
Source: github.com
Moonshot AI Kimi K2.6 tops coding puzzle benchmark

- Kimi K2.6 from Moonshot AI placed first in ThinkPol's Day 12 AI Coding Contest, according to the report, finishing with 22 match points and a 7-1-0 record.
- The task was a Word Gem Puzzle on 10×10 to 30×30 boards, where models had 10 seconds per round to slide tiles and claim English words formed in straight horizontal or vertical lines.
- Scoring favored longer words: words under seven letters lost points, while seven-letter-plus words scored length minus six, which pushed most serious entries toward longer-word filtering.
- The report says Kimi used an aggressive greedy sliding loop and posted the highest cumulative score at 77, while Xiaomi's MiMo V2-Pro finished second at 20 match points with a mostly static word-scan strategy.
- GPT-5.5 ranked third with 16 match points, Claude Opus 4.7 ranked fifth with 12, and the write-up argues that models that did not slide often broke down on larger 30×30 boards.
- The results are not yet independently confirmed beyond the primary source, but the benchmark highlights how protocol handling, search strategy, and penalty-aware planning can outweigh brand-name model positioning on structured coding tasks.
Source: thinkpol.ca
More: news.google.com
Mendral argues agent harnesses should run outside the sandbox
- Mendral said agent harnesses are more reliable outside the sandbox, where the LLM loop stays on the backend and calls sandbox tools over an API.
- The post argues this keeps LLM API keys, user tokens, and database credentials out of ephemeral execution environments, according to the report.
- Mendral said the outside model allows sandboxes to be suspended when idle, with Blaxel used for standby resume times of about 25ms during interactive turns.
- For long-running sessions, the company said it runs each agent turn as an Inngest step so workflows can survive deploys, restarts, and instance failures.
- The post describes skills and memories as virtual files backed by different systems: workspace paths go to the sandbox, while shared memory and skill namespaces map to a database.
- Mendral said this avoids treating multi-user agent state as a distributed filesystem problem when several engineers share the same agent and update memory in parallel.
- The design is presented as a backend architecture choice rather than a standard pattern, and the claims are not yet independently confirmed beyond Mendral's own blog post.
Source: mendral.com
More: news.google.com
Flue, TypeScript agent harness framework unveiled
- Flue presented itself as a TypeScript agent harness for autonomous agents, according to its site.
- The framework pairs model access with sessions, skills, memory, filesystem, and sandbox controls.
- Agents can run from a CLI or be bundled into an HTTP server for deployment.
- Flue offers a built-in virtual sandbox and can connect to external sandboxes.
- Example code shows structured skill outputs with Valibot and shell steps inside one session.
- Sample models listed include anthropic/claude-sonnet-4-6 and anthropic/claude-opus-4-7.
- A GitHub issue triage example is framed as 22 lines of TypeScript on the site.
- The site also claims token scoping so secrets like GITHUB_TOKEN stay outside the agent sandbox.
Source: flueframework.com
DO_NOT_TRACK, single env var proposed for telemetry opt-out
- DO_NOT_TRACK proposes a single environment variable, `DO_NOT_TRACK=1`, to signal that software should disable telemetry, usage reporting, crash reporting, ad tracking, and other non-essential network requests, according to the project site.
- The stated goal is to replace per-tool opt-out switches with one cross-tool convention for local software and CLI workflows.
- The site lists existing opt-out examples across tools including .NET, AWS SAM CLI, Azure CLI, Gatsby, Go telemetry, Google Cloud SDK, Homebrew, and Netlify CLI.
- Setup examples are provided for Bash, Zsh, Fish, PowerShell, and Windows CMD so the variable can persist across terminal sessions.
- For software authors, the proposal asks tools to check whether `DO_NOT_TRACK` is set to `1` and to honor it alongside current product-specific switches.
- The page also recommends moving telemetry from opt-out to opt-in where possible.
- The proposal references `NO_COLOR` and `FORCE_COLOR` as precedent for simple environment-variable standards, but broader ecosystem adoption is not yet officially confirmed.
Source: donottrack.sh
More: news.google.com
Jocoletter curates AI, software, and product trends for developers and builders.
#DO_NOT_TRACK #Flue #Mendral #Microsoft #MoonshotAI