Agent Stack, Copilot, and Async Inference #52
Today's Letter
- GitHub Copilot, context handling and Auto routing update
- AWS, SageMaker Async Inference adds inline payloads
- Vercel, Agent Stack and eve unveiled
- Hugging Face, agentic tooling benchmark for open models published
GitHub Copilot, context handling and Auto routing update

- GitHub published a June 17 post on changes to GitHub Copilot context handling and model routing.
- The update is framed as getting more useful work from each token during Copilot sessions.
- GitHub says the work focuses on reducing repeated context sent across turns.
- The post also points to lower overhead from repeatedly sending tool definitions and cached state.
- GitHub names Auto as the routing mode tied to the update.
- The article is published on the GitHub Blog and discusses GitHub Copilot and GitHub Copilot for VS Code.
Source: github.blog
More: augmentcode.com
AWS, SageMaker Async Inference adds inline payloads

- AWS added inline request payload support to Amazon SageMaker AI Async Inference on June 17, 2026.
- InvokeEndpointAsync now accepts a raw-bytes Body parameter, removing the need to upload input data to Amazon S3 first.
- Inline payloads are capped at 128,000 bytes and are intended for smaller requests with longer async processing times.
- Body and InputLocation are mutually exclusive, and requests that set both return a synchronous ValidationError.
- Output behavior is unchanged: inference results are still written to the configured S3 OutputLocation.
- Existing async endpoints are supported without expected model or container changes.
- The previous flow required an S3 client, input bucket, s3:PutObject permission, object naming, and stale-object cleanup.
- AWS says the feature removes one network round trip and reduces client-side code and operational overhead.
- The launch is available in 31 commercial AWS Regions, including ICN, NRT, SIN, FRA, and IAD.
Source: aws.amazon.com
Vercel, Agent Stack and eve unveiled

- Vercel recapped Ship 2026 on June 17 and positioned its platform around building and deploying AI agents.
- Agent Stack combines AI SDK, AI Gateway, Workflow SDK, Vercel Sandbox, and Chat SDK as core building blocks.
- AI SDK provides one API across model providers, while AI Gateway routes requests across hundreds of models with failover.
- Workflow SDK adds durable runs, retries, state persistence, and observability for multi-step agent execution.
- Vercel Connect launches as a secure access layer using temporary task-scoped credentials instead of long-lived provider tokens.
- eve is a new open-source agent framework with a single-directory structure, Markdown instructions, and TypeScript tools.
- Vercel said eve includes approvals, subagents, evals, durable execution, and sandboxed compute out of the box.
- The company also expanded backend support for FastAPI, Flask, Express, and Hono, plus REST APIs, queues, cron, and MCP servers.
- Vercel Services was announced with availability starting July 1, 2026.
Source: vercel.com
Hugging Face, agentic tooling benchmark for open models published

- Hugging Face published an agent-focused benchmark for testing how open models use real tooling, using transformers as the main case study.
- The evaluation measures process cost rather than final accuracy alone, including turns, tokens, latency, failures, and how directly an agent reaches the result.
- Each task is run under three access tiers: a bare `pip install transformers` setup, a full source checkout, and a packaged Skill with curated docs and task examples.
- The benchmark is executed with the pi coding agent, with each model × revision × task run isolated as a separate Hugging Face Job on identical hardware.
- Results and traces are written to a Hugging Face Bucket to support parallel runs and high write concurrency.
- The post argues that agent-oriented tooling depends on discoverable APIs, structured documentation, and task-specific examples, not only model quality.
- Hugging Face cites earlier hf CLI work where agent use required 1.3–1.8× fewer tokens, with reductions of up to 6× in some cases.
- The article was published on June 18, 2026, and positions the harness as a way to compare model revisions and library changes before shipping large code changes.
Source: huggingface.co
More: venturebeat.com
Jocoletter curates AI, software, and product trends for developers and builders.
#AWS #GitHub #HuggingFace #Vercel