Copilot Adoption Metrics and Inference Tools #33
Today's Letter
GitHub, Copilot metrics API adds AI adoption cohorts

- GitHub added AI adoption cohorting to the Copilot usage metrics API on May 29, 2026
- User-level reports now include a new `ai_adoption_phase` field, based on Copilot product usage across a rolling 28-day window
- Each engaged user is classified into one of four phases when a qualifying surface was used on at least two days in that window
- Phase 1 covers code completion or IDE agent mode, Phase 2 covers a single GitHub-based agent surface, and Phase 3 covers two or more agent surfaces or the GitHub Copilot app
- Enterprise- and organization-level reports now expose a `totals_by_ai_adoption_phase` array for per-phase engagement and activity metrics
- Reported metrics include engaged users, interaction averages, code generation and acceptance, lines added and deleted, pull request activity, and median time-to-merge averages
- GitHub said the cohort value includes a version field starting at `v1`, so the classification model can change without breaking historical context
- The API is available to enterprise administrators and organization owners with Copilot usage metrics access through the REST API
Source: github.blog
NVIDIA, DynoSim inference simulator introduced

- NVIDIA published DynoSim on May 29, 2026 as a workload-driven discrete-event simulator for the Dynamo LLM serving stack
- The project targets deployment tuning across backend choice, tensor parallelism, prefill/decode split, worker counts, scheduler settings, routing, KV cache behavior, autoscaling, and topology
- DynoSim composes workload replay, engine simulations, Router, Planner, and optional KV behavior on a single virtual timeline rather than using a bit-exact hardware emulation model
- Engine timing is informed by AI Configurator, while scheduler logic models backend-specific serving behavior such as vLLM preemption and SGLang radix-cache-aware admission
- NVIDIA reports a single-threaded Rust offline replay on an Apple M4 MacBook Air simulated a 23,608-request Mooncake trace in 2.41 seconds for a 60.1-minute serving window, about 1,500x faster than real time
- The stated use case is a simulate-then-verify loop that screens large numbers of serving configurations before running full GPU experiments
- In NVIDIA's Router example, KV-aware routing raised prefix cache reuse from about 0.38 to 0.44-0.45 and improved TTFT and throughput versus round-robin placement across the tested concurrency sweep
- The post positions DynoSim as a way to search Pareto tradeoffs and test serving-stack changes such as router cost functions, planner heuristics, and cache policies
Source: developer.nvidia.com
Jocoletter curates AI, software, and product trends for developers and builders.
#GitHub #NVIDIA