Copilot Adoption Metrics and Inference Tools #33

Copilot Adoption Metrics and Inference Tools #33

Today's Letter

  1. GitHub, Copilot metrics API adds AI adoption cohorts
  2. NVIDIA, DynoSim inference simulator introduced

GitHub, Copilot metrics API adds AI adoption cohorts

GitHub, Copilot metrics API adds AI adoption cohorts
  • GitHub added AI adoption cohorting to the Copilot usage metrics API on May 29, 2026
  • User-level reports now include a new `ai_adoption_phase` field, based on Copilot product usage across a rolling 28-day window
  • Each engaged user is classified into one of four phases when a qualifying surface was used on at least two days in that window
  • Phase 1 covers code completion or IDE agent mode, Phase 2 covers a single GitHub-based agent surface, and Phase 3 covers two or more agent surfaces or the GitHub Copilot app
  • Enterprise- and organization-level reports now expose a `totals_by_ai_adoption_phase` array for per-phase engagement and activity metrics
  • Reported metrics include engaged users, interaction averages, code generation and acceptance, lines added and deleted, pull request activity, and median time-to-merge averages
  • GitHub said the cohort value includes a version field starting at `v1`, so the classification model can change without breaking historical context
  • The API is available to enterprise administrators and organization owners with Copilot usage metrics access through the REST API

Source: github.blog


NVIDIA, DynoSim inference simulator introduced

NVIDIA, DynoSim inference simulator introduced
  • NVIDIA published DynoSim on May 29, 2026 as a workload-driven discrete-event simulator for the Dynamo LLM serving stack
  • The project targets deployment tuning across backend choice, tensor parallelism, prefill/decode split, worker counts, scheduler settings, routing, KV cache behavior, autoscaling, and topology
  • DynoSim composes workload replay, engine simulations, Router, Planner, and optional KV behavior on a single virtual timeline rather than using a bit-exact hardware emulation model
  • Engine timing is informed by AI Configurator, while scheduler logic models backend-specific serving behavior such as vLLM preemption and SGLang radix-cache-aware admission
  • NVIDIA reports a single-threaded Rust offline replay on an Apple M4 MacBook Air simulated a 23,608-request Mooncake trace in 2.41 seconds for a 60.1-minute serving window, about 1,500x faster than real time
  • The stated use case is a simulate-then-verify loop that screens large numbers of serving configurations before running full GPU experiments
  • In NVIDIA's Router example, KV-aware routing raised prefix cache reuse from about 0.38 to 0.44-0.45 and improved TTFT and throughput versus round-robin placement across the tested concurrency sweep
  • The post positions DynoSim as a way to search Pareto tradeoffs and test serving-stack changes such as router cost functions, planner heuristics, and cache policies

Source: developer.nvidia.com


Jocoletter curates AI, software, and product trends for developers and builders.

#GitHub #NVIDIA

Subscribe to Jocoletter

Read more