Agent benchmarks and model release updates #48
Today's Letter
XiaomiMiMo, MiMo-V2.5-Pro-FP4-DFlash page posted

- Hugging Face model page for XiaomiMiMo/MiMo-V2.5-Pro-FP4-DFlash was posted on 2026-06-08.
- The page identifies MiMo-v2.5-pro as a Xiaomi MiMo Team model with 1T parameters.
- A 1M-token context window is stated in the retrieved source text.
- The repository name indicates an FP4 DFlash variant rather than a generic MiMo-v2.5-pro listing.
- A MIT license is shown on the model page.
- No pricing, benchmark table, or deployment requirements are clearly provided in the retrieved source text.
Source: huggingface.co
More: mimo.xiaomi.com
NVIDIA, GB300 NVL72 tops AA-AgentPerf launch

- NVIDIA said its GB300 NVL72 posted the top launch result on Artificial Analysis' AA-AgentPerf coding benchmark.
- AA-AgentPerf measures how many concurrent coding agents a system can serve while meeting model-specific SLO targets.
- The launch benchmark uses DeepSeek-V4-Pro and reports results normalized per accelerator and per megawatt.
- SLO tiers are defined by output speed and P95 time-to-first-token: 30/10s, 100/5s, and 300/3s.
- Test trajectories are based on public code-repository issues, cover 12+ programming languages, and include tool use.
- Request lengths range from 5K to 131K tokens, with an average of about 27K tokens per request.
- Tool-call latency is simulated with a shared CPU baseline using a one-second median delay across tested systems.
- NVIDIA said GB300 NVL72 delivered up to 20x higher concurrent agent throughput per megawatt than H200.
Source: developer.nvidia.com
More: blogs.nvidia.com
Jocoletter curates AI, software, and product trends for developers and builders.
#NVIDIA #XiaomiMiMo