Claude Limits, Training Gains, API Pricing #12
Today's Letter
- Anthropic raises Claude limits, signs SpaceX compute deal
- Unsloth, NVIDIA training optimizations detailed
- DeepSeek, V4 Pro pricing cut extended through 31 May
Anthropic raises Claude limits, signs SpaceX compute deal
- Anthropic said it increased usage limits for Claude Code and the Claude API, with all listed changes effective on May 6, 2026
- Claude Code five-hour rate limits were doubled for Pro, Max, Team, and seat-based Enterprise plans
- Anthropic also removed peak-hours limit reductions for Claude Code on Pro and Max accounts
- API rate limits were raised for Claude Opus models, though the announcement did not include the full table in the provided text
- The company signed an agreement to use all compute capacity at SpaceX's Colossus 1 data center, adding more than 300 megawatts and over 220,000 NVIDIA GPUs within a month
- Anthropic said this new capacity will directly improve availability for Claude Pro and Claude Max subscribers
- The announcement adds to earlier compute agreements, including up to 5 GW with Amazon, 5 GW with Google and Broadcom starting in 2027, and a $30 billion Azure capacity partnership with Microsoft and NVIDIA
- Anthropic said some new capacity will be deployed internationally, including additional inference capacity in Asia and Europe for enterprise compliance and data residency needs
Source: anthropic.com
More: techzine.eu · business-standard.com · dqindia.com
Unsloth, NVIDIA training optimizations detailed

- Unsloth said its collaboration with NVIDIA made LLM training about 25% faster, with the new optimizations enabled automatically after updating Unsloth
- The post describes three changes: packed-sequence metadata caching, double-buffered async gradient checkpointing, and faster MoE routing for GPT-OSS training
- Unsloth reported that caching packed-sequence metadata improved per-batch training speed by 14.3% on Qwen3-14B QLoRA SFT, with a 43.3% gain in forward pass and 5.8% in backward
- The metadata path is reused across transformer layers, so caching avoids rebuilding masks and sequence metadata on every layer and reduces repeated synchronization overhead
- For checkpointing, Unsloth uses two buffers so activation reloads from pinned CPU memory to GPU can overlap with backward compute instead of running in a serialized copy-then-compute pattern
- Unsloth said this double-buffered checkpoint reload path produced an 8% speedup on larger dense-model runs on NVIDIA Blackwell B200 GPUs
- The company also said GPT-OSS training became 15% faster by using argsort and bincount during mixture-of-experts routing
- Unsloth said the new changes are additive to its earlier 2-5x training speedups and apply across RTX laptops, data center GPUs, and DGX Spark systems
Source: unsloth.ai
More: techflowpost.com
DeepSeek, V4 Pro pricing cut extended through 31 May

- DeepSeek says the 75% discount on DeepSeek-V4-Pro remains in effect until 2026-05-31 15:59 UTC, according to its API pricing page
- Discounted V4 Pro rates are listed at $0.003625 per 1M input tokens on cache hit, $0.435 on cache miss, and $0.87 per 1M output tokens
- The standard V4 Pro prices shown on the same page are $0.0145 for cache-hit input, $1.74 for cache-miss input, and $3.48 for output, implying the temporary cut is already reflected in the active billing table
- DeepSeek-V4-Pro supports both thinking and non-thinking modes, with 1M context length and up to 384K maximum output
- JSON output, tool calls, chat prefix completion beta, and FIM completion beta are listed as supported features for the model
- DeepSeek also notes that cache-hit input pricing for all models was reduced to one-tenth of launch pricing from 2026-04-26 12:15 UTC
- The pricing page says product prices can change and recommends checking the page regularly for the latest billing terms
Source: api-docs.deepseek.com
Jocoletter curates AI, software, and product trends for developers and builders.
#Anthropic #DeepSeek #Unsloth