Artificial intelligenceJune 11, 2026· via MarkTechPost

Cohere’s ‘North Mini Code’ brings efficient AI coding to your own hardware

Cohere’s ‘North Mini Code’ brings efficient AI coding to your own hardware
Publicité

Cohere AI this week unveiled North Mini Code, its first developer-facing coding model designed to run efficiently on a single H100 GPU. The open-weight mixture-of-experts (MoE) model packs 30 billion total parameters but activates just three billion during each forward pass, striking a balance between capability and resource demands.

A model built for agentic workflows

North Mini Code targets three core tasks: code generation, agentic software engineering, and terminal operations. Unlike broader multimodal models, it processes text only, with a 256K-token context window and a maximum output of 64K tokens. Cohere optimized the architecture for autonomous workflows that can call tools, reason step-by-step, and orchestrate sub-agents—key features for modern AI-driven development pipelines.

Lean design, high throughput

The model uses a decoder-only Transformer with sparse MoE layers, interleaving sliding-window and global attention in a 3:1 ratio. Its feed-forward block contains 128 experts, with eight activated per token using a sigmoid-gated router. This design keeps active compute low while preserving broad capacity. In internal benchmarks, North Mini Code delivered up to 2.8 times higher output throughput than a comparable model on identical hardware, alongside a 30% reduction in inter-token latency.

Weights are released under Apache 2.0 on Hugging Face, with additional access via the Cohere API, Model Vault, and OpenRouter. The minimum hardware requirement is a single H100 GPU operating at FP8 precision, making it feasible for teams to self-host without large clusters. Cohere positions the release within its “sovereign AI” initiative, emphasizing control and autonomy for developers who prefer running models on their own terms.


Source: MarkTechPost. AI-assisted editorial synthesis — TechnoExpress.

Read the original source on MarkTechPost →

← Back to home

Publicité