Liquid AI Unveils Compact Models for Fast Multilingual Search

Liquid AI has introduced two compact retrieval models designed to accelerate multilingual and cross-lingual search across 11 languages. The new models, LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M, each contain 350 million parameters and are built on the LFM2.5-350M-Base architecture released earlier this year. Both are tailored for fast, efficient search in contexts such as product catalogs, FAQ knowledge bases, and support documentation.
A tale of two approaches
The two models share a common bidirectional encoder backbone but differ in how they represent text. LFM2.5-Embedding-350M functions as a dense bi-encoder, converting each document into a single vector. This design prioritizes speed and minimal storage, making it ideal for applications where efficiency is critical. In contrast, LFM2.5-ColBERT-350M adopts a late-interaction strategy, generating vector representations for each token. This method enables word-by-word matching between queries and documents, improving accuracy and generalization at the cost of a larger index. Its query length is capped at 32 tokens, and it can also rerank results from a first-stage retriever without requiring an index build.
From causal to bidirectional
Both models stem from the general-purpose LFM2.5-350M-Base checkpoint, which was originally a causal decoder. Liquid AI adapted the architecture by applying bidirectional patches: replacing the causal attention mask with a bidirectional one and removing causal constraints from short convolutions. These changes allow each token to attend to both left and right context, enhancing the model’s ability to produce full-context representations suited for retrieval tasks. Despite the architectural shift, the models retain the efficiency of the LFM2 backbone, with 17 layers—10 convolution, 6 attention, and 1 pooling or dense layer—and a context length of 32,768 tokens, though documents are tuned to 512 tokens.
Training for multilingual performance
The models follow a three-stage training process: large-scale contrastive pretraining in English, multilingual and cross-lingual distillation from a strong teacher across all 11 languages, and final fine-tuning using hard-mined negatives. The Embedding model receives slightly more cross-lingual data, reflecting the natural advantages of late-interaction setups for cross-lingual retrieval. Training data combines curated internal sources with open English retrieval datasets, augmented through LLM-based translation for multilingual and cross-lingual pairs.
Liquid AI evaluated the models on multilingual retrieval using NanoBEIR and cross-lingual open-domain QA with MKQA-11, reporting results across Arabic, German, English, Spanish, French, Italian, Japanese, Korean, Norwegian, Portuguese, and Swedish. On average, both models lead their respective classes in performance benchmarks.
Source: MarkTechPost. AI-assisted editorial synthesis — TechnoExpress.

