b8106

📅 Feb 20, 2026📦 llama-cppView on GitHub →

✨ 2 features🐛 5 fixes🔧 2 symbols

Summary

This release introduces full support for the JAIS-2 model architecture, including specific fixes for tokenizer hashing, RoPE type, and control vector support. It also notes that JAIS-2 requires F32 precision accumulators on CUDA.

Migration Steps

No longer necessary to override set_vocab.

✨ New Features

Add support for the JAIS-2 family of Arabic-English bilingual models from Inception AI, featuring LayerNorm (no RMSNorm), ReLU² activation, separate Q/K/V projections with biases, simple MLP, RoPE embeddings, and GPT-2 BPE tokenizer.
Support for JAIS-2 model sizes: Jais-2-8B and Jais-2-70B.

🐛 Bug Fixes

Run convert_hf_to_gguf_update.py for jais-2 tokenizer hash.
Use NEOX RoPE type for JAIS2.
Remove Q/K permutation as NEOX RoPE does not require it.
Enable flash attention for JAIS2 (fixed by #19115).
Add dedicated JAIS2 pre-tokenizer type and control vector support, including LLAMA_VOCAB_PRE_TYPE_JAIS2 with cascading whitespace regex and build_cvec call.

Affected Symbols

LLAMA_VOCAB_PRE_TYPE_JAIS2 set_vocab