BitLlama
Pure Rust LLM inference engine with 1.58-bit ternary support and Test-Time Training
BitLlama is a Pure Rust LLM inference engine featuring 1.58-bit ternary quantization, Test-Time Training (TTT), Soul learning system, MCP server/client, and private RAG. Supports Llama, Gemma, Mistral, Qwen, and BitNet models. OpenAI-compatible API server included.
winget install --id imonoonoko.BitLlama --exact --source winget Latest 1.0.0
| Architecture | Scope | Download | SHA256 |
|---|---|---|---|
x64 | — | Download | 4F2A1FC7F498F43292E52F32DCF4E88B5AD4C0DFED933AAA5815102CF8D6DCA7 |
Details
- Homepage
- https://github.com/imonoonoko/Bit-TTT-Engine
- License
- MIT
- Publisher
- imonoonoko
- Support
- https://github.com/imonoonoko/Bit-TTT-Engine/issues