Tokenizer

Microsoft Corporation · Microsoft.Tokenizer

Contains Typescript and C# implementation of byte pair encoding (BPE) tokenizer for OpenAI LLMs, it is based on open sourced rust implementation in the OpenAI tiktoken.

winget install --id Microsoft.Tokenizer --exact --source winget

Latest 1.3.3

Installer type: zip

Architecture Scope Download SHA256
x64 — Download E0F039C6521FB7A4E3013386CD60ED64FB44AD9CAB74BBE7D744B60D1AFD5CE4

Details

Homepage
https://github.com/microsoft/Tokenizer
License
MIT

Tags

Microsoft.ML.TokenizersMicrosoft.DeepDev.TokenizerLibtypescripttokenizationrequirescmd