openai/tiktoken
tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Builder

OpenAI
openai • ai-lab
Stars
17,776
Using upstream star count
Forks
1,425
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Dec 1, 2022
Project creation date
README Summary
tiktoken is a fast Byte Pair Encoding (BPE) tokenizer library designed specifically for use with OpenAI's language models. It provides efficient tokenization and encoding functionality with support for different OpenAI model encodings. The library is optimized for performance and integrates seamlessly with OpenAI's API ecosystem.
AI Dev Skills
Unmapped
Natural Language ProcessingTokenizationByte Pair EncodingText PreprocessingLanguage Model Integration
Tags
Natural Language ProcessingTokenizationByte Pair EncodingText PreprocessingLanguage Model IntegrationBatch text processing for ML pipelinesFoundation Model IntegrationLarge Language ModelsOn-premiseTextSelf-hostedText preprocessing for language model applicationsCloud APIToken counting for API cost estimationText tokenization for OpenAI model inferencePython
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 2 months ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
Inference & ServingPrimaryNLP & TextML Platform & InfrastructureOther AI / MLMLOps & InfrastructureFoundation Models
PM Skills
Scale & ReliabilityDeveloper Platform
Languages
Python100.0%
Timeline
- Project created
- Dec 1, 2022
- Forked
- Mar 14, 2026
- Your last push
- 2 months ago
- Upstream last push
- 17 days ago
- Tracked since
- Feb 8, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…