megatron.tokenizer.gpt2_tokenization#
Description
Tokenization classes for OpenAI GPT.
Classes
|
GPT-2 BPE tokenizer. Peculiarities: |
Functions
Returns list of utf-8 byte and a corresponding list of unicode strings. |
|
|
Return set of symbol pairs in a word. |