2

ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

Token-free byte-to-byte language modeling at scale.