Show HN: Byte-Pair Encoding tokenizer for training LLMs on large datasets | Heykuki News