GitHub repository URL (e.g., https://github.com/jimmc414/onefilellm)
arXiv abstract URL (e.g., https://arxiv.org/abs/2401.14295)
Local folder path (e.g., C:\python\PipMyRide)
Youtube video URL (e.g., https://www.youtube.com/watch?v=KZ_NlnmPQYk)
Webpage URL (e.g., https://llm.datasette.io/en/stable/)
It outputs the repo, web documentation, arXiv paper or YT transcript to a text file and the clipboard, displaying a token count. It also produces an additional file of the same content that reduces token usage by removing stopwords and converting to lowercase.
The purpose of this tool is to make it easier to manually pass repos, papers, transcripts or web documentation into an LLM for inference.
For accessing Github repos you just need to generate a github personal token as described in the readme.
I needed this tool for myself and thought it might be of use to someone else.