What is Pathik?
• A high-performance web crawler written in Golang with Pythonic bindings
• Uses 134x less memory than Playwright (thanks to Rod)
• Supports proxies
• Outputs clean Markdown, making it LLM-friendly for fine-tuning & retrieval use cases
Why does this matter?
• Avoids bloated browser automation
• Ideal for structured web crawling for LLMs, RAG pipelines, and knowledge extraction
• Simple pip install pathik and you’re ready to go
Try it out
pip install pathik
Example script:
github.com/justrach/pathik/blob/main/examples/example.py
If you find it useful, drop a : github.com/justrach/pathik
Would love feedback & contributions!