I am Paul, and I would like to get some feedback on the tool we are releasing as beta today.
3LC is an ML tool that gives detailed insights, real-time data-centric iterative workflows for training/finetuning, and data quality improvements for your Machine Learning datasets and models. 3LC serves as a visualizer, editor, and debugger, focusing on how models learn from the training data.
Key Features of 3LC:
• Detailed Data Analysis: 3LC enables users to dive into model performance beyond typical labeling errors. It offers the capability to analyze intricate false positives, track embedding-space dynamics, and perform interactive metrics analysis on a per-sample, per-epoch basis.
• Real-Time Visualization: Users can visualize every data point immediately after training in our Dashboard; create interactive 2D/3D plots, real-time filter outliers, and find correlations in the metrics recorded. All results are visualized and linked to the underlying original data.
• Interactive Data Editing: At its core, 3LC allows for on-the-fly modifications of training data. Users can, for example, adjust bounding boxes in image datasets or change sample weights based on their trajectories in embedding space, directly influencing subsequent training rounds.
• Seamless Integration: 3LC can integrate into existing PyTorch training scripts without significantly changing your established workflow. It operates within any system setup, be it locally on your laptop, on-prem HPC, or at your favorite cloud provider.
• Non-Intrusive Data Revisions: 3LC's data modifications are sparse and without duplicating or relocating data. No upload of data to a SaaS solution!
We have tried to make 3LC as minimally intrusive as possible – enabling full data-centric workflows wherever you run your training or finetuning.Why This Matters:
We have designed 3LC to provide insights into the Machine Learning workflows not often visible in traditional setups. The aim is, of course, to help produce more accurate ML models, but we have also seen users able to reduce both their model and training dataset sizes while improving their accuracy.
This first release is aimed at the Computer Vision domain. However, we are hard at work on UI improvements and integrations to support LLM finetuning.
To integrate 3LC into your projects, you can start with a simple installation:
pip install 3lc
or visit https://pypi.org/project/3lc
For further details and documentation, visit https://docs.3lc.aiWe welcome feedback from the HackerNews community to help us improve and develop 3LC further.
Thank you for your time!