Several features include:
Integration by decorating existing code. Automatic concurrent (or parallel) execution management. Easy task scheduling without dealing with low-level concurrency primitives. Thread safety. Lightweight. Python 3.7 -> 3.11 support. Asyncify IO Bound sync Code. Extend the original DAG by composing it with smaller DAGs.
We built this library by iterating a lot on user feedback, since the first users were our own colleagues from the R&D team, who used the library when developing the API codebase ! We tried to make it as generic as possible, so it wouldn’t just be tailored for their needs but for everyone’s. But the downside of developing with a fixed user base is that we tend to focus on their requests and having other users would be a blessing to get other points of view. This library is pretty battle-tested since we’ve been using this library for almost 2 years in production now !
As our job develops APIs that require long calculations and each API call requires running many heavy computer vision deep learning models encapsulated in functions, that depend on the result of other functions and so on. The crux is that as functions don’t take the same time to run, parts of the code could be run while other parts wait. As this can naturally be modelled as a DAG, we decided to write (and open-source !) a library enabling easy parallelization of such workflows.
Open sourcing our work has always been something we had wanted to do and this was the perfect excuse to do it ! Any critic/feedback/questions is absolutely welcome and encouraged (and will mean a lot !) and we’ll try to address those as precisely as possible !
Matthias & Bashir