A few friends and I have spent the past 6 years or so developing a way to write data transformation code in a way that can easily adapt to changes in logic or data elements both upstream and downstream without the need major refactoring, regression testing, or re-orchestration.
We decided to open source the project about two months ago and published a CLI tool after we realized how big of a task it was to take on incumbents like stored procedures, dbt, and psyspark.
It is early days for our community and we are looking to grow and engage with others to poke holes and contribute ideas!
For an overview of the concepts and why we built it, our most recent blog does a decent job introducing the core ideas: https://www.dataforgelabs.com/blog/introduction-dataforge-fr...
We will be publishing more content in the near future to dig into how we actually took these ideas and implemented them across the project.
Feel free to submit ideas or issues directly to the Github!