Does anyone have experience with using lightweight transforms (DuckDB) and the S3 API? I watched this discussion between Chad and Nicolas with great interest, where this combination was emphasized but without specific examples about how to implement this.
I’m very familiar with these APIs and excited about the possibility that I’m missing something - and wanting to learn more. Would appreciate any feedback and examples here.
To my latest information Palantir PD is close to shipping official lightweight DuckDB bindings - maybe @jayad could share more concrete information.
For our use cases I provided a common reusable component to our devs that hooks into the ‚internal‘ pieces of the lightweight APIs - and I have the plan to move over to the official bindings once they are available!
Once that is done and there is interest I can open source the SQLFrame glue code to run existing pyspark code on duckdb.
Thank you for flagging this! We’ve been looking forward to this update and will be updating our internal libraries to these streamlined APIs. Additionally, thank you for sharing the SQLFrame example - this is an awesome idea for migrating transforms with low-impact to logic.