I am working on a project that gathers MLB statcast data through a python script. My vision for this is to combine all of the data in pipeline builder then create a finalized dataset that I build a dashboard with through workshop.
Below is my current pipeline with MLB data up to 6/29/2025:
To start the project I gathered all of the data from the MLB season up to this point and uploaded it into the dataset in the screenshot titled “Statcast Data from Openi…”, performed some transformations on it, joined a table to convert player IDs to names, then ended up with a finalized dataset that can be seen to the far right of the screenshot.
How do I integrate a daily python script that will pull the MLB statcast data from yesterday, send it through my pipeline builder to be properly transformed, then added to the bottom of my finalized dataset that is seen in the far right of the screenshot?
I have that script that and if I run it in Google Colab it spits out a .csv, but I don’t want to have to do that everyday
Thanks!
