Transforms input differences for build vs preview

Hi there,

I have been trying to migrate code from Jupyter labs to code repositories , and creating transform functions to run the pipeline. The transform I have created has two additional decorators - lightweight (to set the compute/memory of the workspace) and incremental (so I can append rows to a pre-existing dataframe).

However, I am running into an issue when switching between ‘build’ and ‘preview’ modes. To read data in and get the pipeline to compile successfully for ‘preview’, I need to read my input data in and then apply this pandas function:

input_df = input_df.pandas()

But this code fails when I try to build the pipeline. I need to use this logic instead:

input_df = input_df.dataframe(‘current’)

Can anyone explain why there are read in differences between the two modes, and preferably suggest some logic for reading in my input dataframes which successfully compiles for both preview and build?

Best wishes, Mia

Heya! This is curious, because behind the scenes actually .dataframe() calls .pandas(). Can you share more about the error you are getting?

Of course, the full error I get is the following: *
Traceback (most recent call last): File “/myproject/datasets/main.py”, line 41, in run_model input_df= input_df.dataframe(‘current’) TypeError: PreviewTransformInput.dataframe() takes 1 positional argument but 2 were given*

Any suggestions are greatly appreciated! :smiley:

Hey that seems to be an error during preview and not build?

I am afraid doing transform preview with lightweight transforms that are incremental is not currently supported. Previewing incremental transforms is a new feature and currently only available if using preview in VScode (instead of code repositories) and not yet supported for transforms with lightweight decorator, though it is on the roadmap.