Is it:
- The largest value present in the specified column anywhere in the table?
- The largest value present in that column which was ingested during the previous run?
- Something else?
Context is that I’ve got a query that pulls in rows from the source system, where last_update_timestamp > ?
. However the table is large and the update_timestamp
column has no index so this query is very slow.
I’m hoping I can batch my initial ingest such that each run is limited to only pull in a single month of data at a time. I’ll run it a few dozen times to get hold of the full historical data, and then once it’s caught up to today then I can swap it back to normal incremental behaviour.