When ingesting incrementally from a source-system, I need to specify a timestamp column to avoid old already ingested data to be ingested again. In short I need to use a timestamp column as the incremental column of my incremental sync.
My Query looks like
SELECT * FROM "table" WHERE "some_timestamp_col" > ?
Which columns types are supported and will yield to a working incremental ingest ? (no duplicated rows, no error)
- A timestamp column ingested as a String as a date-time epoch-time, like
2021-10-12 00:00:00
- A timestamp column ingested as a String as an epoch timestamp representation, like
1634083200
- A timestamp column ingested as a Long
- A timestamp column ingested as a Timestamp
- A timestamp column ingested as a Date
- Other ?
When I say “ingested as”, that’s the kind of statement I have in mind:
SELECT *, EXTRACT(EPOCH FROM "some_timestamp_col") FROM "table" WHERE "some_timestamp_col" > ?;
in Postgres Syntax
To what I see in the docs, there might be cases where microseconds are dropped during the ingestion, hence generating duplicated data. In which case this can be a problem ?