Hello,
We are having trouble to index large Object Types that are backed by frequently updating incremental datasets.
The incremental datasets contain duplicated primary keys due to the possible ingestion of updated rows from our data sources, this makes backing our object types with the incremental datasets impossible.
Currently our only working solution is to deduplicate the incremental datasets which forces us ultimately to back our object types with snapshot datasets and thus takes multiple hours to index.
We tried converting the pipeline to streaming to index as Streaming object type which allows duplicate PKs in the backing datasource, but the first index job in this case takes multiple weeks which is not acceptable.
Anyone found a solution to this type of issue ?
Thanks!