Update: This was due to a shared property being set to require values, and some of the underlying data being null
Just recently, our team began getting changelog job errors similar to the below, due to null values in the property. I’ve dug into the errors and the objects themselves, and the null values match the number of errors.
Two question I had on what could be causing it:
- We’re not handling null values as we should (although we haven’t changed how we create datasets in any meaningful way for the ontology).
- We recently replaced our object backing datasets with restricted views, perhaps that could be the issue?
Is there a way to handle null values gracefully? Note the “unit” key in this instance is just a property and isn’t a primary key or title field.
Found [7608] invalid values in columns [unit] of type [String]. Please check your datasource for the following:
- Nulls inside arrays or nested arrays
- NaNs or infinite values for floats/doubles
- Misconfigured media reference data sources: confirm the data sources backing your media reference properties are correct and exhaustive
- Malformed Geohashes: geohashes should be non-empty strings that are either
(a) in the format `latitude, longitude` (in that same order)
(b) a valid geohash encoding of a coordinate
- Malformed Geoshapes: geoshapes can be invalid for a number of reasons, including
- Non-conformance with the GeoJSON spec
- Polygon self-intersection
- Polygon consecutive coordinate duplication
- Polygon contains unclosed rings (first point and last point must match)
Consider running geoshape strings through the "normalize geometry" expression in Pipeline Builder to fix or drop invalid shapes prior to indexing.