How to get LAST UPDATED value for files from the source

Into a data connection with S3 datasource, when creating a sync, I’m trying to concatenate the " LAST UPDATED" value of the file with the file name.
I used the transformers “append timestamp to filenames” but it append the time of the build.

Hi,

When you ingest the data, does the file modified metric show the original modification time? You can get to this by going to the dataset > details > files.

If this is correct you could use a board in pipeline builder to extract the metadata of these files and then concatenate that onto the filename.

Thanks,
Ben

1 Like

Hi Ben,
Thank you for your replay.
the file modified metric shows the date of the ingestion into foundry (build time) and not the last updated date from the source, which I’m looking for.

You could potentially use external transforms to query this metadata via S3 API (just the metadata separately or you could as well ingest the data directly there I guess): https://www.palantir.com/docs/foundry/data-integration/external-transforms/

The metadata (file size and file last modified time) recorded by Data Connection for each ingested file is stored on transaction metadata (and visible from the “Custom metadata” section on the transaction details page that you can access via the dataset’s History tab). Unfortunately, we don’t currently expose public API endpoints for retrieving this metadata, so you would need to use legacy API endpoints in order to gain programmatic access to it. The following endpoints should be sufficient:

To get the list of transactions: https://your-subdomain.palantirfoundry.com/workspace/documentation/developer/api/catalog/services/CatalogService/endpoints/getReverseTransactionsInView

To get the metadata for a given transaction: https://your-subdomain.palantirfoundry.com/workspace/documentation/developer/api/metadata/services/MetadataService/endpoints/getTransactionMetadata

At least at the present time, the appropriate value to specify for the namespace argument of this second endpoint is the string __default_metadata_namespace__.

In addition to the usual caveats about the risks of using legacy API endpoints (which can change without warning at any time), using the above APIs in this case would introduce an additional dependency on undocumented implementation details of Data Connection, so while it is technically possible to do what you want, you should do so with an understanding that it is not officially supported and brings maintainability risks.