Hello,
I will like to have a last update date column in my dataset.
Please, i don’t known if it is possible ? Thanks
Hello,
I will like to have a last update date column in my dataset.
Please, i don’t known if it is possible ? Thanks
Hello Elodie,
I’m not entirely sure I completely understand your question but if you are looking to capture the date of the dataset build, you can simply create a constant column that you will set to the current date. In python spark, the snippet would be something like:
from pyspark.sql import functions as F
df = df.withColumn("current_date", F.current_date())
Or if you need the full timestamp:
df = df.withColumn("current_timestamp", F.current_timestamp())
I don’t know what is the reason you need that column but keep in mind you’ll increase the memory footprint for storage of your dataset because you’ll add this for all rows ![]()
Hope that it helps,
Cheers
Nicolas