Currently, I have a struct as a property type in my dataset output. When I try to create an object out of it, ontology does not allow struct. I thought of decomposing the struct into multiple strings. What is the optimal strategy or alternatives in a situation like this?
Hi @ShaiBrin!
Depending on your urgency to develop this feature, Struct property types are coming to the ontology very soon! In our documentation here you can see that we expect Struct property types to become available now in December.
Definitely a second that just waiting for structs to be supported is a valid strategy!
However, the fields on structs are accessible in transform code. I have a method that decodes an integer address into seven different sub-addresses for plate and well. Here’s the method I use to pull out the fields of the struct into first-class fields:
def create_well_id_columns(locus_col: Union[Column, str]) -> List[Column]:
if isinstance(locus_col, Column):
_c = locus_col
else:
_c = F.col(locus_col)
return [
decode_plate_locus_column(_c).getItem("barcode").alias("barcode"),
decode_plate_locus_column(_c).getItem("labware_id").alias("labware_id"),
decode_plate_locus_column(_c).getItem("well_address").alias("well_address"),
decode_plate_locus_column(_c).getItem("well_id").alias("well_id"),
decode_plate_locus_column(_c).getItem("well_column").alias("well_column"),
decode_plate_locus_column(_c).getItem("well_row").alias("well_row"),
decode_plate_locus_column(_c)
.getItem("well_row_letter")
.alias("well_row_letter"),
]
You can extract the struct members into full fields and then drop the struct column.
You can use getItem()
when programmatically accessing fields. Or if you’re using direct variable references you can just use dot notation: table.field.struct_field
.
Happy to answer other questions.
It seems that the Struct type implementation will be rather rudimentary. according to the docs, we can expect no nested structs, and a maximum of 10 properties. will the struct properties allow multiple values and edits?
I am asking since we are thinking of using the struct field for the documentation of calculation parameters when builing the object type containing calculation results… kinda documenting the data that certain scenarios were based on. This could reduce the complexity that you would need in a relational data model to achieve the same thing…