smpena
1
Hello,
Is something like this possible in foundry so I don’t have to loop through a json object?
‘’'req = requests.get(url, headers={‘Accept’: ‘application/json’, ‘Authorization’: ‘Basic aaa’})
column =
gettots =req.json()
req2 = req.content.decode(‘utf-8’)
rdd = sc.parallelize([req2])
jsonDF = spark.read.json(rdd)
df_expl = jsonDF.withColumn("explodedarray1",F.explode(jsonDF.issues))
columns.append(df_expl)'''
Yes, this is possible in Transforms Python. Here is a minimal example that covers the use of SparkContext.parallelize
and SparkSession.read.json
:
from transforms.api import transform_df, Output
@transform_df(
Output("<output_path_or_rid>"),
)
def compute(ctx):
spark_session = ctx.spark_session
json_strings = [
'{"name": "A", "age": 5}',
'{"name": "B", "age": 7}'
]
string_rdd = spark_session.sparkContext.parallelize(json_strings)
return spark_session.read.json(string_rdd)
Regarding the special ctx
parameter, see https://www.palantir.com/docs/foundry/transforms-python/transforms-python-api/#parameters-3. For details of how to do an external API call from a transform, see https://www.palantir.com/docs/foundry/data-integration/external-transforms/ or https://www.palantir.com/docs/foundry/data-integration/external-transforms-source-based/.
1 Like