Hello,
I am trying to run Multi-GPU training in Foundry but I would like some expertise from the community:
- Spark with profile=[“DYNAMIC_ALLOCATION_ENABLED”,“DRIVER_GPU_ENABLED”] → One GPU get picked up from RQ and is “activated” during build but from what I understand the Driver shouldn’t be the one used for this task in a Spark infra.
- Spark with profile=[@configure(profile=[“DYNAMIC_ALLOCATION_ENABLED”, “EXECUTOR_GPU_ENABLED”])] → I believe that’s working as expected.
I understand that Palantir recommends the usage of Lightweight for this usage however no arguments available - like gpu_count - to set GPU to 2 ?
With Lightweight: @lightweight(gpu_type=“NVIDIA_A10G”), only One GPU get picked up from RQ.
As well, I am not sure how to enable such capacity on Code Workspace / Jupyter Notebook ?
Best regards,

