I’m encountering a resource limitation issue when attempting to run a pipeline on a dataset I’ve ingested into Foundry. The dataset is approximately 4GB in size (~30 million rows). I was previously able to process it successfully, but now I’m hitting resource constraints.
To troubleshoot, I tried reducing the dataset size in the first transform by selecting only more recent records (~10 million rows), but I still encounter the same error. Below is the job error details:
Job Error:
{
"errorCode": "INTERNAL",
"errorName": "ModuleExitReason:RequestedResourcesExceedResourceQueueCapacity",
"errorInstanceId": null,
"safeArgs": {
"sparkModuleId": "bac9183d-87ee-41d7-857e-0883dfb8ea49",
"exitReason": "REQUESTED_RESOURCES_EXCEED_RESOURCE_QUEUE_CAPACITY",
"exitMessage": "Module is requesting more compute resources than can ever be provided by the limits of the Resource Queue. Consider reducing the resource requirements or increasing the Resource Queue Limits."
},
"unsafeArgs": {}
}
Error Message:
Module died. Exit reason: REQUESTED_RESOURCES_EXCEED_RESOURCE_QUEUE_CAPACITY.
Message not helpful?
The driver running the job crashed, ran out of memory, was terminated, or otherwise became unresponsive while it was running. Try rebuilding, and if the problem persists, see logs for more information to confirm if either the driver or executor, or both, ran out of memory and try increasing driver and executor memory accordingly. The exit reason for the crash was REQUESTED_RESOURCES_EXCEED_RESOURCE_QUEUE_CAPACITY.
From the message, it appears that the pipeline is requesting more resources than the Resource Queue can provide. I’d appreciate any advice on how to resolve this issue. Specifically:
1. Are there ways to further optimize the pipeline or reduce resource requirements?
2. Is it possible to adjust the Resource Queue limits, and if so, how can this be done?
3. Could recent changes in cluster configuration or limits be affecting this, and is there a way to check?
4. Are there specific logs or configurations I should inspect to better understand the issue?
Thanks in advance for your help!
Tommy