I’m trying to build an external transform in Code Repositories but it’s failing to successfully build. I followed the external transforms documentation to set it up, and then I used the Pipelines template in Code Repositories to create the repository where my transform resides. The checks and build initialization are successful but my function fails with this error:
Hi yes here it is below thanks in advance @mai125 !
from transforms.api import transform, Output
from transforms.external.systems import use_external_systems, EgressPolicy
from functions.sources import get_source
import json
@use_external_systems(
egress=EgressPolicy('ri.magritte..source.c56607c4-e6ba-47e4-a558-d5fe7e6d2b93')
)
@transform(output=Output("ri.foundry.main.dataset.6f74a845-94a7-486a-b244-b5dae11d03ba"))
def create_dataset(egress, output):
# Define the fixed base URL for the CVE API
base_url = "https://services.nvd.nist.gov/rest/json/cves/2.0"
# Access the client from the provided data source
source = get_source("NvdCve")
client = source.get_https_connection().get_client()
# Initialize variables for pagination and results
page_size = 2000 # Maximum results per page
start_index = 0
all_cves = []
has_more_results = True
while has_more_results:
# Build query parameters for the API call
params = {
"startIndex": start_index,
"resultsPerPage": page_size # Adjust if needed by the API spec
}
# Make the API request
response = client.get(base_url, params=params)
response.raise_for_status() # Raise exception for HTTP errors
data = json.loads(response.text)
# Extract vulnerabilities
vulnerabilities = data.get("vulnerabilities", [])
if not vulnerabilities:
break # Stop if no vulnerabilities are returned
# Append all vulnerabilities to the list
all_cves.extend(vulnerabilities)
# Update pagination variables
start_index += page_size
has_more_results = start_index < data.get("totalResults", 0)
# with output.filesystem().open('cve_data.json', 'w') as f:
# json.dump(all_cves, f, indent=2)
return output.write_text(json.dumps(all_cves))
Hi @mai125 will be sure to try those things! I also wanted to note in case it’s worth mentioning that when I change the body of the function to just be a return statement, create_dataset still fails to build with the same error log message. Do you still think it’s the case that the error is related to the body of the function?
Because you are using the @transform() decorator, you won’t need to return anything, just make sure you write the dataframe in the output: output.write_dataframe(all_cves)
Hi @mai125 thanks for the advice! Update I’ve followed your suggestions but am still getting the same error. Even if the body of the function contains a single line of code like ‘test_var = 1’ I still get the same error, which makes me think that the issue may be related to configuration of resources rather than the code content? What do you think?
Hi @mai125 update it works now! The issue was with the policy RID that I was using as the argument for egress in the @use_external_systems decorator; a combination of that and you suggestions made my build run successfully. Thank you so much!