I’m trying to build some datasets in Code Repository.
However, the error below occured and checks failed. Internal Error: ImportError: cannot import name 'model_training' from 'main' (/scratch/standalone/RID/code-assist/contents/transforms-model-training/src/main/__init__.py)
The file this happens is /transforms-model-training/src/main/pipeline.py
I ran the same code a few weeks ago, then there was no error.
I guess this is due to any updates in templates as there’s a pull request about upgrading templates, but I counldn’t merge this upgrade either due to the same error.
The error you’re encountering with the check failure could be due to the following reason.
By default, a Python code repository has the following structure:
transforms-python
- conda_recipe
- src
- myproject
- datasets
... code for data transforms / modeling goes here
__init__.py
pipeline.py
Within this folder structure, the pipeline.py file is designed to automatically discover modules recursively within the datasets folder and register module-level Transforms.
The content of pipeline.py by default looks like this:
# Contents of pipeline.py
from transforms.api import Pipeline
from myproject import datasets
my_pipeline = Pipeline()
my_pipeline.discover_transforms(datasets) # Note: This searches within the datasets folder and registers Transforms
Therefore, if you were to change the default name of the datasets folder to x, you would also need to update the pipeline.py code to import from x and use it accordingly – my_pipeline.discover_transforms(x)
Regarding the check error you mentioned, it is likely that the issue started occurring after the folder name was changed from model_training to modeling. Updating the import statements to reflect the latest folder name in code like the following should resolve the Import error and allow the check to pass
from transforms.api import Pipeline
from main import model_training
pipeline = Pipeline()
pipeline.discover_transforms(model_training)