When I build models using scikit-learn, I can import and use foundry-sklearn-adapter.
However, now I try to build models using light GBM.
Is there any libraries only for light GBM?
Or, should I use custom adopters from the file adopter.py in the Model Training Template?
For a model built with LightGBM, I would recommend writing your own custom adapter. I would assume that lightgbm supports serialization/deserialization of models using dill (similar to pickle, but covers more cases), so you can use the DillSerializer auto serializer to handle the saving and loading of your model.
import palantir_models as pm
from palantir_models_serializers import DillSerializer
class LightGbmAdapter(pm.ModelAdapter):
@pm.auto_serialize(
model=DillSerializer(),
)
def __init__(self, model): # you would pass your lightgbm model to the adapter via the init
self.model = model
@classmethod
def api(cls):
# define your api here. Below is an example API showing inputs and outputs
# inputs = {
# "df_in": pm.Pandas(columns=[("input_column", str)]),
# "param_in": pm.Parameter(type=str, default="default_value")
# }
# outputs = {
# "df_out": pm.Pandas(columns=[("output_column", str)])
# }
return inputs, outputs
def predict(self, df_in, param_in):
# this is where you would call the predict method on your lightgbm model
If you haven’t already tried, I would recommend using Code Workspaces for model training as it provides a much more interactive experience allowing model training in Jupyter Notebooks. That way you can debug any issues with your adapter on the fly using any of the available testing methods. For example, once you construct your adapter you can test it using:
# this will return a named tuple with the fields in the tuple matching your
# model adapter api's output fields
my_adapter.run_inference(...) # pass the inputs here
I hope this helps! Let me know if you need any more assistance!
Thank you for your answer.
I haven’t tried yet, so I’ll do it soon.
Besides, can I please ask one more question?
You recommended Code Workspaces and Jupyter Notebook is more familiar and easier for modeling to me, but sometimes it costs more than using Code Repository.
Could you tell me the way to save the cost with COde WOrkspaces?