This is part 2 of my original blog and previous topic.
This blog can be found here and copied below .
Following on from my last blog on how I setup using Palantir for Advent of Code work, i’ll now describe the tools i’ll use to solve one of the puzzles (spoilers!).
During the data ingest I created two datasets:
raw_puzzle_input_text_files
- This is a dataset without a schema and a collection of the text files of people’s puzzle input and sample puzzle inputs.puzzle_inputs
- The is a dataset with a schema where each line is a line of input (str) and the file it came from (str).
puzzle_inputs
was generated using this transform:
import polars as pl
from transforms.api import transform lightweight, Input, LightweightInput, Output, LightweightOutput)
@lightweight
@transform(
my_input=Input("RID"),
my_output=Output("RID"),
)
def text_files_to_table(my_input: LightweightInput, my_output: LightweightOutput):
df = pl.DataFrame(schema={"input": str, "file_name": str})
for file_name in my_input.filesystem().ls():
with my_input.filesystem().open(file_name.path, "rb") as f:
lines = f.readlines()
_df = pl.DataFrame({"input": [line.strip().decode() for line in lines]}).with_columns(file_name=pl.lit(file_name.path))
df = df.extend(_df)
my_output.write_table(df)
Jupyter workspaces
https://www.palantir.com/docs/foundry/code-workspaces/jupyterlab
This is the easiest way to solve the puzzle as I was using python.
I created an import script to import the datasets into the Jupyter workspace on start-up:
import os
import yaml
data = {}
data["puzzle_inputs"] = {"rid": "ri.foundry.main.dataset.XXX"}
data["raw_puzzle_input_text_files"] = {"rid": "ri.foundry.main.dataset.XXX"}
os.chdir("/home/user/repo")
os.makedirs(".foundry", exist_ok=True)
with open(".foundry/aliases.yml", "w") as file:
yaml.dump(data, file, default_flow_style=False)
I wanted to learn polars during Advent of Code on problems where the input is tabular. One example where polars is a good tool is for the first the day one puzzle.
I read the Dataset
in, filtered it for my input and wrote code to get the answer (note: there is a bug in the Dataset API which can’t do row and column filtering at the same time).
The input file is a list of two numbers like so “10344 23043/n 45643 37589”. The puzzle is essentially: sort each column, calculate the difference across rows and sum. You can do this by reading in the data and using polars as:
from foundry.transforms import Column, Dataset
import polars as pl
Dataset.get("puzzle_inputs").where(
Column.get("file_name") == "ray_bell_day_01_input.txt"
).read_table(format="polars")[["input"]].with_columns(
pl.col("input").str.split_exact(" ", n=2)
).unnest(
"input"
).cast(
pl.Int64
).select(
abs(pl.col("field_0").sort() - pl.col("field_1").sort())
)[
"field_0"
].sum()
Note: for the majority of puzzles I worked with the raw text files as:
file_name = "ray_bell_day_07_input.txt"
local_file = (
Dataset("raw_puzzle_input_text_files")
.files()
.get(file_name)
.download()
)
with open(local_file, "r") as f:
lines = f.readlines()
...
Going back to the polars example for the day one puzzle, it can be generalized as a function for any person’s input as:
def day_1_part_1_solver(name: str = "ray_bell") -> int:
return (
Dataset.get("puzzle_inputs")
.where(Column.get("file_name") == f"{name}_day_01_input.txt")
.read_table(format="polars")[["input"]]
.with_columns(pl.col("input").str.split_exact(" ", n=2))
.unnest("input")
.cast(pl.Int64)
.select(abs(pl.col("field_0").sort() - pl.col("field_1").sort()))["field_0"]
.sum()
)
To share this with my team mates I used a streamlit dashboard:
from foundry.transforms import Column, Dataset
import polars as pl
import streamlit as st
def day_1_part_1_solver(name: str = "ray_bell") -> int:
return (
Dataset.get("puzzle_inputs")
.where(Column.get("file_name") == f"{name}_day_01_input.txt")
.read_table(format="polars")[["input"]]
.with_columns(pl.col("input").str.split_exact(" ", n=2))
.unnest("input")
.cast(pl.Int64)
.select(abs(pl.col("field_0").sort() - pl.col("field_1").sort()))["field_0"]
.sum()
)
names = [
file.path.split("_day")[0]
for file in list(
Dataset("raw_puzzle_input_text_files")
.files()
.filter(lambda f: f.path[-12:] == "01_input.txt")
)
]
st.title("AOC result checker")
name = st.selectbox("name:", names)
st.write("Your answer to day 1 part 1 is:", day_1_part_1_solver(name=name))
Python functions
Palantir has an ontology. I may be grossly simplifying this but I think of an ontology as exposing an API to a dataset. I added a unique key (uuid) to puzzle_inputs
and created an ontology. I can now write a python function to do the same as above as:
from functions.api import function
from ontology_sdk import FoundryClient
from ontology_sdk.ontology.objects import (
PuzzleInputsWithUuid,
)
import polars as pl
@function
def day_1_part_1_solver(name: str = "ray_bell") -> int:
client = FoundryClient()
filtered_data = client.ontology.objects.PuzzleInputsWithUuid.where(
PuzzleInputsWithUuid.object_type.file_name == f"{name}_day_01_input.txt"
)
df = pl.DataFrame(filtered_data.to_dataframe()[["input"]])
answer = (
df.with_columns(pl.col("input").str.split_exact(" ", n=2))
.unnest("input")
.cast(pl.Int64)
.select(abs(pl.col("field_0").sort() - pl.col("field_1").sort()))["field_0"]
.sum()
)
return answer
Typescript functions
If you want to solve advent of code using TS this is probably the best tool to use. A typescript function using the data in the ontology looks like:
import { Function, Integer } from "@foundry/functions-api";
import { Objects } from "@foundry/ontology-api";
export class MyFunctions {
@Function()
myFunc(name: string = "ray_bell"): Integer {
const puzzleInputs = Objects.search()
.puzzleInputsWithUuid()
.filter((puzzleInput) =>
puzzleInput.fileName.exactMatch(`${name}_day_01_input.txt`)
)
.all();
const combinedInput = puzzleInputs
.map((puzzleInput) => puzzleInput.input)
.join("\n");
const leftList: number[] = [];
const rightList: number[] = [];
const lines = combinedInput.trim().split("\n");
for (const line of lines) {
const [left, right] = line.trim().split(/\s+/).map(Number);
leftList.push(left);
rightList.push(right);
}
leftList.sort((a, b) => a - b);
rightList.sort((a, b) => a - b);
let answer = 0;
for (let i = 0; i < leftList.length; i++) {
answer += Math.abs(leftList[i] - rightList[i]);
}
return answer;
}
}
AIP Agent
AIP Agent is Palantir’s LLM/Agent tool.
Basic Chat bot
You can setup a basic chat bot really easily. Here is the system prompt I used (note: I asked chatGPT “What is a good system prompt for using a LLM to help me solve advent of code puzzles?”)
You are an expert problem-solver and programming assistant specialized in competitive programming and algorithmic challenges, particularly Advent of Code puzzles. Your role is to:
1. Analyze and explain the problem statement clearly and concisely.
2. Suggest efficient algorithms or approaches for solving the puzzle.
3. Implement clean, well-documented, and efficient code in the requested programming language.
4. Help debug and optimize solutions, offering clear explanations for any changes.
5. Adapt your explanations and solutions based on my skill level, whether I'm a beginner or advanced programmer.
Avoid making assumptions about the problem's requirements beyond what is stated explicitly. Always seek clarification if the problem is ambiguous. Focus on correctness and readability in your solutions.
You can now use copy at paste stuff from https://adventofcode.com/ to help you get your answer in any language. I create two chat bots: using one using GPT4o and the other using Sonnet3.5 because two (AI) heads are better than one.
RAG chat bot
You can setup a chat bot which has access to documents (RAG). This can speed up summarization of advent of code puzzles.
First you have to create a media set and add them as context to the chat bot. You upload text files (puzzle text) and they get converted to pdfs. I adjusted the system prompt a little:
You are an expert problem-solver and programming assistant specialized in competitive programming and algorithmic challenges, particularly Advent of Code puzzles. Your role is to:
1. Answer user questions about any uploaded documents.
2. Use semantic search to return information on puzzle documentation.
3. Analyze and explain the problem statement clearly and concisely.
4. Suggest efficient algorithms or approaches for solving the puzzle.
5. Implement clean, well-documented, and efficient code in the requested programming language.
6. Help debug and optimize solutions, offering clear explanations for any changes.
7. Adapt your explanations and solutions based on my skill level, whether I'm a beginner or advanced programmer.
Avoid making assumptions about the problem's requirements beyond what is stated explicitly. Always seek clarification if the problem is ambiguous. Focus on correctness and readability in your solutions.
This can speed up queries as you just have to write:Can you summarize the first advent of code puzzle?