What’s the easiest way to query Foundry datasets locally, ideally using SQL?
I’m trying to use JDBC but it doesn’t seem to be as easy as docs suggest. What I did is try to run a simple python script to query the data:
- As per docs, I downloaded the JDBC driver and put it in my local folder
- I created a new python env and installed JayDeBeApi
- I ran the attached scripts both with JayDeBeApi and with JPype directly
- In both cases I’m getting an error of the sort:
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGBUS (0xa) at pc=0x0000000106934288, pid=21605, tid=259
#
# JRE version: (11.0.18+10) (build )
# Java VM: OpenJDK 64-Bit Server VM (11.0.18+10-LTS, mixed mode, tiered, compressed oops, g1 gc, bsd-aarch64)
# Problematic frame:
# V [libjvm.dylib+0x300288] CodeHeap::allocate(unsigned long)+0x1c8
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /<local_dir>/hs_err_pid21605.log
#
#
[1] 21605 abort python jdbc_test.py
I’m running it locally on my macos 14.7. Java info:
$ java -version
OpenJDK Runtime Environment Zulu11.62+17-CA (build 11.0.18+10-LTS)
OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)
$ echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/zulu-11.jdk/Contents/Home
Simple python script that I tried - using JayDeBeApi:
import jaydebeapi
jdbc_driver_path = '<PATH_TO_DRIVER>/foundry-sql-jdbc-driver-3.27.0-withdep.jar'
jdbc_url = 'jdbc:foundrysql://<HOSTNAME>?Password=<TOKEN>'
query = 'SELECT * FROM "<PATH_TO_DATASET>" LIMIT 10'
# Establish a connection
conn = jaydebeapi.connect(
'com.palantir.foundry.sql.jdbc.FoundryJdbcDriver',
jdbc_url,
[],
jdbc_driver_path
)
# Create a cursor object
cursor = conn.cursor()
# Execute a query
cursor.execute(query)
# Fetch all results
results = cursor.fetchall()
# Print the results
for row in results:
print(row)
# Close the cursor and connection
cursor.close()
conn.close()
Anothe python script that I tried that uses jpype
directly:
import jpype
import jpype.imports
from jpype.types import *
jdbc_driver_path = '<PATH_TO_DRIVER>/foundry-sql-jdbc-driver-3.27.0-withdep.jar'
jdbc_url = 'jdbc:foundrysql://<HOSTNAME>?Password=<TOKEN>'
query = 'SELECT * FROM "<PATH_TO_DATASET>" LIMIT 10'
# Start the JVM
jpype.startJVM(jpype.getDefaultJVMPath(), f"-Djava.class.path={jdbc_driver_path}")
# Import the Java classes
from java.sql import DriverManager
# Establish a connection
conn = DriverManager.getConnection(jdbc_url)
# Create a statement object
stmt = conn.createStatement()
# Execute a query
rs = stmt.executeQuery(query)
# Process the result set
while rs.next():
# Assuming the table has columns 'id' and 'name'
id = rs.getInt("id")
name = rs.getString("name")
print(f"ID: {id}, Name: {name}")
# Close the result set, statement, and connection
rs.close()
stmt.close()
conn.close()
# Shutdown the JVM
jpype.shutdownJVM()
Am I making some obvious mistake when setting things up? And is JDBC really the way to go or are there easier ways?