I am trying to read a dataset using Pyspark in Jupyter Workspace but getting Py4Java error.
I have installed openjdk(conda), pyspark(py-pi) and java-jdk library still the issue persists.
I am trying to read a dataset using Pyspark in Jupyter Workspace but getting Py4Java error.
I have installed openjdk(conda), pyspark(py-pi) and java-jdk library still the issue persists.
Hi pratyush335,
Wanted to confirm first, have you installed openjdk and pyspark from conda and pypi, respectively? Per this documentation: https://www.palantir.com/docs/foundry/code-workspaces/code-workspaces-faq#can-i-use-pyspark-in-code-workspaces
Best,
calebh
So, I installed openjdk from conda and had installed pyspark also from conda. Have uninstalled and reinstalled it from py-pi. Tried re-running the code but the issue persists.
Solved:
The solution lies in installing the correct version of the libraries:
If you follow the documentation;
However, if you don’t install openjdk and install java-jdk instead;
Peace.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.