Seeking Assistance with Structured Output for LLM Processing

Jacob_SE · September 19, 2024, 7:43am

Dear Palantir Engineering Team,

I have developed a logic that processes data using an LLM and outputs the results in JSON format. However, I am encountering an issue where the output includes unnecessary phrases, explanations of the code, or just plain text descriptions, instead of the desired structured JSON format.

Here is an example of the expected output format:
{
“example_data”: “value”
}

Includes unnecessary phrases:

  "example_data": "value"
}```


I am currently using few-shot learning for processing, but I am struggling to enforce a structured output. Could you please advise on how to ensure that the output strictly adheres to the JSON format without additional text or explanations?

Thank you for your assistance.

yushi · September 19, 2024, 8:06am

Hi @Jacob_SE is it correct that you want to avoid having code fences around the JSON response?
i.e., this thing→```

LLMs might consider JSON with or without code fences to be valid responses, especially if they assume the output is to be displayed in markdown. To ensure the response is JSON without code fences, make it explicit that nothing else should be included. For example, add the following lines:

Provide your response exclusively in JSON format
The JSON object should be the only content in your response, with no surrounding code fences or text

maddyAWS · September 19, 2024, 2:23pm

If this is AIP logic, then why wudnt you use the output type option to select struct and define your key value pair

Jacob_SE · September 23, 2024, 8:29am

@yushi

Thank you for your response. While we understand that prompting can address the problem, we are experiencing significant randomness in the output. Additionally, we are facing model deprecation issues with GPT-4, which has caused our logic to collapse.

I am wondering if there are structured output tools, such as the one OpenAI recently launched or the one available in LangChain.

@maddyAWS

When using structured output, the AIP logic returns null if the final output contains an unexpected structure, such as JSON.