OpenAI Introduces Structured Output Feature: Enhancing AI-Generated JSON Reliability

OpenAI has added a structured output feature to its API, significantly improving the reliability of AI models in generating valid JSON. This feature not only allows developers to build stable applications more easily but also opens new possibilities for AI to generate structured data from unstructured input.

Structured Output: A Revolutionary Advancement in AI-Generated JSON

In the field of artificial intelligence, generating structured data from unstructured input has always been a core application scenario. Whether it is obtaining data through function calls, extracting structured information for data entry, or establishing multi-step agent workflows, developers are continuously exploring how to enable AI to better accomplish these tasks.

However, developers previously often had to rely on open-source tools, meticulously designed prompts, and even repeated request attempts to ensure that the AI model’s output conforms to the format required by their systems. This process was not only time-consuming and laborious but also potentially affected the stability and reliability of applications.

Breakthroughs in Structured Output

The structured output feature introduced by OpenAI addresses this issue through two key methods:

  1. Constrained Model Output: Ensures that the AI model’s output strictly adheres to the JSON schema provided by developers.
  2. Model Optimization: Trains the model to better understand and handle complex JSON schemas.

The results of this feature are impressive. In the evaluation of complex JSON schemas, the new model equipped with the structured output feature, gpt-4o-2024-08-06, achieved a score of 100%, far surpassing the less than 40% score of the previous model, gpt-4-0613.

How to Use Structured Output

OpenAI provides two ways to use structured output in the API:

1. Function Call

Developers can set strict: true in the function definition to enable structured output. This is applicable to all models supporting tools, including gpt-4-0613 and gpt-3.5-turbo-0613 and higher versions.

2. New Response Format Option

Developers can use the new json_schema option in the response_format parameter to provide a JSON Schema. This is particularly suitable for cases where the model needs to respond to users in a structured manner, rather than during tool calls.

Technical Principles of Structured Output

OpenAI has implemented structured output using a technique known as “constrained sampling” or “constrained decoding.” This method restricts the model’s output to valid token ranges that comply with the provided JSON schema, ensuring the output’s validity.

The specific implementation steps are as follows:

  1. Convert the JSON schema into a Context-Free Grammar (CFG).
  2. During the sampling process, dynamically determine the next valid token based on previously generated tokens and grammar rules.
  3. Use this list of valid tokens to mask the next sampling step, effectively reducing the probability of invalid tokens to zero.

This approach can handle not only simple JSON structures but also complex schemas involving nested or recursive data structures, which are difficult to achieve with Finite State Machine (FSM)-based methods.

Applications of Structured Output

The structured output feature opens up many new possibilities for developers. Here are some potential applications:

  1. Dynamic User Interface Generation: Generate descriptions of UI elements that conform to specific structures based on user intent.
  2. Data Extraction and Analysis: Extract structured information from unstructured text, such as to-dos and deadlines from meeting notes.
  3. Automated Report Generation: Convert raw data into structured report formats.
  4. Dialogue System Optimization: Ensure the chatbot’s responses adhere to predefined structures for easier subsequent processing.

Limitations and Considerations of Structured Output

Although the structured output feature is powerful, developers should be aware of the following points:

  • Supports only a subset of JSON schemas; see official documentation for details.
  • First-time use of a new schema may experience additional delay, but subsequent requests will respond quickly.
  • If the model chooses to reject unsafe requests, it may not adhere to the schema.
  • May not fully adhere to the schema when reaching max_tokens limit or other stop conditions.
  • Structured output cannot prevent all types of model errors, such as mathematical calculation errors.
  • Incompatible with parallel function calls; set parallel_tool_calls: false.
  • Does not conform to the Zero Data Retention (ZDR) standard.

Availability of Structured Output

The structured output feature is now fully available in the OpenAI API:

  • Structured output supporting function calls is applicable to all models supporting function calls in the API.
  • Structured output supporting response format is applicable to gpt-4o-mini and gpt-4o-2024-08-06 and their fine-tuned models.
  • Compatible with Chat Completion API, Assistant API, and Batch API.
  • Supports visual input.

Developers can start using the structured output feature by referring to OpenAI’s official documentation.

Frequently Asked Questions

  1. Q: How does the structured output feature improve the reliability of JSON generation? A: Structured output improves the reliability of JSON generation to 100% by constraining the model output and optimizing the model’s ability to understand complex JSON schemas.

  2. Q: Which OpenAI models are applicable to the structured output feature? A: Structured output applies to all models supporting function calls, including the latest gpt-4o series and gpt-3.5-turbo series.

  3. Q: Will using the structured output feature affect the API’s response speed? A: The first use of a new JSON schema may have some delay, but subsequent requests will not be affected, and the response speed will be fast.

  4. Q: What are the limitations of the structured output feature? A: Main limitations include supporting only a subset of JSON schemas, inability to prevent all types of model errors, and incompatibility with parallel function calls.

  5. Q: How can I start using the structured output feature? A: Developers can refer to OpenAI’s official documentation to learn how to enable and configure the structured output feature in API calls.

Share on:
Previous: Maximizing Gemini AI in Google Workspace on Wix: A Revolutionary Tool to Boost Small Business Efficiency
Next: Advanced Voice Mode in ChatGPT: Are You Ready for the AI Conversation Revolution?
DMflow.chat

DMflow.chat

An all-in-one chatbot integrating Facebook, Instagram, Telegram, LINE, and web platforms, supporting ChatGPT and Gemini models. Features include history retention, push notifications, marketing campaigns, and customer service transfer.