ChatGPT models accept an array of messages and deliver a model-created response (ChatGPT Completions). While the chat format was created to simplify multi-turn discussions, it can be equally beneficial for tasks that don’t require any conversation.
A sample API call would appear like this:
import openai
openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
For complete API reference documentation, please click here.
The primary input is the ‘messages’ parameter. ‘Messages’ should be an array of message objects, with each object possessing a role (“system”, “user”, or “assistant”) and content. Conversations can range from a single message to multiple exchanges.
In most cases, a conversation begins with a system message, followed by alternating messages between the user and the assistant.
System Message
The system message plays a crucial role in determining the assistant’s behavior. For instance, you can use it to adjust the assistant’s personality or set clear rules for its interaction throughout the conversation. However, it’s important to note that you don’t have to provide a system message, and the model will probably act similarly to how it would with a generic message like “You are a helpful assistant” if it isn’t included.
User Messages
User messages put forward requests or observations for the assistant to respond to. Assistant messages, on the other hand, keep track of previous responses from the assistant but can also be composed by you to demonstrate expected behavior.
Including the conversation history is crucial, particularly when the user’s instructions are in reference to previous messages. For instance, in the example provided, the final user query “Where was it played?” is only comprehensible in the context of preceding messages about the 2020 World Series. Since the models do not remember previous requests, all pertinent information must be included in the conversation history for each request. If a conversation exceeds the model’s token limit, some form of truncation will be necessary.
ChatGPT Completions Response Format
An example chat completions API response looks as follows:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
"role": "assistant"
}
}
],
"created": 1677664795,
"id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 17,
"prompt_tokens": 57,
"total_tokens": 74
}
}
In Python, the assistant’s response can be obtained using the command response['choices'][0]['message']['content']
.
Every response comes with a ‘finish_reason’. The potential values for ‘finish_reason’ include:
- ‘stop’: The API returned a full message, or a message was terminated by one of the stop sequences given via the stop parameter.
- ‘length’: This signifies that the model output was cut short due to the max_tokens parameter or token limit.
- ‘function_call’: This means the model decided to initiate a function.
- ‘content_filter’: This indicates that content was excluded due to a flag triggered by our content filters.
- ‘null’: This shows that the API response is either still underway or incomplete.
The model’s response might include different information depending on the input parameters (like providing functions).
Read more related articles: