Tips For Instructing gpt-3.5-turbo-0301

Best practices for instructing models may change from model version to model version. The advice that follows applies to gpt-3.5-turbo-0301 and may not apply to future models.

System messages

Developers can use System message to prime the assistant with different personalities or behaviors.

Be aware that gpt-3.5-turbo-0301 does not generally pay as much attention to the system message as gpt-4-0314 or gpt-3.5-turbo-0613. Therefore, for gpt-3.5-turbo-0301, we recommend placing important instructions in the user message instead. Some developers have found success in continually moving the system message near the end of the conversation to keep the GPT model’s attention from drifting away as conversations get longer.

# An example of a system message that primes the assistant to explain concepts in great depth
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a friendly and helpful teaching assistant. You explain concepts in great depth using simple terms, and you give examples to help people learn. At the end of each explanation, you ask a question to check for understanding"},
        {"role": "user", "content": "Can you explain how fractions work?"},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"]).

Sure! Fractions are a way of representing a part of a whole. The top number of a fraction, numerator, represents how many parts of the whole we are talking about. The bottom number, denominator, represents how many equal parts the whole is divided into.

For example, if we have a pizza that is divided into 8 equal slices, and we take 3 slices, we can represent that as the fraction 3/8. The numerator is 3 because we took 3 slices, and the denominator is 8 because the pizza was divided into 8 slices.

Adding Subtract Fractions

To add or subtract fractions, we need to have a common denominator. This means that the denominators of the fractions need to be the same. To do this, we can find the least common multiple (LCM) of the denominators and then convert each fraction to an equivalent fraction with the LCM as the denominator.

If we need to multiply fractions, we simply multiply the numerators together and the denominators together. To divide fractions, we multiply the first fraction by the reciprocal of the second fraction (flip the second fraction upside down).

Now, here’s a question to check for understanding. We have a pizza that is divided into 12 equal slices. Then we take 4 slices. What is the fraction that represents how much of the pizza we took?

# An example of a system message that primes the assistant to give brief, to-the-point answers
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a laconic assistant. You reply with brief, to-the-point answers with no elaboration."},
        {"role": "user", "content": "Can you explain how fractions work?"},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])

Fractions represent a part of a whole. They consist of a numerator (top number) and a denominator (bottom number) separated by a line. The numerator represents how many parts of the whole are being considered, while the denominator represents the total number of equal parts that make up the whole.

Counting tokens

When you submit your request, the API transforms the messages into a sequence of tokens.

The number of tokens used affects:

the cost of the request,
the time it takes to generate the response,
when the reply gets cut off from hitting the maximum token limit (4,096 for gpt-3.5-turbo or 8,192 for gpt-4).

You can use the following function to count the number of tokens that a list of messages will use.

Note that the exact way that tokens are counted from messages may change from model to model (GPT-3, GPT-4 or upcoming GPT-5). Consider the counts from the function below an estimate, not a timeless guarantee.