The tall weeds of LLM prompts

Tuesday 27th February, 2024 - Bruce Sterling

*It’s interesting that the cost of prompts and platforms is becoming a major issue.

From: The Prompt newsletter

I’m seeing four things with folks who use LLMs in production right now:

GPT-4 can do many things very well, but it can become quite expensive for production.

Builders turn to GPT-3.5 Turbo for cost savings, but struggle with results;
They then want to test models for other providers like Google and Anthropic, or an open-sourced model but they don’t get the same results.

And lastly, they start thinking whether to fine-tune their own model.

Let’s talk about the strategies and how to go through this process.

How to lower costs with GPT-4

If you have a more complex LLM workflow, and you’re using GPT-4 for every step of that workflow, the costs can start to rise.
To reduce the cost, ask yourself this question:

Which part of my workflow can I do with a secondary model?

For example, GPT-3 will do great with summarization, but it won’t do that well at capturing intent from a customer message. You’ll definitely need GPT-4 for that.

Depending on the use-case, you might end up using GPT-3.5 for more complex tasks. To do that, you need to create prompts that are similar to how the model was trained.

In the next section we’ll share some best practices that can improve your output.

How to get better results with GPT-3.5

Below are some tips to make GPT-3.5 work more like GPT-4:

Use “Do” instead of “Don’t”

Separate Instructions from Context with ### or “““

Be direct: Use “You must”, or “Your task is”…

Assign a role

Instead of just writing “conversational style”, add a sentence that follow that style, and ask the model to replicate that

Add info about the end-user

Provide the format structure of the output

Give examples (use Chain of thought prompting for more complex reasoning tasks)

Use emotion prompts like “This is very important for my career”. (((Why does this work?)))

If your prompt is complex, split it into more prompts and chain them together (it will be easier for GPT-3.5 to follow multiple but simpler prompts)

Please refer to my original post for additional information and examples.

How to prompt Claude?

It’s really interesting to me that we always default to prompting Claude the same way that we’re prompting OpenAI’s models. The reality is that this is a model that has been trained using a completely different set of techniques and methodologies, and requires a prompt design that adheres to those settings.

Here’s a quick rundown on how you should prompt Claude:

Use XML tags like to separate instructions from content

Use “Do” instead of “Don’t”, and be direct

Start the Assistant output with the first token of the expected response

Assign a role, and pass it in the Assistant output as well

Ask Claude to reason through the problem before providing the answer

Provide Examples (few-shot, chain of thought prompting)

If you’re dealing with longer documents, always ask your question at the end of the prompt

Break complex prompts into multiple prompts

If this seems a bit confusing, read my prompting guide for Claude, which includes detailed examples and instructions.

I’m currently working on a prompting guide for Gemini and Mixtral, and will share it soon….