What I’ve learned from the Orange Draft method so far

May 13

Written By Karen Boyd, PhD. Director of Research at PIC

In a previous post, I wrote about the Orange Draft method that my team and I sometimes use to draft internal documents with LLMs. One of the benefits I described is getting better at editing the drafts your LLM spits out, even when you aren’t using the Orange Draft method.

Note that this use is in a research context, but we are not using LLMs to generate research papers directly. I’ll make another post in the future to talk about what we do and don’t use LLMs for.

Here is what I’ve learned so far, from about a month of using the Orange Draft method in a social science research context:

I MUST check all citations that it adds. If I am using an LLM to help me draft something that needs proof, I will outline the argument in the prompt and provide citations, then add “if you add any claims, please cite them with peer reviewed sources.” When it adds citations, it can’t seem to tell when the source is proving something, recommending something, or citing someone else. Very often, I need to chase down the primary source and cite it instead, or adjust the language from “researchers found,” to “researchers recommend.”

I often soften claims. For example, it will say “X causes Y,” and I’ll change it to “X can lead to Y.” Lots of adding “may”s and “could include”s.
Last time I did this, it was using British spelling?? So now I have to look out for that, I guess.
When making recommendations, LLMs are unlikely to think about exceptions. For example, Orange Draft method doesn’t work without modification for visually impaired people using screen readers. ChatGPT did not note that or include an alternate method, so I added that information to the Orange Draft post.
Especially if the output is supposed to be persuasive, some models (especially ChatGPT o3 at the time of this writing) will add things that are not true. This is one of the biggest reasons we don’t use LLMs for high-stakes documents. For example, if you use it to help write a grant application, it could alter the details of the proposed project to make an argument stronger, but add things that you can’t or don’t plan to deliver. Yikes!
It doesn’t understand human experience. I didn’t use an LLM at all for this text. It offered vague ideas in the last post that weren’t very helpful. Where experience and stories are helpful, you’ll need to provide them.
It does have some tells.
- It does use em dashes a lot. So do I, but I understand where they are useful! LLMs seem to use them for style or sentence variety, often where they do not belong.
- When it is evaluating something, it likes to shakes things up with these gems: “You didn’t just ___, you ___.” And compliments that are just observations phrased as questions? It loves that. (That’s called “hypophora” if you want to look up some more examples.)
- LOTS OF LISTS (but not this one)
- It’s always bolding things! Cute for a blog post or an email, but very often not suitable.
- Sometimes it adds a query string in the url of sources it finds. That looks like “http://vocabaventures.com/about?utm_source=chatgpt.com” there are a lot of cases where that could look unprofessional.
It’s not great at motivating some types of documents. “Why it matters” seems to be difficult when it’s not extremely practical. (Unfortunately, we have that in common!)

I would love to hear what you’ve learned about prompt writing in your context! I’ll create an update post in a few months when I’ve been using the method for longer.

—

LLM use disclosure:
I created the thumbnail image for this blog post using ChatGPT o3:
”Thanks! I like that color. [responding to the color swatch it created for the last post. Yes, I know I don’t need to thank it, but there’s only so much I am willing to do to change myself 😂]. Can you generate a stack of printed documents with about 15% of the text in orange?”

Frankly, I didn’t love the result, but when it will do, especially for image generation, I’d rather not run it again. I think the environmental concerns are overinflated compared to the actual impact (another post for another time), but waste is waste.

Techniques

Karen Boyd, PhD. Director of Research at PIC https://drkarenboyd.com

What I’ve learned from the Orange Draft method so far

AI: What is it? And should I be using it?

Color‑Coding Your Way to Better AI Writing

Get free resources to jumpstart effective, ethical AI use at your organization!