LLM-augmented Brainstorming: How to make it work

How to use

May 23

Written By Karen Boyd, PhD. Director of Research at PIC

Large Language Models (LLMs) are, from one perspective, novelty generating machines. When studied on their own, they’ve generated more, better ideas faster than people. But evidence from research studies on integrating AI into human brainstorming sessions is mixed at best. Some studies show more and better ideas coming from AI-assisted groups, and others show decreases over 20%. One concerning study even showed that using AI hurt human’s originality in later, AI-free brainstorming tasks. What is going on? And can we use AI effectively for brainstorming?

When AI excelled, it was left to its own devices and applied to a simple task (ideas for a product that college students would buy). But pretty often, we want it to do more involved ideation tasks that include some guidance from us, the context experts! And when we work together with AI, sometimes we get fewer or predictable ideas.

Some think it’s because we approach LLMs like we use Google (ask a question, get an answer!) Normally, we’d want to have a divergence phase (think up as many wildly different ideas as you can!) and then a convergence stage (which of these ideas do we think is best?) But if we put too much confidence in the first answer or answer set we got from an answer machine, we settle in to the “convergence” stage too quickly.

The same thing can happen in human-only brainstorming groups. Experts have techniques that help us stay in divergence mode. For example, asking people to think of some ideas on their own before they come to the group brainstorm, can help a group maintain divergence longer, bringing in ideas from before they met to shake things up. You can also make a point of gathering diverse people at similar levels of organizational power to brainstorm together: diversity brings different perspectives and not having your boss’s boss in the room makes it less embarrassing to blurt out an off-the-wall idea. Sometimes, people pull from a deck of cards that have people, values, or new situations printed on them to prompt new ideas in the group.

Fortunately, there tailored techniques that we can use to improve an AI-assisted brainstorm, too.

Introduce AI later. This seems to help! Let the humans get comfortable in the divergence phase. Although I haven’t seen it tested, I’d bet it’d be even better to have people think of ideas by themselves, then as a group, and only then introduce AI.
Use prompts that request divergence. “Give me 10 wildly different ideas.” Consider leaving out or even changing some context to get ideas from different domains.
Seed it with examples. Giving it a couple of your best ideas seems to improve AI-augmented brainstorming results. Perhaps seeding it with a couple very different ideas will support divergence, too!
Don’t treat AI like an oracle. Think of it as one of these brainstorming card decks that can shake things up when you draw from it. Chat back and forth with it. Use cheesy brainstorming tools with it: “answer this question like an 8-year old/an ancient person/an alien,” “propose a solution that has no features in common with the previous one,” “give me 5 ideas that would work [on the moon/for kids/at a tiny scale].
“More chat, less bot.” Rather than asking for more solutions, chat back and forth with it. “Imagine a conversation between [Steve Jobs and Jonny Ive/Kant and Locke/five people who disagree/2 healthy adults] about this problem;” “ask me five questions that you think will help me think of some divergent ideas;” “what if we combined [this idea] with [that one].” “how would you think about this problem if you had [24 hours to implement it/an unlimited budget/the world’s experts in this area available to you]?”
Crank up the temperature. If your platform and model allows you to adjust the parameter called “temperature,” try brainstorming with the temperature up high!
Prevent the LLM from converging. Some models tend to themselves get to convergence too quickly! If you see this happening you can 1) open a new chat window. This clears the context and lets you start over. 2) Warn the model about it’s tendency to do this and ask it not to! ~”I am not quite ready to make a decision yet, and I am still looking for more, different ideas. Please don’t evaluate the quality of ideas and focus on quantity.”

Have you found techniques that help you get better results out of brainstorming with AI? Please share them!

—
LLM Disclosure:

I experimented with using LLMs to help me review literature with mixed results.

First, I used my normal, boring, human methods to find some papers on AI+human brainstorming. Then, I asked ChatGPT o3:

“Can you create a document summarizing peer-reviewed research around brainstorming with LLMs? - What is the range of findings around idea quantity and quality? (For each paper that scores quality, please describe how quality was assessed: by whom? how were they related to the project? were they blind to whether AI was involved or not?) - Are there other ways that the brainstorming sessions were evaluated, how were they measured, and what were the results? - what do researchers attribute those results to? (if they are bad, why do they think LLM-augmented teams did worse? if they are good, what to they attribute that to?) - Were there cases where interventions allowed LLM augmented brainstorming sessions to improve their results? Please cite each study that you included.”

Then, after I got an answer, I added the list of citations I had found with this prompt:

”Thank you! does this include the following studies? Why or why not? If they should be included, please add them: [list]”

It already had only one of the studies I had found in my cursory search: the top handful of google scholar results for “brainstorming LLMs” and one I’d heard about on a podcast. This lack of overlap does NOT signal good things about the viability of this approach for reducing effort in thorough lit reviews.

Then I had it clarify some things in the document, and then asked:
”Thanks! Based on the results of what you read, can you please correct or confirm this thesis: AI on its own can generate a huge volume of ideas, but the quality isn't great/(consistent?). When humans generate with AI, they often do worse, perhaps because they don't iterate enough or build their own ideas off of AI's--they treat it like google instead of a brainstorming partner. You can fix this by 1) asking the AI several times, rather than just once. 2) seed it with example ideas 3) delay the use of the LLM until the people have had time to ideate on their own 4) use worse AI-- having but great but wildly varying answers gives the humans ideas to build off of, but they stay engaged. Please identify where this paragraph is contradicted by the evidence and link citations to the parts that are correct. Help me edit this so it is accurate.”

It handled this really well: it broke out all the claims, sorted by supported and needs to be changed. It treated each one separately and connected each to citations. Unfortunately, between needing to update this and the lackluster results in the previous task, I decided I needed to do more of this on my own.

So I dug in some more. I used Claude 3.7 a few times to summarize papers I wasn’t sure were relevant:

“Can you summarize the findings of this paper?”

During my lit review journey, I also asked Claude:

“I remember from college that research suggested that people would get better results from brainstorming sessions if they had independent idea generation time before they got together to brainstorm. Is that right? can you point me to related research?”

and:

“sometimes, when humans brainstorm together with an LLM, they get worse results. I think they leap to convergence before they are done with divergence. What do you think we can do to get a broad range of ideas. [sic]”

I added two additional example prompts under points 4 & 5 from the results of this query.

Whew! When I find a method that gets better results, I’ll write about it, but this was a lot!

Finally, I asked ChatGPT 4o “Can you create an image of what creativity looks and feels like?” I was curious to add the “feels like” clause, but I can’t say the result blew me away. Again, not going to spend a lot of time iterating on the image: it only shows up on the blog home page and I don’t want to be wasteful.

How toCreativityAugmentation

Karen Boyd, PhD. Director of Research at PIC https://drkarenboyd.com