Key Terminology: Fundamentals

Jul 9

Tired of technical jargon, corporate speak, and euphemism around AI? I’m going to write some posts that can serve as cheat sheets to will help you decode the hype (or a sales pitch, or a blog post—maybe even mine!)

This post will start with the basics: AI, LLMs, RLHF and more. Please leave a comment and let me know what words I should cover in future posts like this!

Artificial Intelligence, or AI: Something we made that can appear to think, learn, and act without human instruction. It includes, but is not limited to, generative models like ChatGPT, classifiers like your Junk Mail filter, image recognition like Reverse Image search, facial recognition models that police use, and recommendation engines that online stores and streaming services use to suggest things to you. We have also tried to create synthetic minds by emulating entire brains and teaching computers everything there is to know, one rule at a time.

Machine Learning (ML): A method of building AI where algorithms find patterns in large datasets, similar to how humans learn from examples. Instead of programming specific rules, you show the system thousands of examples and let it figure out the patterns.

Training Data: The examples used to teach an AI system. For image recognition, this might be millions of labeled photos. For chatbots, it's text from books, websites, and other sources. The quality and type of training data heavily influences what the AI can do.

Algorithm: A structured process, including contingencies. For example, when I check the mail, I get the mailbox key, open the mailbox and if there is mail in the box, I remove the mail before closing and locking the mailbox. In computing, an algorithm is a set of instructions that tell a computer how to solve a problem or make decisions. In machine learning, algorithms are used to learn from each example in the training data.

“Algorithm” can be confusing in AI, because people very often refer to the resulting model as the algorithm, rather than the training process. For example, ”the algorithm is showing me a lot of content about sourdough bread right now.” In fact, AI models (including those used to decide what to show you on social media or on an e-commerce site) are decidedly not rule-based. They were built using a rule-based process (an algorithm) and some aspects of the way they implemented include some rules (e.g. put sponsored posts at the top), but their power comes from the fact that their decisions are not simply algorithmic.

Generative AI: AI that creates new content (text, images, audio, video, or code, for example) based on patterns it learned from training data (rather than predicting or classifying, which are other common tasks for machine learning models). Examples include ChatGPT writing emails, DALL-E creating images, or GitHub Copilot writing code. The output is what it thinks humans want based on what the AI has seen before.

Large Language Models (LLMs): A particular, popular type of generative AI trained on massive amounts of text that can understand and generate human-like language. Examples include ChatGPT, Claude, and Gemini. They predict what word or phrase comes next based on patterns in their training, which lets them write, answer questions, and have conversations.

State-of-the-art (SOTA): Top of the line. In the case of LLMs, people will talk about the current best LLM as SOTA or identify the model that is SOTA at a particular task or benchmark. I can talk about common model benchmarks in a future post if you’re interested!

Generative Pre-trained Transformer (GPT): The technology behind OpenAI’s LLMs. "Generative" means it creates new content, "pre-trained" means it learned from massive amounts of text before you ever interact with it, and "Transformer" refers to the specific architecture that lets it understand context and relationships between words. This term can help us understand and remember a few things about popular LLMs’ strengths and limitations:

The Transformer architecture is particularly good at understanding how words relate to each other across long passages of text, rather than just predicting the next word based on the immediate previous words. This makes GPTs much more useful than past chatbots, which liked to go off on weird tangents when one or two words in a row “reminded” it of something else.
Pre-trained refers to the fact that (out of the box) a GPT has learned from a huge training data set consisting of (approximately) the internet as it existed on the date it was trained. Pre-training incredibly expensive, which is why most people build custom models on a foundation of existing ones, rather than training their own. Pre-training also has a cut-off date (the date of the newest information in its training data) beyond which it does not know information. For example, LLMs did not know who the president of the US was for a while, until major model makers decided they needed to add that information to the system prompt.

Reinforcement Learning from Human Feedback (RLHF): After pre-training, we have the right structure to give intelligible answers (the transformer architecture) and a ton of information (from pre-training) but they aren’t great at consistently responding to a prompt with a useful answer. During RHLF humans rate AI outputs as good or bad, teaching the AI to produce responses people prefer.

RLHF is how we get the default answer structures (e.g. a short introductory sentence, content outlined with bullets or tables, and an offer to do the next step); one of the ways model makers try to make sure we don’t get offensive or illegal content; and also where sycophancy (the tendency of models to excessively praise users and their ideas) comes from.

System Prompt: The initial instructions given to an AI model that define its personality, capabilities, limitations, and behavior before any user interaction begins. For example, a system prompt might tell an AI to be helpful but concise, to refuse certain types of requests, or to always mention when information might be outdated. Model makers often also use system prompts to bridge major information gaps caused by training data cut-off dates (e.g. who is the US president, if the election was decided after the cut-off date).

Anthropic publicly shares Claude's system prompts so you can read exactly how they've instructed Claude to behave and what a system prompt looks like.

Hallucination: When a model makes up facts. This is an error type that plagues LLMs in particular. LLMs can "hallucinate" fake citations, nonexistent events, or incorrect statistics because they predict that these facts would make a great argument. The model will recognize that a fact is made up if you ask it directly after the fact, but (unlike humans) recognizing that it can and has lied will not reduce its propensity to lie in the future. You can reduce hallucinations by asking that the LLM include linked citations for all fact claims and examples, but it will still hallucinate. If you draft text with LLMs, you must fact check it.

Prompt: The instruction or question you give to an AI system. Good prompts get better results. Good prompts are direct and clear, describe the format and qualities of the desired output, offer context (information about the task, audience, examples, primary sources, etc). Different prompting methods can get you better answers for different tasks.

Bias: When AI systems produce results that support discriminatory behavior, further harmful stereotypes, or perform worse for some groups than others. Bias in AI results often reflects biases in their training data or design. For example, facial recognition and skin cancer detection models tend to work better on light skin than dark skin because there is much more data for light-skinned people. This can get confusing in conversations about AI because there are statistical meanings of “bias” that refer to patterns in data that are not prejudiced ones.

Model: The AI system after it's been trained. You might use "the GPT-4 model" or "Google's image recognition model." It's the finished product that can perform tasks.

Application Programming Interface (API): How different software programs talk to each other. Many AI tools offer APIs so developers can integrate AI features into their own applications. When you hear about, for example, “a ChatGPT wrapper,” a software product that has a third-party LLM “in” it, or (often) a company “building their own LLM,” it often means that the access an state-of-the-art model through an API and build other software around it so that it can perform the specific tasks they want, with the safety guardrails that they want, and/or with a bunch of additional training data or fine-tuning custom to their contexts.

For example, a non-profit could use an API to connect a state-of-the -art LLM to their donor database. That would allow it to write better personalized emails; find donors whose recent communications seem to indicate that they are at high risk to leave; or to identify every donor with an interest in a particular tree-planting program: people interested in the environment, volunteerism, donating physical labor, or the particular region where the program is taking place.

—

LLM disclosure:

I asked Claude Sonnet 4:
”I am working on a blog post series to go over the jargon people might hear about AI. I've started this post covering basics. What other words should I include, and can you review/complete what I have already?”

After i edited its response, I asked it to add “GPT,” and then “Can you add "system prompt?" highlight that Anthropic posts theirs and link to it”

I added a lot to GPT and RLHF to make connections to the experience of using the models, and I cut down the content around Anthropic’s system prompt because it was very editorial (isn’t it great how Anthropic…!)

Karen Boyd

Key Terminology: Fundamentals

Ethics & LLMs: Intellectual Honesty

Safe Occupations?

Get free resources to jumpstart effective, ethical AI use at your organization!