Memory and Threads: Agents That Remember
Try this with Aria as she stands right now: tell her your name, then in a separate message, ask her what your name is. She won't know. Not because she's being difficult — because as far as her code is concerned, that second message is the first thing she's ever heard from you.
This article fixes that. By the end, Aria will remember an entire ongoing conversation — and you'll understand exactly what "memory" means in this context, because it's not what most beginners assume.
🟢🟡 Skill level: Beginner-to-Intermediate.
Quick Reference
When to use this: Any time a conversation needs to span more than one message — follow-up questions, ongoing context, multi-turn conversations.
Basic syntax:
from langgraph.checkpoint.memory import InMemorySaver
agent = create_agent(model="gpt-5-nano", checkpointer=InMemorySaver())
config = {"configurable": {"thread_id": "1"}}
response = agent.invoke({"messages": [...]}, config)
Common patterns:
- No checkpointer → every
invoke()call is a completely fresh, isolated conversation - Checkpointer +
thread_id→ conversation history persists across multipleinvoke()calls that share the samethread_id - Different
thread_idvalues → completely separate, unrelated conversations, even using the same agent
Gotchas:
- ⚠️
InMemorySaverkeeps history only in your program's memory — it's gone the moment your script ends. It's for learning and development, not production persistence. - ⚠️ Forgetting to reuse the same
thread_idis the most common reason memory "doesn't work."
See also: Web Search: Real-Time Knowledge for Agents
What You Need to Know First
- Everything from Articles 1–3 —
create_agent, tools, and how an agent processes a single message - No new API keys are needed for this article
What We'll Cover in This Article
- Why agents forget everything by default
- What a checkpointer actually does
- What a
thread_idis and why it matters - How to give Aria real memory across multiple messages
What We'll Explain Along the Way
- What "state" means in this context
- The difference between memory within a single response and memory across separate calls
Seeing the Problem: No Memory by Default
Let's prove the problem before fixing it. We'll send Aria two completely separate messages and see what happens.
# Purpose: Demonstrate that an agent has no memory between invoke() calls by default
# Context: Establishes the problem before introducing checkpointers
# Input: Two separate messages, sent as two separate invoke() calls
# Output: The second response shows Aria has no idea what was said in the first
from dotenv import load_dotenv
load_dotenv()
from langchain.agents import create_agent
from langchain.messages import HumanMessage
agent = create_agent(model="gpt-5-nano")
# First message: tell Aria something
first_message = HumanMessage(content="Hi, my name is Julie and I prefer short, direct email replies.")
response_one = agent.invoke({"messages": [first_message]})
print(response_one["messages"][-1].content)
# Second message: ask about what we just said, in a completely separate call
second_message = HumanMessage(content="What's my name, and how do I like my replies?")
response_two = agent.invoke({"messages": [second_message]})
print(response_two["messages"][-1].content)
The second response won't know Julie's name or her preference — Aria will likely say she doesn't have that information, or ask you to clarify. This isn't a bug. Each call to agent.invoke(...) is entirely self-contained: whatever messages you pass in are the only thing the agent knows about for that call. Nothing carries over automatically.
What's Actually Being Remembered?
Here's a distinction worth being precise about, because it trips people up: within a single invoke() call, the agent already has access to everything in the messages list you passed it — that's how it could see both a question and a follow-up if you included them in the same list. What's missing is memory between separate calls — separate times your code calls agent.invoke(...).
That's what we're fixing. We need somewhere to store the growing conversation history, and a way to tell the agent which stored conversation to continue. In LangGraph (which create_agent is built on), that combination is called state, and the mechanism that saves and reloads it is called a checkpointer.
Think of a checkpointer like a video game's save system. Without saving, every time you start the game, you're back at the very beginning — no progress, no inventory, no history. A checkpointer is what lets you save your progress and pick up exactly where you left off. The thread ID is which save file you're loading — different thread IDs are completely separate save files, even on the same game (the same agent).
Diagram: Calls sharing the same thread ID build on a continuously saved conversation. A different thread ID starts a completely separate, unrelated conversation, even using the exact same agent.
Adding Memory with a Checkpointer
LangGraph ships with InMemorySaver — a checkpointer that stores conversation state in your program's memory while it's running. It's the simplest checkpointer to start with, and exactly what we need for learning (we'll note its limitation shortly).
# Purpose: Give the agent a checkpointer so it can remember across invoke() calls
# Context: Same agent setup as before, with one addition
# Input: None yet — we'll send messages next
# Output: An agent configured to persist conversation state
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent
agent = create_agent(
model="gpt-5-nano",
checkpointer=InMemorySaver(),
)
A checkpointer alone isn't quite enough — we also need to tell the agent which conversation we're continuing each time we call it. That's the thread_id, passed as a second argument to invoke():
# Purpose: Use a thread_id to identify which conversation to save/continue
# Context: The config object tells the checkpointer which "save file" to use
# Input: A thread_id string — any unique identifier you choose
# Output: A config dictionary to pass alongside your messages
config = {"configurable": {"thread_id": "1"}}
Watching Aria Remember
Let's repeat the exact same two-message test from earlier, but this time passing the same config (and therefore the same thread_id) to both calls.
# Purpose: Confirm memory now works across separate invoke() calls
# Context: Same two messages as the "no memory" example, now with a checkpointer
# Input: Two separate messages, both using thread_id "1"
# Output: The second response correctly recalls what was said in the first
from langchain.messages import HumanMessage
config = {"configurable": {"thread_id": "1"}}
first_message = HumanMessage(content="Hi, my name is Julie and I prefer short, direct email replies.")
response_one = agent.invoke({"messages": [first_message]}, config)
print(response_one["messages"][-1].content)
second_message = HumanMessage(content="What's my name, and how do I like my replies?")
response_two = agent.invoke({"messages": [second_message]}, config)
print(response_two["messages"][-1].content)
This time, Aria answers correctly — Julie, and short, direct replies. Nothing about your message structure changed; the only difference is the checkpointer saved the state after the first call, and reloaded it (then added the new message) on the second call, because both calls used thread_id: "1".
Now try using a different thread_id for the second call — "2" instead of "1" — and ask the same follow-up question. Aria won't know the answer, because as far as the checkpointer is concerned, thread "2" is a brand new, empty conversation that happens to share the same underlying agent.
Common Misconceptions
❌ Misconception: Memory means the agent has "learned" something permanently
Reality: A checkpointer saves the conversation history for a specific thread — it doesn't change the underlying model at all. The model is exactly the same before and after; what's different is that the full conversation, including earlier turns, gets sent along with each new message.
Why this matters: If you expect Aria to remember Julie's name in a different thread, or in a different program run entirely (with InMemorySaver), she won't — because nothing about the model itself changed. Only that one specific thread's saved history exists.
Example:
# ❌ Wrong assumption: "Aria learned Julie's name permanently"
# ✅ Correct: thread_id "1" has a saved conversation that includes Julie's name;
# any other thread_id starts with no knowledge of it
❌ Misconception: InMemorySaver keeps memory permanently
Reality: InMemorySaver stores state only in your running program's memory. The moment your script ends or restarts, that memory is gone completely.
Why this matters: This is fine for learning and local development, but it's not what you'd use for a real assistant Julie expects to remember things tomorrow. (We're flagging this now; a persistent checkpointer is outside the scope of this introductory series, but it's worth knowing this distinction exists before you build something real.)
Example:
# ❌ Wrong assumption: "InMemorySaver will remember Julie's name forever"
# ✅ Correct: InMemorySaver remembers only while this specific program is running
Troubleshooting Common Issues
Problem: Aria still doesn't remember anything, even with a checkpointer
Symptoms: You've added checkpointer=InMemorySaver(), but follow-up questions still fail like there's no memory at all.
Common Causes:
- The
config(withthread_id) wasn't passed to one of theinvoke()calls (most common) - Two different
thread_idvalues were used by accident across the calls that were supposed to be connected - The agent was recreated (a new
create_agent(...)call) between the two messages, creating a brand newInMemorySaverwith nothing stored in it
Diagnostic Steps:
# Step 1: Confirm config is passed to BOTH invoke() calls
response = agent.invoke({"messages": [message]}, config) # ✅ config included
# Step 2: Print the thread_id you're actually using each time
print(config["configurable"]["thread_id"])
# Step 3: Confirm you're reusing the same agent instance, not recreating it
# (the checkpointer lives on the agent object itself)
Solution: Make sure the exact same config dictionary (or at least the same thread_id value) is passed to every invoke() call that should share history, and that you're calling invoke() on the same agent instance each time.
Prevention: Store your thread_id in a variable once, and reuse that variable everywhere, rather than retyping the string each time — a typo like "1" vs "l" will silently create two separate conversations.
Problem: Memory disappears after restarting the script
Symptoms: Conversation history that worked fine while the script was running is gone the next time you run it.
Common Causes:
- This is expected behavior with
InMemorySaver— it was never designed to persist between separate program runs
Solution: This isn't a bug to fix within this article's scope — InMemorySaver is specifically for in-process memory. Persistent, cross-restart memory requires a different kind of checkpointer, which is outside what we're covering here.
Prevention: Treat InMemorySaver as appropriate for learning and short-lived sessions, not as the final answer for a production assistant.
Check Your Understanding
Quick Quiz
-
What's the difference between memory within a single
invoke()call and memory across separate calls?Show Answer
Within a single call, the agent sees everything in the
messageslist you passed in for that call — no checkpointer needed. Memory across separate calls requires a checkpointer plus a sharedthread_id, so the conversation state is saved after one call and reloaded at the start of the next. -
If you call
agent.invoke(...)twice with two completely differentthread_idvalues, will the second call know anything from the first?Show Answer
No — different
thread_idvalues represent entirely separate conversations, even on the same agent. Each one starts with no saved history of its own. -
Why is
InMemorySavernot suitable for a real production assistant Julie uses every day?Show Answer
Because it only stores conversation state in the running program's memory — as soon as the program stops or restarts, everything saved is lost. A real, persistent assistant needs a checkpointer that survives restarts, which is a separate topic from what's covered in this introductory article.
Hands-On Exercise
Challenge: Create two separate conversations using two different thread_id values, both with the same agent, and confirm they don't share any memory with each other.
Show Solution
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent
from langchain.messages import HumanMessage
agent = create_agent(model="gpt-5-nano", checkpointer=InMemorySaver())
# Thread 1: Julie's conversation
config_julie = {"configurable": {"thread_id": "julie-1"}}
agent.invoke(
{"messages": [HumanMessage(content="My name is Julie.")]},
config_julie,
)
# Thread 2: a completely separate conversation, same agent
config_other = {"configurable": {"thread_id": "other-1"}}
response = agent.invoke(
{"messages": [HumanMessage(content="What's my name?")]},
config_other,
)
print(response["messages"][-1].content) # Aria won't know — this is a fresh thread
Explanation: Even though both calls use the exact same agent object, thread_id: "other-1" has no saved history at all, since it's never been used before — proving threads are fully isolated from each other.
Summary: Key Takeaways
- By default, every
agent.invoke(...)call is completely isolated — nothing carries over automatically - A checkpointer saves and reloads conversation state, the same way a save system works in a video game
- A
thread_ididentifies which saved conversation to continue — different IDs mean completely separate conversations InMemorySaveris the simplest checkpointer, but it only persists while your program is actively running- Memory doesn't change the underlying model — it just means more conversation history gets included in each new request
- Aria can now hold a real, ongoing conversation across multiple messages
Version Information
Tested with:
- Python:
>=3.10, <4.0 langchain:>=1.1.3(latest stable as of writing:1.3.4)langgraph:>=1.0.3—InMemorySavercomes fromlanggraph.checkpoint.memory, a new package for this article
Known issues:
- ⚠️
InMemorySaveris not thread-safe across multiple processes — it's meant for single-process learning and development scenarios, not multi-server production deployments.
What's Next?
You now understand how agents remember conversations across multiple messages, and the distinction between in-call and cross-call memory.
The natural next step is Multimodal Messages: Text, Image, and Audio Input — so far, Aria has only ever seen plain text. That article covers what happens when Julie wants to send her something else entirely, like a screenshot of an email or a voice note.
References
- LangChain Academy: Introduction to LangChain (Python) — this section is inspired by and adapted from this course
- LangGraph Docs: Persistence — official guide to checkpointers and threads
- LangChain API Reference:
create_agent— full parameter reference, includingcheckpointer langgraphon PyPI — latest version and release history