Memory and Threads: Agents That Remember

Try this with Aria as she stands right now: tell her your name, then in a separate message, ask her what your name is. She won't know. Not because she's being difficult — because as far as her code is concerned, that second message is the first thing she's ever heard from you.

This article fixes that. By the end, Aria will remember an entire ongoing conversation — and you'll understand exactly what "memory" means in this context, because it's not what most beginners assume.

🟢🟡 Skill level: Beginner-to-Intermediate.

Quick Reference

When to use this: Any time a conversation needs to span more than one message — follow-up questions, ongoing context, multi-turn conversations.

Basic syntax:

from langgraph.checkpoint.memory import InMemorySaver

agent = create_agent(model="gpt-5-nano", checkpointer=InMemorySaver())

config = {"configurable": {"thread_id": "1"}}
response = agent.invoke({"messages": [...]}, config)

Common patterns:

No checkpointer → every invoke() call is a completely fresh, isolated conversation
Checkpointer + thread_id → conversation history persists across multiple invoke() calls that share the same thread_id
Different thread_id values → completely separate, unrelated conversations, even using the same agent

Gotchas:

⚠️ InMemorySaver keeps history only in your program's memory — it's gone the moment your script ends. It's for learning and development, not production persistence.
⚠️ Forgetting to reuse the same thread_id is the most common reason memory "doesn't work."

What You Need to Know First

Everything from Articles 1–3 — create_agent, tools, and how an agent processes a single message
No new API keys are needed for this article

What We'll Cover in This Article

Why agents forget everything by default
What a checkpointer actually does
What a thread_id is and why it matters
How to give Aria real memory across multiple messages

What We'll Explain Along the Way

What "state" means in this context
The difference between memory within a single response and memory across separate calls

Seeing the Problem: No Memory by Default

Let's prove the problem before fixing it. We'll send Aria two completely separate messages and see what happens.

# Purpose: Demonstrate that an agent has no memory between invoke() calls by default
# Context: Establishes the problem before introducing checkpointers
# Input: Two separate messages, sent as two separate invoke() calls
# Output: The second response shows Aria has no idea what was said in the first

from dotenv import load_dotenv
load_dotenv()

from langchain.agents import create_agent
from langchain.messages import HumanMessage

agent = create_agent(model="gpt-5-nano")

# First message: tell Aria something
first_message = HumanMessage(content="Hi, my name is Julie and I prefer short, direct email replies.")
response_one = agent.invoke({"messages": [first_message]})
print(response_one["messages"][-1].content)

# Second message: ask about what we just said, in a completely separate call
second_message = HumanMessage(content="What's my name, and how do I like my replies?")
response_two = agent.invoke({"messages": [second_message]})
print(response_two["messages"][-1].content)

The second response won't know Julie's name or her preference — Aria will likely say she doesn't have that information, or ask you to clarify. This isn't a bug. Each call to agent.invoke(...) is entirely self-contained: whatever messages you pass in are the only thing the agent knows about for that call. Nothing carries over automatically.

What's Actually Being Remembered?

Here's a distinction worth being precise about, because it trips people up: within a single invoke() call, the agent already has access to everything in the messages list you passed it — that's how it could see both a question and a follow-up if you included them in the same list. What's missing is memory between separate calls — separate times your code calls agent.invoke(...).

That's what we're fixing. We need somewhere to store the growing conversation history, and a way to tell the agent which stored conversation to continue. In LangGraph (which create_agent is built on), that combination is called state, and the mechanism that saves and reloads it is called a checkpointer.

Think of a checkpointer like a video game's save system. Without saving, every time you start the game, you're back at the very beginning — no progress, no inventory, no history. A checkpointer is what lets you save your progress and pick up exactly where you left off. The thread ID is which save file you're loading — different thread IDs are completely separate save files, even on the same game (the same agent).

Diagram: Calls sharing the same thread ID build on a continuously saved conversation. A different thread ID starts a completely separate, unrelated conversation, even using the exact same agent.

Adding Memory with a Checkpointer

LangGraph ships with InMemorySaver — a checkpointer that stores conversation state in your program's memory while it's running. It's the simplest checkpointer to start with, and exactly what we need for learning (we'll note its limitation shortly).

# Purpose: Give the agent a checkpointer so it can remember across invoke() calls
# Context: Same agent setup as before, with one addition
# Input: None yet — we'll send messages next
# Output: An agent configured to persist conversation state

from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent

agent = create_agent(
    model="gpt-5-nano",
    checkpointer=InMemorySaver(),
)

A checkpointer alone isn't quite enough — we also need to tell the agent which conversation we're continuing each time we call it. That's the thread_id, passed as a second argument to invoke():

# Purpose: Use a thread_id to identify which conversation to save/continue
# Context: The config object tells the checkpointer which "save file" to use
# Input: A thread_id string — any unique identifier you choose
# Output: A config dictionary to pass alongside your messages

config = {"configurable": {"thread_id": "1"}}

Watching Aria Remember

Let's repeat the exact same two-message test from earlier, but this time passing the same config (and therefore the same thread_id) to both calls.

# Purpose: Confirm memory now works across separate invoke() calls
# Context: Same two messages as the "no memory" example, now with a checkpointer
# Input: Two separate messages, both using thread_id "1"
# Output: The second response correctly recalls what was said in the first

from langchain.messages import HumanMessage

config = {"configurable": {"thread_id": "1"}}

first_message = HumanMessage(content="Hi, my name is Julie and I prefer short, direct email replies.")
response_one = agent.invoke({"messages": [first_message]}, config)
print(response_one["messages"][-1].content)

second_message = HumanMessage(content="What's my name, and how do I like my replies?")
response_two = agent.invoke({"messages": [second_message]}, config)
print(response_two["messages"][-1].content)

This time, Aria answers correctly — Julie, and short, direct replies. Nothing about your message structure changed; the only difference is the checkpointer saved the state after the first call, and reloaded it (then added the new message) on the second call, because both calls used thread_id: "1".

Now try using a different thread_id for the second call — "2" instead of "1" — and ask the same follow-up question. Aria won't know the answer, because as far as the checkpointer is concerned, thread "2" is a brand new, empty conversation that happens to share the same underlying agent.

Common Misconceptions

❌ Misconception: Memory means the agent has "learned" something permanently

Reality: A checkpointer saves the conversation history for a specific thread — it doesn't change the underlying model at all. The model is exactly the same before and after; what's different is that the full conversation, including earlier turns, gets sent along with each new message.

Why this matters: If you expect Aria to remember Julie's name in a different thread, or in a different program run entirely (with InMemorySaver), she won't — because nothing about the model itself changed. Only that one specific thread's saved history exists.

Example:

# ❌ Wrong assumption: "Aria learned Julie's name permanently"
# ✅ Correct: thread_id "1" has a saved conversation that includes Julie's name;
#    any other thread_id starts with no knowledge of it

❌ Misconception: `InMemorySaver` keeps memory permanently

Reality: InMemorySaver stores state only in your running program's memory. The moment your script ends or restarts, that memory is gone completely.

Why this matters: This is fine for learning and local development, but it's not what you'd use for a real assistant Julie expects to remember things tomorrow. (We're flagging this now; a persistent checkpointer is outside the scope of this introductory series, but it's worth knowing this distinction exists before you build something real.)

Example:

# ❌ Wrong assumption: "InMemorySaver will remember Julie's name forever"
# ✅ Correct: InMemorySaver remembers only while this specific program is running

Troubleshooting Common Issues

Problem: Aria still doesn't remember anything, even with a checkpointer

Symptoms: You've added checkpointer=InMemorySaver(), but follow-up questions still fail like there's no memory at all.

Common Causes:

The config (with thread_id) wasn't passed to one of the invoke() calls (most common)
Two different thread_id values were used by accident across the calls that were supposed to be connected
The agent was recreated (a new create_agent(...) call) between the two messages, creating a brand new InMemorySaver with nothing stored in it

Diagnostic Steps:

# Step 1: Confirm config is passed to BOTH invoke() calls
response = agent.invoke({"messages": [message]}, config)  # ✅ config included

# Step 2: Print the thread_id you're actually using each time
print(config["configurable"]["thread_id"])

# Step 3: Confirm you're reusing the same agent instance, not recreating it
# (the checkpointer lives on the agent object itself)

Solution: Make sure the exact same config dictionary (or at least the same thread_id value) is passed to every invoke() call that should share history, and that you're calling invoke() on the same agent instance each time.

Prevention: Store your thread_id in a variable once, and reuse that variable everywhere, rather than retyping the string each time — a typo like "1" vs "l" will silently create two separate conversations.

Problem: Memory disappears after restarting the script

Symptoms: Conversation history that worked fine while the script was running is gone the next time you run it.

Common Causes:

This is expected behavior with InMemorySaver — it was never designed to persist between separate program runs

Solution: This isn't a bug to fix within this article's scope — InMemorySaver is specifically for in-process memory. Persistent, cross-restart memory requires a different kind of checkpointer, which is outside what we're covering here.

Prevention: Treat InMemorySaver as appropriate for learning and short-lived sessions, not as the final answer for a production assistant.

Check Your Understanding

Quick Quiz

What's the difference between memory within a single invoke() call and memory across separate calls?

Show Answer
Within a single call, the agent sees everything in the messages list you passed in for that call — no checkpointer needed. Memory across separate calls requires a checkpointer plus a shared thread_id, so the conversation state is saved after one call and reloaded at the start of the next.
If you call agent.invoke(...) twice with two completely different thread_id values, will the second call know anything from the first?

Show Answer
No — different thread_id values represent entirely separate conversations, even on the same agent. Each one starts with no saved history of its own.
Why is InMemorySaver not suitable for a real production assistant Julie uses every day?

Show Answer
Because it only stores conversation state in the running program's memory — as soon as the program stops or restarts, everything saved is lost. A real, persistent assistant needs a checkpointer that survives restarts, which is a separate topic from what's covered in this introductory article.

Hands-On Exercise

Challenge: Create two separate conversations using two different thread_id values, both with the same agent, and confirm they don't share any memory with each other.

Show Solution

from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent
from langchain.messages import HumanMessage

agent = create_agent(model="gpt-5-nano", checkpointer=InMemorySaver())

# Thread 1: Julie's conversation
config_julie = {"configurable": {"thread_id": "julie-1"}}
agent.invoke(
    {"messages": [HumanMessage(content="My name is Julie.")]},
    config_julie,
)

# Thread 2: a completely separate conversation, same agent
config_other = {"configurable": {"thread_id": "other-1"}}
response = agent.invoke(
    {"messages": [HumanMessage(content="What's my name?")]},
    config_other,
)

print(response["messages"][-1].content)  # Aria won't know — this is a fresh thread

Explanation: Even though both calls use the exact same agent object, thread_id: "other-1" has no saved history at all, since it's never been used before — proving threads are fully isolated from each other.

Summary: Key Takeaways

By default, every agent.invoke(...) call is completely isolated — nothing carries over automatically
A checkpointer saves and reloads conversation state, the same way a save system works in a video game
A thread_id identifies which saved conversation to continue — different IDs mean completely separate conversations
InMemorySaver is the simplest checkpointer, but it only persists while your program is actively running
Memory doesn't change the underlying model — it just means more conversation history gets included in each new request
Aria can now hold a real, ongoing conversation across multiple messages

Version Information

Tested with:

Python: >=3.10, <4.0
langchain: >=1.1.3 (latest stable as of writing: 1.3.4)
langgraph: >=1.0.3 — InMemorySaver comes from langgraph.checkpoint.memory, a new package for this article

Known issues:

⚠️ InMemorySaver is not thread-safe across multiple processes — it's meant for single-process learning and development scenarios, not multi-server production deployments.

What's Next?

You now understand how agents remember conversations across multiple messages, and the distinction between in-call and cross-call memory.

The natural next step is Multimodal Messages: Text, Image, and Audio Input — so far, Aria has only ever seen plain text. That article covers what happens when Julie wants to send her something else entirely, like a screenshot of an email or a voice note.

References

LangChain Academy: Introduction to LangChain (Python) — this section is inspired by and adapted from this course
LangGraph Docs: Persistence — official guide to checkpointers and threads
LangChain API Reference: create_agent — full parameter reference, including checkpointer
langgraph on PyPI — latest version and release history

Quick Reference​

What You Need to Know First​

What We'll Cover in This Article​

What We'll Explain Along the Way​

Seeing the Problem: No Memory by Default​

What's Actually Being Remembered?​

Adding Memory with a Checkpointer​

Watching Aria Remember​

Common Misconceptions​

❌ Misconception: Memory means the agent has "learned" something permanently​

❌ Misconception: InMemorySaver keeps memory permanently​

Troubleshooting Common Issues​

Problem: Aria still doesn't remember anything, even with a checkpointer​

Problem: Memory disappears after restarting the script​

Check Your Understanding​

Quick Quiz​

Hands-On Exercise​

Summary: Key Takeaways​

Version Information​

What's Next?​

References​