Documentation

Get started with Aura

Aura gives your AI agent persistent memory — no embeddings, no vector database, no LLM calls for memory operations. Pure Rust, sub-millisecond recall.

Quickstart

Install

terminal
pip install aura-memory

Requires Python 3.9+. Works on Windows, macOS, Linux. No system dependencies.

Your first memory in 30 seconds

hello_memory.py
from aura import Aura

brain = Aura("./my_brain")

# Store knowledge
brain.store("Deploy to staging first. Never push straight to prod.")
brain.store("Our database is PostgreSQL. Auth uses JWT, 15-minute tokens.")
brain.store("User prefers concise answers without bullet points.")

# Recall relevant context — <1ms, no API call
results = brain.recall("deployment rules", limit=3)
for r in results:
    print(r.content)

# Output:
# Deploy to staging first. Never push straight to prod.

The brain persists to disk. Next time you run the script — it remembers everything.

Add memory to any LLM

with_claude.py
from aura import Aura
import anthropic

brain = Aura("./brain")
client = anthropic.Anthropic()  # ANTHROPIC_API_KEY

def chat(message: str) -> str:
    # Pull relevant memories before each message
    memories = brain.recall(message, limit=5)
    context = "\n".join(f"- {m.content}" for m in memories)

    response = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=1024,
        system=f"You are a helpful assistant.\n\nWhat you remember:\n{context}",
        messages=[{"role": "user", "content": message}]
    )

    brain.store(f"User asked: {message}")
    return response.content[0].text

# Teach the agent
brain.store("The user is building a FastAPI service with PostgreSQL.")
brain.store("The user prefers async endpoints.")

print(chat("What database should I use?"))
# → Answers with PostgreSQL from memory

Core Concepts

What is Aura?

Aura is a cognitive memory layer that runs alongside your AI agent. It stores what the agent learns, organizes it automatically, and surfaces the most relevant context when the agent needs to answer a question.

Unlike a vector database, Aura doesn't just store and retrieve. It forms beliefs from accumulated evidence, discovers causal patterns over time, and generates advisory hints — all without any LLM calls.

The brain

Everything starts with Aura("./path"). This creates or opens a brain at the given directory. The brain stores everything to disk — it persists between script runs, server restarts, and deployments.

from aura import Aura

brain = Aura("./my_agent_brain")
# Creates: my_agent_brain/brain.cog, beliefs.cog, etc.

Store and Recall

Two operations you'll use constantly:

# Store — write something to memory
brain.store("content", tags=["optional", "tags"])

# Recall — find relevant memories for a query
results = brain.recall("your query", limit=5)
for r in results:
    print(r.content)      # the stored text
    print(r.confidence)   # how reliable is this memory
    print(r.tags)         # associated tags

How memory becomes knowledge

Aura automatically organizes stored records into higher layers during maintenance cycles:

RecordsRaw stored facts — what you wrote with store()
BeliefsGroups of related records weighted by evidence and confidence
ConceptsAbstractions discovered across stable beliefs
Causal PatternsCause-effect relationships found in the memory graph
Policy HintsAdvisory guidance derived from patterns — "prefer staging deploys"

Run brain.run_maintenance() to trigger a cycle. In production, use the background daemon.

Integrations

Aura works with any LLM framework. The pattern is always the same: recall before the LLM call, store after.

Claude (Anthropic)

claude_agent.py
from aura import Aura
import anthropic

brain = Aura("./brain")
client = anthropic.Anthropic()

memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)

response = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=1024,
    system=f"Assistant with memory:\n{context}",
    messages=[{"role": "user", "content": user_message}]
)
Full example

Gemini (Google)

gemini_agent.py
from aura import Aura
import google.generativeai as genai

brain = Aura("./brain")
genai.configure(api_key="YOUR_KEY")
model = genai.GenerativeModel("gemini-2.5-flash-lite")

memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)
prompt = f"Memory:\n{context}\n\nUser: {user_message}"

response = model.generate_content(prompt)
brain.store(f"User asked: {user_message}")
Full example — cheap model + memory vs expensive model alone

Ollama (local, no API key)

ollama_agent.py
from aura import Aura
import requests

brain = Aura("./brain")

memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)

r = requests.post("http://localhost:11434/api/generate", json={
    "model": "gemma3n:e4b",
    "prompt": f"Memory:\n{context}\n\nUser: {user_message}",
    "stream": False
})
brain.store(f"User asked: {user_message}")

CrewAI

crewai_agent.py
from aura import Aura
from crewai.tools import tool

brain = Aura("./brain")

@tool("remember")
def remember(content: str) -> str:
    """Store important information in long-term memory."""
    brain.store(content)
    return "Stored."

@tool("recall")
def recall_memory(query: str) -> str:
    """Search long-term memory for relevant information."""
    results = brain.recall(query, limit=5)
    return "\n".join(r.content for r in results) or "Nothing found."

Memory Layers

When you store something, you can specify how long it should persist. Aura uses four levels — memories decay naturally and promote based on how often they're accessed.

LevelLifespanUse for
Level.WorkingHoursCurrent session context, temporary notes
Level.DecisionsDaysDecisions made, tasks in progress
Level.DomainWeeksProject knowledge, team preferences
Level.IdentityMonths+Core user traits, permanent rules
from aura import Aura, Level

brain = Aura("./brain")

# Persists for months — core identity
brain.store("User is a senior backend engineer", level=Level.Identity)

# Persists for weeks — project knowledge
brain.store("We use PostgreSQL, not MySQL", level=Level.Domain)

# Persists for days — active work
brain.store("Working on auth module this week", level=Level.Decisions)

# Default — current session
brain.store("User asked about deployment just now")

API Reference

Core methods

brain.store(content, level?, tags?, namespace?)

Store a memory. Returns the created Record.

brain.recall(query, limit?, tags?, namespace?)

Recall relevant memories. Returns list of Records sorted by relevance.

brain.run_maintenance()

Run a maintenance cycle. Forms beliefs, concepts, causal patterns, policy hints from stored records.

brain.get_surfaced_policy_hints(limit?)

Get advisory hints derived from accumulated memory patterns.

brain.get_metacognitive_context()

Query the brain's own knowledge state — confidence, freshness, conflicts.

brain.recall_with_epistemic_context(query, limit?)

Recall records plus their epistemic state in one call.

Record fields

record = brain.recall("something")[0]

record.id           # unique record ID
record.content      # stored text
record.level        # memory level (Working/Decisions/Domain/Identity)
record.tags         # list of tags
record.confidence   # reliability score 0.0–1.0
record.namespace    # isolation namespace
record.created_at   # unix timestamp

MCP server

Aura includes a native MCP server — works with Claude Desktop, Cursor, VS Code, and any MCP client.

# Start MCP server
python -m aura.mcp --brain ./my_brain

# Or via CLI
aura-mcp --brain ./my_brain --port 8765