Get started with Aura
Aura gives your AI agent persistent memory — no embeddings, no vector database, no LLM calls for memory operations. Pure Rust, sub-millisecond recall.
Quickstart
Install
pip install aura-memoryRequires Python 3.9+. Works on Windows, macOS, Linux. No system dependencies.
Your first memory in 30 seconds
from aura import Aura
brain = Aura("./my_brain")
# Store knowledge
brain.store("Deploy to staging first. Never push straight to prod.")
brain.store("Our database is PostgreSQL. Auth uses JWT, 15-minute tokens.")
brain.store("User prefers concise answers without bullet points.")
# Recall relevant context — <1ms, no API call
results = brain.recall("deployment rules", limit=3)
for r in results:
print(r.content)
# Output:
# Deploy to staging first. Never push straight to prod.The brain persists to disk. Next time you run the script — it remembers everything.
Add memory to any LLM
from aura import Aura
import anthropic
brain = Aura("./brain")
client = anthropic.Anthropic() # ANTHROPIC_API_KEY
def chat(message: str) -> str:
# Pull relevant memories before each message
memories = brain.recall(message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=f"You are a helpful assistant.\n\nWhat you remember:\n{context}",
messages=[{"role": "user", "content": message}]
)
brain.store(f"User asked: {message}")
return response.content[0].text
# Teach the agent
brain.store("The user is building a FastAPI service with PostgreSQL.")
brain.store("The user prefers async endpoints.")
print(chat("What database should I use?"))
# → Answers with PostgreSQL from memoryCore Concepts
What is Aura?
Aura is a cognitive memory layer that runs alongside your AI agent. It stores what the agent learns, organizes it automatically, and surfaces the most relevant context when the agent needs to answer a question.
Unlike a vector database, Aura doesn't just store and retrieve. It forms beliefs from accumulated evidence, discovers causal patterns over time, and generates advisory hints — all without any LLM calls.
The brain
Everything starts with Aura("./path"). This creates or opens a brain at the given directory. The brain stores everything to disk — it persists between script runs, server restarts, and deployments.
from aura import Aura
brain = Aura("./my_agent_brain")
# Creates: my_agent_brain/brain.cog, beliefs.cog, etc.Store and Recall
Two operations you'll use constantly:
# Store — write something to memory
brain.store("content", tags=["optional", "tags"])
# Recall — find relevant memories for a query
results = brain.recall("your query", limit=5)
for r in results:
print(r.content) # the stored text
print(r.confidence) # how reliable is this memory
print(r.tags) # associated tagsHow memory becomes knowledge
Aura automatically organizes stored records into higher layers during maintenance cycles:
Run brain.run_maintenance() to trigger a cycle. In production, use the background daemon.
Integrations
Aura works with any LLM framework. The pattern is always the same: recall before the LLM call, store after.
Claude (Anthropic)
from aura import Aura
import anthropic
brain = Aura("./brain")
client = anthropic.Anthropic()
memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=f"Assistant with memory:\n{context}",
messages=[{"role": "user", "content": user_message}]
)Gemini (Google)
from aura import Aura
import google.generativeai as genai
brain = Aura("./brain")
genai.configure(api_key="YOUR_KEY")
model = genai.GenerativeModel("gemini-2.5-flash-lite")
memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)
prompt = f"Memory:\n{context}\n\nUser: {user_message}"
response = model.generate_content(prompt)
brain.store(f"User asked: {user_message}")Ollama (local, no API key)
from aura import Aura
import requests
brain = Aura("./brain")
memories = brain.recall(user_message, limit=5)
context = "\n".join(f"- {m.content}" for m in memories)
r = requests.post("http://localhost:11434/api/generate", json={
"model": "gemma3n:e4b",
"prompt": f"Memory:\n{context}\n\nUser: {user_message}",
"stream": False
})
brain.store(f"User asked: {user_message}")CrewAI
from aura import Aura
from crewai.tools import tool
brain = Aura("./brain")
@tool("remember")
def remember(content: str) -> str:
"""Store important information in long-term memory."""
brain.store(content)
return "Stored."
@tool("recall")
def recall_memory(query: str) -> str:
"""Search long-term memory for relevant information."""
results = brain.recall(query, limit=5)
return "\n".join(r.content for r in results) or "Nothing found."Memory Layers
When you store something, you can specify how long it should persist. Aura uses four levels — memories decay naturally and promote based on how often they're accessed.
| Level | Lifespan | Use for |
|---|---|---|
| Level.Working | Hours | Current session context, temporary notes |
| Level.Decisions | Days | Decisions made, tasks in progress |
| Level.Domain | Weeks | Project knowledge, team preferences |
| Level.Identity | Months+ | Core user traits, permanent rules |
from aura import Aura, Level
brain = Aura("./brain")
# Persists for months — core identity
brain.store("User is a senior backend engineer", level=Level.Identity)
# Persists for weeks — project knowledge
brain.store("We use PostgreSQL, not MySQL", level=Level.Domain)
# Persists for days — active work
brain.store("Working on auth module this week", level=Level.Decisions)
# Default — current session
brain.store("User asked about deployment just now")API Reference
Core methods
brain.store(content, level?, tags?, namespace?)Store a memory. Returns the created Record.
brain.recall(query, limit?, tags?, namespace?)Recall relevant memories. Returns list of Records sorted by relevance.
brain.run_maintenance()Run a maintenance cycle. Forms beliefs, concepts, causal patterns, policy hints from stored records.
brain.get_surfaced_policy_hints(limit?)Get advisory hints derived from accumulated memory patterns.
brain.get_metacognitive_context()Query the brain's own knowledge state — confidence, freshness, conflicts.
brain.recall_with_epistemic_context(query, limit?)Recall records plus their epistemic state in one call.
Record fields
record = brain.recall("something")[0]
record.id # unique record ID
record.content # stored text
record.level # memory level (Working/Decisions/Domain/Identity)
record.tags # list of tags
record.confidence # reliability score 0.0–1.0
record.namespace # isolation namespace
record.created_at # unix timestampMCP server
Aura includes a native MCP server — works with Claude Desktop, Cursor, VS Code, and any MCP client.
# Start MCP server
python -m aura.mcp --brain ./my_brain
# Or via CLI
aura-mcp --brain ./my_brain --port 8765Examples
Ready-to-run scripts for every major use case. Clone the repo and run directly.
basic_usage.pyStore, recall, levels, tags — full basics walkthrough
claude_sdk_agent.pySystem prompt injection + tool use with Claude
gemini_aura_demo.pyCheap model + memory vs expensive model alone
ollama_agent.pyFully local — no API key required
crewai_agent.pyMulti-agent crew with persistent memory tools
langchain_agent.pyDrop-in memory via prompt injection
fastapi_middleware.pyPer-user memory isolation in a web API
research_bot.pyResearch agent that accumulates findings
maintenance_daemon.pyBackground brain maintenance in production
encryption.pyChaCha20 encrypted brain at rest
Try in browser — no install needed
Open the Colab notebook and run Aura in your browser in under 2 minutes.
Open in Colab