Frequently Asked Questions About Neo4j Context Graph Decision-Aware Agent Framework
22 answers covering everything from basics to advanced usage.
// Basics
What is the difference between a decision-aware agent and a standard LLM agent?
A standard LLM agent relies on prompt instructions and retrieved context to respond. A decision-aware agent additionally stores and queries policies, rules, and prior decision precedents from a context graph, performs reference class validation, separates analysis from authority, and records full decision traces. This means it can handle situations its prompt engineering never anticipated, while a standard agent defaults to statistical averages or hallucination.
What is reasoning memory and why do AI agents need it?
Reasoning memory is the memory layer that stores policies, rules, and prior decision rationale — the why behind agent behaviour. Without it, agents only know facts (long-term memory) and session context (short-term memory) but cannot reason about governance or precedent. Reasoning memory enables consistent, accountable decisions and prevents agents from treating every situation as novel when established rules already apply.
What is a decision trace and how is it stored?
A decision trace is the complete recorded artefact of a decision event: what was considered, what alternatives were rejected, the full reasoning chain, the final outcome, and actions taken. It is stored as nodes and relationships in the context graph, creating precedent that future agents can query. This enables both auditability for compliance and self-improvement as the agent corpus of precedent grows over time.
Is this framework suitable for low-stakes consumer chatbots?
The full seven-step workflow is designed for consequential decisions and may be over-engineered for simple consumer chatbots. However, even low-stakes agents benefit from the principles: storing soft rules in a graph (brand voice, return policies), separating analysis from action (recommend vs. execute), and tracing decisions (for analytics and improvement). Apply the framework proportionally — use the principles everywhere, but scale the workflow formality to match the stakes.
// How To
How do I define the agent authority scope correctly?
Map every action the agent might take and classify each as authorized-autonomous, authorized-with-confirmation, or requires-escalation. Be specific: 'can place orders under $50' is better than 'can make purchases.' Encode this scope in the context graph so the decision agent can query it at the act-or-escalate checkpoint. Err on the side of narrower authority initially and expand based on recorded decision traces that demonstrate safe operation.
How do I encode hard and soft rules into a context graph?
Hard rules (formal policies, regulatory requirements) become nodes with mandatory enforcement relationships. Soft rules (Slack guidance, team norms, informal practices) become nodes with advisory relationships. Both connect to the domains, entities, and decision types they govern. Include metadata like source, last-updated date, and override hierarchy so the agent can resolve conflicts — hard rules override soft rules, newer rules override older ones unless explicitly stated otherwise.
How do I implement reference class validation in practice?
Before the risk-value analysis step, the agent queries the context graph for attributes that segment the current case into subpopulations. For a patient, this might be age, medications, allergies, and genetic markers. For a financial transaction, it might be account history, transaction size, and jurisdiction. The agent then explicitly checks: does the majority-case logic still hold for this specific segment? If the segment has known exceptions or contraindications, the analysis must reflect that minority-case risk profile.
How do I set up the analysis agent and decision agent as separate components?
Create two distinct agent roles with different system prompts and tool access. The analysis agent has read access to the context graph and produces a structured JSON proposal listing options, pros, cons, and risk assessments — but has no execution tools. The decision agent receives this proposal, has access to the authority scope in the context graph, and has execution tools. This separation can be implemented as two LLM calls, two microservices, or two nodes in a LangGraph or similar orchestration framework.
// Troubleshooting
What happens when the context graph has conflicting rules?
Conflicting rules are expected, not exceptional. Encode a conflict resolution hierarchy in the graph: regulatory rules override company policies, which override informal guidance. Newer rules override older ones unless the older rule is marked immutable. When the analysis agent detects a conflict, it must surface both rules and the conflict in its proposal rather than silently resolving it. The decision agent or human then makes the judgment call, and the resolution is traced back into the graph as precedent.
What if the context graph has no prior decisions for a novel situation?
This is the cold-start scenario. Without prior decisions, the agent relies more heavily on hard and soft rules and its reference class validation. The analysis agent should flag the absence of precedent explicitly in its proposal, which increases the likelihood of escalation at the decision stage. The first decision made in a novel situation becomes the founding precedent — making thorough decision tracing especially important so future agents benefit from this initial case.
My agent keeps escalating everything — how do I reduce unnecessary escalations?
Excessive escalation usually means the authority scope is too narrow, the certainty thresholds are too high, or the context graph lacks sufficient precedent and rules. Audit the decision traces to find patterns: if the same type of decision is repeatedly escalated and always approved by the human, expand the agent's authority scope for that decision type. Also enrich the context graph with more rules and precedent so the agent's certainty increases. Gradual authority expansion based on traced evidence is the correct approach.
How do I audit an existing AI agent using this framework?
Map the agent's current decision points against the seven-step workflow. Check: Does it frame local context with stakes assessment? Does it query structured rules and precedent (or just RAG documents)? Does it perform reference class validation? Is analysis separated from decision authority? Does it have an explicit act-or-escalate gate? Are decisions traced? Each gap represents a failure mode. Prioritize closing gaps where the stakes are highest — medical, financial, legal, or safety-critical decisions.
// Comparisons
How does the context graph framework compare to chain-of-thought prompting for explainability?
Chain-of-thought prompting produces reasoning that looks explanatory but is generated ad hoc — it is not grounded in stored policies or precedent, and it is not persisted for future use. The context graph framework produces explainability that is verifiable (grounded in graph-stored rules), consistent (same rules apply across sessions), and cumulative (decision traces become precedent). Chain-of-thought explains what the model thought; context graphs explain what the agent was governed by.
How does this framework compare to using a vector database for agent memory?
Vector databases store embeddings for semantic similarity search — good for retrieving relevant documents but poor at representing structured relationships between rules, policies, entities, and decisions. Context graphs in Neo4j store explicit relationships (this rule governs this decision type, this precedent applies to this entity class), enabling traversal-based reasoning rather than similarity-based retrieval. The two are complementary: use vector search for unstructured retrieval and graph queries for structured governance.
Can I combine this framework with LangGraph or CrewAI for multi-agent orchestration?
Yes, and it is recommended. LangGraph or CrewAI handle agent orchestration — routing, state management, and tool calling. The context graph framework provides the decision governance layer on top. Implement the analysis agent and decision agent as separate nodes in LangGraph or separate agents in CrewAI, with the context graph (Neo4j) as a shared tool both agents query. The orchestration framework manages flow; the context graph framework manages judgment.
// Advanced
How does Text-to-Cypher work in this framework?
Text-to-Cypher translates the agent's natural language queries into Cypher, Neo4j's graph query language, so the agent can retrieve structured data from the context graph without hard-coded queries. For example, an agent might ask 'What were the last three decisions made for similar loan applications?' and Text-to-Cypher generates the Cypher query to traverse the graph. This enables flexible, dynamic context retrieval while keeping the agent's reasoning in natural language.
How do I handle evolving rules and policies in the context graph?
Version every rule node with effective dates and supersedence relationships. When a new rule replaces an old one, create a SUPERSEDES relationship rather than deleting the old rule — prior decisions that referenced the old rule must remain auditable. The analysis agent should check rule currency during the global context load step, and flag any prior decisions that were made under now-superseded rules if they are being used as precedent.
How do you measure the quality of a context graph for agent decision-making?
Measure coverage, currency, and consistency. Coverage: what percentage of the agent's decision types have corresponding rules and precedent in the graph? Currency: are rules up to date, or are there stale policies that no longer apply? Consistency: do prior decision traces show consistent application of rules, or are there contradictory precedents? Audit the escalation rate — a high rate may indicate coverage gaps; a zero rate may indicate missing authority checks.
What graph schema should I use for the context graph in Neo4j?
Start with core node types: Agent, Decision, Rule (hard and soft), Entity (domain-specific), Precedent, and Escalation. Key relationships include GOVERNED_BY (decision to rule), PRECEDED_BY (decision to prior decision), APPLIES_TO (rule to entity class), ESCALATED_TO (decision to human or higher agent), and RESULTED_IN (decision to outcome). Add domain-specific nodes as needed. Keep the schema expressive enough to represent governance but simple enough to query efficiently with Text-to-Cypher.
How do I handle the 1% edge case problem in practice?
The 1% edge case problem occurs when an agent applies majority-case logic to a minority-case situation with catastrophic consequences. In practice, solve it with reference class validation: before analysis, query the context graph for attributes that distinguish the current case from the majority. If the case matches a known minority segment with different risk profiles, load those specific rules. If the case cannot be classified with confidence, escalate — never default to the majority assumption for high-stakes decisions.
What are the infrastructure requirements for running this framework?
You need a Neo4j instance (cloud via AuraDB or self-hosted) for the context graph, an LLM provider for the analysis and decision agents, and an orchestration layer (LangGraph, CrewAI, or custom). For Text-to-Cypher, you need an LLM with access to the graph schema. Storage scales with decision volume — each decision trace adds nodes and relationships. For production, add monitoring for escalation rates, decision latency, and graph query performance.
Can this framework support real-time agent decisions or is it only for batch processing?
The framework supports real-time decisions. Neo4j graph queries return in milliseconds for well-indexed graphs, and the workflow steps can execute in a single LLM call chain. The main latency factors are LLM inference time (analysis and decision agents) and any human-in-the-loop escalation. For time-sensitive domains, set escalation timeouts and define fallback-safe actions (e.g., 'if no human response in 30 seconds, take the lowest-risk option and trace the timeout').