A deep blue-purple aurora-like glow with a black background

Arvora Annotated Bibliography

Last updated: 07-05-2025

foo

A curated selection of works guiding Arvora's architecture, with works spanning neuroscience, symbolic systems, graph theory, and AI. This is a living document and one that will shrink and grow as Arvora matures.


AI Approaches to Memory and Reasoning in Language Models

Foreword

The following works examine how large language models (LLMs) can be augmented with structured memory, graph-based reasoning, and cognitive scaffolding to overcome the limitations of stateless inference. Together, they outline a landscape of techniques--from GraphRAG and MemGPT to temporal graphs and chain-of-though prompting--that explore how memory can be externalized, modularized, and made navigable, while also exploring challenges around scale, accuracy, and coordination

These papers directly and indirectly inform Arvora's evolving design: a persistent, content-addressed semantic graph for storing, traversing, auditing knowledge with speed and integrity. While many approaches rely on the LLM to manage its own memory implicitly, Arvora emphasizes explicit, agent-controlled memory--laying the groundwork for deeper reasoning and more durable verifiable context over time.


Liang, J., Wang, Y., Li, C., Zhu, R., Jiang, T., Gong, N., & Wang, T. (2025). GraphRAG under Fire. arXiv preprint arXiv:2501.14050.

This work examines GraphRAG – a retrieval-augmented generation approach that organizes knowledge as graphs – under adversarial conditions. It finds that structuring context as a knowledge graph makes LLM retrieval more resilient to certain poisoning attacks compared to flat RAG techniques. GraphRAG's graph-based indexing inherently filters some malicious inputs, highlighting the robustness benefits of graph-structured memory. However, the study also uncovers new vulnerabilities unique to graph-centric systems when adversaries exploit relational links.

For Arvora's design, these insights reinforce the choice of a graph model for memory: a semantic graph can enhance the integrity and relevance of retrieved information. At the same time, Arvora's emphasis on immutability and content-addressed nodes/edges provides an extra layer of defense – every node and relationship is hashed and versioned, offering auditability that can help detect or roll back malicious modifications. In contrast to GraphRAG's dynamic graph state, Arvora’s Git-like snapshots with root hashes act as checkpoints, aligning with secure practices suggested by this paper's findings.

Han, H., Wang, Y., Shomer, H., Guo, K., Ding, J., … Tang, J. (2025). Retrieval-Augmented Generation with Graphs (GraphRAG). arXiv:2501.00309.

Han et al. provide a comprehensive survey of GraphRAG techniques, underscoring why graph-structured knowledge bases are valuable for augmented generation. A graph's intrinsic “nodes connected by edges” structure encodes rich, heterogeneous relationships, making it a “golden resource” for retrieval-augmented generation. This survey defines a general GraphRAG framework and reviews how various domains tailor graph-informed retrievers and generators. The broad takeaway is that linking information as a graph can give language models access to both broad context and fine-grained details in a uniform representational form.

This motivates Arvora's hybrid semantic graph architecture: by storing knowledge as a network of content-addressable nodes and edges, Arvora enables similar benefits of structured context. The survey also notes design challenges when working with diverse graph data. Arvora addresses these by remaining storage-agnostic (via a write-ahead log) and by not hardcoding domain-specific schema – the graph can flexibly represent any relational data, with optional metadata or embeddings to adapt to different use cases. In essence, the GraphRAG paradigm surveyed here validates Arvora's core idea that knowledge graphs can significantly enhance an AI's ability to utilize and navigate memory.

Packer, C., Wooders, S., Lin, K., Fang, V., Patil, S. G., Stoica, I., & Gonzalez, J. E. (2024). MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560.

MemGPT tackles the limitation of LLMs' fixed context windows by introducing an OS-inspired memory hierarchy. It manages “fast” vs “slow” memory tiers and uses virtual context swapping to let an LLM handle extended conversations or large documents beyond its normal context length. The system demonstrates that careful management of in-memory context (for immediate use) versus external storage (for long-term data) allows an AI to “remember, reflect, and evolve” over longer interactions.

This approach informs Arvora's Rust-native, in-memory design. Arvora keeps the working graph in memory for high-speed access (analogous to MemGPT's fast memory) while relying on a storage-agnostic write-ahead log as a persistence layer (analogous to slow memory). By decoupling immediate memory usage from durable storage, Arvora ensures fast graph operations and snapshots state to disk for durability. Furthermore, MemGPT's successes in multi-session chat and large document analysis underscore the need for structured long-term memory. Arvora's content-addressed graph goes a step further by not just swapping text in and out, but organizing knowledge into a semantic network. This design choice provides a more structured long-term memory than MemGPT's raw token storage, potentially enabling richer recall and cross-reference. Importantly, while MemGPT's memory management is orchestrated by the LLM itself, Arvora avoids any direct LLM dependency for managing the memory. Instead, the external system handles it, which can increase reliability and transparency in how information is stored and retrieved.

Rasmussen, P., Paliychuk, P., Beauvais, T., Ryan, J., & Chalef, D. (2025). Zep: A Temporal Knowledge Graph Architecture for Agent Memory. arXiv:2501.13956.

Zep introduces a dedicated memory service for AI agents, centered on a temporal knowledge graph called Graphiti. This graph engine dynamically integrates streaming information (like ongoing conversations) with structured domain data, while preserving the temporal relationships among events. The result is a system that significantly outperforms prior long-term memory solutions (e.g. MemGPT) on deep memory retrieval benchmarks, achieving higher accuracy and an order-of-magnitude speedup in long-horizon tasks. Zep's success confirms that explicitly graph-structured memories can greatly improve an agent's ability to synthesize information over time.

In comparison to Arvora, Zep is a full-fledged service tailored for LLM agents, with Graphiti focusing on time-aware context injection. Both Zep and Arvora share the philosophy of using a knowledge graph as the backbone of memory, but they diverge in implementation and scope. Arvora is a lower-level Rust library providing an immutable, content-addressed graph store with Git-like version control, whereas Zep's Graphiti is a higher-level architecture built into an AI agent system. Notably, Arvora's design eschews direct involvement of the LLM in constructing or querying the memory graph—developers explicitly add and retrieve nodes/edges. On the other hand, Zep's pipeline is integrated with the agent's learning process (possibly using the LLM to help populate or use the graph). Arvora's choice of immutability and root-hash snapshots offers strong consistency and auditability (any memory state can be hashed and verified), in contrast to Zep's focus on live, mutable graphs for real-time use. Nonetheless, both approaches underscore a common theme: long-term agent intelligence benefits from a structured, persistent memory. Zep's enterprise-oriented results validate Arvora's core premise that a graph-based memory, augmented with temporal context and proper tooling, can dramatically enhance long-term reasoning in AI.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., … Zhou, D. (2023). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903.

Wei et al. demonstrate that prompting large language models to generate step-by-step chains of reasoning substantially improves their performance on complex tasks. By providing examples of intermediate reasoning (“thoughts”), even very large models can achieve breakthroughs on math word problems and logical reasoning benchmarks. This finding highlights that LLMs do have latent capability for multi-step inference when guided properly. However, chain-of-thought prompting alone doesn't furnish the model with a long-term memory – it operates within a single query's context and dissipates afterwards.

From Arvora's design perspective, the success of chain-of-thought prompts reinforces the importance of structure in reasoning. Arvora's semantic graph can be seen as an externalized, persistent “chain-of-thought” repository: instead of relying on the LLM's transient internal state to carry reasoning, Arvora allows an agent to store intermediate conclusions, facts, or questions as graph nodes and edges. This makes the reasoning process explicit and audit-able. Moreover, because Arvora's memory is versioned and content-addressable, an agent can iteratively build and refine chains of thought over multiple interactions, linking new information to prior context without the risk of the model forgetting earlier steps. In summary, while chain-of-thought prompting coaxes structured reasoning from the black-box of an LLM, Arvora provides a complementary approach by giving the AI a scratchpad or brain (in graph form) that persists beyond any single prompt—combining the cognitive boost of structured reasoning with the enduring stability of an external memory.

Webb, T., Mondal, S. S., Wang, C., Krabach, B., & Momennejad, I. (2024). A Prefrontal Cortex-inspired Architecture for Planning in Large Language Models. arXiv:2310.00194v3.

This work introduces LLM-PFC, an architecture that directly implements neuroscience principles from prefrontal cortex (PFC) research to enhance planning capabilities in language models. The authors create specialized modules that mirror key PFC functions – conflict monitoring, state prediction, state evaluation, task decomposition, and task coordination – demonstrating that brain-inspired modular architectures can achieve up to seven-fold improvements on planning tasks compared to standard LLM approaches. By explicitly structuring cognitive functions as interacting modules rather than relying on emergent behaviors, the system shows how neuroscience can inform practical AI agent design beyond mere analogy.

For Arvora's design, this paper provides crucial validation for the principle of structured, modular memory systems inspired by neural architecture. While Arvora focuses on hippocampal-inspired memory storage through its semantic graph, Webb et al. demonstrate the complementary value of cortical-inspired processing modules. The LLM-PFC's success with explicit state tracking and evaluation modules suggests that Arvora's graph could benefit from similar specialized processing layers – for instance, dedicated modules for memory consolidation, conflict detection between stored facts, or predictive pre-fetching based on current context. The paper's emphasis on coordination between specialized components also resonates with Arvora's philosophy of separating memory storage (the graph) from processing (external algorithms), allowing each to be designed and optimized for its specific function. Together with the other papers on memory systems, this work reinforces that brain-inspired architectures – whether for memory storage or executive control – can yield substantial practical improvements in AI systems.


Neuroscience and Cognitive Science Perspectives on Memory

Foreword

The following works bridge neuroscience and memory modeling in artificial systems, grounding Arvora's design in findings from hippocampal systems, cortical memory encoding, and the broader neural architecture oc cognition. These studies enforce our approach to structuring memory as a versioned, distributed, and semantically-linked graph.

Associative rewiring is a term we introduce to describe the phenomenon where repeated co-activation of memory elements leads to the strengthening of creation of shortcut links between them (e.g. Transitioning from A -> B -> C to A -> C or B -> C). This mimics mechanisms like Hebbian learning, memory replay, and successor representations, which enable memory systems to compress or restructure (rewire) experience over time.

We also draw from the neuroscience of multi-region partitioning, where memories are stored in overlapping, distributed engrams across distinct brain areas. This provides further theoretical support for Arvora's choice of decentralized node and edge structures--allowing memory to be retrieved from subgraphs rather than monolithic records, and enabling a single node (analogous to a concept or feature) to participate in many distinct memory traces

These ideas not only support Arvora's current immutability and symbolic graph design, but also open pathways for future features like memory compression, anticipatory rewiring and hot paths, as well as regionally-partitioned memory layers inspired by cortical modularity.


Sridhar, S., Khamaj, A., & Asthana, M. K. (2023). Cognitive neuroscience perspective on memory: overview and summary. Frontiers in Human Neuroscience, 17, 1217093.

This review outlines how human memory is organized and consolidated, offering a foundation for memory system design. It distinguishes between working memory (short-term, capacity-limited) and long-term memory (including declarative/episodic and non-declarative forms), and emphasizes how different brain regions specialize in each type (e.g. prefrontal cortex for working memory, hippocampus for initial declarative memory storage). Critically, the paper describes the process of memory consolidation: new memories are initially encoded in the hippocampus and then gradually integrated into neocortical networks for long-term storage. This systems-level consolidation – a transfer from a fast, context-rich store to a stable, generalized store – directly inspires Arvora’s architecture.

Arvora's in-memory graph layer plays an analogous role to the “hippocampal” fast store: it's optimized for quick, expressive recording of new information (with rich associations). Meanwhile, Arvora's write-ahead log and snapshot mechanism mirror the consolidation to a long-term store: periodically, the in-memory state is persisted (akin to memories solidifying in the cortex). The review also notes that the hippocampus links new memories to existing schemas, highlighting that episodic inputs are incorporated into a broader semantic framework. Arvora's semantic graph similarly allows new nodes (events, facts) to be connected to prior knowledge via edges, ensuring that fresh information is encoded in context. By aligning with these cognitive principles – a fast encoding buffer and a durable structured store – Arvora's design not only achieves technical robustness but also parallels the brain's approach to balancing immediacy versus longevity in memory.

Battaglia, F. P., & Pennartz, C. (2011). The construction of semantic memory: Grammar-based representations learned from relational episodic information. Frontiers in Computational Neuroscience, 5, 36.

Battaglia and Pennartz propose a computational theory of how semantic memory can emerge from episodic experiences. They observe that hippocampal (episodic) memory encodes autobiographical events as relational data – essentially links between items, contexts, and temporal order – while the neocortex gradually extracts generalities (semantic knowledge) from those episodes. The paper introduces a model in which episodic memories are stored as an association matrix (akin to a graph of relationships), and a probabilistic grammar is learned over this data to form semantic representations. Notably, the authors tie this process to “sleep replay,” suggesting that off-line reactivation of episodic traces trains the semantic grammar. This framework directly informs Arvora's hybrid graph approach.

In Arvora, experiences or data points can be recorded as nodes and edges (relational episodic information). Because the graph is content-addressable and versioned, it provides a clean record of “what happened” that can be replayed or analyzed by higher-level algorithms. For example, an external process (not necessarily an LLM) could traverse Arvora's graph snapshots to induce more abstract schemas or compress redundant nodes – analogous to the grammar induction described in the paper. Arvora doesn't inherently perform the Battaglia-Pennartz learning algorithm, but it furnishes the ideal substrate (an immutable, queryable repository of relations) on which such an algorithm or other high-level operations could operate. By decoupling the storage of episodes from the inference of semantics (no direct LLM dependency for creating the graph), Arvora aligns with the paper's view that semantic structures arise from iterative processing of episodic data. In short, this research provides theoretical backing for Arvora's choice to represent memory as a graph: it mirrors the brain's strategy of building a structured knowledge base from the raw material of lived experiences.

Tacikowski, P., Kalender, G., Ciliberti, D., & Fried, I. (2024). Human hippocampal and entorhinal neurons encode the temporal structure of experience. Nature, 635(8037), 160–167.

This neuroscience study shows that the hippocampus doesn't merely record isolated events, but encodes the temporal and relational structure of sequences. By recording single-neuron activity in humans learning sequences of images, the authors found that hippocampal and entorhinal cells developed representations reflecting the graph structure of the experienced sequence. In fact, the neural firing patterns came to mirror the actual sequence graph (nodes and their ordered links), and even predicted upcoming items based on learned transitions. Moreover, during rest, the hippocampus replayed sequences in a time-compressed manner, reinforcing the learned graph trajectories. These findings bolster Arvora's design choice of a graph-based memory with support for temporal relationships.

In Arvora, events or observations can be linked by temporal edges, capturing not just that they occurred, but in what order or timeframe. The fact that hippocampal neurons naturally form a graph-like code for sequences suggests that Arvora's approach is neurally plausible: a memory system benefits from storing the structure of experiences, not only the content. Arvora's optional metadata on edges could be used to weight or label these temporal connections (e.g. intervals or probabilities, similar to the varying transition likelihoods the study observed in neural activity). Additionally, the idea of replay aligns with Arvora's versioned snapshots – an agent can revisit previous memory states or traverse the graph to reinforce and reorganize knowledge. By encoding “when” and “what” together in a unified graph, Arvora can support functionalities like chronological recall, timeline construction, and predictive modeling, much like the human hippocampal system supports episodic memory and imagination of future events.

Wang, J.-H., & Cui, S. (2018). Associative memory cells and their working principle in the brain. F1000Research, 7, 108.

Wang and Cui review the neural basis of associative memory – how the brain links related pieces of information across different regions. They describe evidence that “associative memory cells” are recruited via mutual synaptic connections among co-activated brain regions to integrate and retrieve associated signals. In essence, when two stimuli or concepts are experienced together, their corresponding neurons form connections (or strengthen existing ones), creating an association that can later be reactivated to recall the linked information. This principle resonates strongly with Arvora's semantic graph model.

In Arvora, nodes representing concepts or data can be explicitly connected by edges whenever a relationship is detected or learned. These edges function like synapses encoding an association: activating one node (e.g. querying or recalling it) allows traversal to related nodes, mirroring how one memory triggers another in the brain. The paper also notes that such associative networks are essential for higher cognition, including logical reasoning and planning. Arvora's inclusion of optional embeddings alongside symbolic links further strengthens its associative power – the embeddings provide a continuous measure of similarity or relatedness, which can complement discrete graph edges. This dual representation means Arvora can support both exact, logical associations (through content-addressed links) and fuzzy, similarity-based associations (through vector comparisons), akin to how the brain's memory system blends precise relational memory with graded semantic similarity. By building Arvora around a web of associations (rather than isolated entries), we ensure that recalling one piece of information can fluidly lead to relevant others, enabling richer context assembly and a more brain-like recall process for AI systems.

Roy, D. S., Park, Y. G., Kim, M. E., Ogawa, S. K., … Tonegawa, S. (2022). Brain-wide mapping reveals that engrams for a single memory are distributed across multiple brain regions. Nature Communications, 13, 1799.

This study provides empirical evidence for the “unified engram complex” hypothesis: a single memory is not stored in one localized spot, but rather across a distributed network of neurons in different brain regions. Using a mouse fear-conditioning paradigm, the researchers identified ensembles of cells (engrams) activated by a memory in 247 disparate brain areas, all functionally connected as a memory-specific network. Reactivating multiple parts of this distributed ensemble produced a more complete memory recall than stimulating any single region. This finding vindicates a core assumption behind Arvora's design: that memory (even a single event) benefits from being represented as a network of components rather than a monolithic chunk.

In Arvora, any given “memory” – say an experience or a knowledge item – can be composed of multiple nodes (facts, context elements, sensory data) connected by edges into a subgraph. The complete recollection of that memory involves traversing and integrating that subgraph. If only one piece is retrieved, the memory is partial; when the whole network is activated, the memory is richer, mirroring the study's result that multi-ensemble reactivation yields stronger recall. Additionally, Arvora's content-addressable scheme naturally supports distributed storage: the same node (e.g., a concept like “meeting” or an entity like “Alice”) can be part of many memory subgraphs without duplication, much as a neuron can participate in multiple engrams. The distributed engram mapping also underscores why Arvora remains storage-agnostic and in-memory – a memory network might span many elements, and having them instantly accessible in one unified address space (memory) avoids the latency of piecemeal retrieval from a database. In summary, Roy et al.'s discovery that memory is inherently graph-distributed across the brain lends biological credibility to Arvora's graph-of-nodes approach to representing knowledge.

Abdou, K., Shehata, M., Choko, K., Nishizono, H., Matsuo, M., Muramatsu, S., & Inokuchi, K. (2018). Synapse-specific representation of the identity of overlapping memory engrams. Science, 360(6394), 1227–1231.

Abdou et al. investigate how different memories can overlap in the brain's storage without losing their individual identities. They show that when two fear memories share some of the same neuron population, those memories remain distinct by virtue of synapse-specific plasticity. In their experiments, completely erasing one memory (through induced amnesia) did not erase a second memory that was linked and shared the neural ensemble – the second memory could still be recalled independently. Moreover, by potentiating or depotentiating only the synapses encoding one particular association, they could selectively affect the recall of that target memory without disrupting the other. This suggests that the “address” of a memory in the brain is a combination of which cells are involved and the specific synaptic strengths that encode that memory's unique pattern. The analogy in Arvora is the content-addressed node and edge model.

If two knowledge entries in Arvora share some nodes (i.e. overlapping content or context), they can still be distinguished by the unique set of edges and connections each entry uses. The node is like a reused neuron, and the edge (with its particular provenance or metadata) is like a synapse tagged to a specific memory. Because Arvora's edges can carry identifiers or hashes, even overlapping subgraphs can be disentangled: each relationship is an immutable object that “belongs” to a particular snapshot or context. This design guarantees that updating or removing one set of relations (akin to weakening synapses for one memory) won't inadvertently collapse another memory that merely shared some nodes. In practice, this means Arvora supports efficient re-use of knowledge (avoiding duplication of identical nodes) while preserving the integrity of individual memory traces. The paper's conclusion – that shared physical substrates can host multiple memories thanks to connection-specific encoding – provides a neuroscience blueprint for Arvora's approach to memory de-duplication and isolation via content hashes and immutable links.

Ekman, M., Kusch, S., & de Lange, F. P. (2023). Successor-like representation guides the prediction of future events in human visual cortex and hippocampus. eLife, 12, e78904.

Ekman and colleagues present evidence that the brain encodes experiences in a manner consistent with the successor representation from reinforcement learning – essentially a predictive map of future states. After people learned a fixed sequence of visual stimuli, seeing one item (e.g., B in a sequence A→B→C→D) would activate not only its representation but also the neural representations of upcoming items (C and D) in visual cortex, whereas preceding items (A) were not activated. The hippocampus similarly showed activity patterns reflecting how far in the sequence events were, with a gradient fading for more distant future or past elements. This suggests that the brain stores not just direct links but also a latent vector encoding of the transition structure – effectively, each state contains an imprint of likely subsequent states.

Arvora's design accounts for this kind of information by allowing optional embeddings on nodes or edges. In addition to explicit graph links (which correspond to direct connections like A→B), Arvora can store a vector representation for each node that captures its context in the broader network (akin to a successor representation embedding). For instance, a node's embedding could be tuned to encode which other nodes are frequently adjacent or sequentially related, enabling quick predictions or recommendations without traversing every edge. The brain’s use of successor-like coding validates the utility of combining symbolic and statistical memory: the symbolic graph provides exact relational knowledge, while embedding vectors summarize the “cloud” of associations in a continuous form. For Arvora, this means an agent can do both logical queries (following specific relationships) and associative retrieval (via embedding similarity) from the same memory structure. By mirroring the brain's predictive encoding – where current context points to probable futures – Arvora can support anticipatory reasoning. An AI agent using Arvora could, for example, take a current node (situation) and, via vector similarity or connected node traversal, infer likely next steps or relevant continuations, much as human memory biases us towards what usually comes next.