# Neuro-Symbolic Brain Architecture Research

**Date:** 2026-03-19  
**Context:** Varij's question about Gemini's suggestion to run localized neural net vs OpenMemory-like vector store approach

---

## The Article's Key Insight

The TDS article demonstrates **differentiable rule learning** — a neural network that:
1. Learns to classify (fraud detection)
2. Simultaneously extracts human-readable IF-THEN rules
3. Discovers domain-critical features (V14) **without being told**

This is neuro-symbolic AI: statistical learning + symbolic reasoning in one system.

---

## The Core Question: OpenMemory vs Neural Learning

### OpenMemory/Mem0 Approach (What we partially built)
```
User Input → Embedding → Vector Store → Similarity Search → Retrieved Context → LLM
```

**Limitations:**
- **Static embeddings**: Doesn't learn from usage patterns
- **No relationship learning**: Flat similarity, no graph reasoning
- **Manual taxonomy**: You define what matters upfront
- **Retrieval-only**: Doesn't discover new patterns, just retrieves existing ones
- **Context window constraints**: RAG still bounded by what you retrieve

### Neural Learning Approach (What Gemini suggested)
```
User Input → Embedding → Neural Network (GNN/Transformer) → Learned Representations
                              ↑                                    |
                              └──── Continuous Learning ────────────┘
```

**Advantages:**
- **Learns from every query**: System improves automatically
- **Discovers patterns**: Like V14 in fraud detection — finds what matters
- **Graph relationships**: Understands connections, not just similarities
- **Self-tuning**: Adapts to your workload without manual optimization
- **Transfer learning**: Knowledge from one domain bootstraps another

---

## Why Gemini Was Right

The article proves the core principle: **neural networks can discover rules you didn't know to look for**.

In our context:
- We're building a knowledge brain with 213K+ documents
- We want agents to learn from interactions
- We want the system to surface insights, not just retrieve them
- We want continuous improvement without manual curation

A pure vector store (OpenMemory-like) requires YOU to:
- Define the embedding strategy
- Manually update knowledge
- Curate relationships
- Hope similarity search finds relevant content

A self-learning neural architecture:
- Learns which documents relate to which
- Discovers patterns across conversations
- Improves retrieval based on actual usage
- Surfaces insights you didn't search for

---

## Recommended Architecture: Hybrid Neuro-Symbolic Brain

### Layer 1: Storage (What we have)
- **PostgreSQL + pgvector**: Document embeddings, hybrid BM25+vector search
- **Neo4j**: Entity relationships, knowledge graph
- **DragonflyDB**: Hot cache, session context
- **SurrealDB**: Flexible document storage

### Layer 2: Self-Learning Neural Layer (What we should add)
```
┌─────────────────────────────────────────────────────────────────┐
│                    SELF-LEARNING BRAIN                          │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐ │
│  │   GNN       │    │ Attention   │    │ Rule Extractor      │ │
│  │ (Relations) │ ←→ │ (Context)   │ ←→ │ (Symbolic Logic)    │ │
│  └─────────────┘    └─────────────┘    └─────────────────────┘ │
│         ↑                  ↑                     ↑              │
│         └──────────── Continuous Learning ───────┘              │
│                    (from every agent interaction)               │
└─────────────────────────────────────────────────────────────────┘
                              ↓
           ┌──────────────────┴──────────────────┐
           │        Agent Interactions           │
           │  • Queries improve search ranking   │
           │  • Feedback strengthens connections │
           │  • Usage patterns tune embeddings   │
           └─────────────────────────────────────┘
```

### Layer 3: Agent Integration
- Agents are "smaller brains" that interact with the central neural net
- Each agent interaction is a training signal
- The neural net learns what agents find useful
- Knowledge flows: Agents → Brain → Improved Context → Better Agent Responses

---

## Self-Hosted Options

### 1. RuVector (Most Promising)
**Repo:** https://github.com/ruvnet/ruvector

- **Self-learning**: GNN learns from every query
- **Graph + Vector**: Full Cypher engine + vector similarity
- **Local LLMs**: Runs models on-device (Metal, CUDA, WebGPU)
- **PostgreSQL drop-in**: 230+ SQL functions
- **Single binary**: Ships as one file
- **MIT License**: Free forever

This is exactly what Gemini was suggesting — a system that learns from usage.

### 2. PyTorch Geometric + Custom Training Loop
**Repo:** https://github.com/pyg-team/pytorch_geometric

- Build custom GNN for knowledge graph
- Train on document relationships
- Continuous learning from agent feedback
- More work but maximum flexibility

### 3. Deep Graph Library (DGL)
**Website:** https://www.dgl.ai/

- Industry-standard GNN library
- Supports PyTorch, TensorFlow, MXNet
- Good for production graph learning

---

## Implementation Plan

### Phase 1: Foundation (Already Done)
- ✅ PostgreSQL + pgvector (embeddings)
- ✅ Neo4j (knowledge graph)
- ✅ Document ingestion pipeline
- ✅ Basic hybrid search

### Phase 2: Self-Learning Layer (NEW)
1. **Deploy RuVector** as the learning layer
   - Replace pure pgvector similarity with RuVector's GNN
   - Let it learn from query patterns
   
2. **Training Loop**
   - Every agent query = training signal
   - Every successful retrieval = positive feedback
   - Every "not helpful" = negative feedback
   
3. **Rule Extraction**
   - Like the fraud detection article
   - System learns patterns: "When searching for X, documents A, B, C are always useful"
   - Surfaces these as discoverable rules

### Phase 3: Agent Integration
1. **Agents as Learners**
   - Each agent interaction updates the neural net
   - Agents don't just query — they teach
   
2. **Federated Learning** (Optional)
   - Multiple agent nodes contribute to central brain
   - Knowledge aggregates across the fleet

### Phase 4: Continuous Evolution
- System improves 24/7
- No manual curation needed
- Discovers insights from patterns
- Self-optimizes retrieval strategies

---

## Why NOT Pure OpenMemory/Mem0

| Aspect | OpenMemory | Neural Brain |
|--------|------------|--------------|
| Learning | Manual curation | Automatic from usage |
| Relationships | Flat similarity | Graph + learned edges |
| Discovery | Only what you search | Surfaces unknown patterns |
| Adaptation | Requires re-embedding | Continuous online learning |
| Scale | Linear with data | Sublinear with GNN optimization |
| Future-proof | Static architecture | Self-improving |

---

## Key Takeaway

**The article proves the principle. Gemini was right.**

A neural network can:
1. Learn from data without being told what matters
2. Discover domain-critical patterns (like V14)
3. Express learned knowledge as readable rules
4. Continuously improve with each interaction

For the knowledge brain, this means:
- Don't just store and retrieve (OpenMemory)
- Let the system LEARN what's valuable
- Agents become teachers, not just consumers
- The brain gets smarter every day

---

## Next Steps

1. **Evaluate RuVector** — Deploy locally on 10.11.12.105, test with existing docs
2. **Design Training Signal API** — How agents report useful/not useful retrievals
3. **Build Feedback Loop** — Every agent interaction trains the model
4. **Extract Rules** — Surface discovered patterns as readable insights
5. **Monitor & Iterate** — Track improvements over time

---

*This is the architecture that won't become obsolete. It's self-improving by design.*