Files
claude-scientific-skills/scientific-skills/open-notebook/references/architecture.md
Claude 259e01f7fd Add open-notebook skill: self-hosted NotebookLM alternative (issue #56)
Implements the open-notebook skill as a comprehensive integration for the
open-source, self-hosted alternative to Google NotebookLM. Addresses the
gap created by Google not providing a public NotebookLM API.

Developed using TDD with 44 tests covering skill structure, SKILL.md
frontmatter/content, reference documentation, example scripts, API
endpoint coverage, and marketplace.json registration.

Includes:
- SKILL.md with full documentation, code examples, and provider matrix
- references/api_reference.md covering all 20+ REST API endpoint groups
- references/examples.md with complete research workflow examples
- references/configuration.md with Docker, env vars, and security setup
- references/architecture.md with system design and data flow diagrams
- scripts/ with 3 example scripts (notebook, source, chat) + test suite
- marketplace.json updated to register the new skill

Closes #56

https://claude.ai/code/session_015CqcNWNYmDF9sqxKxziXcz
2026-02-23 00:18:19 +00:00

7.3 KiB

Open Notebook Architecture

System Overview

Open Notebook is built as a modern Python web application with a clear separation between frontend and backend, using Docker for deployment.

┌─────────────────────────────────────────────────────┐
│                   Docker Compose                     │
│                                                     │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────┐ │
│  │   Next.js     │  │   FastAPI    │  │ SurrealDB │ │
│  │   Frontend    │──│   Backend    │──│           │ │
│  │  (port 8502)  │  │  (port 5055) │  │ (port 8K) │ │
│  └──────────────┘  └──────────────┘  └───────────┘ │
│                          │                          │
│                    ┌─────┴─────┐                    │
│                    │ LangChain │                    │
│                    │ Esperanto │                    │
│                    └─────┬─────┘                    │
│                          │                          │
│              ┌───────────┼───────────┐              │
│              │           │           │              │
│          ┌───┴───┐  ┌───┴───┐  ┌───┴───┐          │
│          │OpenAI │  │Claude │  │Ollama │  ...      │
│          └───────┘  └───────┘  └───────┘           │
└─────────────────────────────────────────────────────┘

Core Components

FastAPI Backend

The REST API is built with FastAPI and organized into routers:

  • 20 route modules covering notebooks, sources, notes, chat, search, podcasts, transformations, models, credentials, embeddings, settings, and more
  • Async/await throughout for non-blocking I/O
  • Pydantic models for request/response validation
  • Custom exception handlers mapping domain errors to HTTP status codes
  • CORS middleware for cross-origin access
  • Optional password authentication middleware

SurrealDB

SurrealDB serves as the primary data store, providing both document and relational capabilities:

  • Document storage for notebooks, sources, notes, transformations, and models
  • Relational references for notebook-source associations
  • Full-text search across indexed content
  • RocksDB backend for persistent storage on disk
  • Schema migrations run automatically on application startup

LangChain Integration

AI features are powered by LangChain with the Esperanto multi-provider library:

  • LangGraph manages conversational state for chat sessions
  • Embedding models power vector search across content
  • LLM chains drive transformations, note generation, and podcast scripting
  • Prompt templates stored in the prompts/ directory

Esperanto Multi-Provider Library

Esperanto provides a unified interface to 16+ AI providers:

  • Abstracts provider-specific API differences
  • Supports LLM, embedding, speech-to-text, and text-to-speech capabilities
  • Handles credential management and model discovery
  • Enables runtime provider switching without code changes

Next.js Frontend

The user interface is a React application built with Next.js:

  • Responsive design for desktop and tablet use
  • Real-time updates for chat and processing status
  • File upload with progress tracking
  • Audio player for podcast episodes

Data Flow

Source Ingestion

Upload/URL → Source Record Created → Processing Queue
                                         │
                              ┌──────────┼──────────┐
                              ▼          ▼          ▼
                          Text       Embedding   Metadata
                        Extraction   Generation  Extraction
                              │          │          │
                              └──────────┼──────────┘
                                         ▼
                                  Source Updated
                                  (searchable)

Chat Execution

User Message → Build Context (sources + notes)
                    │
                    ▼
              LangGraph State Machine
                    │
                    ├─ Retrieve relevant context
                    ├─ Format prompt with citations
                    └─ Stream LLM response
                         │
                         ▼
                   Response with
                   source citations

Podcast Generation

Notebook Content → Episode Profile → Script Generation (LLM)
                                          │
                                          ▼
                                    Speaker Assignment
                                          │
                                          ▼
                                    Text-to-Speech
                                    (per segment)
                                          │
                                          ▼
                                    Audio Assembly
                                          │
                                          ▼
                                    Episode Record
                                    + Audio File

Key Design Decisions

  1. Multi-provider by default: Not locked to any single AI provider, enabling cost optimization and capability matching
  2. Async processing: Long-running operations (source ingestion, podcast generation) run asynchronously with status polling
  3. Self-hosted data: All data stays on the user's infrastructure with encrypted credential storage
  4. REST-first API: Every UI action is backed by an API endpoint for automation
  5. Docker-native: Designed for containerized deployment with persistent volumes

File Structure

open-notebook/
├── api/               # FastAPI REST API
│   ├── main.py        # App setup, middleware, routers
│   ├── routers/       # Route handlers (20 modules)
│   ├── models.py      # Pydantic request/response models
│   └── auth.py        # Authentication middleware
├── open_notebook/     # Core library
│   ├── ai/            # AI integration (LangChain, Esperanto)
│   ├── database/      # SurrealDB operations
│   ├── domain/        # Domain models and business logic
│   ├── graphs/        # LangGraph chat and processing graphs
│   ├── podcasts/      # Podcast generation pipeline
│   └── utils/         # Shared utilities
├── frontend/          # Next.js React application
├── prompts/           # AI prompt templates
├── tests/             # Test suite
└── docker-compose.yml # Deployment configuration