Add open-notebook skill: self-hosted NotebookLM alternative (issue #56)

Implements the open-notebook skill as a comprehensive integration for the open-source, self-hosted alternative to Google NotebookLM. Addresses the gap created by Google not providing a public NotebookLM API. Developed using TDD with 44 tests covering skill structure, SKILL.md frontmatter/content, reference documentation, example scripts, API endpoint coverage, and marketplace.json registration. Includes: - SKILL.md with full documentation, code examples, and provider matrix - references/api_reference.md covering all 20+ REST API endpoint groups - references/examples.md with complete research workflow examples - references/configuration.md with Docker, env vars, and security setup - references/architecture.md with system design and data flow diagrams - scripts/ with 3 example scripts (notebook, source, chat) + test suite - marketplace.json updated to register the new skill Closes #56 https://claude.ai/code/session_015CqcNWNYmDF9sqxKxziXcz
2026-03-27 07:09:27 +08:00 · 2026-02-23 00:18:19 +00:00
parent f7585b7624
commit 259e01f7fd
10 changed files with 2599 additions and 0 deletions
--- a/scientific-skills/open-notebook/references/architecture.md
+++ b/scientific-skills/open-notebook/references/architecture.md
@@ -0,0 +1,163 @@
+# Open Notebook Architecture
+
+## System Overview
+
+Open Notebook is built as a modern Python web application with a clear separation between frontend and backend, using Docker for deployment.
+
+```
+┌─────────────────────────────────────────────────────┐
+│                   Docker Compose                     │
+│                                                     │
+│  ┌──────────────┐  ┌──────────────┐  ┌───────────┐ │
+│  │   Next.js     │  │   FastAPI    │  │ SurrealDB │ │
+│  │   Frontend    │──│   Backend    │──│           │ │
+│  │  (port 8502)  │  │  (port 5055) │  │ (port 8K) │ │
+│  └──────────────┘  └──────────────┘  └───────────┘ │
+│                          │                          │
+│                    ┌─────┴─────┐                    │
+│                    │ LangChain │                    │
+│                    │ Esperanto │                    │
+│                    └─────┬─────┘                    │
+│                          │                          │
+│              ┌───────────┼───────────┐              │
+│              │           │           │              │
+│          ┌───┴───┐  ┌───┴───┐  ┌───┴───┐          │
+│          │OpenAI │  │Claude │  │Ollama │  ...      │
+│          └───────┘  └───────┘  └───────┘           │
+└─────────────────────────────────────────────────────┘
+```
+
+## Core Components
+
+### FastAPI Backend
+
+The REST API is built with FastAPI and organized into routers:
+
+- **20 route modules** covering notebooks, sources, notes, chat, search, podcasts, transformations, models, credentials, embeddings, settings, and more
+- Async/await throughout for non-blocking I/O
+- Pydantic models for request/response validation
+- Custom exception handlers mapping domain errors to HTTP status codes
+- CORS middleware for cross-origin access
+- Optional password authentication middleware
+
+### SurrealDB
+
+SurrealDB serves as the primary data store, providing both document and relational capabilities:
+
+- **Document storage** for notebooks, sources, notes, transformations, and models
+- **Relational references** for notebook-source associations
+- **Full-text search** across indexed content
+- **RocksDB** backend for persistent storage on disk
+- Schema migrations run automatically on application startup
+
+### LangChain Integration
+
+AI features are powered by LangChain with the Esperanto multi-provider library:
+
+- **LangGraph** manages conversational state for chat sessions
+- **Embedding models** power vector search across content
+- **LLM chains** drive transformations, note generation, and podcast scripting
+- **Prompt templates** stored in the `prompts/` directory
+
+### Esperanto Multi-Provider Library
+
+Esperanto provides a unified interface to 16+ AI providers:
+
+- Abstracts provider-specific API differences
+- Supports LLM, embedding, speech-to-text, and text-to-speech capabilities
+- Handles credential management and model discovery
+- Enables runtime provider switching without code changes
+
+### Next.js Frontend
+
+The user interface is a React application built with Next.js:
+
+- Responsive design for desktop and tablet use
+- Real-time updates for chat and processing status
+- File upload with progress tracking
+- Audio player for podcast episodes
+
+## Data Flow
+
+### Source Ingestion
+
+```
+Upload/URL → Source Record Created → Processing Queue
+                                         │
+                              ┌──────────┼──────────┐
+                              ▼          ▼          ▼
+                          Text       Embedding   Metadata
+                        Extraction   Generation  Extraction
+                              │          │          │
+                              └──────────┼──────────┘
+                                         ▼
+                                  Source Updated
+                                  (searchable)
+```
+
+### Chat Execution
+
+```
+User Message → Build Context (sources + notes)
+                    │
+                    ▼
+              LangGraph State Machine
+                    │
+                    ├─ Retrieve relevant context
+                    ├─ Format prompt with citations
+                    └─ Stream LLM response
+                         │
+                         ▼
+                   Response with
+                   source citations
+```
+
+### Podcast Generation
+
+```
+Notebook Content → Episode Profile → Script Generation (LLM)
+                                          │
+                                          ▼
+                                    Speaker Assignment
+                                          │
+                                          ▼
+                                    Text-to-Speech
+                                    (per segment)
+                                          │
+                                          ▼
+                                    Audio Assembly
+                                          │
+                                          ▼
+                                    Episode Record
+                                    + Audio File
+```
+
+## Key Design Decisions
+
+1. **Multi-provider by default**: Not locked to any single AI provider, enabling cost optimization and capability matching
+2. **Async processing**: Long-running operations (source ingestion, podcast generation) run asynchronously with status polling
+3. **Self-hosted data**: All data stays on the user's infrastructure with encrypted credential storage
+4. **REST-first API**: Every UI action is backed by an API endpoint for automation
+5. **Docker-native**: Designed for containerized deployment with persistent volumes
+
+## File Structure
+
+```
+open-notebook/
+├── api/               # FastAPI REST API
+│   ├── main.py        # App setup, middleware, routers
+│   ├── routers/       # Route handlers (20 modules)
+│   ├── models.py      # Pydantic request/response models
+│   └── auth.py        # Authentication middleware
+├── open_notebook/     # Core library
+│   ├── ai/            # AI integration (LangChain, Esperanto)
+│   ├── database/      # SurrealDB operations
+│   ├── domain/        # Domain models and business logic
+│   ├── graphs/        # LangGraph chat and processing graphs
+│   ├── podcasts/      # Podcast generation pipeline
+│   └── utils/         # Shared utilities
+├── frontend/          # Next.js React application
+├── prompts/           # AI prompt templates
+├── tests/             # Test suite
+└── docker-compose.yml # Deployment configuration
+```