mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
Remove extra md files from markitdown
This commit is contained in:
@@ -1,22 +0,0 @@
|
|||||||
MIT License
|
|
||||||
|
|
||||||
Copyright (c) Microsoft Corporation.
|
|
||||||
|
|
||||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
||||||
of this software and associated documentation files (the "Software"), to deal
|
|
||||||
in the Software without restriction, including without limitation the rights
|
|
||||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
||||||
copies of the Software, and to permit persons to whom the Software is
|
|
||||||
furnished to do so, subject to the following conditions:
|
|
||||||
|
|
||||||
The above copyright notice and this permission notice shall be included in all
|
|
||||||
copies or substantial portions of the Software.
|
|
||||||
|
|
||||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
||||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
||||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
||||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
||||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
||||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
||||||
SOFTWARE.
|
|
||||||
|
|
||||||
@@ -1,318 +0,0 @@
|
|||||||
# MarkItDown Installation Guide
|
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
- Python 3.10 or higher
|
|
||||||
- pip package manager
|
|
||||||
- Virtual environment (recommended)
|
|
||||||
|
|
||||||
## Basic Installation
|
|
||||||
|
|
||||||
### Install All Features (Recommended)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
This installs support for all file formats and features.
|
|
||||||
|
|
||||||
### Install Specific Features
|
|
||||||
|
|
||||||
If you only need certain file formats, you can install specific dependencies:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# PDF support only
|
|
||||||
pip install 'markitdown[pdf]'
|
|
||||||
|
|
||||||
# Office documents
|
|
||||||
pip install 'markitdown[docx,pptx,xlsx]'
|
|
||||||
|
|
||||||
# Multiple formats
|
|
||||||
pip install 'markitdown[pdf,docx,pptx,xlsx,audio-transcription]'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Install from Source
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git clone https://github.com/microsoft/markitdown.git
|
|
||||||
cd markitdown
|
|
||||||
pip install -e 'packages/markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Optional Dependencies
|
|
||||||
|
|
||||||
| Feature | Installation | Use Case |
|
|
||||||
|---------|--------------|----------|
|
|
||||||
| All formats | `pip install 'markitdown[all]'` | Everything |
|
|
||||||
| PDF | `pip install 'markitdown[pdf]'` | PDF documents |
|
|
||||||
| Word | `pip install 'markitdown[docx]'` | DOCX files |
|
|
||||||
| PowerPoint | `pip install 'markitdown[pptx]'` | PPTX files |
|
|
||||||
| Excel (new) | `pip install 'markitdown[xlsx]'` | XLSX files |
|
|
||||||
| Excel (old) | `pip install 'markitdown[xls]'` | XLS files |
|
|
||||||
| Outlook | `pip install 'markitdown[outlook]'` | MSG files |
|
|
||||||
| Azure DI | `pip install 'markitdown[az-doc-intel]'` | Enhanced PDF |
|
|
||||||
| Audio | `pip install 'markitdown[audio-transcription]'` | WAV/MP3 |
|
|
||||||
| YouTube | `pip install 'markitdown[youtube-transcription]'` | YouTube videos |
|
|
||||||
|
|
||||||
## System Dependencies
|
|
||||||
|
|
||||||
### OCR Support (for scanned documents and images)
|
|
||||||
|
|
||||||
#### macOS
|
|
||||||
```bash
|
|
||||||
brew install tesseract
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Ubuntu/Debian
|
|
||||||
```bash
|
|
||||||
sudo apt-get update
|
|
||||||
sudo apt-get install tesseract-ocr
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Windows
|
|
||||||
Download from: https://github.com/UB-Mannheim/tesseract/wiki
|
|
||||||
|
|
||||||
### Poppler Utils (for advanced PDF operations)
|
|
||||||
|
|
||||||
#### macOS
|
|
||||||
```bash
|
|
||||||
brew install poppler
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Ubuntu/Debian
|
|
||||||
```bash
|
|
||||||
sudo apt-get install poppler-utils
|
|
||||||
```
|
|
||||||
|
|
||||||
## Verification
|
|
||||||
|
|
||||||
Test your installation:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check version
|
|
||||||
python -c "import markitdown; print('MarkItDown installed successfully')"
|
|
||||||
|
|
||||||
# Test basic conversion
|
|
||||||
echo "Test" > test.txt
|
|
||||||
markitdown test.txt
|
|
||||||
rm test.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
## Virtual Environment Setup
|
|
||||||
|
|
||||||
### Using venv
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Create virtual environment
|
|
||||||
python -m venv markitdown-env
|
|
||||||
|
|
||||||
# Activate (macOS/Linux)
|
|
||||||
source markitdown-env/bin/activate
|
|
||||||
|
|
||||||
# Activate (Windows)
|
|
||||||
markitdown-env\Scripts\activate
|
|
||||||
|
|
||||||
# Install
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Using conda
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Create environment
|
|
||||||
conda create -n markitdown python=3.12
|
|
||||||
|
|
||||||
# Activate
|
|
||||||
conda activate markitdown
|
|
||||||
|
|
||||||
# Install
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Using uv
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Create virtual environment
|
|
||||||
uv venv --python=3.12 .venv
|
|
||||||
|
|
||||||
# Activate
|
|
||||||
source .venv/bin/activate
|
|
||||||
|
|
||||||
# Install
|
|
||||||
uv pip install 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
## AI Enhancement Setup (Optional)
|
|
||||||
|
|
||||||
For AI-powered image descriptions using OpenRouter:
|
|
||||||
|
|
||||||
### OpenRouter API
|
|
||||||
|
|
||||||
OpenRouter provides unified access to multiple AI models (GPT-4, Claude, Gemini, etc.) through a single API.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Install OpenAI SDK (required, already included with markitdown)
|
|
||||||
pip install openai
|
|
||||||
|
|
||||||
# Get API key from https://openrouter.ai/keys
|
|
||||||
|
|
||||||
# Set API key
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
|
|
||||||
# Add to shell profile for persistence
|
|
||||||
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.bashrc # Linux
|
|
||||||
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc # macOS
|
|
||||||
```
|
|
||||||
|
|
||||||
**Why OpenRouter?**
|
|
||||||
- Access to 100+ AI models through one API
|
|
||||||
- Choose between GPT-4, Claude, Gemini, and more
|
|
||||||
- Competitive pricing
|
|
||||||
- No vendor lock-in
|
|
||||||
- Simple OpenAI-compatible interface
|
|
||||||
|
|
||||||
**Popular Models for Image Description:**
|
|
||||||
- `anthropic/claude-sonnet-4.5` - **Recommended** - Best for scientific vision
|
|
||||||
- `anthropic/claude-opus-4.5` - Excellent technical analysis
|
|
||||||
- `openai/gpt-4o` - Good vision understanding
|
|
||||||
- `google/gemini-pro-vision` - Cost-effective option
|
|
||||||
|
|
||||||
See https://openrouter.ai/models for complete model list and pricing.
|
|
||||||
|
|
||||||
## Azure Document Intelligence Setup (Optional)
|
|
||||||
|
|
||||||
For enhanced PDF conversion:
|
|
||||||
|
|
||||||
1. Create Azure Document Intelligence resource in Azure Portal
|
|
||||||
2. Get endpoint and key
|
|
||||||
3. Set environment variables:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
export AZURE_DOCUMENT_INTELLIGENCE_KEY="your-key"
|
|
||||||
export AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT="https://your-endpoint.cognitiveservices.azure.com/"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Docker Installation (Alternative)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Clone repository
|
|
||||||
git clone https://github.com/microsoft/markitdown.git
|
|
||||||
cd markitdown
|
|
||||||
|
|
||||||
# Build image
|
|
||||||
docker build -t markitdown:latest .
|
|
||||||
|
|
||||||
# Run
|
|
||||||
docker run --rm -i markitdown:latest < input.pdf > output.md
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Import Error
|
|
||||||
```
|
|
||||||
ModuleNotFoundError: No module named 'markitdown'
|
|
||||||
```
|
|
||||||
|
|
||||||
**Solution**: Ensure you're in the correct virtual environment and markitdown is installed:
|
|
||||||
```bash
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Missing Feature
|
|
||||||
```
|
|
||||||
Error: PDF conversion not supported
|
|
||||||
```
|
|
||||||
|
|
||||||
**Solution**: Install the specific feature:
|
|
||||||
```bash
|
|
||||||
pip install 'markitdown[pdf]'
|
|
||||||
```
|
|
||||||
|
|
||||||
### OCR Not Working
|
|
||||||
|
|
||||||
**Solution**: Install Tesseract OCR (see System Dependencies above)
|
|
||||||
|
|
||||||
### Permission Errors
|
|
||||||
|
|
||||||
**Solution**: Use virtual environment or install with `--user` flag:
|
|
||||||
```bash
|
|
||||||
pip install --user 'markitdown[all]'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Upgrading
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Upgrade to latest version
|
|
||||||
pip install --upgrade 'markitdown[all]'
|
|
||||||
|
|
||||||
# Check version
|
|
||||||
pip show markitdown
|
|
||||||
```
|
|
||||||
|
|
||||||
## Uninstallation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip uninstall markitdown
|
|
||||||
```
|
|
||||||
|
|
||||||
## Next Steps
|
|
||||||
|
|
||||||
After installation:
|
|
||||||
1. Read `QUICK_REFERENCE.md` for basic usage
|
|
||||||
2. See `SKILL.md` for comprehensive guide
|
|
||||||
3. Try example scripts in `scripts/` directory
|
|
||||||
4. Check `assets/example_usage.md` for practical examples
|
|
||||||
|
|
||||||
## Skill Scripts Setup
|
|
||||||
|
|
||||||
To use the skill scripts:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Navigate to scripts directory
|
|
||||||
cd /Users/vinayak/Documents/claude-scientific-writer/.claude/skills/markitdown/scripts
|
|
||||||
|
|
||||||
# Scripts are already executable, just run them
|
|
||||||
python batch_convert.py --help
|
|
||||||
python convert_with_ai.py --help
|
|
||||||
python convert_literature.py --help
|
|
||||||
```
|
|
||||||
|
|
||||||
## Testing Installation
|
|
||||||
|
|
||||||
Create a test file to verify everything works:
|
|
||||||
|
|
||||||
```python
|
|
||||||
# test_markitdown.py
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
|
|
||||||
def test_basic():
|
|
||||||
md = MarkItDown()
|
|
||||||
# Create a simple test file
|
|
||||||
with open("test.txt", "w") as f:
|
|
||||||
f.write("Hello MarkItDown!")
|
|
||||||
|
|
||||||
# Convert it
|
|
||||||
result = md.convert("test.txt")
|
|
||||||
print("✓ Basic conversion works")
|
|
||||||
print(result.text_content)
|
|
||||||
|
|
||||||
# Cleanup
|
|
||||||
import os
|
|
||||||
os.remove("test.txt")
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
test_basic()
|
|
||||||
```
|
|
||||||
|
|
||||||
Run it:
|
|
||||||
```bash
|
|
||||||
python test_markitdown.py
|
|
||||||
```
|
|
||||||
|
|
||||||
## Getting Help
|
|
||||||
|
|
||||||
- **Documentation**: See `SKILL.md` and `README.md`
|
|
||||||
- **GitHub Issues**: https://github.com/microsoft/markitdown/issues
|
|
||||||
- **Examples**: `assets/example_usage.md`
|
|
||||||
- **API Reference**: `references/api_reference.md`
|
|
||||||
|
|
||||||
@@ -1,359 +0,0 @@
|
|||||||
# OpenRouter Integration for MarkItDown
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
This MarkItDown skill has been configured to use **OpenRouter** instead of direct OpenAI API access. OpenRouter provides a unified API gateway to access 100+ AI models from different providers through a single, OpenAI-compatible interface.
|
|
||||||
|
|
||||||
## Why OpenRouter?
|
|
||||||
|
|
||||||
### Benefits
|
|
||||||
|
|
||||||
1. **Multiple Model Access**: Access GPT-4, Claude, Gemini, and 100+ other models through one API
|
|
||||||
2. **No Vendor Lock-in**: Switch between models without code changes
|
|
||||||
3. **Competitive Pricing**: Often better rates than going direct
|
|
||||||
4. **Simple Migration**: OpenAI-compatible API means minimal code changes
|
|
||||||
5. **Flexible Choice**: Choose the best model for each task
|
|
||||||
|
|
||||||
### Popular Models for Image Description
|
|
||||||
|
|
||||||
| Model | Provider | Use Case | Vision Support |
|
|
||||||
|-------|----------|----------|----------------|
|
|
||||||
| `anthropic/claude-sonnet-4.5` | Anthropic | **Recommended** - Best overall for scientific analysis | ✅ |
|
|
||||||
| `anthropic/claude-opus-4.5` | Anthropic | Excellent technical analysis | ✅ |
|
|
||||||
| `openai/gpt-4o` | OpenAI | Strong vision understanding | ✅ |
|
|
||||||
| `openai/gpt-4-vision` | OpenAI | GPT-4 with vision | ✅ |
|
|
||||||
| `google/gemini-pro-vision` | Google | Cost-effective option | ✅ |
|
|
||||||
|
|
||||||
See https://openrouter.ai/models for the complete list.
|
|
||||||
|
|
||||||
## Getting Started
|
|
||||||
|
|
||||||
### 1. Get an API Key
|
|
||||||
|
|
||||||
1. Visit https://openrouter.ai/keys
|
|
||||||
2. Sign up or log in
|
|
||||||
3. Create a new API key
|
|
||||||
4. Copy the key (starts with `sk-or-v1-...`)
|
|
||||||
|
|
||||||
### 2. Set Environment Variable
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Add to your environment
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
|
|
||||||
# Make it permanent
|
|
||||||
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc # macOS
|
|
||||||
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.bashrc # Linux
|
|
||||||
|
|
||||||
# Reload shell
|
|
||||||
source ~/.zshrc # or source ~/.bashrc
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Use in Python
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
# Initialize OpenRouter client (OpenAI-compatible)
|
|
||||||
client = OpenAI(
|
|
||||||
api_key="your-openrouter-api-key", # or use env var
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Create MarkItDown with AI support
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5" # Choose your model
|
|
||||||
)
|
|
||||||
|
|
||||||
# Convert with AI-enhanced descriptions
|
|
||||||
result = md.convert("presentation.pptx")
|
|
||||||
print(result.text_content)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Using the Scripts
|
|
||||||
|
|
||||||
All skill scripts have been updated to use OpenRouter:
|
|
||||||
|
|
||||||
### convert_with_ai.py
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Set API key
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
|
|
||||||
# Convert with default model (advanced vision model)
|
|
||||||
python scripts/convert_with_ai.py paper.pdf output.md --prompt-type scientific
|
|
||||||
|
|
||||||
# Use GPT-4o as alternative
|
|
||||||
python scripts/convert_with_ai.py paper.pdf output.md \
|
|
||||||
--model openai/gpt-4o \
|
|
||||||
--prompt-type scientific
|
|
||||||
|
|
||||||
# Use Gemini Pro Vision (cost-effective)
|
|
||||||
python scripts/convert_with_ai.py slides.pptx output.md \
|
|
||||||
--model google/gemini-pro-vision \
|
|
||||||
--prompt-type presentation
|
|
||||||
|
|
||||||
# List available prompt types
|
|
||||||
python scripts/convert_with_ai.py --list-prompts
|
|
||||||
```
|
|
||||||
|
|
||||||
### Choosing the Right Model
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# For scientific papers - use advanced vision model for technical analysis
|
|
||||||
python scripts/convert_with_ai.py research.pdf output.md \
|
|
||||||
--model anthropic/claude-sonnet-4.5 \
|
|
||||||
--prompt-type scientific
|
|
||||||
|
|
||||||
# For presentations - use advanced vision model
|
|
||||||
python scripts/convert_with_ai.py slides.pptx output.md \
|
|
||||||
--model anthropic/claude-sonnet-4.5 \
|
|
||||||
--prompt-type presentation
|
|
||||||
|
|
||||||
# For data visualizations - use advanced vision model
|
|
||||||
python scripts/convert_with_ai.py charts.pdf output.md \
|
|
||||||
--model anthropic/claude-sonnet-4.5 \
|
|
||||||
--prompt-type data_viz
|
|
||||||
|
|
||||||
# For medical images - use advanced vision model for detailed analysis
|
|
||||||
python scripts/convert_with_ai.py xray.jpg output.md \
|
|
||||||
--model anthropic/claude-sonnet-4.5 \
|
|
||||||
--prompt-type medical
|
|
||||||
```
|
|
||||||
|
|
||||||
## Code Examples
|
|
||||||
|
|
||||||
### Basic Usage
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
from openai import OpenAI
|
|
||||||
import os
|
|
||||||
|
|
||||||
# Initialize OpenRouter client
|
|
||||||
client = OpenAI(
|
|
||||||
api_key=os.environ.get("OPENROUTER_API_KEY"),
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Use advanced vision model for image descriptions
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5"
|
|
||||||
)
|
|
||||||
|
|
||||||
result = md.convert("document.pptx")
|
|
||||||
print(result.text_content)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Switching Models Dynamically
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
from openai import OpenAI
|
|
||||||
import os
|
|
||||||
|
|
||||||
client = OpenAI(
|
|
||||||
api_key=os.environ["OPENROUTER_API_KEY"],
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Use different models for different file types
|
|
||||||
def convert_with_best_model(filepath):
|
|
||||||
if filepath.endswith('.pdf'):
|
|
||||||
# Use advanced vision model for technical PDFs
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5",
|
|
||||||
llm_prompt="Describe scientific figures with technical precision"
|
|
||||||
)
|
|
||||||
elif filepath.endswith('.pptx'):
|
|
||||||
# Use advanced vision model for presentations
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5",
|
|
||||||
llm_prompt="Describe slide content and visual elements"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
# Use advanced vision model as default
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5"
|
|
||||||
)
|
|
||||||
|
|
||||||
return md.convert(filepath)
|
|
||||||
|
|
||||||
# Use it
|
|
||||||
result = convert_with_best_model("paper.pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
### Custom Prompts per Model
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
client = OpenAI(
|
|
||||||
api_key="your-openrouter-api-key",
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Scientific analysis with advanced vision model
|
|
||||||
scientific_prompt = """
|
|
||||||
Analyze this scientific figure. Provide:
|
|
||||||
1. Type of visualization and methodology
|
|
||||||
2. Quantitative data points and trends
|
|
||||||
3. Statistical significance
|
|
||||||
4. Technical interpretation
|
|
||||||
Be precise and use scientific terminology.
|
|
||||||
"""
|
|
||||||
|
|
||||||
md_scientific = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5",
|
|
||||||
llm_prompt=scientific_prompt
|
|
||||||
)
|
|
||||||
|
|
||||||
# Visual analysis with advanced vision model
|
|
||||||
visual_prompt = """
|
|
||||||
Describe this image comprehensively:
|
|
||||||
1. Main visual elements and composition
|
|
||||||
2. Colors, layout, and design
|
|
||||||
3. Text and labels
|
|
||||||
4. Overall message
|
|
||||||
"""
|
|
||||||
|
|
||||||
md_visual = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5",
|
|
||||||
llm_prompt=visual_prompt
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Model Comparison
|
|
||||||
|
|
||||||
### For Scientific Content
|
|
||||||
|
|
||||||
**Recommended: anthropic/claude-sonnet-4.5**
|
|
||||||
- Excellent at technical analysis
|
|
||||||
- Superior reasoning capabilities
|
|
||||||
- Best at understanding scientific figures
|
|
||||||
- Most detailed and accurate explanations
|
|
||||||
- Advanced vision capabilities
|
|
||||||
|
|
||||||
**Alternative: openai/gpt-4o**
|
|
||||||
- Good vision understanding
|
|
||||||
- Fast processing
|
|
||||||
- Good at charts and graphs
|
|
||||||
|
|
||||||
### For Presentations
|
|
||||||
|
|
||||||
**Recommended: anthropic/claude-sonnet-4.5**
|
|
||||||
- Superior vision capabilities
|
|
||||||
- Excellent at understanding slide layouts
|
|
||||||
- Fast and reliable
|
|
||||||
- Best technical comprehension
|
|
||||||
|
|
||||||
### For Cost-Effectiveness
|
|
||||||
|
|
||||||
**Recommended: google/gemini-pro-vision**
|
|
||||||
- Lower cost per request
|
|
||||||
- Good quality
|
|
||||||
- Fast processing
|
|
||||||
|
|
||||||
## Pricing Considerations
|
|
||||||
|
|
||||||
OpenRouter pricing varies by model. Check current rates at https://openrouter.ai/models
|
|
||||||
|
|
||||||
**Tips for Cost Optimization:**
|
|
||||||
1. Use advanced vision models for best quality on complex scientific content
|
|
||||||
2. Use cheaper models (Gemini) for simple images
|
|
||||||
3. Batch process similar content with the same model
|
|
||||||
4. Use appropriate prompts to get better results in fewer retries
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### API Key Issues
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check if key is set
|
|
||||||
echo $OPENROUTER_API_KEY
|
|
||||||
|
|
||||||
# Should show: sk-or-v1-...
|
|
||||||
# If empty, set it:
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
```
|
|
||||||
|
|
||||||
### Model Not Found
|
|
||||||
|
|
||||||
If you get a "model not found" error, check:
|
|
||||||
1. Model name format: `provider/model-name`
|
|
||||||
2. Model availability: https://openrouter.ai/models
|
|
||||||
3. Vision support: Ensure model supports vision for image description
|
|
||||||
|
|
||||||
### Rate Limits
|
|
||||||
|
|
||||||
OpenRouter has rate limits. If you hit them:
|
|
||||||
1. Add delays between requests
|
|
||||||
2. Use batch processing scripts with `--workers` parameter
|
|
||||||
3. Consider upgrading your OpenRouter plan
|
|
||||||
|
|
||||||
## Migration Notes
|
|
||||||
|
|
||||||
This skill was updated from direct OpenAI API to OpenRouter. Key changes:
|
|
||||||
|
|
||||||
1. **Environment Variable**: `OPENAI_API_KEY` → `OPENROUTER_API_KEY`
|
|
||||||
2. **Client Initialization**: Added `base_url="https://openrouter.ai/api/v1"`
|
|
||||||
3. **Model Names**: `gpt-4o` → `openai/gpt-4o` (with provider prefix)
|
|
||||||
4. **Script Updates**: All scripts now use OpenRouter by default
|
|
||||||
|
|
||||||
## Resources
|
|
||||||
|
|
||||||
- **OpenRouter Website**: https://openrouter.ai
|
|
||||||
- **Get API Keys**: https://openrouter.ai/keys
|
|
||||||
- **Model List**: https://openrouter.ai/models
|
|
||||||
- **Pricing**: https://openrouter.ai/models (click on model for details)
|
|
||||||
- **Documentation**: https://openrouter.ai/docs
|
|
||||||
- **Support**: https://openrouter.ai/discord
|
|
||||||
|
|
||||||
## Example Workflow
|
|
||||||
|
|
||||||
Here's a complete workflow using OpenRouter:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 1. Set up API key
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-your-key-here"
|
|
||||||
|
|
||||||
# 2. Convert a scientific paper with Claude
|
|
||||||
python scripts/convert_with_ai.py \
|
|
||||||
research_paper.pdf \
|
|
||||||
output.md \
|
|
||||||
--model anthropic/claude-opus-4.5 \
|
|
||||||
--prompt-type scientific
|
|
||||||
|
|
||||||
# 3. Convert presentation with GPT-4o
|
|
||||||
python scripts/convert_with_ai.py \
|
|
||||||
talk_slides.pptx \
|
|
||||||
slides.md \
|
|
||||||
--model openai/gpt-4o \
|
|
||||||
--prompt-type presentation
|
|
||||||
|
|
||||||
# 4. Batch convert with cost-effective model
|
|
||||||
python scripts/batch_convert.py \
|
|
||||||
images/ \
|
|
||||||
markdown_output/ \
|
|
||||||
--extensions .jpg .png
|
|
||||||
```
|
|
||||||
|
|
||||||
## Support
|
|
||||||
|
|
||||||
For OpenRouter-specific issues:
|
|
||||||
- Discord: https://openrouter.ai/discord
|
|
||||||
- Email: support@openrouter.ai
|
|
||||||
|
|
||||||
For MarkItDown skill issues:
|
|
||||||
- Check documentation in this skill directory
|
|
||||||
- Review examples in `assets/example_usage.md`
|
|
||||||
|
|
||||||
@@ -1,309 +0,0 @@
|
|||||||
# MarkItDown Quick Reference
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# All features
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
|
|
||||||
# Specific formats
|
|
||||||
pip install 'markitdown[pdf,docx,pptx,xlsx]'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Basic Usage
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
|
|
||||||
md = MarkItDown()
|
|
||||||
result = md.convert("file.pdf")
|
|
||||||
print(result.text_content)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Command Line
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Simple conversion
|
|
||||||
markitdown input.pdf > output.md
|
|
||||||
markitdown input.pdf -o output.md
|
|
||||||
|
|
||||||
# With plugins
|
|
||||||
markitdown --use-plugins file.pdf -o output.md
|
|
||||||
```
|
|
||||||
|
|
||||||
## Common Tasks
|
|
||||||
|
|
||||||
### Convert PDF
|
|
||||||
```python
|
|
||||||
md = MarkItDown()
|
|
||||||
result = md.convert("paper.pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
### Convert with AI
|
|
||||||
```python
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
# Use OpenRouter for multiple model access
|
|
||||||
client = OpenAI(
|
|
||||||
api_key="your-openrouter-api-key",
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5" # recommended for vision
|
|
||||||
)
|
|
||||||
result = md.convert("slides.pptx")
|
|
||||||
```
|
|
||||||
|
|
||||||
### Batch Convert
|
|
||||||
```bash
|
|
||||||
python scripts/batch_convert.py input/ output/ --extensions .pdf .docx
|
|
||||||
```
|
|
||||||
|
|
||||||
### Literature Conversion
|
|
||||||
```bash
|
|
||||||
python scripts/convert_literature.py papers/ markdown/ --create-index
|
|
||||||
```
|
|
||||||
|
|
||||||
## Supported Formats
|
|
||||||
|
|
||||||
| Format | Extension | Notes |
|
|
||||||
|--------|-----------|-------|
|
|
||||||
| PDF | `.pdf` | Full text + OCR |
|
|
||||||
| Word | `.docx` | Tables, formatting |
|
|
||||||
| PowerPoint | `.pptx` | Slides + notes |
|
|
||||||
| Excel | `.xlsx`, `.xls` | Tables |
|
|
||||||
| Images | `.jpg`, `.png`, `.gif`, `.webp` | EXIF + OCR |
|
|
||||||
| Audio | `.wav`, `.mp3` | Transcription |
|
|
||||||
| HTML | `.html`, `.htm` | Clean conversion |
|
|
||||||
| Data | `.csv`, `.json`, `.xml` | Structured |
|
|
||||||
| Archives | `.zip` | Iterates contents |
|
|
||||||
| E-books | `.epub` | Full text |
|
|
||||||
| YouTube | URLs | Transcripts |
|
|
||||||
|
|
||||||
## Optional Dependencies
|
|
||||||
|
|
||||||
```bash
|
|
||||||
[all] # All features
|
|
||||||
[pdf] # PDF support
|
|
||||||
[docx] # Word documents
|
|
||||||
[pptx] # PowerPoint
|
|
||||||
[xlsx] # Excel
|
|
||||||
[xls] # Old Excel
|
|
||||||
[outlook] # Outlook messages
|
|
||||||
[az-doc-intel] # Azure Document Intelligence
|
|
||||||
[audio-transcription] # Audio files
|
|
||||||
[youtube-transcription] # YouTube videos
|
|
||||||
```
|
|
||||||
|
|
||||||
## AI-Enhanced Conversion
|
|
||||||
|
|
||||||
### Scientific Papers
|
|
||||||
```python
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
# Initialize OpenRouter client
|
|
||||||
client = OpenAI(
|
|
||||||
api_key="your-openrouter-api-key",
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5", # recommended for scientific vision
|
|
||||||
llm_prompt="Describe scientific figures with technical precision"
|
|
||||||
)
|
|
||||||
result = md.convert("paper.pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
### Custom Prompts
|
|
||||||
```python
|
|
||||||
prompt = """
|
|
||||||
Analyze this data visualization. Describe:
|
|
||||||
- Type of chart/graph
|
|
||||||
- Key trends and patterns
|
|
||||||
- Notable data points
|
|
||||||
"""
|
|
||||||
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5",
|
|
||||||
llm_prompt=prompt
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Available Models via OpenRouter
|
|
||||||
- `anthropic/claude-sonnet-4.5` - **Recommended for scientific vision**
|
|
||||||
- `anthropic/claude-opus-4.5` - Advanced vision model
|
|
||||||
- `openai/gpt-4o` - GPT-4 Omni (vision)
|
|
||||||
- `openai/gpt-4-vision` - GPT-4 Vision
|
|
||||||
- `google/gemini-pro-vision` - Gemini Pro Vision
|
|
||||||
|
|
||||||
See https://openrouter.ai/models for full list
|
|
||||||
|
|
||||||
## Azure Document Intelligence
|
|
||||||
|
|
||||||
```python
|
|
||||||
md = MarkItDown(docintel_endpoint="https://YOUR-ENDPOINT.cognitiveservices.azure.com/")
|
|
||||||
result = md.convert("complex_layout.pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Batch Processing
|
|
||||||
|
|
||||||
### Python
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
md = MarkItDown()
|
|
||||||
|
|
||||||
for file in Path("input/").glob("*.pdf"):
|
|
||||||
result = md.convert(str(file))
|
|
||||||
output = Path("output") / f"{file.stem}.md"
|
|
||||||
output.write_text(result.text_content)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Script
|
|
||||||
```bash
|
|
||||||
# Parallel conversion
|
|
||||||
python scripts/batch_convert.py input/ output/ --workers 8
|
|
||||||
|
|
||||||
# Recursive
|
|
||||||
python scripts/batch_convert.py input/ output/ -r
|
|
||||||
```
|
|
||||||
|
|
||||||
## Error Handling
|
|
||||||
|
|
||||||
```python
|
|
||||||
try:
|
|
||||||
result = md.convert("file.pdf")
|
|
||||||
except FileNotFoundError:
|
|
||||||
print("File not found")
|
|
||||||
except Exception as e:
|
|
||||||
print(f"Error: {e}")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Streaming
|
|
||||||
|
|
||||||
```python
|
|
||||||
with open("large_file.pdf", "rb") as f:
|
|
||||||
result = md.convert_stream(f, file_extension=".pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Common Prompts
|
|
||||||
|
|
||||||
### Scientific
|
|
||||||
```
|
|
||||||
Analyze this scientific figure. Describe:
|
|
||||||
- Type of visualization
|
|
||||||
- Key data points and trends
|
|
||||||
- Axes, labels, and legends
|
|
||||||
- Scientific significance
|
|
||||||
```
|
|
||||||
|
|
||||||
### Medical
|
|
||||||
```
|
|
||||||
Describe this medical image. Include:
|
|
||||||
- Type of imaging (X-ray, MRI, CT, etc.)
|
|
||||||
- Anatomical structures visible
|
|
||||||
- Notable findings
|
|
||||||
- Clinical relevance
|
|
||||||
```
|
|
||||||
|
|
||||||
### Data Visualization
|
|
||||||
```
|
|
||||||
Analyze this data visualization:
|
|
||||||
- Chart type
|
|
||||||
- Variables and axes
|
|
||||||
- Data ranges
|
|
||||||
- Key patterns and outliers
|
|
||||||
```
|
|
||||||
|
|
||||||
## Performance Tips
|
|
||||||
|
|
||||||
1. **Reuse instance**: Create once, use many times
|
|
||||||
2. **Parallel processing**: Use ThreadPoolExecutor for multiple files
|
|
||||||
3. **Stream large files**: Use `convert_stream()` for big files
|
|
||||||
4. **Choose right format**: Install only needed dependencies
|
|
||||||
|
|
||||||
## Environment Variables
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# OpenRouter for AI-enhanced conversions
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
|
|
||||||
# Azure Document Intelligence (optional)
|
|
||||||
export AZURE_DOCUMENT_INTELLIGENCE_KEY="key..."
|
|
||||||
export AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT="https://..."
|
|
||||||
```
|
|
||||||
|
|
||||||
## Scripts Quick Reference
|
|
||||||
|
|
||||||
### batch_convert.py
|
|
||||||
```bash
|
|
||||||
python scripts/batch_convert.py INPUT OUTPUT [OPTIONS]
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--extensions .pdf .docx File types to convert
|
|
||||||
--recursive, -r Search subdirectories
|
|
||||||
--workers 4 Parallel workers
|
|
||||||
--verbose, -v Detailed output
|
|
||||||
--plugins, -p Enable plugins
|
|
||||||
```
|
|
||||||
|
|
||||||
### convert_with_ai.py
|
|
||||||
```bash
|
|
||||||
python scripts/convert_with_ai.py INPUT OUTPUT [OPTIONS]
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--api-key KEY OpenRouter API key
|
|
||||||
--model MODEL Model name (default: anthropic/claude-sonnet-4.5)
|
|
||||||
--prompt-type TYPE Preset prompt (scientific, medical, etc.)
|
|
||||||
--custom-prompt TEXT Custom prompt
|
|
||||||
--list-prompts Show available prompts
|
|
||||||
```
|
|
||||||
|
|
||||||
### convert_literature.py
|
|
||||||
```bash
|
|
||||||
python scripts/convert_literature.py INPUT OUTPUT [OPTIONS]
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--organize-by-year, -y Organize by year
|
|
||||||
--create-index, -i Create index file
|
|
||||||
--recursive, -r Search subdirectories
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Missing Dependencies
|
|
||||||
```bash
|
|
||||||
pip install 'markitdown[pdf]' # Install PDF support
|
|
||||||
```
|
|
||||||
|
|
||||||
### Binary File Error
|
|
||||||
```python
|
|
||||||
# Wrong
|
|
||||||
with open("file.pdf", "r") as f:
|
|
||||||
|
|
||||||
# Correct
|
|
||||||
with open("file.pdf", "rb") as f: # Binary mode
|
|
||||||
```
|
|
||||||
|
|
||||||
### OCR Not Working
|
|
||||||
```bash
|
|
||||||
# macOS
|
|
||||||
brew install tesseract
|
|
||||||
|
|
||||||
# Ubuntu
|
|
||||||
sudo apt-get install tesseract-ocr
|
|
||||||
```
|
|
||||||
|
|
||||||
## More Information
|
|
||||||
|
|
||||||
- **Full Documentation**: See `SKILL.md`
|
|
||||||
- **API Reference**: See `references/api_reference.md`
|
|
||||||
- **Format Details**: See `references/file_formats.md`
|
|
||||||
- **Examples**: See `assets/example_usage.md`
|
|
||||||
- **GitHub**: https://github.com/microsoft/markitdown
|
|
||||||
|
|
||||||
@@ -1,184 +0,0 @@
|
|||||||
# MarkItDown Skill
|
|
||||||
|
|
||||||
This skill provides comprehensive support for converting various file formats to Markdown using Microsoft's MarkItDown tool.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
MarkItDown is a Python tool that converts files and office documents to Markdown format. This skill includes:
|
|
||||||
|
|
||||||
- Complete API documentation
|
|
||||||
- Format-specific conversion guides
|
|
||||||
- Utility scripts for batch processing
|
|
||||||
- AI-enhanced conversion examples
|
|
||||||
- Integration with scientific workflows
|
|
||||||
|
|
||||||
## Contents
|
|
||||||
|
|
||||||
### Main Skill File
|
|
||||||
- **SKILL.md** - Complete guide to using MarkItDown with quick start, examples, and best practices
|
|
||||||
|
|
||||||
### References
|
|
||||||
- **api_reference.md** - Detailed API documentation, class references, and method signatures
|
|
||||||
- **file_formats.md** - Format-specific details for all supported file types
|
|
||||||
|
|
||||||
### Scripts
|
|
||||||
- **batch_convert.py** - Batch convert multiple files with parallel processing
|
|
||||||
- **convert_with_ai.py** - AI-enhanced conversion with custom prompts
|
|
||||||
- **convert_literature.py** - Scientific literature conversion with metadata extraction
|
|
||||||
|
|
||||||
### Assets
|
|
||||||
- **example_usage.md** - Practical examples for common use cases
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Install with all features
|
|
||||||
pip install 'markitdown[all]'
|
|
||||||
|
|
||||||
# Or install specific features
|
|
||||||
pip install 'markitdown[pdf,docx,pptx,xlsx]'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Quick Start
|
|
||||||
|
|
||||||
```python
|
|
||||||
from markitdown import MarkItDown
|
|
||||||
|
|
||||||
md = MarkItDown()
|
|
||||||
result = md.convert("document.pdf")
|
|
||||||
print(result.text_content)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Supported Formats
|
|
||||||
|
|
||||||
- **Documents**: PDF, DOCX, PPTX, XLSX, EPUB
|
|
||||||
- **Images**: JPEG, PNG, GIF, WebP (with OCR)
|
|
||||||
- **Audio**: WAV, MP3 (with transcription)
|
|
||||||
- **Web**: HTML, YouTube URLs
|
|
||||||
- **Data**: CSV, JSON, XML
|
|
||||||
- **Archives**: ZIP files
|
|
||||||
|
|
||||||
## Key Features
|
|
||||||
|
|
||||||
### 1. AI-Enhanced Conversions
|
|
||||||
Use AI models via OpenRouter to generate detailed image descriptions:
|
|
||||||
|
|
||||||
```python
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
# OpenRouter provides access to 100+ AI models
|
|
||||||
client = OpenAI(
|
|
||||||
api_key="your-openrouter-api-key",
|
|
||||||
base_url="https://openrouter.ai/api/v1"
|
|
||||||
)
|
|
||||||
|
|
||||||
md = MarkItDown(
|
|
||||||
llm_client=client,
|
|
||||||
llm_model="anthropic/claude-sonnet-4.5" # recommended for vision
|
|
||||||
)
|
|
||||||
result = md.convert("presentation.pptx")
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Batch Processing
|
|
||||||
Convert multiple files efficiently:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python scripts/batch_convert.py papers/ output/ --extensions .pdf .docx
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Scientific Literature
|
|
||||||
Convert and organize research papers:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python scripts/convert_literature.py papers/ output/ --organize-by-year --create-index
|
|
||||||
```
|
|
||||||
|
|
||||||
### 4. Azure Document Intelligence
|
|
||||||
Enhanced PDF conversion with Microsoft Document Intelligence:
|
|
||||||
|
|
||||||
```python
|
|
||||||
md = MarkItDown(docintel_endpoint="https://YOUR-ENDPOINT.cognitiveservices.azure.com/")
|
|
||||||
result = md.convert("complex_document.pdf")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Use Cases
|
|
||||||
|
|
||||||
### Literature Review
|
|
||||||
Convert research papers to Markdown for easier analysis and note-taking.
|
|
||||||
|
|
||||||
### Data Extraction
|
|
||||||
Extract tables from Excel files into Markdown format.
|
|
||||||
|
|
||||||
### Presentation Processing
|
|
||||||
Convert PowerPoint slides with AI-generated descriptions.
|
|
||||||
|
|
||||||
### Document Analysis
|
|
||||||
Process documents for LLM consumption with token-efficient Markdown.
|
|
||||||
|
|
||||||
### YouTube Transcripts
|
|
||||||
Fetch and convert YouTube video transcriptions.
|
|
||||||
|
|
||||||
## Scripts Usage
|
|
||||||
|
|
||||||
### Batch Convert
|
|
||||||
```bash
|
|
||||||
# Convert all PDFs in a directory
|
|
||||||
python scripts/batch_convert.py input_dir/ output_dir/ --extensions .pdf
|
|
||||||
|
|
||||||
# Recursive with multiple formats
|
|
||||||
python scripts/batch_convert.py docs/ markdown/ --extensions .pdf .docx .pptx -r
|
|
||||||
```
|
|
||||||
|
|
||||||
### AI-Enhanced Conversion
|
|
||||||
```bash
|
|
||||||
# Convert with AI descriptions via OpenRouter
|
|
||||||
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
||||||
python scripts/convert_with_ai.py paper.pdf output.md --prompt-type scientific
|
|
||||||
|
|
||||||
# Use different models
|
|
||||||
python scripts/convert_with_ai.py image.png output.md --model anthropic/claude-sonnet-4.5
|
|
||||||
|
|
||||||
# Use custom prompt
|
|
||||||
python scripts/convert_with_ai.py image.png output.md --custom-prompt "Describe this diagram"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Literature Conversion
|
|
||||||
```bash
|
|
||||||
# Convert papers with metadata extraction
|
|
||||||
python scripts/convert_literature.py papers/ markdown/ --organize-by-year --create-index
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Scientific Writer
|
|
||||||
|
|
||||||
This skill integrates seamlessly with the Scientific Writer CLI for:
|
|
||||||
- Converting source materials for paper writing
|
|
||||||
- Processing literature for reviews
|
|
||||||
- Extracting data from various document formats
|
|
||||||
- Preparing documents for LLM analysis
|
|
||||||
|
|
||||||
## Resources
|
|
||||||
|
|
||||||
- **MarkItDown GitHub**: https://github.com/microsoft/markitdown
|
|
||||||
- **PyPI**: https://pypi.org/project/markitdown/
|
|
||||||
- **OpenRouter**: https://openrouter.ai (AI model access)
|
|
||||||
- **OpenRouter API Keys**: https://openrouter.ai/keys
|
|
||||||
- **OpenRouter Models**: https://openrouter.ai/models
|
|
||||||
- **License**: MIT
|
|
||||||
|
|
||||||
## Requirements
|
|
||||||
|
|
||||||
- Python 3.10+
|
|
||||||
- Optional dependencies based on formats needed
|
|
||||||
- OpenRouter API key (for AI-enhanced conversions) - Get at https://openrouter.ai/keys
|
|
||||||
- Azure subscription (optional, for Document Intelligence)
|
|
||||||
|
|
||||||
## Examples
|
|
||||||
|
|
||||||
See `assets/example_usage.md` for comprehensive examples covering:
|
|
||||||
- Basic conversions
|
|
||||||
- Scientific workflows
|
|
||||||
- AI-enhanced processing
|
|
||||||
- Batch operations
|
|
||||||
- Error handling
|
|
||||||
- Integration patterns
|
|
||||||
|
|
||||||
Reference in New Issue
Block a user