mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
Add Seaborn integration for enhanced statistical visualization techniques in scientific plots. Include quick start guides, common plot types, and best practices for publication-ready figures, emphasizing colorblind-safe palettes and automatic confidence intervals.
This commit is contained in:
@@ -80,6 +80,35 @@ fig, ax = plt.subplots()
|
||||
# ... your plotting code ...
|
||||
```
|
||||
|
||||
### Quick Start with Seaborn
|
||||
|
||||
For statistical plots, use seaborn with publication styling:
|
||||
|
||||
```python
|
||||
import seaborn as sns
|
||||
import matplotlib.pyplot as plt
|
||||
from style_presets import apply_publication_style
|
||||
|
||||
# Apply publication style
|
||||
apply_publication_style('default')
|
||||
sns.set_theme(style='ticks', context='paper', font_scale=1.1)
|
||||
sns.set_palette('colorblind')
|
||||
|
||||
# Create statistical comparison figure
|
||||
fig, ax = plt.subplots(figsize=(3.5, 3))
|
||||
sns.boxplot(data=df, x='treatment', y='response',
|
||||
order=['Control', 'Low', 'High'], palette='Set2', ax=ax)
|
||||
sns.stripplot(data=df, x='treatment', y='response',
|
||||
order=['Control', 'Low', 'High'],
|
||||
color='black', alpha=0.3, size=3, ax=ax)
|
||||
ax.set_ylabel('Response (μM)')
|
||||
sns.despine()
|
||||
|
||||
# Save figure
|
||||
from figure_export import save_publication_figure
|
||||
save_publication_figure(fig, 'treatment_comparison', formats=['pdf', 'png'], dpi=300)
|
||||
```
|
||||
|
||||
## Core Principles and Best Practices
|
||||
|
||||
### 1. Resolution and File Format
|
||||
@@ -204,6 +233,18 @@ See `references/matplotlib_examples.md` Example 1 for complete code.
|
||||
6. Remove unnecessary spines
|
||||
7. Save in vector format
|
||||
|
||||
**Using seaborn for automatic confidence intervals:**
|
||||
```python
|
||||
import seaborn as sns
|
||||
fig, ax = plt.subplots(figsize=(5, 3))
|
||||
sns.lineplot(data=timeseries, x='time', y='measurement',
|
||||
hue='treatment', errorbar=('ci', 95),
|
||||
markers=True, ax=ax)
|
||||
ax.set_xlabel('Time (hours)')
|
||||
ax.set_ylabel('Measurement (AU)')
|
||||
sns.despine()
|
||||
```
|
||||
|
||||
### Task 2: Create a Multi-Panel Figure
|
||||
|
||||
See `references/matplotlib_examples.md` Example 2 for complete code.
|
||||
@@ -226,6 +267,17 @@ See `references/matplotlib_examples.md` Example 4 for complete code.
|
||||
4. Set appropriate center value for diverging maps
|
||||
5. Test appearance in grayscale
|
||||
|
||||
**Using seaborn for correlation matrices:**
|
||||
```python
|
||||
import seaborn as sns
|
||||
fig, ax = plt.subplots(figsize=(5, 4))
|
||||
corr = df.corr()
|
||||
mask = np.triu(np.ones_like(corr, dtype=bool))
|
||||
sns.heatmap(corr, mask=mask, annot=True, fmt='.2f',
|
||||
cmap='RdBu_r', center=0, square=True,
|
||||
linewidths=1, cbar_kws={'shrink': 0.8}, ax=ax)
|
||||
```
|
||||
|
||||
### Task 4: Prepare Figure for Specific Journal
|
||||
|
||||
**Workflow:**
|
||||
@@ -306,17 +358,296 @@ ax.text(1.5, max_y * 1.1, '***', ha='center', fontsize=8)
|
||||
- See `references/matplotlib_examples.md` for extensive examples
|
||||
|
||||
### Seaborn
|
||||
- Built on matplotlib, inherits all matplotlib customizations
|
||||
- Good for statistical plots
|
||||
- Apply matplotlib styles first, then use seaborn
|
||||
|
||||
Seaborn provides a high-level, dataset-oriented interface for statistical graphics, built on matplotlib. It excels at creating publication-quality statistical visualizations with minimal code while maintaining full compatibility with matplotlib customization.
|
||||
|
||||
**Key advantages for scientific visualization:**
|
||||
- Automatic statistical estimation and confidence intervals
|
||||
- Built-in support for multi-panel figures (faceting)
|
||||
- Colorblind-friendly palettes by default
|
||||
- Dataset-oriented API using pandas DataFrames
|
||||
- Semantic mapping of variables to visual properties
|
||||
|
||||
#### Quick Start with Publication Style
|
||||
|
||||
Always apply matplotlib publication styles first, then configure seaborn:
|
||||
|
||||
```python
|
||||
import seaborn as sns
|
||||
import matplotlib.pyplot as plt
|
||||
from style_presets import apply_publication_style
|
||||
|
||||
# Apply publication style
|
||||
apply_publication_style('default')
|
||||
sns.set_palette(['#E69F00', '#56B4E9', '#009E73']) # Okabe-Ito colors
|
||||
|
||||
# Configure seaborn for publication
|
||||
sns.set_theme(style='ticks', context='paper', font_scale=1.1)
|
||||
sns.set_palette('colorblind') # Use colorblind-safe palette
|
||||
|
||||
# Create figure
|
||||
fig, ax = plt.subplots(figsize=(3.5, 2.5))
|
||||
sns.scatterplot(data=df, x='time', y='response',
|
||||
hue='treatment', style='condition', ax=ax)
|
||||
sns.despine() # Remove top and right spines
|
||||
```
|
||||
|
||||
#### Common Plot Types for Publications
|
||||
|
||||
**Statistical comparisons:**
|
||||
```python
|
||||
# Box plot with individual points for transparency
|
||||
fig, ax = plt.subplots(figsize=(3.5, 3))
|
||||
sns.boxplot(data=df, x='treatment', y='response',
|
||||
order=['Control', 'Low', 'High'], palette='Set2', ax=ax)
|
||||
sns.stripplot(data=df, x='treatment', y='response',
|
||||
order=['Control', 'Low', 'High'],
|
||||
color='black', alpha=0.3, size=3, ax=ax)
|
||||
ax.set_ylabel('Response (μM)')
|
||||
sns.despine()
|
||||
```
|
||||
|
||||
**Distribution analysis:**
|
||||
```python
|
||||
# Violin plot with split comparison
|
||||
fig, ax = plt.subplots(figsize=(4, 3))
|
||||
sns.violinplot(data=df, x='timepoint', y='expression',
|
||||
hue='treatment', split=True, inner='quartile', ax=ax)
|
||||
ax.set_ylabel('Gene Expression (AU)')
|
||||
sns.despine()
|
||||
```
|
||||
|
||||
**Correlation matrices:**
|
||||
```python
|
||||
# Heatmap with proper colormap and annotations
|
||||
fig, ax = plt.subplots(figsize=(5, 4))
|
||||
corr = df.corr()
|
||||
mask = np.triu(np.ones_like(corr, dtype=bool)) # Show only lower triangle
|
||||
sns.heatmap(corr, mask=mask, annot=True, fmt='.2f',
|
||||
cmap='RdBu_r', center=0, square=True,
|
||||
linewidths=1, cbar_kws={'shrink': 0.8}, ax=ax)
|
||||
plt.tight_layout()
|
||||
```
|
||||
|
||||
**Time series with confidence bands:**
|
||||
```python
|
||||
# Line plot with automatic CI calculation
|
||||
fig, ax = plt.subplots(figsize=(5, 3))
|
||||
sns.lineplot(data=timeseries, x='time', y='measurement',
|
||||
hue='treatment', style='replicate',
|
||||
errorbar=('ci', 95), markers=True, dashes=False, ax=ax)
|
||||
ax.set_xlabel('Time (hours)')
|
||||
ax.set_ylabel('Measurement (AU)')
|
||||
sns.despine()
|
||||
```
|
||||
|
||||
#### Multi-Panel Figures with Seaborn
|
||||
|
||||
**Using FacetGrid for automatic faceting:**
|
||||
```python
|
||||
# Create faceted plot
|
||||
g = sns.relplot(data=df, x='dose', y='response',
|
||||
hue='treatment', col='cell_line', row='timepoint',
|
||||
kind='line', height=2.5, aspect=1.2,
|
||||
errorbar=('ci', 95), markers=True)
|
||||
g.set_axis_labels('Dose (μM)', 'Response (AU)')
|
||||
g.set_titles('{row_name} | {col_name}')
|
||||
sns.despine()
|
||||
|
||||
# Save with correct DPI
|
||||
from figure_export import save_publication_figure
|
||||
save_publication_figure(g.figure, 'figure_facets',
|
||||
formats=['pdf', 'png'], dpi=300)
|
||||
```
|
||||
|
||||
**Combining seaborn with matplotlib subplots:**
|
||||
```python
|
||||
# Create custom multi-panel layout
|
||||
fig, axes = plt.subplots(2, 2, figsize=(7, 6))
|
||||
|
||||
# Panel A: Scatter with regression
|
||||
sns.regplot(data=df, x='predictor', y='response', ax=axes[0, 0])
|
||||
axes[0, 0].text(-0.15, 1.05, 'A', transform=axes[0, 0].transAxes,
|
||||
fontsize=10, fontweight='bold')
|
||||
|
||||
# Panel B: Distribution comparison
|
||||
sns.violinplot(data=df, x='group', y='value', ax=axes[0, 1])
|
||||
axes[0, 1].text(-0.15, 1.05, 'B', transform=axes[0, 1].transAxes,
|
||||
fontsize=10, fontweight='bold')
|
||||
|
||||
# Panel C: Heatmap
|
||||
sns.heatmap(correlation_data, cmap='viridis', ax=axes[1, 0])
|
||||
axes[1, 0].text(-0.15, 1.05, 'C', transform=axes[1, 0].transAxes,
|
||||
fontsize=10, fontweight='bold')
|
||||
|
||||
# Panel D: Time series
|
||||
sns.lineplot(data=timeseries, x='time', y='signal',
|
||||
hue='condition', ax=axes[1, 1])
|
||||
axes[1, 1].text(-0.15, 1.05, 'D', transform=axes[1, 1].transAxes,
|
||||
fontsize=10, fontweight='bold')
|
||||
|
||||
plt.tight_layout()
|
||||
sns.despine()
|
||||
```
|
||||
|
||||
#### Color Palettes for Publications
|
||||
|
||||
Seaborn includes several colorblind-safe palettes:
|
||||
|
||||
```python
|
||||
# Use built-in colorblind palette (recommended)
|
||||
sns.set_palette('colorblind')
|
||||
|
||||
# Or specify custom colorblind-safe colors (Okabe-Ito)
|
||||
okabe_ito = ['#E69F00', '#56B4E9', '#009E73', '#F0E442',
|
||||
'#0072B2', '#D55E00', '#CC79A7', '#000000']
|
||||
sns.set_palette(okabe_ito)
|
||||
|
||||
# For heatmaps and continuous data
|
||||
sns.heatmap(data, cmap='viridis') # Perceptually uniform
|
||||
sns.heatmap(corr, cmap='RdBu_r', center=0) # Diverging, centered
|
||||
```
|
||||
|
||||
#### Choosing Between Axes-Level and Figure-Level Functions
|
||||
|
||||
**Axes-level functions** (e.g., `scatterplot`, `boxplot`, `heatmap`):
|
||||
- Use when building custom multi-panel layouts
|
||||
- Accept `ax=` parameter for precise placement
|
||||
- Better integration with matplotlib subplots
|
||||
- More control over figure composition
|
||||
|
||||
```python
|
||||
fig, ax = plt.subplots(figsize=(3.5, 2.5))
|
||||
sns.scatterplot(data=df, x='x', y='y', hue='group', ax=ax)
|
||||
```
|
||||
|
||||
**Figure-level functions** (e.g., `relplot`, `catplot`, `displot`):
|
||||
- Use for automatic faceting by categorical variables
|
||||
- Create complete figures with consistent styling
|
||||
- Great for exploratory analysis
|
||||
- Use `height` and `aspect` for sizing
|
||||
|
||||
```python
|
||||
g = sns.relplot(data=df, x='x', y='y', col='category', kind='scatter')
|
||||
```
|
||||
|
||||
#### Statistical Rigor with Seaborn
|
||||
|
||||
Seaborn automatically computes and displays uncertainty:
|
||||
|
||||
```python
|
||||
# Line plot: shows mean ± 95% CI by default
|
||||
sns.lineplot(data=df, x='time', y='value', hue='treatment',
|
||||
errorbar=('ci', 95)) # Can change to 'sd', 'se', etc.
|
||||
|
||||
# Bar plot: shows mean with bootstrapped CI
|
||||
sns.barplot(data=df, x='treatment', y='response',
|
||||
errorbar=('ci', 95), capsize=0.1)
|
||||
|
||||
# Always specify error type in figure caption:
|
||||
# "Error bars represent 95% confidence intervals"
|
||||
```
|
||||
|
||||
#### Best Practices for Publication-Ready Seaborn Figures
|
||||
|
||||
1. **Always set publication theme first:**
|
||||
```python
|
||||
sns.set_theme(style='ticks', context='paper', font_scale=1.1)
|
||||
```
|
||||
|
||||
2. **Use colorblind-safe palettes:**
|
||||
```python
|
||||
sns.set_palette('colorblind')
|
||||
```
|
||||
|
||||
3. **Remove unnecessary elements:**
|
||||
```python
|
||||
sns.despine() # Remove top and right spines
|
||||
```
|
||||
|
||||
4. **Control figure size appropriately:**
|
||||
```python
|
||||
# Axes-level: use matplotlib figsize
|
||||
fig, ax = plt.subplots(figsize=(3.5, 2.5))
|
||||
|
||||
# Figure-level: use height and aspect
|
||||
g = sns.relplot(..., height=3, aspect=1.2)
|
||||
```
|
||||
|
||||
5. **Show individual data points when possible:**
|
||||
```python
|
||||
sns.boxplot(...) # Summary statistics
|
||||
sns.stripplot(..., alpha=0.3) # Individual points
|
||||
```
|
||||
|
||||
6. **Include proper labels with units:**
|
||||
```python
|
||||
ax.set_xlabel('Time (hours)')
|
||||
ax.set_ylabel('Expression (AU)')
|
||||
```
|
||||
|
||||
7. **Export at correct resolution:**
|
||||
```python
|
||||
from figure_export import save_publication_figure
|
||||
save_publication_figure(fig, 'figure_name',
|
||||
formats=['pdf', 'png'], dpi=300)
|
||||
```
|
||||
|
||||
#### Advanced Seaborn Techniques
|
||||
|
||||
**Pairwise relationships for exploratory analysis:**
|
||||
```python
|
||||
# Quick overview of all relationships
|
||||
g = sns.pairplot(data=df, hue='condition',
|
||||
vars=['gene1', 'gene2', 'gene3'],
|
||||
corner=True, diag_kind='kde', height=2)
|
||||
```
|
||||
|
||||
**Hierarchical clustering heatmap:**
|
||||
```python
|
||||
# Cluster samples and features
|
||||
g = sns.clustermap(expression_data, method='ward',
|
||||
metric='euclidean', z_score=0,
|
||||
cmap='RdBu_r', center=0,
|
||||
figsize=(10, 8),
|
||||
row_colors=condition_colors,
|
||||
cbar_kws={'label': 'Z-score'})
|
||||
```
|
||||
|
||||
**Joint distributions with marginals:**
|
||||
```python
|
||||
# Bivariate distribution with context
|
||||
g = sns.jointplot(data=df, x='gene1', y='gene2',
|
||||
hue='treatment', kind='scatter',
|
||||
height=6, ratio=4, marginal_kws={'kde': True})
|
||||
```
|
||||
|
||||
#### Common Seaborn Issues and Solutions
|
||||
|
||||
**Issue: Legend outside plot area**
|
||||
```python
|
||||
g = sns.relplot(...)
|
||||
g._legend.set_bbox_to_anchor((0.9, 0.5))
|
||||
```
|
||||
|
||||
**Issue: Overlapping labels**
|
||||
```python
|
||||
plt.xticks(rotation=45, ha='right')
|
||||
plt.tight_layout()
|
||||
```
|
||||
|
||||
**Issue: Text too small at final size**
|
||||
```python
|
||||
sns.set_context('paper', font_scale=1.2) # Increase if needed
|
||||
```
|
||||
|
||||
#### Additional Resources
|
||||
|
||||
For more detailed seaborn information, see:
|
||||
- `scientific-packages/seaborn/SKILL.md` - Comprehensive seaborn documentation
|
||||
- `scientific-packages/seaborn/references/examples.md` - Practical use cases
|
||||
- `scientific-packages/seaborn/references/function_reference.md` - Complete API reference
|
||||
- `scientific-packages/seaborn/references/objects_interface.md` - Modern declarative API
|
||||
|
||||
### Plotly
|
||||
- Interactive figures for exploration
|
||||
- Export static images for publication
|
||||
|
||||
Reference in New Issue
Block a user