diff --git a/scientific-thinking/scientific-visualization/SKILL.md b/scientific-thinking/scientific-visualization/SKILL.md index 481aa24..a3cb2a3 100644 --- a/scientific-thinking/scientific-visualization/SKILL.md +++ b/scientific-thinking/scientific-visualization/SKILL.md @@ -80,6 +80,35 @@ fig, ax = plt.subplots() # ... your plotting code ... ``` +### Quick Start with Seaborn + +For statistical plots, use seaborn with publication styling: + +```python +import seaborn as sns +import matplotlib.pyplot as plt +from style_presets import apply_publication_style + +# Apply publication style +apply_publication_style('default') +sns.set_theme(style='ticks', context='paper', font_scale=1.1) +sns.set_palette('colorblind') + +# Create statistical comparison figure +fig, ax = plt.subplots(figsize=(3.5, 3)) +sns.boxplot(data=df, x='treatment', y='response', + order=['Control', 'Low', 'High'], palette='Set2', ax=ax) +sns.stripplot(data=df, x='treatment', y='response', + order=['Control', 'Low', 'High'], + color='black', alpha=0.3, size=3, ax=ax) +ax.set_ylabel('Response (μM)') +sns.despine() + +# Save figure +from figure_export import save_publication_figure +save_publication_figure(fig, 'treatment_comparison', formats=['pdf', 'png'], dpi=300) +``` + ## Core Principles and Best Practices ### 1. Resolution and File Format @@ -204,6 +233,18 @@ See `references/matplotlib_examples.md` Example 1 for complete code. 6. Remove unnecessary spines 7. Save in vector format +**Using seaborn for automatic confidence intervals:** +```python +import seaborn as sns +fig, ax = plt.subplots(figsize=(5, 3)) +sns.lineplot(data=timeseries, x='time', y='measurement', + hue='treatment', errorbar=('ci', 95), + markers=True, ax=ax) +ax.set_xlabel('Time (hours)') +ax.set_ylabel('Measurement (AU)') +sns.despine() +``` + ### Task 2: Create a Multi-Panel Figure See `references/matplotlib_examples.md` Example 2 for complete code. @@ -226,6 +267,17 @@ See `references/matplotlib_examples.md` Example 4 for complete code. 4. Set appropriate center value for diverging maps 5. Test appearance in grayscale +**Using seaborn for correlation matrices:** +```python +import seaborn as sns +fig, ax = plt.subplots(figsize=(5, 4)) +corr = df.corr() +mask = np.triu(np.ones_like(corr, dtype=bool)) +sns.heatmap(corr, mask=mask, annot=True, fmt='.2f', + cmap='RdBu_r', center=0, square=True, + linewidths=1, cbar_kws={'shrink': 0.8}, ax=ax) +``` + ### Task 4: Prepare Figure for Specific Journal **Workflow:** @@ -306,17 +358,296 @@ ax.text(1.5, max_y * 1.1, '***', ha='center', fontsize=8) - See `references/matplotlib_examples.md` for extensive examples ### Seaborn -- Built on matplotlib, inherits all matplotlib customizations -- Good for statistical plots -- Apply matplotlib styles first, then use seaborn + +Seaborn provides a high-level, dataset-oriented interface for statistical graphics, built on matplotlib. It excels at creating publication-quality statistical visualizations with minimal code while maintaining full compatibility with matplotlib customization. + +**Key advantages for scientific visualization:** +- Automatic statistical estimation and confidence intervals +- Built-in support for multi-panel figures (faceting) +- Colorblind-friendly palettes by default +- Dataset-oriented API using pandas DataFrames +- Semantic mapping of variables to visual properties + +#### Quick Start with Publication Style + +Always apply matplotlib publication styles first, then configure seaborn: + ```python import seaborn as sns +import matplotlib.pyplot as plt from style_presets import apply_publication_style +# Apply publication style apply_publication_style('default') -sns.set_palette(['#E69F00', '#56B4E9', '#009E73']) # Okabe-Ito colors + +# Configure seaborn for publication +sns.set_theme(style='ticks', context='paper', font_scale=1.1) +sns.set_palette('colorblind') # Use colorblind-safe palette + +# Create figure +fig, ax = plt.subplots(figsize=(3.5, 2.5)) +sns.scatterplot(data=df, x='time', y='response', + hue='treatment', style='condition', ax=ax) +sns.despine() # Remove top and right spines ``` +#### Common Plot Types for Publications + +**Statistical comparisons:** +```python +# Box plot with individual points for transparency +fig, ax = plt.subplots(figsize=(3.5, 3)) +sns.boxplot(data=df, x='treatment', y='response', + order=['Control', 'Low', 'High'], palette='Set2', ax=ax) +sns.stripplot(data=df, x='treatment', y='response', + order=['Control', 'Low', 'High'], + color='black', alpha=0.3, size=3, ax=ax) +ax.set_ylabel('Response (μM)') +sns.despine() +``` + +**Distribution analysis:** +```python +# Violin plot with split comparison +fig, ax = plt.subplots(figsize=(4, 3)) +sns.violinplot(data=df, x='timepoint', y='expression', + hue='treatment', split=True, inner='quartile', ax=ax) +ax.set_ylabel('Gene Expression (AU)') +sns.despine() +``` + +**Correlation matrices:** +```python +# Heatmap with proper colormap and annotations +fig, ax = plt.subplots(figsize=(5, 4)) +corr = df.corr() +mask = np.triu(np.ones_like(corr, dtype=bool)) # Show only lower triangle +sns.heatmap(corr, mask=mask, annot=True, fmt='.2f', + cmap='RdBu_r', center=0, square=True, + linewidths=1, cbar_kws={'shrink': 0.8}, ax=ax) +plt.tight_layout() +``` + +**Time series with confidence bands:** +```python +# Line plot with automatic CI calculation +fig, ax = plt.subplots(figsize=(5, 3)) +sns.lineplot(data=timeseries, x='time', y='measurement', + hue='treatment', style='replicate', + errorbar=('ci', 95), markers=True, dashes=False, ax=ax) +ax.set_xlabel('Time (hours)') +ax.set_ylabel('Measurement (AU)') +sns.despine() +``` + +#### Multi-Panel Figures with Seaborn + +**Using FacetGrid for automatic faceting:** +```python +# Create faceted plot +g = sns.relplot(data=df, x='dose', y='response', + hue='treatment', col='cell_line', row='timepoint', + kind='line', height=2.5, aspect=1.2, + errorbar=('ci', 95), markers=True) +g.set_axis_labels('Dose (μM)', 'Response (AU)') +g.set_titles('{row_name} | {col_name}') +sns.despine() + +# Save with correct DPI +from figure_export import save_publication_figure +save_publication_figure(g.figure, 'figure_facets', + formats=['pdf', 'png'], dpi=300) +``` + +**Combining seaborn with matplotlib subplots:** +```python +# Create custom multi-panel layout +fig, axes = plt.subplots(2, 2, figsize=(7, 6)) + +# Panel A: Scatter with regression +sns.regplot(data=df, x='predictor', y='response', ax=axes[0, 0]) +axes[0, 0].text(-0.15, 1.05, 'A', transform=axes[0, 0].transAxes, + fontsize=10, fontweight='bold') + +# Panel B: Distribution comparison +sns.violinplot(data=df, x='group', y='value', ax=axes[0, 1]) +axes[0, 1].text(-0.15, 1.05, 'B', transform=axes[0, 1].transAxes, + fontsize=10, fontweight='bold') + +# Panel C: Heatmap +sns.heatmap(correlation_data, cmap='viridis', ax=axes[1, 0]) +axes[1, 0].text(-0.15, 1.05, 'C', transform=axes[1, 0].transAxes, + fontsize=10, fontweight='bold') + +# Panel D: Time series +sns.lineplot(data=timeseries, x='time', y='signal', + hue='condition', ax=axes[1, 1]) +axes[1, 1].text(-0.15, 1.05, 'D', transform=axes[1, 1].transAxes, + fontsize=10, fontweight='bold') + +plt.tight_layout() +sns.despine() +``` + +#### Color Palettes for Publications + +Seaborn includes several colorblind-safe palettes: + +```python +# Use built-in colorblind palette (recommended) +sns.set_palette('colorblind') + +# Or specify custom colorblind-safe colors (Okabe-Ito) +okabe_ito = ['#E69F00', '#56B4E9', '#009E73', '#F0E442', + '#0072B2', '#D55E00', '#CC79A7', '#000000'] +sns.set_palette(okabe_ito) + +# For heatmaps and continuous data +sns.heatmap(data, cmap='viridis') # Perceptually uniform +sns.heatmap(corr, cmap='RdBu_r', center=0) # Diverging, centered +``` + +#### Choosing Between Axes-Level and Figure-Level Functions + +**Axes-level functions** (e.g., `scatterplot`, `boxplot`, `heatmap`): +- Use when building custom multi-panel layouts +- Accept `ax=` parameter for precise placement +- Better integration with matplotlib subplots +- More control over figure composition + +```python +fig, ax = plt.subplots(figsize=(3.5, 2.5)) +sns.scatterplot(data=df, x='x', y='y', hue='group', ax=ax) +``` + +**Figure-level functions** (e.g., `relplot`, `catplot`, `displot`): +- Use for automatic faceting by categorical variables +- Create complete figures with consistent styling +- Great for exploratory analysis +- Use `height` and `aspect` for sizing + +```python +g = sns.relplot(data=df, x='x', y='y', col='category', kind='scatter') +``` + +#### Statistical Rigor with Seaborn + +Seaborn automatically computes and displays uncertainty: + +```python +# Line plot: shows mean ± 95% CI by default +sns.lineplot(data=df, x='time', y='value', hue='treatment', + errorbar=('ci', 95)) # Can change to 'sd', 'se', etc. + +# Bar plot: shows mean with bootstrapped CI +sns.barplot(data=df, x='treatment', y='response', + errorbar=('ci', 95), capsize=0.1) + +# Always specify error type in figure caption: +# "Error bars represent 95% confidence intervals" +``` + +#### Best Practices for Publication-Ready Seaborn Figures + +1. **Always set publication theme first:** + ```python + sns.set_theme(style='ticks', context='paper', font_scale=1.1) + ``` + +2. **Use colorblind-safe palettes:** + ```python + sns.set_palette('colorblind') + ``` + +3. **Remove unnecessary elements:** + ```python + sns.despine() # Remove top and right spines + ``` + +4. **Control figure size appropriately:** + ```python + # Axes-level: use matplotlib figsize + fig, ax = plt.subplots(figsize=(3.5, 2.5)) + + # Figure-level: use height and aspect + g = sns.relplot(..., height=3, aspect=1.2) + ``` + +5. **Show individual data points when possible:** + ```python + sns.boxplot(...) # Summary statistics + sns.stripplot(..., alpha=0.3) # Individual points + ``` + +6. **Include proper labels with units:** + ```python + ax.set_xlabel('Time (hours)') + ax.set_ylabel('Expression (AU)') + ``` + +7. **Export at correct resolution:** + ```python + from figure_export import save_publication_figure + save_publication_figure(fig, 'figure_name', + formats=['pdf', 'png'], dpi=300) + ``` + +#### Advanced Seaborn Techniques + +**Pairwise relationships for exploratory analysis:** +```python +# Quick overview of all relationships +g = sns.pairplot(data=df, hue='condition', + vars=['gene1', 'gene2', 'gene3'], + corner=True, diag_kind='kde', height=2) +``` + +**Hierarchical clustering heatmap:** +```python +# Cluster samples and features +g = sns.clustermap(expression_data, method='ward', + metric='euclidean', z_score=0, + cmap='RdBu_r', center=0, + figsize=(10, 8), + row_colors=condition_colors, + cbar_kws={'label': 'Z-score'}) +``` + +**Joint distributions with marginals:** +```python +# Bivariate distribution with context +g = sns.jointplot(data=df, x='gene1', y='gene2', + hue='treatment', kind='scatter', + height=6, ratio=4, marginal_kws={'kde': True}) +``` + +#### Common Seaborn Issues and Solutions + +**Issue: Legend outside plot area** +```python +g = sns.relplot(...) +g._legend.set_bbox_to_anchor((0.9, 0.5)) +``` + +**Issue: Overlapping labels** +```python +plt.xticks(rotation=45, ha='right') +plt.tight_layout() +``` + +**Issue: Text too small at final size** +```python +sns.set_context('paper', font_scale=1.2) # Increase if needed +``` + +#### Additional Resources + +For more detailed seaborn information, see: +- `scientific-packages/seaborn/SKILL.md` - Comprehensive seaborn documentation +- `scientific-packages/seaborn/references/examples.md` - Practical use cases +- `scientific-packages/seaborn/references/function_reference.md` - Complete API reference +- `scientific-packages/seaborn/references/objects_interface.md` - Modern declarative API + ### Plotly - Interactive figures for exploration - Export static images for publication