Enhance citation management and literature review guidelines

- Updated SKILL.md in citation management to include best practices for identifying seminal and high-impact papers, emphasizing citation count thresholds, venue quality tiers, and author reputation indicators.
- Expanded literature review SKILL.md to prioritize high-impact papers, detailing citation metrics, journal tiers, and author reputation assessment.
- Added comprehensive evaluation strategies for paper impact and quality in literature_search_strategies.md, including citation count significance and journal impact factor guidance.
- Improved research lookup scripts to prioritize results based on citation count, venue prestige, and author reputation, enhancing the quality of research outputs.
This commit is contained in:
Vinayak Agarwal
2026-01-05 13:01:10 -08:00
parent d243a12564
commit 3439a21f57
41 changed files with 11802 additions and 61 deletions

View File

@@ -498,16 +498,74 @@ python scripts/validate_format.py \
| **PLOS** | 300-600 dpi | TIFF, EPS | RGB |
| **IEEE** | 300+ dpi | EPS, PDF | RGB or Grayscale |
## Writing Style Guides
Beyond formatting, this skill provides comprehensive **writing style guides** that capture how papers should *read* at different venues—not just how they should look.
### Why Style Matters
The same research written for Nature will read very differently than when written for NeurIPS:
- **Nature/Science**: Accessible to non-specialists, story-driven, broad significance
- **Cell Press**: Mechanistic depth, comprehensive data, graphical abstract required
- **Medical journals**: Patient-centered, evidence-graded, structured abstracts
- **ML conferences**: Contribution bullets, ablation studies, reproducibility focus
- **CS conferences**: Field-specific conventions, varying evaluation standards
### Available Style Guides
| Guide | Covers | Key Topics |
|-------|--------|------------|
| `venue_writing_styles.md` | Master overview | Style spectrum, quick reference |
| `nature_science_style.md` | Nature, Science, PNAS | Accessibility, story-telling, broad impact |
| `cell_press_style.md` | Cell, Neuron, Immunity | Graphical abstracts, eTOC, Highlights |
| `medical_journal_styles.md` | NEJM, Lancet, JAMA, BMJ | Structured abstracts, evidence language |
| `ml_conference_style.md` | NeurIPS, ICML, ICLR, CVPR | Contribution bullets, ablations |
| `cs_conference_style.md` | ACL, EMNLP, CHI, SIGKDD | Field-specific conventions |
| `reviewer_expectations.md` | All venues | What reviewers look for, rebuttal tips |
### Writing Examples
Concrete examples are available in `assets/examples/`:
- `nature_abstract_examples.md`: Flowing paragraph abstracts for high-impact journals
- `neurips_introduction_example.md`: ML conference intro with contribution bullets
- `cell_summary_example.md`: Cell Press Summary, Highlights, eTOC format
- `medical_structured_abstract.md`: NEJM, Lancet, JAMA structured format
### Workflow: Adapting to a Venue
1. **Identify target venue** and load the appropriate style guide
2. **Review writing conventions**: Tone, voice, abstract format, structure
3. **Check examples** for section-specific guidance
4. **Review expectations**: What do reviewers at this venue prioritize?
5. **Apply formatting**: Use LaTeX template from `assets/`
---
## Resources
### Bundled Resources
**References** (in `references/`):
**Writing Style Guides** (in `references/`):
- `venue_writing_styles.md`: Master style overview and comparison
- `nature_science_style.md`: Nature/Science writing conventions
- `cell_press_style.md`: Cell Press journal style
- `medical_journal_styles.md`: Medical journal writing guide
- `ml_conference_style.md`: ML conference writing conventions
- `cs_conference_style.md`: CS conference writing guide
- `reviewer_expectations.md`: What reviewers look for by venue
**Formatting Requirements** (in `references/`):
- `journals_formatting.md`: Comprehensive journal formatting requirements
- `conferences_formatting.md`: Conference paper specifications
- `posters_guidelines.md`: Research poster design and sizing
- `grants_requirements.md`: Grant proposal requirements by agency
**Writing Examples** (in `assets/examples/`):
- `nature_abstract_examples.md`: High-impact journal abstract examples
- `neurips_introduction_example.md`: ML conference introduction format
- `cell_summary_example.md`: Cell Press Summary/Highlights/eTOC
- `medical_structured_abstract.md`: NEJM/Lancet/JAMA abstract format
**Templates** (in `assets/`):
- `journals/`: Journal article LaTeX templates
- `posters/`: Research poster templates

View File

@@ -0,0 +1,247 @@
# Cell Press Summary, Highlights, and eTOC Examples
Examples of Cell Press-specific elements including Summary (abstract), Highlights, and eTOC blurb.
---
## Complete Example 1: Senescence and Aging
### Summary (150 words max)
```
Cellular senescence is a stress response that prevents damaged cell
proliferation but can drive tissue dysfunction through the senescence-
associated secretory phenotype (SASP). How senescent cells resist
apoptosis despite expressing pro-apoptotic p53 has remained unclear.
Here, we identify FOXO4 as a pivotal mediator of senescent cell viability.
FOXO4 is highly expressed in senescent cells and directly interacts with
p53, retaining it in the nucleus and preventing p53-mediated apoptosis.
A cell-permeable peptide that disrupts FOXO4-p53 interaction selectively
induces p53 nuclear exclusion and apoptosis in senescent cells without
affecting proliferating cells. In vivo, this FOXO4 peptide neutralizes
doxorubicin-induced senescent cells and restores fitness, fur density,
and renal function in naturally aged mice. These findings establish
FOXO4-mediated p53 sequestration as a senescence-specific survival
pathway and demonstrate the therapeutic potential of targeted senescent
cell elimination.
```
### Highlights (≤85 characters each)
```
• FOXO4 is selectively upregulated in senescent cells and binds p53
• FOXO4-p53 interaction retains p53 in the nucleus, preventing apoptosis
• A FOXO4-targeting peptide induces apoptosis specifically in senescent cells
• FOXO4 peptide treatment restores fitness and organ function in aged mice
```
### eTOC Blurb (30-50 words)
```
Baar et al. identify FOXO4 as a critical mediator of senescent cell survival
through p53 sequestration. A peptide disrupting FOXO4-p53 interaction
selectively eliminates senescent cells and restores tissue function in
aged mice, establishing proof-of-concept for targeted senolytic therapy.
```
### In Brief (1 sentence)
```
A FOXO4-targeting peptide selectively eliminates senescent cells by
releasing p53, restoring tissue function in aged mice.
```
---
## Complete Example 2: Genome Organization
### Summary (150 words max)
```
The three-dimensional organization of chromosomes within the nucleus
influences gene expression, DNA replication, and genome stability.
Phase separation has emerged as a potential mechanism for organizing
nuclear contents, but whether condensates can shape chromosome
structure in vivo remains unknown. Here, we show that the transcriptional
coactivator BRD4 forms liquid-like condensates at super-enhancers that
organize associated chromatin into hub structures. Optogenetic induction
of BRD4 condensates is sufficient to remodel chromosome topology and
activate transcription within minutes. Conversely, disruption of BRD4
condensates with the small molecule JQ1 dissolves chromatin hubs and
rapidly silences super-enhancer-controlled genes. Single-molecule
tracking reveals that condensate formation increases the local
concentration of transcription machinery 100-fold, explaining the
transcriptional potency of super-enhancers. These results establish
phase separation as a mechanism for chromatin organization and
transcriptional control with implications for understanding and
targeting oncogenic super-enhancers.
```
### Highlights
```
• BRD4 forms liquid condensates at super-enhancers in living cells
• BRD4 condensates organize chromatin into transcriptionally active hubs
• Optogenetic condensate induction rapidly remodels chromatin topology
• Condensates concentrate transcription machinery 100-fold locally
```
### eTOC Blurb
```
Sabari et al. demonstrate that BRD4 forms phase-separated condensates
at super-enhancers that organize chromatin into hub structures and
concentrate transcription machinery. Optogenetic manipulation reveals
that condensate formation directly drives chromatin remodeling and
transcriptional activation.
```
---
## Complete Example 3: Metabolism and Immunity
### Summary (150 words max)
```
Immune cells undergo dramatic metabolic reprogramming upon activation,
switching from oxidative phosphorylation to aerobic glycolysis. This
metabolic shift is thought to support the biosynthetic demands of
rapid proliferation, but whether specific metabolites directly regulate
immune cell function remains largely unexplored. Here, we show that
the glycolytic metabolite phosphoenolpyruvate (PEP) sustains T cell
receptor signaling by inhibiting sarco/endoplasmic reticulum Ca²⁺-ATPase
(SERCA) activity. PEP accumulates in activated T cells and directly
binds SERCA, preventing calcium reuptake and prolonging store-operated
calcium entry. Genetic or pharmacological enhancement of PEP levels
augments T cell effector function and anti-tumor immunity in vivo.
Conversely, tumor-derived lactate suppresses PEP levels and impairs
T cell calcium signaling, contributing to tumor immune evasion. These
findings reveal an unexpected signaling role for a glycolytic
intermediate and suggest metabolic strategies to enhance T cell
responses in cancer immunotherapy.
```
### Highlights
```
• Phosphoenolpyruvate (PEP) accumulates during T cell activation
• PEP directly binds and inhibits SERCA to sustain calcium signaling
• Enhancing PEP levels augments anti-tumor T cell immunity
• Tumor lactate suppresses T cell PEP levels and calcium signaling
```
### eTOC Blurb
```
Ho et al. discover that the glycolytic metabolite phosphoenolpyruvate
directly regulates T cell calcium signaling by inhibiting SERCA. This
metabolic-signaling link is exploited by tumors through lactate
secretion and offers new targets for cancer immunotherapy.
```
---
## Graphical Abstract Description Examples
### For Senescence Paper
```
"Graphical abstract for Cell paper on FOXO4 and senescence:
Left panel: Senescent cell (enlarged, irregular shape) with FOXO4 (blue
oval) binding p53 (green oval) in nucleus, preventing apoptosis. Label:
'FOXO4 sequesters p53 → Senescent cell survival'
Center panel: Same senescent cell with FOXO4 peptide (red wedge)
disrupting FOXO4-p53 interaction. p53 moves to mitochondria (orange
organelles). Label: 'FOXO4 peptide disrupts interaction'
Right panel: Senescent cell undergoing apoptosis (fragmenting). Label:
'Selective senescent cell death'
Bottom: Aged mouse (grey, hunched) → Treatment arrow → Rejuvenated mouse
(brown, active). Label: 'Restored fitness in aged mice'
Color scheme: Blue for FOXO4, green for p53, red for peptide, grey
background for cells."
```
### For Chromatin Paper
```
"Graphical abstract for Cell paper on BRD4 condensates:
Top row: Diagram showing BRD4 molecules (purple dots) clustering at
super-enhancer (yellow region on DNA strand), forming condensate
(purple droplet). Transcription factors (orange, green, blue small
circles) accumulate inside condensate.
Middle: Chromatin fibers (grey) being pulled into hub structure around
condensate. Arrow showing '100× local concentration increase'
Bottom: Two panels - Left shows 'JQ1' treatment dissolving condensate
and chromatin hub dispersing. Right shows 'Optogenetic activation'
creating new condensate with chromatin reorganization. Gene expression
indicators (up arrow, down arrow) for each condition."
```
---
## Writing Tips for Cell Elements
### Summary Tips
1. **First sentence**: Establish the biological context
2. **Second sentence**: State what was unknown (the gap)
3. **"Here, we show/identify/demonstrate"**: Clear transition to your work
4. **Middle sentences**: Key findings with mechanism
5. **Final sentence**: Significance and implications
### Highlights Tips
- **Start with a noun or verb**: "FOXO4 forms..." or "Activation of..."
- **One finding per bullet**: Don't combine multiple points
- **Be specific**: Include the protein/gene/pathway name
- **Check character count**: Strictly ≤85 characters including spaces
- **Cover different findings**: Don't repeat the same point
### eTOC Blurb Tips
- **Start with author names**: "Smith et al. show that..."
- **One or two sentences only**: Keep it punchy
- **Include the key mechanism**: Not just the finding
- **End with significance**: Why readers should care
---
## Character Counting for Highlights
Use this to check your highlights:
```
• This highlight is exactly 52 characters long including sp
↑ Count: 52 characters ✓ (under 85)
• This highlight is getting close to the maximum allowed character limit
↑ Count: 73 characters ✓ (under 85)
• This highlight demonstrates what happens when you try to include way too much info
↑ Count: 88 characters ✗ (over 85 - need to shorten)
```
---
## See Also
- `cell_press_style.md` - Comprehensive Cell Press writing guide
- `nature_abstract_examples.md` - Compare with Nature abstract style

View File

@@ -0,0 +1,313 @@
# Medical Journal Structured Abstract Examples
Examples of structured abstracts for NEJM, Lancet, JAMA, and BMJ showing the labeled section format expected at medical journals.
---
## NEJM Style (250 words max)
### Example 1: Clinical Trial
```
BACKGROUND
Sodium-glucose cotransporter 2 (SGLT2) inhibitors reduce cardiovascular
events in patients with type 2 diabetes and established cardiovascular
disease. Whether these benefits extend to patients with heart failure and
reduced ejection fraction, regardless of diabetes status, is unknown.
METHODS
We randomly assigned 4,744 patients with heart failure and an ejection
fraction of 40% or less to receive dapagliflozin (10 mg once daily) or
placebo, in addition to recommended therapy. The primary outcome was a
composite of worsening heart failure (hospitalization or urgent visit
requiring intravenous therapy) or cardiovascular death.
RESULTS
Over a median of 18.2 months, the primary outcome occurred in 386 of
2,373 patients (16.3%) in the dapagliflozin group and in 502 of 2,371
patients (21.2%) in the placebo group (hazard ratio, 0.74; 95% confidence
interval [CI], 0.65 to 0.85; P<0.001). A first worsening heart failure
event occurred in 237 patients (10.0%) in the dapagliflozin group and
in 326 patients (13.7%) in the placebo group (hazard ratio, 0.70; 95%
CI, 0.59 to 0.83). Death from cardiovascular causes occurred in 227
patients (9.6%) and 273 patients (11.5%), respectively (hazard ratio,
0.82; 95% CI, 0.69 to 0.98). Effects were similar in patients with and
without diabetes. Serious adverse events were similar between groups.
CONCLUSIONS
Among patients with heart failure and a reduced ejection fraction,
dapagliflozin reduced the risk of worsening heart failure or
cardiovascular death, regardless of the presence of diabetes.
```
**Key Features**:
- Four labeled sections (BACKGROUND, METHODS, RESULTS, CONCLUSIONS)
- Background: 2 sentences (problem + gap)
- Methods: Study design, population, intervention, primary outcome
- Results: Primary outcome with HR and 95% CI, key secondary outcomes
- Conclusions: Clear, measured statement of findings
---
### Example 2: Observational Study
```
BACKGROUND
Long-term use of proton-pump inhibitors (PPIs) has been associated with
adverse outcomes in observational studies, but causality remains uncertain.
The relationship between PPI use and chronic kidney disease is unclear.
METHODS
We conducted a prospective cohort study using data from 10,482 participants
in the Atherosclerosis Risk in Communities study who were free of kidney
disease at baseline. PPI use was ascertained at baseline and follow-up
visits. The primary outcome was incident chronic kidney disease, defined
as an estimated glomerular filtration rate less than 60 ml per minute per
1.73 m² of body-surface area.
RESULTS
Over a median follow-up of 13.9 years, incident chronic kidney disease
occurred in 56.0 per 1000 person-years among PPI users and in 42.0 per
1000 person-years among non-users (adjusted hazard ratio, 1.50; 95%
confidence interval [CI], 1.14 to 1.96). The association persisted after
adjustment for potential confounders, including indication for PPI use
and baseline kidney function. Sensitivity analyses using propensity-score
matching yielded similar results. No association was observed for
histamine H2-receptor antagonist use (hazard ratio, 1.08; 95% CI, 0.87
to 1.34).
CONCLUSIONS
PPI use was associated with an increased risk of incident chronic kidney
disease in this community-based cohort. These findings warrant cautious
use of PPIs and further investigation to establish causality.
```
**Key Features**:
- Appropriate hedging for observational study ("associated with")
- Incidence rates provided (per 1000 person-years)
- Sensitivity analyses mentioned
- Negative control (H2-receptor antagonists)
- Cautious conclusion acknowledging limitation
---
## Lancet Style (300 words max)
### Example 3: Clinical Trial with Summary Box
```
BACKGROUND
Dexamethasone has been shown to reduce mortality in hospitalized patients
with COVID-19 requiring respiratory support. We aimed to evaluate whether
higher doses of corticosteroids would provide additional benefit in
patients with severe COVID-19 pneumonia.
METHODS
In this randomized, controlled, open-label trial conducted at 18 hospitals
in Brazil, we assigned patients with moderate-to-severe COVID-19 (PaO2/FiO2
≤200 mm Hg) to receive high-dose dexamethasone (20 mg once daily for 5
days, then 10 mg once daily for 5 days) or standard dexamethasone (6 mg
once daily for 10 days). The primary outcome was ventilator-free days
at 28 days.
FINDINGS
Between June 17, 2020, and September 20, 2021, we enrolled 299 patients
(151 assigned to high-dose dexamethasone and 148 to standard
dexamethasone). The mean number of ventilator-free days at 28 days was
14·2 (SD 10·8) in the high-dose group and 15·5 (SD 10·4) in the standard
group (difference, 1·3 days; 95% CI, 3·9 to 1·3; P=0·32). There was
no significant difference in 28-day mortality (high dose 35·8% vs
standard 31·8%; hazard ratio 1·16; 95% CI, 0·79 to 1·70). Hyperglycemia
requiring insulin was more frequent with high-dose dexamethasone (66·0%
vs 53·4%; P=0·027).
INTERPRETATION
In patients with moderate-to-severe COVID-19 pneumonia, high-dose
dexamethasone did not improve ventilator-free days and was associated
with increased hyperglycemia compared with standard-dose dexamethasone.
These findings do not support the use of high-dose corticosteroids in
COVID-19.
FUNDING
Ministry of Health of Brazil.
```
**Key Features**:
- Lancet uses "Findings" instead of "Results"
- Lancet uses "Interpretation" instead of "Conclusions"
- Includes funding statement in abstract
- Decimal point (·) instead of period in numbers (Lancet style)
---
## JAMA Style (350 words max)
### Example 4: Diagnostic Study
```
IMPORTANCE
Lung cancer screening with low-dose computed tomography (CT) reduces
mortality but identifies many indeterminate pulmonary nodules, leading
to unnecessary invasive procedures. Improved risk prediction could
reduce harms while preserving benefits.
OBJECTIVE
To develop and validate a deep learning model for predicting malignancy
risk of lung nodules detected on screening CT.
DESIGN, SETTING, AND PARTICIPANTS
This retrospective cohort study included 14,851 participants with
lung nodules from the National Lung Screening Trial (NLST) for model
development and 5,402 participants from an independent multi-site
validation cohort (2016-2019). Data analysis was performed from
January to November 2022.
EXPOSURES
Deep learning model prediction of malignancy risk based on CT imaging.
MAIN OUTCOMES AND MEASURES
The primary outcome was lung cancer diagnosis within 2 years. Model
performance was assessed by area under the receiver operating
characteristic curve (AUC), sensitivity, specificity, and comparison
with radiologist assessments.
RESULTS
In the validation cohort (median age, 65 years; 57% male), 312 nodules
(5.8%) were diagnosed as lung cancer within 2 years. The deep learning
model achieved an AUC of 0.94 (95% CI, 0.92-0.96), compared with 0.85
(95% CI, 0.82-0.88) for the Lung-RADS categorization used by radiologists
(P<0.001). At 95% sensitivity, the model achieved 68% specificity compared
with 38% for Lung-RADS, corresponding to a 49% reduction in false-positive
nodules requiring follow-up. The model's performance was consistent across
subgroups defined by nodule size, location, and patient demographics.
CONCLUSIONS AND RELEVANCE
A deep learning model for lung nodule malignancy prediction outperformed
current clinical standards and could substantially reduce false-positive
findings in lung cancer screening, decreasing unnecessary surveillance
and invasive procedures.
```
**Key Features**:
- JAMA-specific sections (IMPORTANCE, OBJECTIVE, DESIGN...)
- "Importance" section required (2-3 sentences on why this matters)
- Detailed design section
- "Exposures" clearly stated
- "Main Outcomes and Measures" explicit
---
## BMJ Style (300 words max)
### Example 5: Cohort Study
```
OBJECTIVE
To examine the association between statin use and risk of Parkinson's
disease in a large population-based cohort.
DESIGN
Prospective cohort study.
SETTING
UK Biobank, 2006-2021.
PARTICIPANTS
402,251 adults aged 40-69 years without Parkinson's disease at baseline.
MAIN OUTCOME MEASURES
Incident Parkinson's disease identified through hospital admissions,
primary care records, and death certificates. Hazard ratios were
estimated using Cox regression, adjusted for age, sex, education,
smoking, alcohol, physical activity, body mass index, and comorbidities.
RESULTS
Over a median follow-up of 12.3 years, 2,841 participants developed
Parkinson's disease (incidence rate 5.7 per 10,000 person-years).
Statin use at baseline was not associated with incident Parkinson's
disease (adjusted hazard ratio 0.95, 95% confidence interval 0.87 to
1.04). Results were consistent across analyses stratified by statin
type (lipophilic vs hydrophilic), dose, and duration of use, and in
sensitivity analyses accounting for reverse causation. No protective
association was observed in analyses restricted to participants with
high cardiovascular risk or in propensity-score matched cohorts.
CONCLUSIONS
In this large prospective cohort, statin use was not associated with
reduced risk of Parkinson's disease, contrary to findings from some
previous observational studies. The null findings were robust across
multiple sensitivity analyses. These results do not support a
neuroprotective effect of statins against Parkinson's disease.
WHAT IS ALREADY KNOWN ON THIS TOPIC
Previous observational studies have yielded inconsistent results
regarding statin use and Parkinson's disease risk.
WHAT THIS STUDY ADDS
This large prospective study with long follow-up found no evidence
that statin use protects against Parkinson's disease.
```
**Key Features**:
- BMJ uses abbreviated section headers
- Includes "What is already known" and "What this study adds" boxes
- Design, Setting, and Participants as separate sections
- Clear Main Outcome Measures section
---
## Key Differences Between Journals
| Element | NEJM | Lancet | JAMA | BMJ |
|---------|------|--------|------|-----|
| **Word limit** | 250 | 300 | 350 | 300 |
| **Results label** | RESULTS | FINDINGS | RESULTS | RESULTS |
| **Conclusions label** | CONCLUSIONS | INTERPRETATION | CONCLUSIONS AND RELEVANCE | CONCLUSIONS |
| **Unique sections** | — | Funding in abstract | IMPORTANCE | What is known/adds |
| **Decimal style** | Period (.) | Centered dot (·) | Period (.) | Period (.) |
---
## Essential Elements for All Medical Abstracts
### Background/Context
- Disease burden or clinical problem (1 sentence)
- Knowledge gap or rationale for study (1 sentence)
### Methods
- Study design (RCT, cohort, case-control)
- Setting (number of sites, country/region)
- Participants (N, key inclusion criteria)
- Intervention or exposure
- Primary outcome with definition
### Results
- Number enrolled and analyzed
- Primary outcome with effect size and 95% CI
- Key secondary outcomes
- P-values for primary comparisons
- Adverse events (if applicable)
### Conclusions
- Clear statement of main finding
- Appropriate hedging based on study design
- Clinical implication (optional, 1 sentence)
---
## Common Mistakes in Medical Abstracts
**Missing confidence intervals**: "HR 0.75, P=0.02" → include 95% CI
**Relative risk only**: Add absolute risk reduction, NNT
**Causal language for observational studies**: "PPIs cause kidney disease"
**Overstated conclusions**: Claims exceeding evidence
**Missing sample sizes**: Always include N for each group
**Vague outcomes**: "Improved outcomes" without specific definition
---
## See Also
- `medical_journal_styles.md` - Comprehensive medical writing guide
- `venue_writing_styles.md` - Style comparison across venues

View File

@@ -0,0 +1,213 @@
# Nature/Science Abstract Examples
Examples of well-crafted abstracts for high-impact multidisciplinary journals. These demonstrate the flowing paragraph style with broad accessibility expected at Nature, Science, and related venues.
---
## Example 1: Molecular Biology / Cell Biology
**Topic**: CRISPR gene editing discovery
```
The ability to precisely edit DNA sequences in living cells has transformed
biological research and holds promise for treating genetic diseases. However,
current genome editing tools can introduce unwanted mutations at off-target
sites, limiting their clinical potential. Here we describe prime editing, a
versatile and precise genome editing method that directly writes new genetic
information into a specified DNA site using a reverse transcriptase fused to a
CRISPR nickase. Prime editing can make all 12 types of point mutations, as
well as small insertions and deletions, with minimal off-target editing and
without requiring double-strand breaks or donor DNA templates. In human cells,
we used prime editing to correct the primary genetic causes of sickle cell
disease and Tay-Sachs disease, and to install protective mutations that
reduce risk of prion disease. Prime editing expands the scope and capabilities
of genome editing and may address approximately 89% of known human genetic
disease variants.
```
**Why this works**:
- Opens with broad significance (genetic disease treatment)
- States the problem clearly (off-target mutations)
- Describes the approach accessibly ("writes new genetic information")
- Includes specific results (all 12 point mutations, specific diseases)
- Ends with quantified impact (89% of variants)
---
## Example 2: Neuroscience
**Topic**: Memory consolidation mechanism
```
Sleep is essential for memory consolidation, yet how the sleeping brain
transforms labile memories into stable long-term representations remains
poorly understood. We used multi-site electrophysiology in freely behaving
mice to record the activity of thousands of neurons across hippocampus and
cortex during learning and subsequent sleep. We discovered that specific
neurons that encode a newly learned memory reactivate in precisely timed
sequences during slow-wave sleep, with hippocampal reactivation preceding
cortical reactivation by 10-15 milliseconds. Optogenetic disruption of this
temporal coordination impaired memory retention by 78%, whereas artificial
enhancement of the temporal relationship strengthened memories beyond normal
levels. These results reveal that the temporal ordering of hippocampal-cortical
replay is not merely correlative but causally necessary for memory
consolidation. Our findings suggest new therapeutic approaches for memory
disorders based on optimizing the temporal dynamics of sleep.
```
**Why this works**:
- Connects to well-known phenomenon (sleep and memory)
- States what was unknown
- Describes approach (multi-site recordings)
- Key finding with specific number (10-15 ms)
- Causal evidence (disruption and enhancement experiments)
- Broader implications (therapeutic approaches)
---
## Example 3: Climate Science
**Topic**: Carbon cycle feedback
```
Arctic permafrost contains approximately 1,500 billion tonnes of organic
carbon—twice the amount currently in the atmosphere. As the Arctic warms,
this carbon may be released to the atmosphere, accelerating global warming
through a positive feedback loop. However, the magnitude and timing of this
feedback remain highly uncertain because microbial decomposition rates in
thawing permafrost are poorly constrained. Here we present a 15-year
field experiment across 25 sites spanning the Arctic, tracking carbon
fluxes in warming permafrost under natural conditions. We find that
microbial respiration increases exponentially with temperature until soils
reach 3°C, then plateaus due to substrate limitation—a threshold effect
not captured by current Earth system models. Our results suggest that
permafrost carbon feedback will be 30-50% lower than current projections
during this century, providing more time to limit warming, but will
accelerate dramatically if deep permafrost begins to thaw.
```
**Why this works**:
- Opens with striking number (1,500 billion tonnes)
- Clear problem statement (feedback uncertainty)
- Specific methodology (15 years, 25 sites)
- Novel finding (threshold at 3°C)
- Implications both reassuring and cautionary
---
## Example 4: Physics / Materials Science
**Topic**: Room-temperature superconductivity
```
Superconductivity—the flow of electricity without resistance—has been
confined to extremely low temperatures since its discovery over a century
ago, limiting practical applications. The recent demonstration of
superconductivity in hydrogen-rich materials at high pressure has raised
hopes for higher transition temperatures, but achieving room-temperature
superconductivity at ambient pressure has remained elusive. Here we report
superconductivity at 21°C (294 K) in a nitrogen-doped lutetium hydride
(Lu-N-H) compound at pressures of approximately 1 GPa—nearly ambient
conditions. Electrical resistance drops to zero below the transition
temperature with a sharp transition width of 2 K, and we observe the Meissner
effect confirming bulk superconductivity. Density functional theory
calculations suggest that nitrogen incorporation stabilizes the high-symmetry
structure that enables strong electron-phonon coupling. These results
establish a pathway toward practical room-temperature superconductors.
```
**Why this works**:
- Opens with accessible explanation of significance
- Historical context (century-old limitation)
- Precise results (21°C, 1 GPa, 2 K transition width)
- Multiple lines of evidence (resistance + Meissner effect)
- Theoretical explanation briefly included
- Forward-looking conclusion
---
## Example 5: Evolution / Ecology
**Topic**: Rapid evolution in response to climate
```
Climate change is driving rapid shifts in the geographic distributions of
species, but whether organisms can adapt quickly enough to keep pace with
warming remains a critical question for biodiversity conservation. Here we
document real-time evolution in wild populations of a widespread forest tree,
Scots pine, along a 1,000 km latitudinal gradient in Scandinavia. By combining
whole-genome sequencing with phenotypic measurements across 25 common gardens,
we detect signatures of selection at 47 loci associated with cold tolerance,
phenology, and drought resistance over just 50 years—approximately
five tree generations. Alleles conferring warmer-adapted phenotypes have
increased in frequency by 4-12% across northern populations, matching
predictions from models of climate-driven selection. However, migration of
warm-adapted genotypes from the south appears limited by geographic barriers.
These results demonstrate that trees can evolve rapidly in response to
climate change but suggest that assisted gene flow may be necessary to
prevent local maladaptation.
```
**Why this works**:
- Opens with pressing question (climate adaptation)
- Specific system (Scots pine) and scale (1,000 km)
- Methods described briefly (genomics + common gardens)
- Quantitative results (47 loci, 4-12% frequency shift, 5 generations)
- Mechanism identified (limited migration)
- Conservation implications stated
---
## Common Elements Across Examples
### Structure (Implicit)
1. **Hook**: Why this matters broadly (1-2 sentences)
2. **Gap**: What was unknown or problematic (1 sentence)
3. **Approach**: What was done (1 sentence)
4. **Findings**: Key results with numbers (2-3 sentences)
5. **Significance**: Why this matters going forward (1 sentence)
### Style Features
- **Active voice**: "We discovered," "We find," "We report"
- **Specific numbers**: Exact values, not vague quantities
- **Accessible language**: Minimal jargon, explained when needed
- **Compelling opening**: Broad hook before technical details
- **Strong close**: Implications or future directions
### Word Count
- Nature: 150-200 words (examples above: 185-210 words)
- Science: ≤125 words (would need tightening)
---
## What to Avoid
**Too technical opening**:
> "The CRISPR-Cas9 system with guide RNA targeting PAM sequences..."
**Better opening**:
> "The ability to precisely edit DNA in living cells..."
---
**Vague results**:
> "Our method significantly outperformed existing approaches..."
**Better results**:
> "Our method reduced off-target editing by 78% compared to standard Cas9..."
---
**Weak significance statement**:
> "These findings may have implications for the field..."
**Better significance**:
> "These findings suggest new therapeutic approaches for memory disorders..."
---
## See Also
- `nature_science_style.md` - Comprehensive Nature/Science writing guide
- `venue_writing_styles.md` - Style comparison across venues

View File

@@ -0,0 +1,245 @@
# NeurIPS/ICML Introduction Example
This example demonstrates the distinctive ML conference introduction structure with numbered contributions and technical precision.
---
## Full Introduction Example
**Paper Topic**: Efficient Long-Context Transformers
---
### Paragraph 1: Problem Motivation
```
Large language models (LLMs) have demonstrated remarkable capabilities in
natural language understanding, code generation, and reasoning tasks [1, 2, 3].
These capabilities scale with both model size and context length—longer
contexts enable processing of entire documents, multi-turn conversations,
and complex reasoning chains that span many steps [4, 5]. However, the
standard Transformer attention mechanism [6] has O(N²) time and memory
complexity with respect to sequence length N, creating a fundamental
bottleneck for processing long sequences. For a context window of 100K
tokens, computing full attention requires 10 billion scalar operations
and 40 GB of memory for the attention matrix alone, making training and
inference prohibitively expensive on current hardware.
```
**Key features**:
- States why this matters (LLM capabilities)
- Connects to scaling (longer contexts = better performance)
- Specific numbers (O(N²), 100K tokens, 10 billion ops, 40 GB)
- Citations to establish credibility
---
### Paragraph 2: Limitations of Existing Approaches
```
Prior work has addressed attention efficiency through three main approaches.
Sparse attention patterns [7, 8, 9] reduce complexity to O(N√N) or O(N log N)
by restricting attention to local windows, fixed stride patterns, or learned
sparse masks. Linear attention approximations [10, 11, 12] reformulate
attention using kernel feature maps that enable O(N) computation, but
sacrifice the ability to model arbitrary pairwise interactions. Low-rank
factorizations [13, 14] approximate the attention matrix as a product of
smaller matrices, achieving efficiency at the cost of expressivity. While
these methods reduce theoretical complexity, they introduce approximation
errors that compound in deep networks, often resulting in 2-5% accuracy
degradation on long-range modeling benchmarks [15]. Perhaps more importantly,
they fundamentally change the attention mechanism, making it difficult to
apply advances in standard attention (e.g., rotary positional embeddings,
grouped-query attention) to efficient variants.
```
**Key features**:
- Organized categorization of prior work
- Complexity stated for each approach
- Limitations clearly identified
- Quantified shortcomings (2-5% degradation)
- Deeper issue identified (incompatibility with advances)
---
### Paragraph 3: Your Approach (High-Level)
```
We take a different approach: rather than approximating attention, we
accelerate exact attention by optimizing memory access patterns. Our key
observation is that on modern GPUs, attention is bottlenecked by memory
bandwidth, not compute. Reading and writing the N × N attention matrix to
and from GPU high-bandwidth memory (HBM) dominates runtime, while the GPU's
tensor cores remain underutilized. We propose LongFlash, an IO-aware exact
attention algorithm that computes attention block-by-block in fast on-chip
SRAM, never materializing the full attention matrix in HBM. By carefully
orchestrating the tiling pattern and fusing the softmax computation with
matrix multiplications, LongFlash reduces HBM accesses from O(N²) to
O(N²d/M) where d is the head dimension and M is the SRAM size, achieving
asymptotically optimal IO complexity.
```
**Key features**:
- Clear differentiation from prior work ("different approach")
- Key insight stated explicitly
- Technical mechanism explained
- Complexity improvement quantified
- Method name introduced
---
### Paragraph 4: Contributions (CRITICAL)
```
Our contributions are as follows:
• We propose LongFlash, an IO-aware exact attention algorithm that achieves
2-4× speedup over FlashAttention [16] and up to 9× over standard PyTorch
attention on sequences from 1K to 128K tokens (Section 3).
• We provide theoretical analysis proving that LongFlash achieves optimal
IO complexity of O(N²d/M) among all algorithms that compute exact
attention, and analyze the regime where our algorithm provides maximum
benefit (Section 3.3).
• We introduce sequence parallelism techniques that enable LongFlash to
scale to sequences of 1M+ tokens across multiple GPUs with near-linear
weak scaling efficiency (Section 4).
• We demonstrate that LongFlash enables training with 8× longer contexts
on the same hardware: we train a 7B parameter model on 128K token
contexts using the same memory that previously limited us to 16K tokens
(Section 5).
• We release optimized CUDA kernels achieving 80% of theoretical peak
FLOPS on A100 and H100 GPUs, along with PyTorch and JAX bindings, at
[anonymous URL] (Section 6).
```
**Key features**:
- Numbered/bulleted format
- Each contribution is specific and quantified
- Section references for each claim
- Both methodological and empirical contributions
- Code release mentioned
- Self-contained bullets (each makes sense alone)
---
## Alternative Opening Paragraphs
### For a Methods Paper
```
Scalable optimization algorithms are fundamental to modern machine learning.
Stochastic gradient descent (SGD) and its variants [1, 2, 3] have enabled
training of models with billions of parameters on massive datasets. However,
these first-order methods exhibit slow convergence on ill-conditioned
problems, often requiring thousands of iterations to converge on tasks
where second-order methods would converge in tens of iterations [4, 5].
```
### For an Applications Paper
```
Drug discovery is a costly and time-consuming process, with the average new
drug requiring 10-15 years and $2.6 billion to develop [1]. Machine learning
offers the potential to accelerate this process by predicting molecular
properties, identifying promising candidates, and optimizing lead compounds
computationally [2, 3]. Recent successes in protein structure prediction [4]
and molecular generation [5] have demonstrated that deep learning can
capture complex chemical patterns, raising hopes for ML-driven drug discovery.
```
### For a Theory Paper
```
Understanding why deep neural networks generalize well despite having more
parameters than training examples remains one of the central puzzles of
modern machine learning [1, 2]. Classical statistical learning theory
predicts that such overparameterized models should overfit dramatically,
yet in practice, large networks trained with SGD achieve excellent test
accuracy [3]. This gap between theory and practice has motivated a rich
literature on implicit regularization [4], neural tangent kernels [5],
and feature learning [6], but a complete theoretical picture remains elusive.
```
---
## Contribution Bullet Templates
### For a New Method
```
• We propose [Method Name], a novel [type of method] that [key innovation]
achieving [performance improvement] over [baseline] on [benchmark].
```
### For Theoretical Analysis
```
• We prove that [statement], providing the first [type of result] for
[problem setting]. This resolves an open question from [prior work].
```
### For Empirical Study
```
• We conduct a comprehensive evaluation of [N] methods across [M] datasets,
revealing that [key finding] and identifying [failure mode/best practice].
```
### For Code/Data Release
```
• We release [resource name], a [description] containing [scale/scope],
available at [URL]. This enables [future work/reproducibility].
```
---
## Common Mistakes to Avoid
### Vague Contributions
**Bad**:
```
• We propose a novel method for attention
• We show our method is better than baselines
• We provide theoretical analysis
```
**Good**:
```
• We propose LongFlash, achieving 2-4× speedup over FlashAttention
• We prove LongFlash achieves optimal O(N²d/M) IO complexity
• We enable 8× longer context training on fixed hardware budget
```
### Missing Quantification
**Bad**: "Our method significantly outperforms prior work"
**Good**: "Our method improves accuracy by 3.2% on GLUE and 4.1% on SuperGLUE"
### Overlapping Bullets
**Bad**:
```
• We propose a new attention mechanism
• We introduce LongFlash attention
• Our novel attention approach...
```
(These say the same thing three times)
### Buried Contributions
**Bad**: Contribution bullets at the end of page 2
**Good**: Contribution bullets clearly visible by end of page 1
---
## See Also
- `ml_conference_style.md` - Comprehensive ML conference guide
- `venue_writing_styles.md` - Style comparison across venues

View File

@@ -0,0 +1,483 @@
# Cell Press Writing Style Guide
Comprehensive writing guide for Cell, Neuron, Immunity, Molecular Cell, Developmental Cell, Cell Reports, and other Cell Press journals.
**Last Updated**: 2024
---
## Overview
Cell Press journals emphasize **mechanistic depth**, **rigorous experimentation**, and **biological insight**. Unlike Nature/Science, which prioritize broad accessibility, Cell papers are written for biologists who appreciate technical detail and comprehensive data.
### Key Philosophy
> "Cell papers tell a complete mechanistic story with exhaustive experimental support."
**Primary Goal**: Provide deep biological insight with extensive experimental validation that advances understanding of fundamental mechanisms.
---
## Unique Cell Press Features
Cell Press has several distinctive elements not found in other journals:
### 1. Summary (Not Abstract)
Cell uses "Summary" instead of "Abstract" - functionally similar but emphasizes synthesis.
### 2. Graphical Abstract (REQUIRED)
A visual summary appearing on the table of contents. **This is mandatory for all Cell Press journals.**
### 3. eTOC Blurb
A 30-50 word "elevator pitch" for the electronic table of contents.
### 4. Highlights
3-4 bullet points (≤85 characters each) capturing key findings.
### 5. In Brief
A one-sentence summary of the paper.
---
## Audience and Tone
### Target Reader
- Expert biologist in the relevant field
- Familiar with techniques and terminology
- Expects comprehensive data and mechanistic depth
- Values rigor and reproducibility
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Technical** | Appropriate jargon for the field |
| **Mechanistic** | Focus on how and why, not just what |
| **Comprehensive** | Thorough exploration of the question |
| **Data-rich** | Extensive experimental support |
| **Precise** | Exact terminology and quantification |
### Voice
- **First person ("we") acceptable**: "We demonstrate that..."
- **Active voice encouraged**: "We identified..."
- **Confident but measured**: Strong claims require strong evidence
---
## Summary (Abstract)
### Style Requirements
- **150 words maximum** for Cell; varies for other Cell Press journals
- **Flowing paragraph** (not structured sections)
- **Dense with information**: Every sentence should convey key points
- **Mechanistic focus**: What was discovered and how it works
### Summary Structure
1. **Context** (1 sentence): The biological question/problem
2. **Approach** (1 sentence): What you did
3. **Key findings** (2-4 sentences): Main results with mechanism
4. **Significance** (1 sentence): What this reveals about biology
### Example Summary (Cell Style)
```
Cellular senescence is a stress response that arrests proliferation and
promotes tissue remodeling, but the mechanisms controlling senescent cell
fate remain unclear. Here, we identify the transcription factor FOXO4 as a
critical regulator of senescent cell viability. FOXO4 is highly expressed
in senescent cells and sequesters p53 away from mitochondria, preventing
apoptosis. Using a cell-penetrating peptide that disrupts FOXO4-p53
interaction, we selectively induce senescent cell apoptosis in vitro and
in vivo. Administration of this peptide to aged mice restores fitness, fur
density, and renal function. These findings reveal FOXO4-p53 as a senescence
vulnerability and establish proof-of-concept for targeted senolytic
interventions in aging.
```
---
## Graphical Abstract
### Purpose
A single-panel visual summary for the table of contents that captures the entire paper's message.
### Requirements
- **Size**: Square format, typically 1200 × 1200 pixels
- **Layout**: Clean, uncluttered
- **Content**: Show workflow, key finding, and mechanism
- **Text**: Minimal labels, large readable fonts
- **Color**: Vibrant but professional
### Design Elements
```
Typical Graphical Abstract Components:
1. Starting point (cell, organism, condition)
2. Intervention/treatment (arrows, symbols)
3. Key measurement or observation
4. Outcome/conclusion (visual representation)
5. Minimal text labels connecting elements
```
### Example Description (for schematic generation)
```
"Graphical abstract showing: Left panel - normal cells with FOXO4 (blue)
and p53 (green) separate. Center panel - senescent cells with FOXO4
binding p53, preventing apoptosis. Right panel - FOXO4 peptide disrupts
interaction, allowing p53 to reach mitochondria, triggering apoptosis.
Arrow at bottom showing aged mouse → treatment → rejuvenated mouse."
```
---
## Highlights
### Format
3-4 bullet points, each ≤85 characters (including spaces)
### Content Guidelines
- Start with an action verb or key noun
- Include specific findings
- Make each highlight standalone
- Cover different aspects of the paper
### Example Highlights
```
• FOXO4 is selectively expressed in senescent cells
• FOXO4 sequesters p53, preventing senescent cell apoptosis
• A FOXO4-targeting peptide induces selective senescent cell death
• Senolytic peptide treatment restores function in aged mice
```
---
## eTOC Blurb
### Format
30-50 words for the electronic table of contents
### Writing Guidelines
- Written by authors (editors may modify)
- Start with author names or key finding
- Make it a complete, engaging sentence
- Highlight the most exciting aspect
### Example eTOC Blurb
```
Baar et al. identify FOXO4 as a vulnerability of senescent cells and
develop a peptide that induces targeted apoptosis of senescent cells.
Treatment of aged mice with this senolytic peptide restores fitness
and organ function.
```
---
## Introduction
### Length and Structure
- **4-6 paragraphs** (800-1200 words)
- More comprehensive than Nature/Science
- Can include more technical detail and literature
### Paragraph-by-Paragraph Guide
**Paragraph 1: Biological Context**
- Establish the biological process or system
- Why is this important to understand?
- Set up the key players and mechanisms
**Paragraphs 2-3: State of the Field**
- Detailed review of relevant prior work
- Establish what is known mechanistically
- More comprehensive than Nature/Science
**Paragraph 4: The Gap**
- What remains unknown or controversial?
- Why is this a critical question?
- What has prevented progress?
**Paragraph 5: Your Approach**
- How did you tackle this question?
- What techniques/systems did you use?
- Why was your approach appropriate?
**Final Paragraph: Key Findings Preview**
- Brief statement of what you discovered
- How does this advance the field?
- Set up the structure of results
### Example Introduction Paragraph
```
Cellular senescence is characterized by stable cell-cycle arrest, profound
chromatin alterations, and a complex secretory phenotype known as the
senescence-associated secretory phenotype (SASP) (Coppé et al., 2008;
Rodier and Campisi, 2011). Senescent cells accumulate with age and at
sites of pathology, where they can drive tissue dysfunction through
SASP-mediated inflammation and disruption of tissue architecture (van
Deursen, 2014). The targeted elimination of senescent cells—senolysis—has
emerged as a promising therapeutic strategy, with genetic and pharmacological
approaches demonstrating benefits in mouse models of aging and age-related
disease (Baker et al., 2011, 2016; Chang et al., 2016).
```
---
## Results
### Organization
Cell papers typically have **5-8 results sections**, each with a descriptive subheading:
```
Results
├── Section 1: Discovery of the phenomenon
├── Section 2: Characterization of the mechanism
├── Section 3: Identification of molecular players
├── Section 4: Functional validation
├── Section 5: In vivo confirmation
├── Section 6: Therapeutic proof-of-concept
└── Section 7: Broader implications
```
### Subheading Style
Cell uses **declarative subheadings** stating the finding:
❌ "Analysis of FOXO4 expression" (descriptive - avoid)
✅ "FOXO4 Is Selectively Upregulated in Senescent Cells" (declarative)
### Results Writing Style
- **Comprehensive detail**: Cell expects more methodological context in Results than Nature
- **Figure-by-figure narrative**: Each major figure often corresponds to a results section
- **Statistical rigor**: All quantifications with statistics
- **Biological interpretation**: More interpretation woven in than pure Results sections
### Example Results Paragraph
```
To identify transcription factors regulating senescent cell viability, we
performed RNA sequencing on proliferating and senescent human fibroblasts
(IMR90 cells induced to senesce by replicative exhaustion, ionizing
radiation, or oncogene-induced senescence). Differential expression
analysis revealed 47 transcription factors significantly upregulated
across all senescence modalities (FDR < 0.05, fold change > 2; Figure 1A
and Table S1). Among these, FOXO4 showed the highest and most consistent
upregulation (12.3 ± 2.1-fold; Figure 1B), a finding we confirmed by
quantitative RT-PCR (Figure 1C) and immunoblot analysis (Figure 1D).
Immunofluorescence microscopy revealed nuclear FOXO4 accumulation in
senescent but not proliferating cells (Figure 1E,F).
```
---
## Discussion
### Structure
Cell discussions are **thorough and mechanistic**:
**Paragraph 1: Summary**
- Restate key findings
- Synthesize the main message
**Paragraphs 2-4: Mechanistic Interpretation**
- Deep dive into how your findings fit with known biology
- Propose models
- Discuss molecular mechanisms in detail
**Paragraph 5: Comparison with Literature**
- How do your findings relate to prior work?
- Resolve apparent contradictions
**Paragraph 6: Implications and Applications**
- Therapeutic implications
- Broader significance
**Paragraph 7: Limitations**
- Honest assessment
- Open questions remaining
**Final Paragraph: Conclusions**
- Big-picture take-home message
- Future directions
---
## Experimental Procedures / STAR Methods
### STAR Methods Format
Cell uses a structured **STAR Methods** section:
```
RESOURCE AVAILABILITY
Lead Contact
Materials Availability
Data and Code Availability
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Cell Lines
Animals
Human Subjects
METHOD DETAILS
[Detailed protocols for each technique]
QUANTIFICATION AND STATISTICAL ANALYSIS
```
### Key Reagent Table (KEY RESOURCES TABLE)
Cell requires a comprehensive table of all key resources:
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---------------------|--------|------------|
| Antibodies | | |
| Rabbit anti-FOXO4 | Abcam | Cat#ab12345 |
| Chemicals | | |
| Doxorubicin | Sigma-Aldrich | Cat#D1515 |
| Cell Lines | | |
| IMR90 | ATCC | CCL-186 |
---
## Figures
### Figure Philosophy
Cell papers are **figure-heavy** with extensive multi-panel figures:
- **6-8 main figures** typical
- **Multi-panel format**: 6-12 panels per figure common
- **Data-dense**: Comprehensive experimental support
- **Extended Data**: Supplementary figures for additional validation
### Panel Labeling
Panels labeled with lowercase letters: **(A)**, **(B)**, **(C)**
### Figure Legend Format
```
Figure 3. FOXO4 Sequesters p53 in the Nucleus of Senescent Cells
(A) Immunofluorescence microscopy of p53 (green) and FOXO4 (red) in
proliferating (left) and senescent (right) IMR90 cells. DAPI (blue)
marks nuclei. Scale bar, 10 μm.
(B) Quantification of nuclear p53 intensity in proliferating versus
senescent cells. Data represent mean ± SEM; n = 3 biological replicates,
>100 cells per condition. ***p < 0.001, two-tailed Student's t test.
(C and D) Co-immunoprecipitation of FOXO4 and p53 in proliferating (C)
and senescent (D) cell lysates. IgG, immunoglobulin G control.
(E) Proximity ligation assay for FOXO4-p53 interaction. Red dots indicate
interaction events. Scale bar, 10 μm.
(F) Model of FOXO4-mediated p53 sequestration in senescent cells.
See also Figure S3 and Table S2.
```
---
## References
### Citation Style
- **Author-year format**: (Smith et al., 2023) or Smith et al. (2023)
- **Multiple citations**: (Smith et al., 2020; Jones et al., 2021)
- **Two authors**: (Smith and Jones, 2023)
- **Three or more**: (Smith et al., 2023)
### Reference Format
```
Baker, D.J., Wijshake, T., Tchkonia, T., LeBrasseur, N.K., Childs, B.G.,
van de Sluis, B., Kirkland, J.L., and van Deursen, J.M. (2011). Clearance
of p16Ink4a-positive senescent cells delays ageing-associated disorders.
Nature 479, 232236.
```
---
## Cell Press Journal Comparison
| Journal | Focus | Article Length | Figures |
|---------|-------|---------------|---------|
| **Cell** | Breakthrough biology | Long | 7-8 main + ED |
| **Neuron** | Neuroscience | Long | 6-8 main |
| **Immunity** | Immunology | Medium-Long | 6-7 main |
| **Molecular Cell** | Molecular mechanisms | Medium | 5-7 main |
| **Developmental Cell** | Development | Medium | 5-7 main |
| **Cell Reports** | Solid science | Medium | 4-6 main |
---
## Common Mistakes
1. **Insufficient mechanism**: Describing what happens without how
2. **Under-controlled experiments**: Missing key controls
3. **Weak phenotype validation**: Single approach instead of multiple
4. **Missing in vivo work**: Cell papers often expect animal studies
5. **Incomplete figure panels**: Not showing all relevant conditions
6. **Forgetting graphical abstract**: Required element
7. **Exceeding highlight character limits**: ≤85 characters per bullet
---
## Pre-Submission Checklist
### Required Elements
- [ ] Graphical abstract (square format)
- [ ] Highlights (3-4 bullets, ≤85 characters each)
- [ ] eTOC blurb (30-50 words)
- [ ] Summary (≤150 words)
- [ ] Key Resources Table
### Content
- [ ] Mechanistic depth throughout
- [ ] Multiple complementary approaches
- [ ] In vivo validation (if applicable)
- [ ] Declarative subheadings
- [ ] Comprehensive figure panels
### Style
- [ ] Technical precision in terminology
- [ ] Author-year citations
- [ ] Figure legends complete and standalone
- [ ] STAR Methods properly formatted
---
## See Also
- `venue_writing_styles.md` - Master style overview
- `journals_formatting.md` - Technical formatting requirements
- `nature_science_style.md` - Comparison with Nature/Science style

View File

@@ -0,0 +1,463 @@
# CS Conference Writing Style Guide
Comprehensive writing guide for ACL, EMNLP, NAACL (NLP), CHI, CSCW (HCI), SIGKDD, WWW, SIGIR (data mining/IR), and other major CS conferences.
**Last Updated**: 2024
---
## Overview
CS conferences span diverse subfields with distinct writing cultures. This guide covers NLP, HCI, and data mining/IR venues, each with unique expectations and evaluation criteria.
---
# Part 1: NLP Conferences (ACL, EMNLP, NAACL)
## NLP Writing Philosophy
> "Strong empirical results on standard benchmarks with insightful analysis."
NLP papers balance empirical rigor with linguistic insight. Human evaluation is increasingly important alongside automatic metrics.
## Audience and Tone
### Target Reader
- NLP researchers and computational linguists
- Familiar with transformer architectures, standard benchmarks
- Expect reproducible results and error analysis
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Task-focused** | Clear problem definition |
| **Benchmark-oriented** | Standard datasets emphasized |
| **Analysis-rich** | Error analysis, qualitative examples |
| **Reproducible** | Full implementation details |
## Abstract (NLP Style)
### Structure
- **Task/problem** (1 sentence)
- **Limitation of prior work** (1 sentence)
- **Your approach** (1-2 sentences)
- **Results on benchmarks** (2 sentences)
- **Analysis finding** (optional, 1 sentence)
### Example Abstract
```
Coreference resolution remains challenging for pronouns with distant or
ambiguous antecedents. Prior neural approaches struggle with these
difficult cases due to limited context modeling. We introduce
LongContext-Coref, a retrieval-augmented coreference model that
dynamically retrieves relevant context from document history. On the
OntoNotes 5.0 benchmark, LongContext-Coref achieves 83.4 F1, improving
over the previous state-of-the-art by 1.2 points. On the challenging
WinoBias dataset, we reduce gender bias by 34% while maintaining
accuracy. Qualitative analysis reveals that our model successfully
resolves pronouns requiring world knowledge, a known weakness of
prior approaches.
```
## NLP Paper Structure
```
├── Introduction
│ ├── Task motivation
│ ├── Prior work limitations
│ ├── Your contribution
│ └── Contribution bullets
├── Related Work
├── Method
│ ├── Problem formulation
│ ├── Model architecture
│ └── Training procedure
├── Experiments
│ ├── Datasets (with statistics)
│ ├── Baselines
│ ├── Main results
│ ├── Analysis
│ │ ├── Error analysis
│ │ ├── Ablation study
│ │ └── Qualitative examples
│ └── Human evaluation (if applicable)
├── Discussion / Limitations
└── Conclusion
```
## NLP-Specific Requirements
### Datasets
- Use **standard benchmarks**: GLUE, SQuAD, CoNLL, OntoNotes
- Report **dataset statistics**: train/dev/test sizes
- **Data preprocessing**: Document all steps
### Evaluation Metrics
- **Task-appropriate metrics**: F1, BLEU, ROUGE, accuracy
- **Statistical significance**: Paired bootstrap, p-values
- **Multiple runs**: Report mean ± std across seeds
### Human Evaluation
Increasingly expected for generation tasks:
- **Annotator details**: Number, qualifications, agreement
- **Evaluation protocol**: Guidelines, interface, payment
- **Inter-annotator agreement**: Cohen's κ or Krippendorff's α
### Example Human Evaluation Table
```
Table 3: Human Evaluation Results (100 samples, 3 annotators)
─────────────────────────────────────────────────────────────
Method | Fluency | Coherence | Factuality | Overall
─────────────────────────────────────────────────────────────
Baseline | 3.8 | 3.2 | 3.5 | 3.5
GPT-3.5 | 4.2 | 4.0 | 3.7 | 4.0
Our Method | 4.4 | 4.3 | 4.1 | 4.3
─────────────────────────────────────────────────────────────
Inter-annotator κ = 0.72. Scale: 1-5 (higher is better).
```
## ACL-Specific Notes
- **ARR (ACL Rolling Review)**: Shared review system across ACL venues
- **Responsible NLP checklist**: Ethics, limitations, risks
- **Long (8 pages) vs. Short (4 pages)**: Different expectations
- **Findings papers**: Lower-tier acceptance track
---
# Part 2: HCI Conferences (CHI, CSCW, UIST)
## HCI Writing Philosophy
> "Technology in service of humans—understand users first, then design and evaluate."
HCI papers are fundamentally **user-centered**. Technology novelty alone is insufficient; understanding human needs and demonstrating user benefit is essential.
## Audience and Tone
### Target Reader
- HCI researchers and practitioners
- UX designers and product developers
- Interdisciplinary (CS, psychology, design, social science)
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **User-centered** | Focus on people, not technology |
| **Design-informed** | Grounded in design thinking |
| **Empirical** | User studies provide evidence |
| **Reflective** | Consider broader implications |
## HCI Abstract
### Focus on Users and Impact
```
Video calling has become essential for remote collaboration, yet
current interfaces poorly support the peripheral awareness that makes
in-person work effective. Through formative interviews with 24 remote
workers, we identified three key challenges: difficulty gauging
colleague availability, lack of ambient presence cues, and interruption
anxiety. We designed AmbientOffice, a peripheral display system that
conveys teammate presence through subtle ambient visualizations. In a
two-week deployment study with 18 participants across three distributed
teams, AmbientOffice increased spontaneous collaboration by 40% and
reduced perceived isolation (p<0.01). Participants valued the system's
non-intrusive nature and reported feeling more connected to remote
colleagues. We discuss implications for designing ambient awareness
systems and the tension between visibility and privacy in remote work.
```
## HCI Paper Structure
### Research Through Design / Systems Papers
```
├── Introduction
│ ├── Problem in human terms
│ ├── Why technology can help
│ └── Contribution summary
├── Related Work
│ ├── Domain background
│ ├── Prior systems
│ └── Theoretical frameworks
├── Formative Work (often)
│ ├── Interviews / observations
│ └── Design requirements
├── System Design
│ ├── Design rationale
│ ├── Implementation
│ └── Interface walkthrough
├── Evaluation
│ ├── Study design
│ ├── Participants
│ ├── Procedure
│ ├── Findings (quant + qual)
│ └── Limitations
├── Discussion
│ ├── Design implications
│ ├── Generalizability
│ └── Future work
└── Conclusion
```
### Qualitative / Interview Studies
```
├── Introduction
├── Related Work
├── Methods
│ ├── Participants
│ ├── Procedure
│ ├── Data collection
│ └── Analysis method (thematic, grounded theory, etc.)
├── Findings
│ ├── Theme 1 (with quotes)
│ ├── Theme 2 (with quotes)
│ └── Theme 3 (with quotes)
├── Discussion
│ ├── Implications for design
│ ├── Implications for research
│ └── Limitations
└── Conclusion
```
## HCI-Specific Requirements
### Participant Reporting
- **Demographics**: Age, gender, relevant experience
- **Recruitment**: How and where recruited
- **Compensation**: Payment amount and type
- **IRB approval**: Ethics board statement
### Quotes in Findings
Use direct quotes to ground findings:
```
Participants valued the ambient nature of the display. As P7 described:
"It's like having a window to my teammate's office. I don't need to
actively check it, but I know they're there." This passive awareness
reduced the barrier to initiating contact.
```
### Design Implications Section
Translate findings into actionable guidance:
```
**Implication 1: Support peripheral awareness without demanding attention.**
Ambient displays should be visible in peripheral vision but not require
active monitoring. Designers should consider calm technology principles.
**Implication 2: Balance visibility with privacy.**
Users want to share presence but fear surveillance. Systems should
provide granular controls and make visibility mutual.
```
## CHI-Specific Notes
- **Contribution types**: Empirical, artifact, methodological, theoretical
- **ACM format**: `acmart` document class with `sigchi` option
- **Accessibility**: Alt text, inclusive language expected
- **Contribution statement**: Required per-author contributions
---
# Part 3: Data Mining & IR (SIGKDD, WWW, SIGIR)
## Data Mining Writing Philosophy
> "Scalable methods for real-world data with demonstrated practical impact."
Data mining papers emphasize **scalability**, **real-world applicability**, and **solid experimental methodology**.
## Audience and Tone
### Target Reader
- Data scientists and ML engineers
- Industry researchers
- Applied ML practitioners
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Scalable** | Handle large datasets |
| **Practical** | Real-world applications |
| **Reproducible** | Datasets and code shared |
| **Industrial** | Industry datasets valued |
## KDD Abstract
### Emphasize Scale and Application
```
Fraud detection in e-commerce requires processing millions of
transactions in real-time while adapting to evolving attack patterns.
We present FraudShield, a graph neural network framework for real-time
fraud detection that scales to billion-edge transaction graphs. Unlike
prior methods that require full graph access, FraudShield uses
incremental updates with O(1) inference cost per transaction. On a
proprietary dataset of 2.3 billion transactions from a major e-commerce
platform, FraudShield achieves 94.2% precision at 80% recall,
outperforming production baselines by 12%. The system has been deployed
at [Company], processing 50K transactions per second and preventing
an estimated $400M in annual fraud losses. We release an anonymized
benchmark dataset and code.
```
## KDD Paper Structure
```
├── Introduction
│ ├── Problem and impact
│ ├── Technical challenges
│ ├── Your approach
│ └── Contributions
├── Related Work
├── Preliminaries
│ ├── Problem definition
│ └── Notation
├── Method
│ ├── Overview
│ ├── Technical components
│ └── Complexity analysis
├── Experiments
│ ├── Datasets (with scale statistics)
│ ├── Baselines
│ ├── Main results
│ ├── Scalability experiments
│ ├── Ablation study
│ └── Case study / deployment
└── Conclusion
```
## KDD-Specific Requirements
### Scalability
- **Dataset sizes**: Report number of nodes, edges, samples
- **Runtime analysis**: Wall-clock time comparisons
- **Complexity**: Time and space complexity stated
- **Scaling experiments**: Show performance vs. data size
### Industrial Deployment
- **Case studies**: Real-world deployment stories
- **A/B tests**: Online evaluation results (if applicable)
- **Production metrics**: Business impact (if shareable)
### Example Scalability Table
```
Table 4: Scalability Comparison (runtime in seconds)
──────────────────────────────────────────────────────
Dataset | Nodes | Edges | GCN | GraphSAGE | Ours
──────────────────────────────────────────────────────
Cora | 2.7K | 5.4K | 0.3 | 0.2 | 0.1
Citeseer | 3.3K | 4.7K | 0.4 | 0.3 | 0.1
PubMed | 19.7K | 44.3K | 1.2 | 0.8 | 0.3
ogbn-arxiv | 169K | 1.17M | 8.4 | 4.2 | 1.6
ogbn-papers | 111M | 1.6B | OOM | OOM | 42.3
──────────────────────────────────────────────────────
```
---
# Part 4: Common Elements Across CS Venues
## Writing Quality
### Clarity
- **One idea per sentence**
- **Define terms before use**
- **Use consistent notation**
### Precision
- **Exact numbers**: "23.4%" not "about 20%"
- **Clear claims**: Avoid hedging unless necessary
- **Specific comparisons**: Name the baseline
## Contribution Bullets
Used across all CS venues:
```
Our contributions are:
• We identify [problem/insight]
• We propose [method name] that [key innovation]
• We demonstrate [results] on [benchmarks]
• We release [code/data] at [URL]
```
## Reproducibility Standards
All CS venues increasingly expect:
- **Code availability**: GitHub link (anonymous for review)
- **Data availability**: Public datasets or release plans
- **Full hyperparameters**: Training details complete
- **Random seeds**: Exact values for reproduction
## Ethics and Broader Impact
### NLP (ACL/EMNLP)
- **Limitations section**: Required
- **Responsible NLP checklist**: Ethical considerations
- **Bias analysis**: For models affecting people
### HCI (CHI)
- **IRB/Ethics approval**: Required for human subjects
- **Informed consent**: Procedure described
- **Privacy considerations**: Data handling
### KDD/WWW
- **Societal impact**: Consider misuse potential
- **Privacy preservation**: For sensitive data
- **Fairness analysis**: When applicable
---
## Venue Comparison Table
| Aspect | ACL/EMNLP | CHI | KDD/WWW | SIGIR |
|--------|-----------|-----|---------|-------|
| **Focus** | NLP tasks | User studies | Scalable ML | IR/search |
| **Evaluation** | Benchmarks + human | User studies | Large-scale exp | Datasets |
| **Theory weight** | Moderate | Low | Moderate | Moderate |
| **Industry value** | High | Medium | Very high | High |
| **Page limit** | 8 long / 4 short | 10 + refs | 9 + refs | 10 + refs |
| **Review style** | ARR | Direct | Direct | Direct |
---
## Pre-Submission Checklist
### All CS Venues
- [ ] Clear contribution statement
- [ ] Strong baselines
- [ ] Reproducibility information complete
- [ ] Correct venue template
- [ ] Anonymized (if double-blind)
### NLP-Specific
- [ ] Standard benchmark results
- [ ] Error analysis included
- [ ] Human evaluation (for generation)
- [ ] Responsible NLP checklist
### HCI-Specific
- [ ] IRB approval stated
- [ ] Participant demographics
- [ ] Direct quotes in findings
- [ ] Design implications
### Data Mining-Specific
- [ ] Scalability experiments
- [ ] Dataset size statistics
- [ ] Runtime comparisons
- [ ] Complexity analysis
---
## See Also
- `venue_writing_styles.md` - Master style overview
- `ml_conference_style.md` - NeurIPS/ICML style guide
- `conferences_formatting.md` - Technical formatting requirements
- `reviewer_expectations.md` - What CS reviewers seek

View File

@@ -0,0 +1,535 @@
# Medical Journal Writing Style Guide
Comprehensive writing guide for NEJM, Lancet, JAMA, BMJ, Annals of Internal Medicine, and other major medical journals.
**Last Updated**: 2024
---
## Overview
Medical journals prioritize **clinical relevance**, **patient outcomes**, and **evidence-based practice**. Writing must be precise, evidence-focused, and directly applicable to clinical decision-making.
### Key Philosophy
> "Every sentence should help a clinician make better decisions for their patients."
**Primary Goal**: Communicate research findings that can improve patient care and clinical practice.
---
## Audience and Tone
### Target Reader
- Practicing physicians and clinicians
- Clinical researchers
- Healthcare policymakers
- Medical educators
- Some public health and patient advocacy readers
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Evidence-focused** | Appropriate hedging based on study design |
| **Patient-centered** | Focus on patient outcomes, not just biomarkers |
| **Clinical** | Emphasize practical applicability |
| **Precise** | Exact numbers, confidence intervals, NNT |
| **Measured** | Claims match evidence strength |
### Voice
- **Passive voice common**: "Patients were randomized to..."
- **First person acceptable**: "We conducted a trial..."
- **Third person for patients**: "Patients" not "subjects"
---
## Abstract: Structured Format
### Overview
Most major medical journals require **structured abstracts** with labeled sections. This is one of the few venues where structured abstracts are expected.
### Standard Structure (IMRAD-based)
```
Background: [Why this study was needed - 1-2 sentences]
Methods: [Study design, setting, participants, intervention,
main outcomes - 2-4 sentences]
Results: [Primary and key secondary outcomes with statistics -
3-5 sentences]
Conclusions: [Clinical implications, with appropriate hedging -
1-2 sentences]
```
### Word Limits by Journal
| Journal | Abstract Limit |
|---------|---------------|
| NEJM | 250 words |
| Lancet | 300 words |
| JAMA | 350 words |
| BMJ | 300 words |
| Annals | 325 words |
### Example Structured Abstract (NEJM Style)
```
BACKGROUND
Type 2 diabetes is associated with increased cardiovascular risk, but
the effects of intensive glucose control on cardiovascular outcomes
remain uncertain.
METHODS
We randomly assigned 10,251 patients with type 2 diabetes and established
cardiovascular disease to receive intensive glucose-lowering therapy
(target HbA1c <6.0%) or standard therapy (target HbA1c 7.0-7.9%). The
primary outcome was a composite of nonfatal myocardial infarction,
nonfatal stroke, or death from cardiovascular causes.
RESULTS
After a median follow-up of 3.5 years, the primary outcome occurred in
352 patients (6.9%) in the intensive-therapy group and in 371 patients
(7.2%) in the standard-therapy group (hazard ratio, 0.90; 95% CI, 0.78
to 1.04; P=0.16). Severe hypoglycemia was more common with intensive
therapy (3.1% vs. 1.0%; P<0.001). All-cause mortality was similar
between groups (5.0% vs. 4.8%; hazard ratio, 1.04; 95% CI, 0.87 to 1.24).
CONCLUSIONS
In patients with type 2 diabetes and established cardiovascular disease,
intensive glucose lowering did not significantly reduce major
cardiovascular events compared with standard therapy and was associated
with increased severe hypoglycemia.
```
---
## Evidence Language
### The Cardinal Rule
**Match your language to your evidence strength.**
### Language by Study Design
| Study Design | Appropriate Language |
|-------------|---------------------|
| **Meta-analysis of RCTs** | "Treatment X reduces mortality..." |
| **Large RCT** | "Treatment X reduced mortality in this trial..." |
| **Small RCT** | "Treatment X was associated with reduced mortality..." |
| **Cohort study** | "Treatment X was associated with lower mortality..." |
| **Case-control** | "Treatment X was associated with reduced odds of death..." |
| **Cross-sectional** | "Treatment X use was associated with lower mortality..." |
| **Case series** | "These cases suggest that treatment X may..." |
| **Case report** | "This case illustrates that treatment X can..." |
### Causal Language Rules
**Never say** (unless RCT): "Treatment X prevents..." / "Treatment X causes..."
**Use for observational**: "Treatment X was associated with..." / "Treatment X was linked to..."
**Use for RCTs**: "Treatment X resulted in..." / "Treatment X reduced..."
### Hedging Phrases
| Certainty Level | Phrases |
|----------------|---------|
| **High** | "demonstrates," "shows," "confirms" |
| **Moderate** | "suggests," "indicates," "supports" |
| **Low** | "may," "might," "could potentially" |
| **Speculative** | "it is possible that," "one interpretation is" |
---
## Reporting Numbers
### Absolute vs. Relative Risk
**Always report both absolute and relative measures.**
**Incomplete**: "Treatment reduced mortality by 50%"
**Complete**: "Treatment reduced relative mortality by 50% (absolute risk reduction, 2.5 percentage points; number needed to treat, 40)"
### Confidence Intervals
**Always include 95% confidence intervals.**
❌ "The hazard ratio was 0.75"
✅ "The hazard ratio was 0.75 (95% CI, 0.62 to 0.91)"
### P-values
- Report exact P-values when possible: P=0.003
- Use P<0.001 for very small values
- Consider clinical significance alongside statistical significance
### Number Needed to Treat (NNT)
Include NNT for clinically important outcomes:
```
"The intervention prevented one additional death for every 40 patients
treated (NNT=40; 95% CI, 28 to 67)."
```
---
## Introduction
### Length and Structure
- **3-4 paragraphs** (500-700 words)
- Focus on clinical problem and rationale
### Paragraph Structure
**Paragraph 1: Clinical Problem**
- Burden of disease (incidence, prevalence, mortality)
- Impact on patients and healthcare system
- Why this matters clinically
```
"Type 2 diabetes affects more than 450 million adults worldwide and is
a leading cause of cardiovascular disease, renal failure, and premature
death. Despite advances in glucose-lowering therapies, patients with
diabetes continue to face a two- to four-fold increased risk of
cardiovascular events compared with the general population."
```
**Paragraph 2: Current Knowledge and Limitations**
- What treatments/approaches exist
- What evidence gaps remain
- Why more research was needed
**Paragraph 3: Rationale and Objectives**
- Why this study was conducted
- Clear statement of objectives/hypothesis
- Primary outcome stated
```
"We therefore conducted a randomized, controlled trial to evaluate
whether intensive glucose-lowering therapy, compared with standard
therapy, would reduce major cardiovascular events in patients with
type 2 diabetes and established cardiovascular disease."
```
---
## Methods
### Structure (CONSORT/STROBE Aligned)
Medical methods sections follow reporting guidelines:
```
METHODS
├── Study Design
├── Setting and Participants
│ ├── Eligibility Criteria
│ └── Recruitment
├── Randomization and Blinding (for RCTs)
├── Interventions
├── Outcomes
│ ├── Primary Outcome
│ └── Secondary Outcomes
├── Sample Size Calculation
├── Statistical Analysis
├── Ethics Approval
└── Registration
```
### Key Elements
**Eligibility Criteria**
- List inclusion and exclusion criteria explicitly
- Be specific (age ranges, disease definitions, lab values)
**Primary Outcome**
- Define precisely, including timing of assessment
- State how it was measured
**Statistical Analysis**
- Pre-specified analysis plan
- Handling of missing data
- Subgroup analyses (pre-specified vs. exploratory)
### Example Methods Paragraph
```
We enrolled adults aged 40 years or older with type 2 diabetes (defined
as HbA1c ≥6.5% or use of glucose-lowering medication) and established
cardiovascular disease (previous myocardial infarction, stroke, or
revascularization procedure). Patients were excluded if they had an
HbA1c level below 7.5% or above 11.0%, estimated glomerular filtration
rate below 30 ml per minute per 1.73 m² of body-surface area, or a
cardiovascular event within the past 30 days.
```
---
## Results
### Structure
**Opening: Participant Flow**
- Screening, enrollment, randomization, follow-up, analysis
- Reference CONSORT flow diagram
**Baseline Characteristics**
- Table 1: Baseline demographics and clinical characteristics
- Note any imbalances
**Primary Outcome**
- Report first and prominently
- Include point estimate, CI, P-value
- State clinical significance
**Secondary Outcomes**
- Report all pre-specified secondary outcomes
- Be cautious about multiple comparisons
**Adverse Events**
- Report serious adverse events systematically
- Include deaths, hospitalizations, SAEs by category
### Example Results Paragraph
```
Of 12,537 patients assessed for eligibility, 10,251 underwent
randomization: 5,128 were assigned to intensive therapy and 5,123 to
standard therapy (Figure 1). Baseline characteristics were similar
between groups (Table 1). Median follow-up was 3.5 years (interquartile
range, 2.8 to 4.2), with vital status available for 99.2% of patients.
The primary outcome occurred in 352 patients (6.9%) in the intensive-
therapy group and 371 patients (7.2%) in the standard-therapy group
(hazard ratio, 0.90; 95% confidence interval [CI], 0.78 to 1.04;
P=0.16). The absolute difference was 0.3 percentage points (95% CI,
-0.7 to 1.4). Results were consistent across pre-specified subgroups
(Figure 3).
```
---
## Discussion
### Structure
**Paragraph 1: Summary of Main Findings**
- Restate primary outcome result
- State whether hypothesis was supported
**Paragraphs 2-3: Interpretation and Context**
- How do findings compare with prior evidence?
- What mechanisms might explain findings?
- Clinical interpretation
**Paragraph 4: Strengths**
- Study design features
- Generalizability
- Completeness of follow-up
**Paragraph 5: Limitations**
- Be specific and thoughtful
- Discuss how limitations might affect interpretation
- Avoid generic statements
**Final Paragraph: Conclusions and Implications**
- Clinical implications
- Policy implications
- Future research needs
### Example Limitations Paragraph
```
Our study has several limitations. First, despite randomization, we
cannot exclude residual confounding from unmeasured factors. Second,
the open-label design may have introduced bias in outcome assessment
for subjective endpoints, though the primary outcome of death was
objective. Third, our findings may not generalize to patients without
established cardiovascular disease or to healthcare settings with
different resources. Fourth, the 3.5-year follow-up may have been
insufficient to detect cardiovascular benefits that emerge over
longer periods.
```
---
## Journal-Specific Requirements
### NEJM (New England Journal of Medicine)
- **Word limit**: 2,700 words (excluding abstract, references)
- **Abstract**: 250 words, structured
- **References**: ~40-50 typical
- **Figures/Tables**: 4-5 combined
- **Style**: Definitive, authoritative
- **Emphasis**: Major clinical trials, transformative research
### Lancet
- **Word limit**: 3,500 words for research articles
- **Abstract**: 300 words, structured
- **Summary box (Panel)**: Key messages highlighted
- **Research in Context**: Required section explaining contribution
- **Style**: Global health perspective valued
### JAMA (Journal of the American Medical Association)
- **Word limit**: 3,000 words for original investigations
- **Abstract**: 350 words, structured
- **Key Points box**: Required summary
- **Visual abstract**: Encouraged
- **Style**: Policy-relevant, public health focus
### BMJ (British Medical Journal)
- **Word limit**: 3,000 words for research
- **Abstract**: 300 words, structured
- **What this paper adds**: Required box
- **Strengths and limitations box**: Explicit section
- **Style**: Practical, evidence-based
### Annals of Internal Medicine
- **Word limit**: 3,000 words
- **Abstract**: 325 words, structured
- **Style**: Focused on internal medicine practice
- **Clinical Trials and Meta-analyses**: Specialty
---
## Reporting Guidelines Compliance
### CONSORT (RCTs)
**25-item checklist** including:
- Trial design, randomization, blinding
- Participant flow (diagram required)
- All outcomes with effect sizes and CIs
- Harms and adverse events
### STROBE (Observational)
**22-item checklist** for:
- Cohort, case-control, cross-sectional studies
- Setting, participants, variables, data sources
- Bias assessment, sensitivity analyses
### PRISMA (Systematic Reviews)
**27-item checklist** including:
- Search strategy
- Study selection process (diagram)
- Risk of bias assessment
- Synthesis methods
### STARD (Diagnostic Studies)
**30 items** for diagnostic accuracy studies
---
## Tables and Figures
### Table 1: Baseline Characteristics
Standard format:
```
Intensive Therapy Standard Therapy
(N=5128) (N=5123)
Age — yr 63.4 ± 8.7 63.6 ± 8.5
Male sex — no. (%) 3389 (66.1) 3401 (66.4)
Body-mass index 32.1 ± 5.4 32.0 ± 5.3
HbA1c — % 8.3 ± 1.1 8.3 ± 1.0
Duration of diabetes — yr 10.2 ± 7.8 10.1 ± 7.6
Prior MI — no. (%) 2435 (47.5) 2411 (47.1)
```
### CONSORT Flow Diagram
Required for RCTs:
```
Assessed for eligibility (n=12,537)
├─► Excluded (n=2,286)
│ ├─ Not meeting criteria (n=1,854)
│ ├─ Declined to participate (n=389)
│ └─ Other reasons (n=43)
Randomized (n=10,251)
├─► Intensive therapy (n=5,128)
│ ├─ Lost to follow-up (n=52)
│ └─ Analyzed (n=5,076)
└─► Standard therapy (n=5,123)
├─ Lost to follow-up (n=48)
└─ Analyzed (n=5,075)
```
### Kaplan-Meier Curves
Standard presentation:
- Survival curves with shaded confidence bands
- Number at risk table below
- Hazard ratio with 95% CI
- Log-rank P-value
---
## Common Mistakes in Medical Writing
1. **Overclaiming causation**: Using "caused" for observational data
2. **Relative risk only**: Not reporting absolute measures
3. **Missing CIs**: Reporting point estimates without uncertainty
4. **Vague limitations**: "Our study has limitations" without specifics
5. **Ignoring negative results**: Selective reporting of outcomes
6. **Clinical significance confusion**: Statistically significant ≠ clinically meaningful
7. **Subgroup fishing**: Post-hoc subgroup analyses presented as confirmatory
8. **Missing CONSORT/STROBE items**: Incomplete reporting
---
## Pre-Submission Checklist
### Required Elements
- [ ] Structured abstract (journal-specific format)
- [ ] Trial registration number (for RCTs)
- [ ] Ethics committee approval statement
- [ ] Conflict of interest disclosures
- [ ] CONSORT/STROBE checklist completed
### Statistical Reporting
- [ ] Primary outcome reported with CI and P-value
- [ ] Absolute and relative measures included
- [ ] All pre-specified outcomes reported
- [ ] NNT calculated for significant clinical outcomes
### Evidence Language
- [ ] Claims match study design
- [ ] Appropriate hedging used
- [ ] Causal language only for RCTs
### Clinical Relevance
- [ ] Clinical implications stated
- [ ] Patient-centered outcomes emphasized
- [ ] Generalizability discussed
---
## See Also
- `venue_writing_styles.md` - Master style overview
- `journals_formatting.md` - Technical formatting requirements
- `reviewer_expectations.md` - What medical reviewers seek
- Reporting guideline resources: consort-statement.org, strobe-statement.org

View File

@@ -0,0 +1,556 @@
# ML Conference Writing Style Guide
Comprehensive writing guide for NeurIPS, ICML, ICLR, CVPR, ECCV, ICCV, and other major machine learning and computer vision conferences.
**Last Updated**: 2024
---
## Overview
ML conferences prioritize **novelty**, **rigorous empirical evaluation**, and **reproducibility**. Papers are evaluated on clear contribution, strong baselines, comprehensive ablations, and honest discussion of limitations.
### Key Philosophy
> "Show don't tell—your experiments should demonstrate your claims, not just your prose."
**Primary Goal**: Advance the state of the art with novel methods validated through rigorous experimentation.
---
## Audience and Tone
### Target Reader
- ML researchers and practitioners
- Experts in the specific subfield
- Familiar with recent literature
- Expect technical depth and precision
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Technical** | Dense with methodology details |
| **Precise** | Exact terminology, no ambiguity |
| **Empirical** | Claims backed by experiments |
| **Direct** | State contributions clearly |
| **Honest** | Acknowledge limitations |
### Voice
- **First person plural ("we")**: "We propose..." "Our method..."
- **Active voice**: "We introduce a novel architecture..."
- **Confident but measured**: Strong claims require strong evidence
---
## Abstract
### Style Requirements
- **Dense and numbers-focused**
- **150-250 words** (varies by venue)
- **Key results upfront**: Include specific metrics
- **Flowing paragraph** (not structured)
### Abstract Structure
1. **Problem** (1 sentence): What problem are you solving?
2. **Limitation of existing work** (1 sentence): Why current methods fall short
3. **Your approach** (1-2 sentences): What's your method?
4. **Key results** (2-3 sentences): Specific numbers on benchmarks
5. **Significance** (optional, 1 sentence): Why this matters
### Example Abstract (NeurIPS Style)
```
Transformers have achieved remarkable success in sequence modeling but
suffer from quadratic computational complexity, limiting their application
to long sequences. We introduce FlashAttention-2, an IO-aware exact
attention algorithm that achieves 2x speedup over FlashAttention and up
to 9x speedup over standard attention on sequences up to 16K tokens. Our
key insight is to reduce memory reads/writes by tiling and recomputation,
achieving optimal IO complexity. On the Long Range Arena benchmark,
FlashAttention-2 enables training with 8x longer sequences while matching
standard attention accuracy. Combined with sequence parallelism, we train
GPT-style models on sequences of 64K tokens at near-linear cost. We
release optimized CUDA kernels achieving 80% of theoretical peak FLOPS
on A100 GPUs. Code is available at [anonymous URL].
```
### Abstract Don'ts
❌ "We propose a novel method for X" (vague, no results)
❌ "Our method outperforms baselines" (no specific numbers)
❌ "This is an important problem" (self-evident claims)
✅ Include specific metrics: "achieves 94.5% accuracy, 3.2% improvement"
✅ Include scale: "on 1M samples" or "16K token sequences"
✅ Include comparison: "2x faster than previous SOTA"
---
## Introduction
### Structure (2-3 pages)
ML introductions have a distinctive structure with **numbered contributions**.
### Paragraph-by-Paragraph Guide
**Paragraph 1: Problem Motivation**
- Why is this problem important?
- What are the applications?
- Set up the technical challenge
```
"Large language models have demonstrated remarkable capabilities in
natural language understanding and generation. However, their quadratic
attention complexity presents a fundamental bottleneck for processing
long documents, multi-turn conversations, and reasoning over extended
contexts. As models scale to billions of parameters and context lengths
extend to tens of thousands of tokens, efficient attention mechanisms
become critical for practical deployment."
```
**Paragraph 2: Limitations of Existing Approaches**
- What methods exist?
- Why are they insufficient?
- Technical analysis of limitations
```
"Prior work has addressed this through sparse attention patterns,
linear attention approximations, and low-rank factorizations. While
these methods reduce theoretical complexity, they often sacrifice
accuracy, require specialized hardware, or introduce approximation
errors that compound in deep networks. Exact attention remains
preferable when computational resources permit."
```
**Paragraph 3: Your Approach (High-Level)**
- What's your key insight?
- How does your method work conceptually?
- Why should it succeed?
```
"We observe that the primary bottleneck in attention is not computation
but rather memory bandwidth—reading and writing the large N×N attention
matrix dominates runtime on modern GPUs. We propose FlashAttention-2,
which eliminates this bottleneck through a novel tiling strategy that
computes attention block-by-block without materializing the full matrix."
```
**Paragraph 4: Contribution List (CRITICAL)**
This is **mandatory and distinctive** for ML conferences:
```
Our contributions are as follows:
• We propose FlashAttention-2, an IO-aware exact attention algorithm
that achieves optimal memory complexity O(N²d/M) where M is GPU
SRAM size.
• We provide theoretical analysis showing that our algorithm achieves
2-4x fewer HBM accesses than FlashAttention on typical GPU
configurations.
• We demonstrate 2x speedup over FlashAttention and up to 9x over
standard PyTorch attention across sequence lengths from 256 to 64K
tokens.
• We show that FlashAttention-2 enables training with 8x longer
contexts on the same hardware, unlocking new capabilities for
long-range modeling.
• We release optimized CUDA kernels and PyTorch bindings at
[anonymous URL].
```
### Contribution Bullet Guidelines
| Good Contribution Bullets | Bad Contribution Bullets |
|--------------------------|-------------------------|
| Specific, quantifiable | Vague claims |
| Self-contained | Requires reading paper to understand |
| Distinct from each other | Overlapping bullets |
| Emphasize novelty | State obvious facts |
### Related Work Placement
- **In introduction**: Brief positioning (1-2 paragraphs)
- **Separate section**: Detailed comparison (at end or before conclusion)
- **Appendix**: Extended discussion if space-limited
---
## Method
### Structure (2-3 pages)
```
METHOD
├── Problem Formulation
├── Method Overview / Architecture
├── Key Technical Components
│ ├── Component 1 (with equations)
│ ├── Component 2 (with equations)
│ └── Component 3 (with equations)
├── Theoretical Analysis (if applicable)
└── Implementation Details
```
### Mathematical Notation
- **Define all notation**: "Let X ∈ ^{N×d} denote the input sequence..."
- **Consistent symbols**: Same symbol means same thing throughout
- **Number important equations**: Reference by number later
### Algorithm Pseudocode
Include clear pseudocode for reproducibility:
```
Algorithm 1: FlashAttention-2 Forward Pass
─────────────────────────────────────────
Input: Q, K, V ∈ ^{N×d}, block size B_r, B_c
Output: O ∈ ^{N×d}
1: Divide Q into T_r = ⌈N/B_r⌉ blocks
2: Divide K, V into T_c = ⌈N/B_c⌉ blocks
3: Initialize O = 0, = 0, m = -∞
4: for i = 1 to T_r do
5: Load Q_i from HBM to SRAM
6: for j = 1 to T_c do
7: Load K_j, V_j from HBM to SRAM
8: Compute S_ij = Q_i K_j^T
9: Update running max and sum
10: Update O_i incrementally
11: end for
12: Write O_i to HBM
13: end for
14: return O
```
### Architecture Diagrams
- **Clear, publication-quality figures**
- **Label all components**
- **Show data flow with arrows**
- **Use consistent visual language**
---
## Experiments
### Structure (2-3 pages)
```
EXPERIMENTS
├── Experimental Setup
│ ├── Datasets and Benchmarks
│ ├── Baselines
│ ├── Implementation Details
│ └── Evaluation Metrics
├── Main Results
│ └── Table/Figure with primary comparisons
├── Ablation Studies
│ └── Component-wise analysis
├── Analysis
│ ├── Scaling behavior
│ ├── Qualitative examples
│ └── Error analysis
└── Computational Efficiency
```
### Datasets and Benchmarks
- **Use standard benchmarks**: Establish comparability
- **Report dataset statistics**: Size, splits, preprocessing
- **Justify non-standard choices**: If using custom data, explain why
### Baselines
**Critical for acceptance.** Include:
- **Recent SOTA**: Not just old methods
- **Fair comparisons**: Same compute budget, hyperparameter tuning
- **Ablated versions**: Your method without key components
- **Strong baselines**: Don't cherry-pick weak competitors
### Main Results Table
Clear, comprehensive formatting:
```
Table 1: Results on Long Range Arena Benchmark (accuracy %)
──────────────────────────────────────────────────────────
Method | ListOps | Text | Retrieval | Image | Path | Avg
──────────────────────────────────────────────────────────
Transformer | 36.4 | 64.3 | 57.5 | 42.4 | 71.4 | 54.4
Performer | 18.0 | 65.4 | 53.8 | 42.8 | 77.1 | 51.4
Linear Attn | 16.1 | 65.9 | 53.1 | 42.3 | 75.3 | 50.5
FlashAttention | 37.1 | 64.5 | 57.8 | 42.7 | 71.2 | 54.7
FlashAttn-2 | 37.4 | 64.7 | 58.2 | 42.9 | 71.8 | 55.0
──────────────────────────────────────────────────────────
```
### Ablation Studies (MANDATORY)
Show what matters in your method:
```
Table 2: Ablation Study on FlashAttention-2 Components
──────────────────────────────────────────────────────
Variant | Speedup | Memory
──────────────────────────────────────────────────────
Full FlashAttention-2 | 2.0x | 1.0x
- without sequence parallelism | 1.7x | 1.0x
- without recomputation | 1.3x | 2.4x
- without block tiling | 1.0x | 4.0x
FlashAttention-1 (baseline) | 1.0x | 1.0x
──────────────────────────────────────────────────────
```
### What Ablations Should Show
- **Each component matters**: Removing it hurts performance
- **Design choices justified**: Why this architecture/hyperparameter?
- **Failure modes**: When does method not work?
- **Sensitivity analysis**: Robustness to hyperparameters
---
## Related Work
### Placement Options
1. **After Introduction**: Common in CV papers
2. **Before Conclusion**: Common in NeurIPS/ICML
3. **Appendix**: When space is tight
### Writing Style
- **Organized by theme**: Not chronological
- **Position your work**: How you differ from each line of work
- **Fair characterization**: Don't misrepresent prior work
- **Recent citations**: Include 2023-2024 papers
### Example Structure
```
**Efficient Attention Mechanisms.** Prior work on efficient attention
falls into three categories: sparse patterns (Beltagy et al., 2020;
Zaheer et al., 2020), linear approximations (Katharopoulos et al., 2020;
Choromanski et al., 2021), and low-rank factorizations (Wang et al.,
2020). Our work differs in that we focus on IO-efficient exact
attention rather than approximations.
**Memory-Efficient Training.** Gradient checkpointing (Chen et al., 2016)
and activation recomputation (Korthikanti et al., 2022) reduce memory
by trading compute. We adopt similar ideas but apply them within the
attention operator itself.
```
---
## Limitations Section
### Why It Matters
**Increasingly required** at NeurIPS, ICML, ICLR. Honest limitations:
- Show scientific maturity
- Guide future work
- Prevent overselling
### What to Include
1. **Method limitations**: When does it fail?
2. **Experimental limitations**: What wasn't tested?
3. **Scope limitations**: What's out of scope?
4. **Computational limitations**: Resource requirements
### Example Limitations Section
```
**Limitations.** While FlashAttention-2 provides substantial speedups,
several limitations remain. First, our implementation is optimized for
NVIDIA GPUs and does not support AMD or other hardware. Second, the
speedup is most pronounced for medium to long sequences; for very short
sequences (<256 tokens), the overhead of our kernel launch dominates.
Third, we focus on dense attention; extending our approach to sparse
attention patterns remains future work. Finally, our theoretical
analysis assumes specific GPU memory hierarchy parameters that may not
hold for future hardware generations.
```
---
## Reproducibility
### Reproducibility Checklist (NeurIPS/ICML)
Most ML conferences require a reproducibility checklist covering:
- [ ] Code availability
- [ ] Dataset availability
- [ ] Hyperparameters specified
- [ ] Random seeds reported
- [ ] Compute requirements stated
- [ ] Number of runs and variance reported
- [ ] Statistical significance tests
### What to Report
**Hyperparameters**:
```
"We train with Adam (β₁=0.9, β₂=0.999, ε=1e-8) and learning rate 3e-4
with linear warmup over 1000 steps and cosine decay. Batch size is 256
across 8 A100 GPUs. We train for 100K steps (approximately 24 hours)."
```
**Random Seeds**:
```
"All experiments are averaged over 3 random seeds (0, 1, 2) with
standard deviation reported in parentheses."
```
**Compute**:
```
"Experiments were conducted on 8 NVIDIA A100-80GB GPUs. Total training
time was approximately 500 GPU-hours."
```
---
## Figures
### Figure Quality
- **Vector graphics preferred**: PDF, SVG
- **High resolution for rasters**: 300+ dpi
- **Readable at publication size**: Test at actual column width
- **Colorblind-accessible**: Use patterns in addition to color
### Common Figure Types
1. **Architecture diagram**: Show your method visually
2. **Performance plots**: Learning curves, scaling behavior
3. **Comparison tables**: Main results
4. **Ablation figures**: Component contributions
5. **Qualitative examples**: Input/output samples
### Figure Captions
Self-contained captions that explain:
- What is shown
- How to read the figure
- Key takeaway
---
## References
### Citation Style
- **Numbered [1]** or **author-year (Smith et al., 2023)**
- Check venue-specific requirements
- Be consistent throughout
### Reference Guidelines
- **Cite recent work**: 2022-2024 papers expected
- **Don't over-cite yourself**: Raises bias concerns
- **Cite arxiv appropriately**: Use published version when available
- **Include all relevant prior work**: Missing citations hurt review
---
## Venue-Specific Notes
### NeurIPS
- **8 pages** main + unlimited appendix/references
- **Broader Impact** section sometimes required
- **Reproducibility checklist** mandatory
- OpenReview submission, public reviews
### ICML
- **8 pages** main + unlimited appendix/references
- Strong emphasis on **theory + experiments**
- Reproducibility statement encouraged
### ICLR
- **8 pages** main (camera-ready can exceed)
- OpenReview with **public reviews and discussion**
- Author response period is interactive
- Strong emphasis on **novelty and insight**
### CVPR/ICCV/ECCV
- **8 pages** main including references
- **Supplementary video** encouraged
- Heavy emphasis on **visual results**
- Benchmark performance critical
---
## Common Mistakes
1. **Weak baselines**: Not comparing to recent SOTA
2. **Missing ablations**: Not showing component contributions
3. **Overclaiming**: "We solve X" when you partially address X
4. **Vague contributions**: "We propose a novel method"
5. **Poor reproducibility**: Missing hyperparameters, seeds
6. **Wrong template**: Using last year's style file
7. **Anonymous violations**: Revealing identity in blind review
8. **Missing limitations**: Not acknowledging failure modes
---
## Rebuttal Tips
ML conferences have author response periods. Tips:
- **Address key concerns first**: Prioritize critical issues
- **Run requested experiments**: When feasible in time
- **Be concise**: Reviewers read many rebuttals
- **Stay professional**: Even with unfair reviews
- **Reference specific lines**: "As stated in L127..."
---
## Pre-Submission Checklist
### Content
- [ ] Clear problem motivation
- [ ] Explicit contribution list
- [ ] Complete method description
- [ ] Comprehensive experiments
- [ ] Strong baselines included
- [ ] Ablation studies present
- [ ] Limitations acknowledged
### Technical
- [ ] Correct venue style file (current year)
- [ ] Anonymized (no author names, no identifiable URLs)
- [ ] Page limit respected
- [ ] References complete
- [ ] Supplementary organized
### Reproducibility
- [ ] Hyperparameters listed
- [ ] Random seeds specified
- [ ] Compute requirements stated
- [ ] Code/data availability noted
- [ ] Reproducibility checklist completed
---
## See Also
- `venue_writing_styles.md` - Master style overview
- `conferences_formatting.md` - Technical formatting requirements
- `reviewer_expectations.md` - What ML reviewers seek

View File

@@ -0,0 +1,405 @@
# Nature and Science Writing Style Guide
Comprehensive writing guide for Nature, Science, and related high-impact multidisciplinary journals (Nature Communications, Science Advances, PNAS).
**Last Updated**: 2024
---
## Overview
Nature and Science are the world's premier multidisciplinary scientific journals. Papers published here must appeal to scientists across all disciplines, not just specialists. This fundamentally shapes the writing style.
### Key Philosophy
> "If a structural biologist can't understand why your particle physics paper matters, it won't be published in Nature."
**Primary Goal**: Communicate groundbreaking science to an educated but non-specialist audience.
---
## Audience and Tone
### Target Reader
- PhD-level scientist in **any** field
- Familiar with scientific methodology
- **Not** an expert in your specific subfield
- Reading broadly to stay current across science
### Tone Characteristics
| Characteristic | Description |
|---------------|-------------|
| **Accessible** | Avoid jargon; explain technical concepts |
| **Engaging** | Hook the reader; tell a story |
| **Significant** | Emphasize why this matters broadly |
| **Confident** | State findings clearly (with appropriate hedging) |
| **Active** | Use active voice; first person acceptable |
### Voice
- **First person plural ("we") is encouraged**: "We discovered that..." not "It was discovered that..."
- **Active voice preferred**: "We measured..." not "Measurements were taken..."
- **Direct statements**: "Protein X controls Y" not "Protein X appears to potentially control Y"
---
## Abstract
### Style Requirements
- **Flowing paragraphs** (NOT structured with labeled sections)
- **150-200 words** for Nature; up to 250 for Nature Communications
- **No citations** in abstract
- **No abbreviations** (or define at first use if essential)
- **Self-contained**: Understandable without reading the paper
### Abstract Structure (Implicit)
Write as flowing prose covering:
1. **Context** (1-2 sentences): Why this area matters
2. **Gap/Problem** (1 sentence): What was unknown or problematic
3. **Approach** (1 sentence): What you did (briefly)
4. **Key findings** (2-3 sentences): Main results with key numbers
5. **Significance** (1-2 sentences): Why this matters, implications
### Example Abstract (Nature Style)
```
The origins of multicellular life remain one of biology's greatest mysteries.
How individual cells first cooperated to form complex organisms has been
difficult to study because the transition occurred over 600 million years ago.
Here we show that the unicellular alga Chlamydomonas reinhardtii can evolve
simple multicellular structures within 750 generations when exposed to
predation pressure. Using experimental evolution with the predator Paramecium,
we observed the emergence of stable multicellular clusters in 5 of 10
replicate populations. Genomic analysis revealed that mutations in just two
genes—encoding cell adhesion proteins—were sufficient to trigger this
transition. These results demonstrate that the evolution of multicellularity
may require fewer genetic changes than previously thought, providing insight
into one of life's major transitions.
```
### What NOT to Write
**Too technical**:
> "Using CRISPR-Cas9-mediated knockout of the CAD1 gene (encoding cadherin-1) in C. reinhardtii strain CC-125, we demonstrated that loss of CAD1 function combined with overexpression of FLA10 under control of the HSP70A/RBCS2 tandem promoter..."
**Too vague**:
> "We studied how cells can form groups. Our results are interesting and may have implications for understanding evolution."
---
## Introduction
### Length and Structure
- **3-5 paragraphs** (roughly 500-800 words)
- **Funnel structure**: Broad → Specific → Your contribution
### Paragraph-by-Paragraph Guide
**Paragraph 1: The Big Picture**
- Open with a broad, engaging statement about the field
- Establish why this area matters to science/society
- Accessible to any scientist
```
Example:
"The ability to predict protein structure from sequence alone has been a grand
challenge of biology for over 50 years. Accurate predictions would transform
drug discovery, enable understanding of disease mechanisms, and illuminate the
fundamental rules governing molecular self-assembly."
```
**Paragraph 2-3: What We Know**
- Review key prior work (selectively, not exhaustively)
- Build toward the gap you'll address
- Keep citations focused on essential papers
```
Example:
"Significant progress has been made through template-based methods that
leverage known structures of homologous proteins. However, for the estimated
30% of proteins without detectable homologs, prediction accuracy has remained
limited. Deep learning approaches have shown promise, achieving improved
accuracy on benchmark datasets, yet still fall short of experimental accuracy
for many protein families."
```
**Paragraph 4: The Gap**
- Clearly state what remains unknown or unresolved
- Frame this as an important problem
```
Example:
"Despite these advances, the fundamental question remains: can we predict
protein structure with experimental-level accuracy for proteins across all
of sequence space? This capability would democratize structural biology and
enable rapid characterization of newly discovered proteins."
```
**Final Paragraph: This Paper**
- State what you did and preview key findings
- Signal the significance of your contribution
```
Example:
"Here we present AlphaFold2, a neural network architecture that predicts
protein structure with atomic-level accuracy. In the CASP14 blind assessment,
AlphaFold2 achieved a median GDT score of 92.4, matching experimental
accuracy for most targets. We show that this system can be applied to predict
structures across entire proteomes, opening new avenues for understanding
protein function at scale."
```
### Introduction Don'ts
- ❌ Don't start with "Since ancient times..." or overly grandiose claims
- ❌ Don't provide an exhaustive literature review (save for specialist journals)
- ❌ Don't include methods or results in the introduction
- ❌ Don't use unexplained acronyms or jargon
---
## Results
### Organizational Philosophy
**Story-driven, not experiment-driven**
Organize by **finding**, not by the chronological order of experiments:
**Experiment-driven** (avoid):
> "We first performed experiment A. Next, we did experiment B. Then we conducted experiment C."
**Finding-driven** (preferred):
> "We discovered that X. To understand the mechanism, we found that Y. This led us to test whether Z, confirming our hypothesis."
### Results Writing Style
- **Past tense** for describing what was done/found
- **Present tense** for referring to figures ("Figure 2 shows...")
- **Objective but interpretive**: State findings with minimal interpretation, but provide enough context for non-specialists
- **Quantitative**: Include key numbers, statistics, effect sizes
### Example Results Paragraph
```
To test whether protein X is required for cell division, we generated
knockout cell lines using CRISPR-Cas9 (Fig. 1a). Cells lacking protein X
showed a 73% reduction in division rate compared to controls (P < 0.001,
n = 6 biological replicates; Fig. 1b). Live-cell imaging revealed that
knockout cells arrested in metaphase, with 84% showing abnormal spindle
morphology (Fig. 1c,d). These results demonstrate that protein X is
essential for proper spindle assembly and cell division.
```
### Subheadings
Use descriptive subheadings that convey findings:
**Vague**: "Protein expression analysis"
**Informative**: "Protein X is upregulated in response to stress"
---
## Discussion
### Structure (4-6 paragraphs)
**Paragraph 1: Summary of Key Findings**
- Restate main findings (don't repeat Results verbatim)
- State whether hypotheses were supported
**Paragraphs 2-3: Interpretation and Context**
- What do the findings mean?
- How do they relate to prior work?
- What mechanisms might explain the results?
**Paragraph 4: Broader Implications**
- Why does this matter beyond your specific system?
- Connections to other fields
- Potential applications
**Paragraph 5: Limitations**
- Acknowledge limitations honestly
- Be specific, not generic
**Final Paragraph: Conclusions and Future**
- Big-picture take-home message
- Brief mention of future directions
### Discussion Writing Tips
- **Lead with implications**, not caveats
- **Compare to literature constructively**: "Our findings extend the work of Smith et al. by demonstrating..."
- **Acknowledge alternative interpretations**: "An alternative explanation is that..."
- **Be honest about limitations**: Specific > generic
### Example Limitation Statement
**Generic**: "Our study has limitations that should be addressed in future work."
**Specific**: "Our analysis was limited to cultured cells, which may not fully recapitulate the tissue microenvironment. Additionally, the 48-hour observation window may miss slower-developing phenotypes."
---
## Methods
### Nature Methods Placement
- **Brief Methods** in main text (often at the end)
- **Extended Methods** in Supplementary Information
- Must be detailed enough for reproduction
### Writing Style
- **Past tense, passive voice acceptable**: "Cells were cultured..." or "We cultured cells..."
- **Precise and reproducible**: Include concentrations, times, temperatures
- **Reference established protocols**: "Following the method of Smith et al.³..."
---
## Figures
### Figure Philosophy
Nature values **conceptual figures** alongside data:
1. **Figure 1**: Often a schematic/model showing the concept
2. **Data figures**: Clear, not cluttered
3. **Final figure**: Often a summary model
### Figure Design Principles
- **Single-column (89 mm) or double-column (183 mm)** width
- **High resolution**: 300+ dpi for photos, 1000+ dpi for line art
- **Colorblind-accessible**: Avoid red-green distinctions alone
- **Minimal chartjunk**: No 3D effects, unnecessary gridlines
- **Complete legends**: Self-explanatory without reading text
### Figure Legend Format
```
Figure 1 | Protein X controls cell division through spindle assembly.
a, Schematic of the experimental approach. b, Quantification of cell
division rate in control (grey) and knockout (blue) cells. Data are
mean ± s.e.m., n = 6 biological replicates. ***P < 0.001, two-tailed
t-test. c,d, Representative images of spindle morphology in control (c)
and knockout (d) cells. Scale bars, 10 μm.
```
---
## References
### Citation Style
- **Numbered superscripts**: ¹, ², ¹⁻³, ¹'⁵'⁷
- **Nature format** for bibliography
### Reference Format
```
1. Watson, J. D. & Crick, F. H. C. Molecular structure of nucleic acids.
Nature 171, 737738 (1953).
2. Smith, A. B., Jones, C. D. & Williams, E. F. Discovery of protein X.
Science 380, 123130 (2023).
```
### Citation Best Practices
- **Recent literature**: Include papers from last 2-3 years
- **Seminal papers**: Cite foundational work
- **Diverse sources**: Don't over-cite your own work
- **Primary sources**: Cite original discoveries, not reviews (when possible)
---
## Language and Style Tips
### Word Choice
| Avoid | Prefer |
|-------|--------|
| utilize | use |
| methodology | method |
| in order to | to |
| a large number of | many |
| at this point in time | now |
| has the ability to | can |
| it is interesting to note that | [delete entirely] |
### Sentence Structure
- **Vary sentence length**: Mix short and longer sentences
- **Lead with importance**: Put key information at the start
- **One idea per sentence**: Complex ideas need multiple sentences
### Paragraph Structure
- **Topic sentence first**: State the main point
- **Supporting evidence**: Data and citations
- **Transition**: Connect to next paragraph
---
## Comparison: Nature vs. Science
| Feature | Nature | Science |
|---------|--------|---------|
| Abstract length | 150-200 words | ≤125 words |
| Citation style | Numbered superscript | Numbered parentheses (1, 2) |
| Article titles in refs | Yes | No (in main refs) |
| Methods placement | End of paper or supplement | Supplement |
| Significance statement | No | No |
| Open access option | Yes | Yes |
---
## Common Rejection Reasons
1. **Not of sufficient broad interest**: Too specialized for Nature/Science
2. **Incremental advance**: Not transformative enough
3. **Overselling**: Claims not supported by data
4. **Poor accessibility**: Too technical for general audience
5. **Weak significance statement**: "So what?" unclear
6. **Insufficient novelty**: Similar findings published elsewhere
7. **Methodological concerns**: Results not convincing
---
## Pre-Submission Checklist
### Content
- [ ] Significance to broad audience clear in first paragraph
- [ ] Non-specialist can understand the abstract
- [ ] Story-driven results (not experiment-by-experiment)
- [ ] Implications emphasized in discussion
- [ ] Limitations acknowledged specifically
### Style
- [ ] Active voice predominates
- [ ] Jargon minimized or explained
- [ ] Sentences vary in length
- [ ] Paragraphs have clear topic sentences
### Technical
- [ ] Figures are high resolution
- [ ] Citations in correct format
- [ ] Word count within limits
- [ ] Line numbers included
- [ ] Double-spaced
---
## See Also
- `venue_writing_styles.md` - Master style overview
- `journals_formatting.md` - Technical formatting requirements
- `reviewer_expectations.md` - What Nature/Science reviewers seek

View File

@@ -0,0 +1,417 @@
# Reviewer Expectations by Venue
Understanding what reviewers look for at different venues is essential for crafting successful submissions. This guide covers evaluation criteria, common rejection reasons, and how to address reviewer concerns.
**Last Updated**: 2024
---
## Overview
Reviewers at different venues prioritize different aspects. Understanding these priorities helps you:
1. Frame your contribution appropriately
2. Anticipate likely criticisms
3. Prepare effective rebuttals
4. Decide where to submit
---
## High-Impact Journals (Nature, Science, Cell)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Broad significance** | Critical | Impact beyond the specific subfield |
| **Novelty** | Critical | First to show this or major advance |
| **Technical rigor** | High | Sound methodology, appropriate controls |
| **Clarity** | High | Accessible to non-specialists |
| **Completeness** | Moderate | Thorough but not exhaustive |
### Review Process
1. **Editorial triage**: Most papers rejected without review (Nature: ~92%)
2. **Expert review**: 2-4 reviewers if sent out
3. **Cross-discipline reviewer**: Often includes non-specialist
4. **Quick turnaround**: First decision typically 2-4 weeks
### What Gets a Paper Rejected
**At Editorial Stage**:
- Findings not significant enough for broad audience
- Incremental advance over prior work
- Too specialized for the journal
- Topic doesn't fit current editorial interests
**At Review Stage**:
- Claims not supported by data
- Missing critical controls
- Alternative interpretations not addressed
- Statistical concerns
- Prior work not adequately acknowledged
- Writing inaccessible to non-specialists
### How to Address Nature/Science Reviewers
**In the paper**:
- Lead with significance in the first paragraph
- Explain why findings matter broadly
- Include controls for all major claims
- Use clear, accessible language
- Include conceptual figures
**In rebuttal**:
- Address every point (even minor ones)
- Provide new data when requested
- Acknowledge valid criticisms gracefully
- Explain significance if questioned
### Sample Reviewer Concerns and Responses
**Reviewer**: "The significance of this work is unclear to a general audience."
**Response**: "We have revised the introduction to clarify the broader significance. As now stated in paragraph 1, our findings have implications for [X] because [Y]. We have also added a discussion of how these results inform understanding of [Z] (p. 8, lines 15-28)."
---
## Medical Journals (NEJM, Lancet, JAMA)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Clinical relevance** | Critical | Will this change practice? |
| **Methodological rigor** | Critical | CONSORT/STROBE compliance |
| **Patient outcomes** | Critical | Focus on what matters to patients |
| **Statistical validity** | High | Appropriate analysis, power |
| **Generalizability** | High | Applicability to broader populations |
### Review Process
1. **Statistical review**: Dedicated statistical reviewer common
2. **Clinical expertise**: Subspecialty experts
3. **Methodological review**: Focus on study design
4. **Multiple rounds**: Revisions often requested
### What Gets a Paper Rejected
**Major Issues**:
- Underpowered study
- Inappropriate control/comparator
- Confounding not addressed
- Selective outcome reporting
- Missing safety data
- Claims exceed evidence
**Moderate Issues**:
- Unclear generalizability
- Missing subgroup analyses
- Incomplete CONSORT/STROBE reporting
- Statistical methods not described adequately
### Sample Reviewer Concerns and Responses
**Reviewer**: "The study appears underpowered for the primary outcome. With 200 participants and an event rate of 5%, there is insufficient power to detect a clinically meaningful difference."
**Response**: "We appreciate this concern. Our power calculation (Methods, p. 5) was based on a 5% event rate in the control arm and a 50% relative reduction (to 2.5%). While the observed event rate (4.8%) was close to projected, we acknowledge the confidence interval is wide (HR 0.65, 95% CI 0.38-1.12). We have added this as a limitation (Discussion, p. 12). Importantly, the direction and magnitude of effect are consistent with the larger XYZ trial (n=5000), suggesting our findings merit confirmation in a larger study."
---
## Cell Press Journals
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Mechanistic insight** | Critical | How does this work? |
| **Depth of investigation** | Critical | Multiple approaches, comprehensive |
| **Biological significance** | High | Importance for the field |
| **Technical rigor** | High | Quantification, statistics, replication |
| **Novelty** | Moderate-High | New findings, not just confirmation |
### Review Process
1. **Extended review**: 3+ reviewers typical
2. **Revision cycles**: Multiple rounds common
3. **Comprehensive revision**: Major new experiments often requested
4. **Detailed assessment**: Figure-by-figure evaluation
### What Reviewers Expect
- **Multiple complementary approaches**: Same finding shown different ways
- **In vivo validation**: For cell biology claims
- **Rescue experiments**: For knockdown/knockout studies
- **Quantification**: Not just representative images
- **Complete figure panels**: All conditions, all controls
### Sample Reviewer Concerns and Responses
**Reviewer**: "The authors show that protein X is required for process Y using siRNA knockdown. However, a single RNAi reagent is used, and off-target effects cannot be excluded. Additional evidence is needed."
**Response**: "We agree that additional validation is important. In the revised manuscript, we now show: (1) two independent siRNAs against protein X produce identical phenotypes (new Fig. S3A-B); (2) CRISPR-Cas9 knockout cells recapitulate the phenotype (new Fig. 2D-E); and (3) expression of siRNA-resistant protein X rescues the phenotype (new Fig. 2F-G). These complementary approaches strongly support the conclusion that protein X is required for process Y."
---
## ML Conferences (NeurIPS, ICML, ICLR)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Novelty** | Critical | New method, insight, or perspective |
| **Technical soundness** | Critical | Correct implementation, fair comparisons |
| **Significance** | High | Advances the field |
| **Experimental rigor** | High | Strong baselines, proper ablations |
| **Reproducibility** | Moderate-High | Can others replicate? |
| **Clarity** | Moderate | Well-written and organized |
### Review Process
1. **Area Chair assignment**: Grouped by topic
2. **3-4 reviewers**: With expertise in the area
3. **Author rebuttal**: Opportunity to respond
4. **Reviewer discussion**: After rebuttal
5. **AC recommendation**: Meta-review
### Scoring Dimensions
Typical NeurIPS/ICML scoring:
| Dimension | Score Range | What's Evaluated |
|-----------|-------------|------------------|
| **Soundness** | 1-4 | Technical correctness |
| **Contribution** | 1-4 | Significance of results |
| **Presentation** | 1-4 | Clarity and organization |
| **Overall** | 1-10 | Holistic assessment |
| **Confidence** | 1-5 | Reviewer expertise |
### What Gets a Paper Rejected
**Critical Issues**:
- Weak baselines or unfair comparisons
- Missing ablation studies
- Results not significantly better than SOTA
- Technical errors in method or analysis
- Overclaiming without evidence
**Moderate Issues**:
- Limited novelty over prior work
- Narrow evaluation (few datasets/tasks)
- Missing reproducibility details
- Poor presentation
- Limited analysis or insights
### Red Flags for ML Reviewers
❌ "We compare against methods from 2018" (outdated baselines)
❌ "Our method achieves 0.5% improvement" (marginal gain)
❌ "We evaluate on one dataset" (limited generalization)
❌ "Implementation details are in the supplementary" (core info missing)
❌ "We leave ablations for future work" (incomplete evaluation)
### Sample Reviewer Concerns and Responses
**Reviewer**: "The proposed method is only compared against Transformer and Performer. Recent works like FlashAttention and Longformer should be included."
**Response**: "Thank you for this suggestion. We have added comparisons to FlashAttention (Dao et al., 2022), Longformer (Beltagy et al., 2020), and BigBird (Zaheer et al., 2020). As shown in new Table 2, our method outperforms all baselines: FlashAttention (3.2% worse), Longformer (5.1% worse), and BigBird (4.8% worse). We also include a new analysis (Section 4.3) explaining why our approach is particularly effective for sequences > 16K tokens."
---
## HCI Conferences (CHI, CSCW)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Contribution to HCI** | Critical | New design, insight, or method |
| **User-centered approach** | High | Focus on human needs |
| **Appropriate evaluation** | High | Matches claims and contribution |
| **Design rationale** | Moderate-High | Justified design decisions |
| **Implications** | Moderate | Guidance for future work |
### Contribution Types
CHI explicitly categorizes contributions:
| Type | What Reviewers Expect |
|------|----------------------|
| **Empirical** | Rigorous user study, clear findings |
| **Artifact** | Novel system/tool, evaluation of use |
| **Methodological** | New research method, validation |
| **Theoretical** | Conceptual framework, intellectual contribution |
| **Survey** | Comprehensive, well-organized coverage |
### What Gets a Paper Rejected
**Critical Issues**:
- Mismatch between claims and evaluation
- Insufficient participants for conclusions
- Missing ethical considerations (no IRB)
- Technology-focused without user insight
- Limited contribution to HCI community
**Moderate Issues**:
- Weak design rationale
- Limited generalizability
- Missing related work in HCI
- Unclear implications for practitioners
### Sample Reviewer Concerns and Responses
**Reviewer**: "The evaluation consists of a short-term lab study with 12 participants. It's unclear how this system would perform in real-world use over time."
**Response**: "We acknowledge this limitation, which we now discuss explicitly (Section 7.2). We have added a 2-week deployment study with 8 participants from our original cohort (new Section 6.3). This longitudinal data shows sustained engagement (mean usage: 4.2 times/day) and reveals additional insights about how use patterns evolve over time. However, we agree that larger and longer deployments would strengthen ecological validity."
---
## NLP Conferences (ACL, EMNLP)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Task performance** | High | SOTA or competitive results |
| **Analysis quality** | High | Error analysis, insights |
| **Methodology** | High | Sound approach, fair comparisons |
| **Reproducibility** | High | Full details provided |
| **Novelty** | Moderate-High | New approach or insight |
### ACL Rolling Review (ARR)
Since 2022, ACL venues use a shared review system:
- Reviews transfer between venues
- Action editors manage papers
- Commitment to specific venue after review
### Responsible NLP Checklist
Reviewers check for:
- Limitations section (required)
- Risks and ethical considerations
- Compute/carbon footprint
- Bias analysis (when applicable)
- Data documentation
### Sample Reviewer Concerns and Responses
**Reviewer**: "The paper lacks analysis of failure cases. When and why does the proposed method fail?"
**Response**: "We have added Section 5.4 on error analysis. We manually examined 100 errors and categorized them into three types: (1) complex coreference chains (42%), (2) implicit references (31%), and (3) domain-specific knowledge requirements (27%). Figure 4 shows representative examples of each. This analysis reveals that our method particularly struggles with implicit references, which we discuss as a direction for future work."
---
## Data Mining (KDD, WWW)
### What Reviewers Look For
| Priority | Weight | Description |
|----------|--------|-------------|
| **Scalability** | High | Handles large datasets |
| **Practical impact** | High | Real-world applicability |
| **Experimental rigor** | High | Comprehensive evaluation |
| **Technical novelty** | Moderate-High | New method or application |
| **Reproducibility** | Moderate | Code/data availability |
### What Impresses KDD Reviewers
- Large-scale experiments (millions of samples)
- Industry deployment or A/B tests
- Efficiency comparisons (runtime, memory)
- Real datasets alongside benchmarks
- Complexity analysis (time and space)
### Sample Reviewer Concerns and Responses
**Reviewer**: "The experiments are limited to small datasets (< 100K samples). How does the method scale to industry-scale data?"
**Response**: "We have added experiments on two large-scale datasets: (1) ogbn-papers100M (111M nodes, 1.6B edges) and (2) a proprietary e-commerce graph (500M nodes, 4B edges) provided by [company]. Table 4 (new) shows our method scales near-linearly with data size, completing in 42 minutes on ogbn-papers where baselines run out of memory. Section 5.5 (new) provides detailed scalability analysis."
---
## General Rebuttal Strategies
### Do's
**Address every point**: Even minor issues
**Provide evidence**: New experiments, data, or citations
**Be specific**: Reference exact sections, lines, figures
**Acknowledge valid criticisms**: Show you understand the concern
**Be concise**: Reviewers read many rebuttals
**Stay professional**: Even for unfair reviews
**Prioritize critical issues**: Address major concerns first
### Don'ts
**Be defensive**: Accept valid criticisms
**Argue without evidence**: Back up claims
**Ignore points**: Even ones you disagree with
**Be vague**: Be specific about changes
**Attack reviewers**: Maintain professionalism
**Promise future work**: Do the work now if possible
### Rebuttal Template
```
We thank the reviewers for their constructive feedback. We address
the main concerns below:
**R1/R2 Concern: [Shared concern from multiple reviewers]**
[Your response with specific actions taken and references to where
changes are made in the revised manuscript]
**R1-1: [Specific point]**
[Response with evidence]
**R2-3: [Specific point]**
[Response with evidence]
We have also made the following additional improvements:
• [Improvement 1]
• [Improvement 2]
```
---
## Pre-Submission Self-Review
Before submitting, review your paper as a reviewer would:
### All Venues
- [ ] Are claims supported by evidence?
- [ ] Are baselines appropriate and recent?
- [ ] Is the contribution clearly stated?
- [ ] Are limitations acknowledged?
- [ ] Is reproducibility information complete?
### High-Impact Journals
- [ ] Is significance clear to a non-specialist?
- [ ] Are figures accessible and clear?
- [ ] Are controls adequate for claims?
### Medical Journals
- [ ] Is CONSORT/STROBE compliance complete?
- [ ] Are absolute numbers reported?
- [ ] Is clinical relevance clear?
### ML Conferences
- [ ] Are ablations comprehensive?
- [ ] Are comparisons fair?
- [ ] Is reproducibility information complete?
### HCI Conferences
- [ ] Is the user-centered perspective clear?
- [ ] Is the evaluation appropriate for claims?
- [ ] Are design implications actionable?
---
## See Also
- `venue_writing_styles.md` - Writing style by venue
- `nature_science_style.md` - Nature/Science detailed guide
- `ml_conference_style.md` - ML conference detailed guide
- `medical_journal_styles.md` - Medical journal detailed guide

View File

@@ -0,0 +1,321 @@
# Venue Writing Styles: Master Guide
This guide provides an overview of how writing style varies across publication venues. Understanding these differences is essential for crafting papers that read like authentic publications at each venue.
**Last Updated**: 2024
---
## The Style Spectrum
Scientific writing style exists on a spectrum from **broadly accessible** to **deeply technical**:
```
Accessible ◄─────────────────────────────────────────────► Technical
Nature/Science PNAS Cell IEEE Trans NeurIPS Specialized
│ │ │ │ │ Journals
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
General Mixed Deep Field Dense ML Expert
audience depth biology experts researchers only
```
## Quick Style Reference
| Venue Type | Audience | Tone | Voice | Abstract Style |
|------------|----------|------|-------|----------------|
| **Nature/Science** | Educated non-specialists | Accessible, engaging | Active, first-person OK | Flowing paragraphs, no jargon |
| **Cell Press** | Biologists | Mechanistic, precise | Mixed | Summary + eTOC blurb + Highlights |
| **Medical (NEJM/Lancet)** | Clinicians | Evidence-focused | Formal | Structured (Background/Methods/Results/Conclusions) |
| **PLOS/BMC** | Researchers | Standard academic | Neutral | IMRaD structured or flowing |
| **IEEE/ACM** | Engineers/CS | Technical | Passive common | Concise, technical |
| **ML Conferences** | ML researchers | Dense technical | Mixed | Numbers upfront, key results |
| **NLP Conferences** | NLP researchers | Technical | Varied | Task-focused, benchmarks |
---
## High-Impact Journals (Nature, Science, Cell)
### Core Philosophy
High-impact multidisciplinary journals prioritize **broad significance** over technical depth. The question is not "Is this technically sound?" but "Why should a scientist outside this field care?"
### Key Writing Principles
1. **Start with the big picture**: Open with why this matters to science/society
2. **Minimize jargon**: Define specialized terms; prefer common words
3. **Tell a story**: Results should flow as a narrative, not a data dump
4. **Emphasize implications**: What does this change about our understanding?
5. **Accessible figures**: Schematics and models over raw data plots
### Structural Differences
**Nature/Science** vs. **Specialized Journals**:
| Element | Nature/Science | Specialized Journal |
|---------|---------------|---------------------|
| Introduction | 3-4 paragraphs, broad → specific | Extensive literature review |
| Methods | Often in supplement or brief | Full detail in main text |
| Results | Organized by finding/story | Organized by experiment |
| Discussion | Implications first, then caveats | Detailed comparison to literature |
| Figures | Conceptual schematics valued | Raw data emphasized |
### Example: Same Finding, Different Styles
**Nature style**:
> "We discovered that protein X acts as a molecular switch controlling cell fate decisions during development, resolving a longstanding question about how stem cells choose their destiny."
**Specialized journal style**:
> "Using CRISPR-Cas9 knockout in murine embryonic stem cells (mESCs), we demonstrate that protein X (encoded by gene ABC1) regulates the expression of pluripotency factors Oct4, Sox2, and Nanog through direct promoter binding, as confirmed by ChIP-seq analysis (n=3 biological replicates, FDR < 0.05)."
---
## Medical Journals (NEJM, Lancet, JAMA, BMJ)
### Core Philosophy
Medical journals prioritize **clinical relevance** and **patient outcomes**. Every finding must connect to practice.
### Key Writing Principles
1. **Patient-centered language**: "Patients receiving treatment X" not "Treatment X subjects"
2. **Evidence strength**: Careful hedging based on study design
3. **Clinical actionability**: "So what?" for practicing physicians
4. **Absolute numbers**: Report absolute risk reduction, not just relative
5. **Structured abstracts**: Required with labeled sections
### Structured Abstract Format (Medical)
```
Background: [1-2 sentences on problem and rationale]
Methods: [Study design, setting, participants, intervention, outcomes, analysis]
Results: [Primary outcome with confidence intervals, secondary outcomes, adverse events]
Conclusions: [Clinical implications, limitations acknowledged]
```
### Evidence Language Conventions
| Study Design | Appropriate Language |
|-------------|---------------------|
| RCT | "Treatment X reduced mortality by..." |
| Observational | "Treatment X was associated with reduced mortality..." |
| Case series | "These findings suggest that treatment X may..." |
| Case report | "This case illustrates that treatment X can..." |
---
## ML/AI Conferences (NeurIPS, ICML, ICLR, CVPR)
### Core Philosophy
ML conferences value **novelty**, **rigorous experiments**, and **reproducibility**. The focus is on advancing the state of the art with empirical evidence.
### Key Writing Principles
1. **Contribution bullets**: Numbered list in introduction stating exactly what's new
2. **Baselines are critical**: Compare against strong, recent baselines
3. **Ablations expected**: Show what parts of your method matter
4. **Reproducibility**: Seeds, hyperparameters, compute requirements
5. **Limitations section**: Honest acknowledgment (increasingly required)
### Introduction Structure (ML Conferences)
```
[Paragraph 1: Problem motivation - why this matters]
[Paragraph 2: Limitations of existing approaches]
[Paragraph 3: Our approach at high level]
Our contributions are as follows:
• We propose [method name], a novel approach to [problem] that [key innovation].
• We provide theoretical analysis showing [guarantees/properties].
• We demonstrate state-of-the-art results on [benchmarks], improving over [baseline] by [X%].
• We release code and models at [anonymous URL for review].
```
### Abstract Style (ML Conferences)
ML abstracts are **dense and numbers-focused**:
> "We present TransformerX, a novel architecture for long-range sequence modeling that achieves O(n log n) complexity while maintaining expressivity. On the Long Range Arena benchmark, TransformerX achieves 86.2% average accuracy, outperforming Transformer (65.4%) and Performer (78.1%). On language modeling, TransformerX matches GPT-2 perplexity (18.4) using 40% fewer parameters. We provide theoretical analysis showing TransformerX can approximate any continuous sequence-to-sequence function."
### Experiment Section Expectations
1. **Datasets**: Standard benchmarks, dataset statistics
2. **Baselines**: Recent strong methods, fair comparisons
3. **Main results table**: Clear, comprehensive
4. **Ablation studies**: Remove/modify components systematically
5. **Analysis**: Error analysis, qualitative examples, failure cases
6. **Computational cost**: Training time, inference speed, memory
---
## CS Conferences (ACL, EMNLP, CHI, SIGKDD)
### ACL/EMNLP (NLP)
- **Task-focused**: Clear problem definition
- **Benchmark-heavy**: Standard datasets (GLUE, SQuAD, etc.)
- **Error analysis valued**: Where does it fail?
- **Human evaluation**: Often expected alongside automatic metrics
- **Ethical considerations**: Bias, fairness, environmental cost
### CHI (Human-Computer Interaction)
- **User-centered**: Focus on humans, not just technology
- **Study design details**: Participant recruitment, IRB approval
- **Qualitative accepted**: Interview studies, ethnography valid
- **Design implications**: Concrete takeaways for practitioners
- **Accessibility**: Consider diverse user populations
### SIGKDD (Data Mining)
- **Scalability emphasis**: Handle large data
- **Real-world applications**: Industry datasets valued
- **Efficiency metrics**: Time and space complexity
- **Novelty in methods or applications**: Both paths valid
---
## Adapting Between Venue Types
### Journal → ML Conference
When converting a journal paper to conference format:
1. **Condense introduction**: Remove extensive background
2. **Add contribution list**: Explicitly enumerate contributions
3. **Restructure results**: Organize as experiments, add ablations
4. **Remove separate discussion**: Integrate interpretation briefly
5. **Add reproducibility section**: Seeds, hyperparameters, code
### ML Conference → Journal
When expanding a conference paper to journal:
1. **Expand related work**: Comprehensive literature review
2. **Detailed methods**: Full algorithmic description
3. **More experiments**: Additional datasets, analyses
4. **Extended discussion**: Implications, limitations, future work
5. **Appendix → main text**: Move important details up
### Specialized → High-Impact Journal
When targeting Nature/Science/Cell from a specialized venue:
1. **Lead with significance**: Why does this matter broadly?
2. **Reduce jargon by 80%**: Replace technical terms
3. **Add conceptual figures**: Schematics, models, not just data
4. **Story-driven results**: Narrative flow, not experiment-by-experiment
5. **Broaden discussion**: Implications beyond the subfield
---
## Voice and Tone Guidelines
### Active vs. Passive Voice
| Venue | Preference | Example |
|-------|-----------|---------|
| Nature/Science | Active encouraged | "We discovered that..." |
| Cell | Mixed | "Our results demonstrate..." |
| Medical | Passive common | "Patients were randomized to..." |
| IEEE | Passive traditional | "The algorithm was implemented..." |
| ML Conferences | Active preferred | "We propose a method that..." |
### First Person Usage
| Venue | First Person | Example |
|-------|-------------|---------|
| Nature/Science | Yes (we) | "We show that..." |
| Cell | Yes (we) | "We found that..." |
| Medical | Sometimes | "We conducted a trial..." |
| IEEE | Less common | Prefer "This paper presents..." |
| ML Conferences | Yes (we) | "We introduce..." |
### Hedging and Certainty
| Claim Strength | Language |
|---------------|----------|
| Strong | "X causes Y" (only with causal evidence) |
| Moderate | "X is associated with Y" / "X leads to Y" |
| Tentative | "X may contribute to Y" / "X suggests that..." |
| Speculative | "It is possible that X..." / "One interpretation is..." |
---
## Common Style Errors by Venue
### Nature/Science Submissions
❌ Too technical: "We used CRISPR-Cas9 with sgRNAs targeting exon 3..."
✅ Accessible: "Using gene-editing technology, we disabled the gene..."
❌ Dry opening: "Protein X is involved in cellular signaling..."
✅ Engaging opening: "How do cells decide their fate? We discovered that..."
### ML Conference Submissions
❌ Vague contributions: "We present a new method for X"
✅ Specific contributions: "We propose Method Y that achieves Z% improvement on benchmark W"
❌ Missing ablations: Only showing full method results
✅ Complete: Table showing contribution of each component
### Medical Journal Submissions
❌ Missing absolute numbers: "50% reduction in risk"
✅ Complete: "50% relative reduction (ARR 2.5%, NNT 40)"
❌ Causal language for observational data: "Treatment caused improvement"
✅ Appropriate: "Treatment was associated with improvement"
---
## Quick Checklist Before Submission
### All Venues
- [ ] Abstract matches venue style (flowing vs. structured)
- [ ] Voice/tone appropriate for audience
- [ ] Jargon level appropriate
- [ ] Figures match venue expectations
- [ ] Citation style correct
### High-Impact Journals (Nature/Science/Cell)
- [ ] Broad significance clear in first paragraph
- [ ] Non-specialist can understand abstract
- [ ] Story-driven results narrative
- [ ] Conceptual figures included
- [ ] Implications emphasized
### ML Conferences
- [ ] Contribution list in introduction
- [ ] Strong baselines included
- [ ] Ablation studies present
- [ ] Reproducibility information complete
- [ ] Limitations acknowledged
### Medical Journals
- [ ] Structured abstract (if required)
- [ ] Patient-centered language
- [ ] Evidence strength appropriate
- [ ] Absolute numbers reported
- [ ] CONSORT/STROBE compliance
---
## See Also
- `nature_science_style.md` - Detailed Nature/Science writing guide
- `cell_press_style.md` - Cell family journal conventions
- `medical_journal_styles.md` - NEJM, Lancet, JAMA, BMJ guide
- `ml_conference_style.md` - NeurIPS, ICML, ICLR, CVPR conventions
- `cs_conference_style.md` - ACL, CHI, SIGKDD guide
- `reviewer_expectations.md` - What reviewers look for by venue