Files
claude-scientific-skills/scientific-skills/venue-templates/references/cs_conference_style.md
Vinayak Agarwal 3439a21f57 Enhance citation management and literature review guidelines
- Updated SKILL.md in citation management to include best practices for identifying seminal and high-impact papers, emphasizing citation count thresholds, venue quality tiers, and author reputation indicators.
- Expanded literature review SKILL.md to prioritize high-impact papers, detailing citation metrics, journal tiers, and author reputation assessment.
- Added comprehensive evaluation strategies for paper impact and quality in literature_search_strategies.md, including citation count significance and journal impact factor guidance.
- Improved research lookup scripts to prioritize results based on citation count, venue prestige, and author reputation, enhancing the quality of research outputs.
2026-01-05 13:01:10 -08:00

15 KiB
Raw Blame History

CS Conference Writing Style Guide

Comprehensive writing guide for ACL, EMNLP, NAACL (NLP), CHI, CSCW (HCI), SIGKDD, WWW, SIGIR (data mining/IR), and other major CS conferences.

Last Updated: 2024


Overview

CS conferences span diverse subfields with distinct writing cultures. This guide covers NLP, HCI, and data mining/IR venues, each with unique expectations and evaluation criteria.


Part 1: NLP Conferences (ACL, EMNLP, NAACL)

NLP Writing Philosophy

"Strong empirical results on standard benchmarks with insightful analysis."

NLP papers balance empirical rigor with linguistic insight. Human evaluation is increasingly important alongside automatic metrics.

Audience and Tone

Target Reader

  • NLP researchers and computational linguists
  • Familiar with transformer architectures, standard benchmarks
  • Expect reproducible results and error analysis

Tone Characteristics

Characteristic Description
Task-focused Clear problem definition
Benchmark-oriented Standard datasets emphasized
Analysis-rich Error analysis, qualitative examples
Reproducible Full implementation details

Abstract (NLP Style)

Structure

  • Task/problem (1 sentence)
  • Limitation of prior work (1 sentence)
  • Your approach (1-2 sentences)
  • Results on benchmarks (2 sentences)
  • Analysis finding (optional, 1 sentence)

Example Abstract

Coreference resolution remains challenging for pronouns with distant or 
ambiguous antecedents. Prior neural approaches struggle with these 
difficult cases due to limited context modeling. We introduce 
LongContext-Coref, a retrieval-augmented coreference model that 
dynamically retrieves relevant context from document history. On the 
OntoNotes 5.0 benchmark, LongContext-Coref achieves 83.4 F1, improving 
over the previous state-of-the-art by 1.2 points. On the challenging 
WinoBias dataset, we reduce gender bias by 34% while maintaining 
accuracy. Qualitative analysis reveals that our model successfully 
resolves pronouns requiring world knowledge, a known weakness of 
prior approaches.

NLP Paper Structure

├── Introduction
│   ├── Task motivation
│   ├── Prior work limitations
│   ├── Your contribution
│   └── Contribution bullets
├── Related Work
├── Method
│   ├── Problem formulation
│   ├── Model architecture
│   └── Training procedure
├── Experiments
│   ├── Datasets (with statistics)
│   ├── Baselines
│   ├── Main results
│   ├── Analysis
│   │   ├── Error analysis
│   │   ├── Ablation study
│   │   └── Qualitative examples
│   └── Human evaluation (if applicable)
├── Discussion / Limitations
└── Conclusion

NLP-Specific Requirements

Datasets

  • Use standard benchmarks: GLUE, SQuAD, CoNLL, OntoNotes
  • Report dataset statistics: train/dev/test sizes
  • Data preprocessing: Document all steps

Evaluation Metrics

  • Task-appropriate metrics: F1, BLEU, ROUGE, accuracy
  • Statistical significance: Paired bootstrap, p-values
  • Multiple runs: Report mean ± std across seeds

Human Evaluation

Increasingly expected for generation tasks:

  • Annotator details: Number, qualifications, agreement
  • Evaluation protocol: Guidelines, interface, payment
  • Inter-annotator agreement: Cohen's κ or Krippendorff's α

Example Human Evaluation Table

Table 3: Human Evaluation Results (100 samples, 3 annotators)
─────────────────────────────────────────────────────────────
Method        | Fluency | Coherence | Factuality | Overall
─────────────────────────────────────────────────────────────
Baseline      |   3.8   |    3.2    |    3.5     |   3.5
GPT-3.5       |   4.2   |    4.0    |    3.7     |   4.0
Our Method    |   4.4   |    4.3    |    4.1     |   4.3
─────────────────────────────────────────────────────────────
Inter-annotator κ = 0.72. Scale: 1-5 (higher is better).

ACL-Specific Notes

  • ARR (ACL Rolling Review): Shared review system across ACL venues
  • Responsible NLP checklist: Ethics, limitations, risks
  • Long (8 pages) vs. Short (4 pages): Different expectations
  • Findings papers: Lower-tier acceptance track

Part 2: HCI Conferences (CHI, CSCW, UIST)

HCI Writing Philosophy

"Technology in service of humans—understand users first, then design and evaluate."

HCI papers are fundamentally user-centered. Technology novelty alone is insufficient; understanding human needs and demonstrating user benefit is essential.

Audience and Tone

Target Reader

  • HCI researchers and practitioners
  • UX designers and product developers
  • Interdisciplinary (CS, psychology, design, social science)

Tone Characteristics

Characteristic Description
User-centered Focus on people, not technology
Design-informed Grounded in design thinking
Empirical User studies provide evidence
Reflective Consider broader implications

HCI Abstract

Focus on Users and Impact

Video calling has become essential for remote collaboration, yet 
current interfaces poorly support the peripheral awareness that makes 
in-person work effective. Through formative interviews with 24 remote 
workers, we identified three key challenges: difficulty gauging 
colleague availability, lack of ambient presence cues, and interruption 
anxiety. We designed AmbientOffice, a peripheral display system that 
conveys teammate presence through subtle ambient visualizations. In a 
two-week deployment study with 18 participants across three distributed 
teams, AmbientOffice increased spontaneous collaboration by 40% and 
reduced perceived isolation (p<0.01). Participants valued the system's 
non-intrusive nature and reported feeling more connected to remote 
colleagues. We discuss implications for designing ambient awareness 
systems and the tension between visibility and privacy in remote work.

HCI Paper Structure

Research Through Design / Systems Papers

├── Introduction
│   ├── Problem in human terms
│   ├── Why technology can help
│   └── Contribution summary
├── Related Work
│   ├── Domain background
│   ├── Prior systems
│   └── Theoretical frameworks
├── Formative Work (often)
│   ├── Interviews / observations
│   └── Design requirements
├── System Design
│   ├── Design rationale
│   ├── Implementation
│   └── Interface walkthrough
├── Evaluation
│   ├── Study design
│   ├── Participants
│   ├── Procedure
│   ├── Findings (quant + qual)
│   └── Limitations
├── Discussion
│   ├── Design implications
│   ├── Generalizability
│   └── Future work
└── Conclusion

Qualitative / Interview Studies

├── Introduction
├── Related Work
├── Methods
│   ├── Participants
│   ├── Procedure
│   ├── Data collection
│   └── Analysis method (thematic, grounded theory, etc.)
├── Findings
│   ├── Theme 1 (with quotes)
│   ├── Theme 2 (with quotes)
│   └── Theme 3 (with quotes)
├── Discussion
│   ├── Implications for design
│   ├── Implications for research
│   └── Limitations
└── Conclusion

HCI-Specific Requirements

Participant Reporting

  • Demographics: Age, gender, relevant experience
  • Recruitment: How and where recruited
  • Compensation: Payment amount and type
  • IRB approval: Ethics board statement

Quotes in Findings

Use direct quotes to ground findings:

Participants valued the ambient nature of the display. As P7 described: 
"It's like having a window to my teammate's office. I don't need to 
actively check it, but I know they're there." This passive awareness 
reduced the barrier to initiating contact.

Design Implications Section

Translate findings into actionable guidance:

**Implication 1: Support peripheral awareness without demanding attention.**
Ambient displays should be visible in peripheral vision but not require 
active monitoring. Designers should consider calm technology principles.

**Implication 2: Balance visibility with privacy.**
Users want to share presence but fear surveillance. Systems should 
provide granular controls and make visibility mutual.

CHI-Specific Notes

  • Contribution types: Empirical, artifact, methodological, theoretical
  • ACM format: acmart document class with sigchi option
  • Accessibility: Alt text, inclusive language expected
  • Contribution statement: Required per-author contributions

Part 3: Data Mining & IR (SIGKDD, WWW, SIGIR)

Data Mining Writing Philosophy

"Scalable methods for real-world data with demonstrated practical impact."

Data mining papers emphasize scalability, real-world applicability, and solid experimental methodology.

Audience and Tone

Target Reader

  • Data scientists and ML engineers
  • Industry researchers
  • Applied ML practitioners

Tone Characteristics

Characteristic Description
Scalable Handle large datasets
Practical Real-world applications
Reproducible Datasets and code shared
Industrial Industry datasets valued

KDD Abstract

Emphasize Scale and Application

Fraud detection in e-commerce requires processing millions of 
transactions in real-time while adapting to evolving attack patterns. 
We present FraudShield, a graph neural network framework for real-time 
fraud detection that scales to billion-edge transaction graphs. Unlike 
prior methods that require full graph access, FraudShield uses 
incremental updates with O(1) inference cost per transaction. On a 
proprietary dataset of 2.3 billion transactions from a major e-commerce 
platform, FraudShield achieves 94.2% precision at 80% recall, 
outperforming production baselines by 12%. The system has been deployed 
at [Company], processing 50K transactions per second and preventing 
an estimated $400M in annual fraud losses. We release an anonymized 
benchmark dataset and code.

KDD Paper Structure

├── Introduction
│   ├── Problem and impact
│   ├── Technical challenges
│   ├── Your approach
│   └── Contributions
├── Related Work
├── Preliminaries
│   ├── Problem definition
│   └── Notation
├── Method
│   ├── Overview
│   ├── Technical components
│   └── Complexity analysis
├── Experiments
│   ├── Datasets (with scale statistics)
│   ├── Baselines
│   ├── Main results
│   ├── Scalability experiments
│   ├── Ablation study
│   └── Case study / deployment
└── Conclusion

KDD-Specific Requirements

Scalability

  • Dataset sizes: Report number of nodes, edges, samples
  • Runtime analysis: Wall-clock time comparisons
  • Complexity: Time and space complexity stated
  • Scaling experiments: Show performance vs. data size

Industrial Deployment

  • Case studies: Real-world deployment stories
  • A/B tests: Online evaluation results (if applicable)
  • Production metrics: Business impact (if shareable)

Example Scalability Table

Table 4: Scalability Comparison (runtime in seconds)
──────────────────────────────────────────────────────
Dataset     | Nodes  | Edges  | GCN   | GraphSAGE | Ours
──────────────────────────────────────────────────────
Cora        |  2.7K  |  5.4K  |  0.3  |    0.2    |  0.1
Citeseer    |  3.3K  |  4.7K  |  0.4  |    0.3    |  0.1
PubMed      | 19.7K  | 44.3K  |  1.2  |    0.8    |  0.3
ogbn-arxiv  | 169K   | 1.17M  |  8.4  |    4.2    |  1.6
ogbn-papers | 111M   | 1.6B   |  OOM  |   OOM     | 42.3
──────────────────────────────────────────────────────

Part 4: Common Elements Across CS Venues

Writing Quality

Clarity

  • One idea per sentence
  • Define terms before use
  • Use consistent notation

Precision

  • Exact numbers: "23.4%" not "about 20%"
  • Clear claims: Avoid hedging unless necessary
  • Specific comparisons: Name the baseline

Contribution Bullets

Used across all CS venues:

Our contributions are:
• We identify [problem/insight]
• We propose [method name] that [key innovation]
• We demonstrate [results] on [benchmarks]
• We release [code/data] at [URL]

Reproducibility Standards

All CS venues increasingly expect:

  • Code availability: GitHub link (anonymous for review)
  • Data availability: Public datasets or release plans
  • Full hyperparameters: Training details complete
  • Random seeds: Exact values for reproduction

Ethics and Broader Impact

NLP (ACL/EMNLP)

  • Limitations section: Required
  • Responsible NLP checklist: Ethical considerations
  • Bias analysis: For models affecting people

HCI (CHI)

  • IRB/Ethics approval: Required for human subjects
  • Informed consent: Procedure described
  • Privacy considerations: Data handling

KDD/WWW

  • Societal impact: Consider misuse potential
  • Privacy preservation: For sensitive data
  • Fairness analysis: When applicable

Venue Comparison Table

Aspect ACL/EMNLP CHI KDD/WWW SIGIR
Focus NLP tasks User studies Scalable ML IR/search
Evaluation Benchmarks + human User studies Large-scale exp Datasets
Theory weight Moderate Low Moderate Moderate
Industry value High Medium Very high High
Page limit 8 long / 4 short 10 + refs 9 + refs 10 + refs
Review style ARR Direct Direct Direct

Pre-Submission Checklist

All CS Venues

  • Clear contribution statement
  • Strong baselines
  • Reproducibility information complete
  • Correct venue template
  • Anonymized (if double-blind)

NLP-Specific

  • Standard benchmark results
  • Error analysis included
  • Human evaluation (for generation)
  • Responsible NLP checklist

HCI-Specific

  • IRB approval stated
  • Participant demographics
  • Direct quotes in findings
  • Design implications

Data Mining-Specific

  • Scalability experiments
  • Dataset size statistics
  • Runtime comparisons
  • Complexity analysis

See Also

  • venue_writing_styles.md - Master style overview
  • ml_conference_style.md - NeurIPS/ICML style guide
  • conferences_formatting.md - Technical formatting requirements
  • reviewer_expectations.md - What CS reviewers seek