# CS Conference Writing Style Guide Comprehensive writing guide for ACL, EMNLP, NAACL (NLP), CHI, CSCW (HCI), SIGKDD, WWW, SIGIR (data mining/IR), and other major CS conferences. **Last Updated**: 2024 --- ## Overview CS conferences span diverse subfields with distinct writing cultures. This guide covers NLP, HCI, and data mining/IR venues, each with unique expectations and evaluation criteria. --- # Part 1: NLP Conferences (ACL, EMNLP, NAACL) ## NLP Writing Philosophy > "Strong empirical results on standard benchmarks with insightful analysis." NLP papers balance empirical rigor with linguistic insight. Human evaluation is increasingly important alongside automatic metrics. ## Audience and Tone ### Target Reader - NLP researchers and computational linguists - Familiar with transformer architectures, standard benchmarks - Expect reproducible results and error analysis ### Tone Characteristics | Characteristic | Description | |---------------|-------------| | **Task-focused** | Clear problem definition | | **Benchmark-oriented** | Standard datasets emphasized | | **Analysis-rich** | Error analysis, qualitative examples | | **Reproducible** | Full implementation details | ## Abstract (NLP Style) ### Structure - **Task/problem** (1 sentence) - **Limitation of prior work** (1 sentence) - **Your approach** (1-2 sentences) - **Results on benchmarks** (2 sentences) - **Analysis finding** (optional, 1 sentence) ### Example Abstract ``` Coreference resolution remains challenging for pronouns with distant or ambiguous antecedents. Prior neural approaches struggle with these difficult cases due to limited context modeling. We introduce LongContext-Coref, a retrieval-augmented coreference model that dynamically retrieves relevant context from document history. On the OntoNotes 5.0 benchmark, LongContext-Coref achieves 83.4 F1, improving over the previous state-of-the-art by 1.2 points. On the challenging WinoBias dataset, we reduce gender bias by 34% while maintaining accuracy. Qualitative analysis reveals that our model successfully resolves pronouns requiring world knowledge, a known weakness of prior approaches. ``` ## NLP Paper Structure ``` ├── Introduction │ ├── Task motivation │ ├── Prior work limitations │ ├── Your contribution │ └── Contribution bullets ├── Related Work ├── Method │ ├── Problem formulation │ ├── Model architecture │ └── Training procedure ├── Experiments │ ├── Datasets (with statistics) │ ├── Baselines │ ├── Main results │ ├── Analysis │ │ ├── Error analysis │ │ ├── Ablation study │ │ └── Qualitative examples │ └── Human evaluation (if applicable) ├── Discussion / Limitations └── Conclusion ``` ## NLP-Specific Requirements ### Datasets - Use **standard benchmarks**: GLUE, SQuAD, CoNLL, OntoNotes - Report **dataset statistics**: train/dev/test sizes - **Data preprocessing**: Document all steps ### Evaluation Metrics - **Task-appropriate metrics**: F1, BLEU, ROUGE, accuracy - **Statistical significance**: Paired bootstrap, p-values - **Multiple runs**: Report mean ± std across seeds ### Human Evaluation Increasingly expected for generation tasks: - **Annotator details**: Number, qualifications, agreement - **Evaluation protocol**: Guidelines, interface, payment - **Inter-annotator agreement**: Cohen's κ or Krippendorff's α ### Example Human Evaluation Table ``` Table 3: Human Evaluation Results (100 samples, 3 annotators) ───────────────────────────────────────────────────────────── Method | Fluency | Coherence | Factuality | Overall ───────────────────────────────────────────────────────────── Baseline | 3.8 | 3.2 | 3.5 | 3.5 GPT-3.5 | 4.2 | 4.0 | 3.7 | 4.0 Our Method | 4.4 | 4.3 | 4.1 | 4.3 ───────────────────────────────────────────────────────────── Inter-annotator κ = 0.72. Scale: 1-5 (higher is better). ``` ## ACL-Specific Notes - **ARR (ACL Rolling Review)**: Shared review system across ACL venues - **Responsible NLP checklist**: Ethics, limitations, risks - **Long (8 pages) vs. Short (4 pages)**: Different expectations - **Findings papers**: Lower-tier acceptance track --- # Part 2: HCI Conferences (CHI, CSCW, UIST) ## HCI Writing Philosophy > "Technology in service of humans—understand users first, then design and evaluate." HCI papers are fundamentally **user-centered**. Technology novelty alone is insufficient; understanding human needs and demonstrating user benefit is essential. ## Audience and Tone ### Target Reader - HCI researchers and practitioners - UX designers and product developers - Interdisciplinary (CS, psychology, design, social science) ### Tone Characteristics | Characteristic | Description | |---------------|-------------| | **User-centered** | Focus on people, not technology | | **Design-informed** | Grounded in design thinking | | **Empirical** | User studies provide evidence | | **Reflective** | Consider broader implications | ## HCI Abstract ### Focus on Users and Impact ``` Video calling has become essential for remote collaboration, yet current interfaces poorly support the peripheral awareness that makes in-person work effective. Through formative interviews with 24 remote workers, we identified three key challenges: difficulty gauging colleague availability, lack of ambient presence cues, and interruption anxiety. We designed AmbientOffice, a peripheral display system that conveys teammate presence through subtle ambient visualizations. In a two-week deployment study with 18 participants across three distributed teams, AmbientOffice increased spontaneous collaboration by 40% and reduced perceived isolation (p<0.01). Participants valued the system's non-intrusive nature and reported feeling more connected to remote colleagues. We discuss implications for designing ambient awareness systems and the tension between visibility and privacy in remote work. ``` ## HCI Paper Structure ### Research Through Design / Systems Papers ``` ├── Introduction │ ├── Problem in human terms │ ├── Why technology can help │ └── Contribution summary ├── Related Work │ ├── Domain background │ ├── Prior systems │ └── Theoretical frameworks ├── Formative Work (often) │ ├── Interviews / observations │ └── Design requirements ├── System Design │ ├── Design rationale │ ├── Implementation │ └── Interface walkthrough ├── Evaluation │ ├── Study design │ ├── Participants │ ├── Procedure │ ├── Findings (quant + qual) │ └── Limitations ├── Discussion │ ├── Design implications │ ├── Generalizability │ └── Future work └── Conclusion ``` ### Qualitative / Interview Studies ``` ├── Introduction ├── Related Work ├── Methods │ ├── Participants │ ├── Procedure │ ├── Data collection │ └── Analysis method (thematic, grounded theory, etc.) ├── Findings │ ├── Theme 1 (with quotes) │ ├── Theme 2 (with quotes) │ └── Theme 3 (with quotes) ├── Discussion │ ├── Implications for design │ ├── Implications for research │ └── Limitations └── Conclusion ``` ## HCI-Specific Requirements ### Participant Reporting - **Demographics**: Age, gender, relevant experience - **Recruitment**: How and where recruited - **Compensation**: Payment amount and type - **IRB approval**: Ethics board statement ### Quotes in Findings Use direct quotes to ground findings: ``` Participants valued the ambient nature of the display. As P7 described: "It's like having a window to my teammate's office. I don't need to actively check it, but I know they're there." This passive awareness reduced the barrier to initiating contact. ``` ### Design Implications Section Translate findings into actionable guidance: ``` **Implication 1: Support peripheral awareness without demanding attention.** Ambient displays should be visible in peripheral vision but not require active monitoring. Designers should consider calm technology principles. **Implication 2: Balance visibility with privacy.** Users want to share presence but fear surveillance. Systems should provide granular controls and make visibility mutual. ``` ## CHI-Specific Notes - **Contribution types**: Empirical, artifact, methodological, theoretical - **ACM format**: `acmart` document class with `sigchi` option - **Accessibility**: Alt text, inclusive language expected - **Contribution statement**: Required per-author contributions --- # Part 3: Data Mining & IR (SIGKDD, WWW, SIGIR) ## Data Mining Writing Philosophy > "Scalable methods for real-world data with demonstrated practical impact." Data mining papers emphasize **scalability**, **real-world applicability**, and **solid experimental methodology**. ## Audience and Tone ### Target Reader - Data scientists and ML engineers - Industry researchers - Applied ML practitioners ### Tone Characteristics | Characteristic | Description | |---------------|-------------| | **Scalable** | Handle large datasets | | **Practical** | Real-world applications | | **Reproducible** | Datasets and code shared | | **Industrial** | Industry datasets valued | ## KDD Abstract ### Emphasize Scale and Application ``` Fraud detection in e-commerce requires processing millions of transactions in real-time while adapting to evolving attack patterns. We present FraudShield, a graph neural network framework for real-time fraud detection that scales to billion-edge transaction graphs. Unlike prior methods that require full graph access, FraudShield uses incremental updates with O(1) inference cost per transaction. On a proprietary dataset of 2.3 billion transactions from a major e-commerce platform, FraudShield achieves 94.2% precision at 80% recall, outperforming production baselines by 12%. The system has been deployed at [Company], processing 50K transactions per second and preventing an estimated $400M in annual fraud losses. We release an anonymized benchmark dataset and code. ``` ## KDD Paper Structure ``` ├── Introduction │ ├── Problem and impact │ ├── Technical challenges │ ├── Your approach │ └── Contributions ├── Related Work ├── Preliminaries │ ├── Problem definition │ └── Notation ├── Method │ ├── Overview │ ├── Technical components │ └── Complexity analysis ├── Experiments │ ├── Datasets (with scale statistics) │ ├── Baselines │ ├── Main results │ ├── Scalability experiments │ ├── Ablation study │ └── Case study / deployment └── Conclusion ``` ## KDD-Specific Requirements ### Scalability - **Dataset sizes**: Report number of nodes, edges, samples - **Runtime analysis**: Wall-clock time comparisons - **Complexity**: Time and space complexity stated - **Scaling experiments**: Show performance vs. data size ### Industrial Deployment - **Case studies**: Real-world deployment stories - **A/B tests**: Online evaluation results (if applicable) - **Production metrics**: Business impact (if shareable) ### Example Scalability Table ``` Table 4: Scalability Comparison (runtime in seconds) ────────────────────────────────────────────────────── Dataset | Nodes | Edges | GCN | GraphSAGE | Ours ────────────────────────────────────────────────────── Cora | 2.7K | 5.4K | 0.3 | 0.2 | 0.1 Citeseer | 3.3K | 4.7K | 0.4 | 0.3 | 0.1 PubMed | 19.7K | 44.3K | 1.2 | 0.8 | 0.3 ogbn-arxiv | 169K | 1.17M | 8.4 | 4.2 | 1.6 ogbn-papers | 111M | 1.6B | OOM | OOM | 42.3 ────────────────────────────────────────────────────── ``` --- # Part 4: Common Elements Across CS Venues ## Writing Quality ### Clarity - **One idea per sentence** - **Define terms before use** - **Use consistent notation** ### Precision - **Exact numbers**: "23.4%" not "about 20%" - **Clear claims**: Avoid hedging unless necessary - **Specific comparisons**: Name the baseline ## Contribution Bullets Used across all CS venues: ``` Our contributions are: • We identify [problem/insight] • We propose [method name] that [key innovation] • We demonstrate [results] on [benchmarks] • We release [code/data] at [URL] ``` ## Reproducibility Standards All CS venues increasingly expect: - **Code availability**: GitHub link (anonymous for review) - **Data availability**: Public datasets or release plans - **Full hyperparameters**: Training details complete - **Random seeds**: Exact values for reproduction ## Ethics and Broader Impact ### NLP (ACL/EMNLP) - **Limitations section**: Required - **Responsible NLP checklist**: Ethical considerations - **Bias analysis**: For models affecting people ### HCI (CHI) - **IRB/Ethics approval**: Required for human subjects - **Informed consent**: Procedure described - **Privacy considerations**: Data handling ### KDD/WWW - **Societal impact**: Consider misuse potential - **Privacy preservation**: For sensitive data - **Fairness analysis**: When applicable --- ## Venue Comparison Table | Aspect | ACL/EMNLP | CHI | KDD/WWW | SIGIR | |--------|-----------|-----|---------|-------| | **Focus** | NLP tasks | User studies | Scalable ML | IR/search | | **Evaluation** | Benchmarks + human | User studies | Large-scale exp | Datasets | | **Theory weight** | Moderate | Low | Moderate | Moderate | | **Industry value** | High | Medium | Very high | High | | **Page limit** | 8 long / 4 short | 10 + refs | 9 + refs | 10 + refs | | **Review style** | ARR | Direct | Direct | Direct | --- ## Pre-Submission Checklist ### All CS Venues - [ ] Clear contribution statement - [ ] Strong baselines - [ ] Reproducibility information complete - [ ] Correct venue template - [ ] Anonymized (if double-blind) ### NLP-Specific - [ ] Standard benchmark results - [ ] Error analysis included - [ ] Human evaluation (for generation) - [ ] Responsible NLP checklist ### HCI-Specific - [ ] IRB approval stated - [ ] Participant demographics - [ ] Direct quotes in findings - [ ] Design implications ### Data Mining-Specific - [ ] Scalability experiments - [ ] Dataset size statistics - [ ] Runtime comparisons - [ ] Complexity analysis --- ## See Also - `venue_writing_styles.md` - Master style overview - `ml_conference_style.md` - NeurIPS/ICML style guide - `conferences_formatting.md` - Technical formatting requirements - `reviewer_expectations.md` - What CS reviewers seek