# The Last Human-Written Paper
## Agent-Native Research Artifacts (ARA)

**Authors:** Orchestra Team
**Year:** 2026
**Status:** Paper forthcoming
**Code:** https://github.com/Orchestra-Research/Agent-Native-Research-Artifact
**Canonical URL:** https://www.orchestra-research.com/ara

---

## Thesis

In the near future, most CS papers will be written by AI, and most will be read by AI. When neither the author nor the audience is human, the three-century-old paper format stops making sense.

Papers flatten a branching research process into a clean story. That flattening imposes two taxes on AI agents trying to understand, reproduce, or extend research.

---

## The Storytelling Tax

Research is inherently branching and exploratory. Scientists try dozens of approaches, hit dead ends, pivot, and iterate. Papers collapse this rich process into a single winning narrative, discarding every failed attempt, rejected hypothesis, and negative result.

**Example: a real research process**
- Initial question
  - Hypothesis A (CNN baseline) — dead end: OOM at batch 64
  - Hypothesis B (Transformer)
    - Standard LayerNorm — dead end: loss diverged at 7.28
    - Pivot: compute inv-std outside forward pass — training stable, loss 4.60
      - Differential LR (embedding 3e-4, transformer 3e-5) — loss 3.98, +13% improvement
  - Hypothesis C (GAN variant) — dead end: mode collapse; dead end: gradient exploded

Outcome: 5 dead ends, 1 pivot, 1 success. All of this gets thrown away.

**What gets published:** a single straight line from research question to result. No dead ends, no failures, no tricks. Knowledge lost forever.

---

## The Engineering Tax

Papers describe methods at the precision needed to convince a reviewer, not at the precision needed to reproduce the work. Hyperparameters are underspecified. Warmup schedules live in someone's head. Numerical stability fixes exist in no document. The gap between "sufficient to believe" and "sufficient to execute" is where reproduction breaks down.

**Reproduction information gap** (8,921 expert-annotated reproduction requirements across 23 ICML papers, PaperBench):

| Category                         | Share |
| -------------------------------- | ----- |
| Fully specified in PDF           | 45.4% |
| Missing hyperparameters          | 26.2% |
| Vague description                | 21.9% |
| Cross-reference only             | 13.4% |
| Missing code / baseline detail   | 21.7% |

Less than half of what an agent needs to reproduce a paper is actually in the PDF. The information exists somewhere — a lab notebook, a Slack thread, the author's muscle memory — but not in any document an AI agent can access. Every reproduction attempt pays the full cost of rediscovering it.

---

## The Solution: Four Interlocking Layers

ARA restructures a paper into four machine-native layers. Together they form a single executable knowledge package: the organized, evolving knowledge produced during research, not the narrative compiled afterward.

```
PAPER.md                      # Human-readable overview & entry point
│
├── logic/                    # Cognitive Layer
│   ├── claims.yaml           # Falsifiable claims with epistemic status
│   ├── concepts/             # Formal concept definitions
│   ├── experiments/          # Declarative experiment plans
│   └── problem_spec.md       # The "what and why" of the research
│
├── src/                      # Physical Layer
│   ├── kernel/               # Novel algorithm core
│   ├── configs/              # Annotated with search ranges & sensitivity
│   └── environment.yaml      # Exact reproducibility spec
│
├── trace/                    # Exploration Graph
│   ├── graph.json            # Full branching research DAG
│   ├── dead_ends/            # Every failed attempt preserved
│   └── pivots/               # Decision points & lessons learned
│
└── evidence/                 # Evidence Layer
    ├── results/              # Machine-readable quantitative outputs
    ├── logs/                 # Raw experiment logs
    └── curves/               # Training curves & metrics
```

---

## Live Research Manager (LRM)

ARA does not require researchers to manually package their work. The Live Research Manager silently captures the research trajectory during AI-human collaboration: no interruptions, no extra effort. The artifact builds itself in the background.

Pipeline: Context Harvester → Event Router → Maturity Tracker.

Design principles:
- Silent integration
- Epistemic objectivity
- Framework independence
- Comprehensive capture
- Faithful translation

Example captured trajectory from one session:
1. decision: "ReLU transformer approach"
2. dead_end: "Loss diverged (norm bug)"
3. heuristic: "Inv-std outside forward pass"
4. experiment: "Training stable, loss 4.60"
5. heuristic: "Differential LR: emb 3e-4 / tfm 3e-5"
6. experiment: "Loss 3.98 (+13% vs uniform)"

---

## Results

Same information, structured differently. Agents become dramatically more effective at understanding, reproducing, and extending research.

| Capability     | Result  | Baseline | Delta    | Benchmark                             |
| -------------- | ------- | -------- | -------- | ------------------------------------- |
| Understanding  | 93.7%   | 72.4%    | +21.3pp  | 450 questions, PaperBench + RE-Bench  |
| Reproduction   | 64.4%   | 57.4%    | +7.0pp   | 150 subtasks, 15 PaperBench papers    |
| Extension      | 3 / 5   | —        | —        | 5 RE-Bench tasks, MALT failure traces |

Reproduction advantage grows with difficulty:

| Difficulty | ARA   | PDF baseline | Delta   |
| ---------- | ----- | ------------ | ------- |
| Easy       | 85.1% | 80.2%        | +4.9pp  |
| Medium     | 68.5% | 62.9%        | +5.6pp  |
| Hard       | 54.5% | 46.0%        | +8.5pp  |

---

## Conclusion

**Knowledge over Narrative.** The organized, evolving knowledge produced during research is the primary scientific object. The narrative paper is a compiled view.

---

## Citation

```bibtex
@article{liu2026ara,
  title  = {The Last Human-Written Paper: Agent-Native Research Artifacts},
  author = {Liu, Jiachen and Pei, Jiaxin and Huang, Jintao and Si, Chenglei and Qu, Ao and Tang, Xiangru and Lu, Runyu and Chen, Lichang and Bai, Xiaoyan and Zheng, Haizhong and Chen, Carl and Chen, Zhiyang and Ye, Haojie and Fu, Yujuan and He, Zexue and Jin, Zijian and Zhang, Zhenyu and Sun, Shangquan and Harmon, Maestro and Wang, John Dianzhuo and Zeng, Jianqiao and Sun, Jiachen and Wu, Mingyuan and Zhou, Baoyu and You, Yuchen and Lu, Shijian and Qiu, Yiming and Lai, Fan and Yuan, Yuan and Li, Yao and Hong, Junyuan and Zhu, Ruihao and Chen, Beidi and Pentland, Alex and Chen, Ang and Chowdhury, Mosharaf and Zhang, Zechen},
  year   = {2026},
  note   = {Paper forthcoming}
}
```

---

## Related

- https://www.orchestra-research.com/perspectives/build-ai-co-scientists — Build AI Co-Scientists That Actually Help
- https://www.orchestra-research.com/perspectives/ai-research-skills — AI Research Engineering Skills