Future of Work in the Age of LLMs

Tutorial at ACL 2026

San Diego, California 🌴

Zora Wang
Zora Wang
Carnegie Mellon University
Yijia Shao
Yijia Shao
Stanford University
David Nguyen
David Nguyen
Stanford University
Diyi Yang
Diyi Yang
Stanford University

Large language models (LLMs) are reshaping how we work. They can now follow complex instructions, use software, and perform tasks once thought exclusive to humans. This is raising urgent questions: about job displacement, human agency, and overreliance on automation.

This tutorial maps the current landscape and looks ahead: How will occupations and task requirements evolve? What roles will LLM-based systems play as capable collaborators and autonomous workers? And how do we build the infrastructure to support effective human-AI collaboration?

Schedule: July 2, 2026

Session 1 Landscape of Human Work

09:00 – 09:10 Background and overview
09:10 – 09:25 The landscape of work
09:25 – 09:40 How NLP is transforming work

Session 2 Building LLMs to Augment Work

09:40 – 09:55 LLM-based agents
09:55 – 10:10 Training LLMs for work-related tasks
10:10 – 10:20 Representative open-source platforms

Session 3 Evaluating LLMs at Work

10:50 – 11:00 Desiderata for evaluating LLMs at work
11:00 – 11:15 Existing datasets and benchmarks
11:15 – 11:30 Metric design practice

Session 4 The Future of Work with LLMs

11:30 – 11:45 Technical challenges
11:45 – 11:55 Evolving roles of human workers
11:55 – 12:00 Social impact, risks, and closing remarks

Resources

O*NET 29.1 Database
O*NET Resource Center (2024). The 29.1 release of the O*NET database, November 2024.

Which economic tasks are performed with AI? Evidence from millions of Claude conversations
Kunal Handa, Alex Tamkin, Miles McCain, et al. (2025). arXiv preprint arXiv:2503.04761.

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
Tejal Patwardhan, Rachel Dias, Elizabeth Proehl, et al. (2025).

Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the US Workforce
Yijia Shao, Humishka Zope, Yucheng Jiang, et al. (2025). arXiv preprint arXiv:2506.06576.

How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations
Zora Wang, Yijia Shao, Omar Shaikh, et al. (2025). arXiv preprint arXiv:2510.22780.

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F Xu, Yufan Song, Boxuan Li, et al. (2024). arXiv preprint arXiv:2412.14161.

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, et al. (2024). Advances in Neural Information Processing Systems, 37.

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin, Maxime Gasse, Massimo Caccia, et al. (2024). arXiv preprint arXiv:2403.07718.

Agent Workflow Memory
Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig (2025). ICML 2025.

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
Cheng Yang, Xuemeng Yang, Licheng Wen, et al. (2025). arXiv preprint arXiv:2510.08002.

OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu, et al. (2025). arXiv preprint arXiv:2508.09123.

How Well Does Agent Development Reflect Real-World Work?
Zora Wang, Sanidhya Vijayvargiya, Aspen Chen, et al. (2026). arXiv preprint arXiv:2603.01203.

BibTeX

@inproceedings{future-of-work-tutorial,
    title = "Future of Work in the Age of LLMs",
    author = "Wang, Zora  and Shao, Yijia  and Nguyen, David  and Yang, Diyi",
    booktitle = "Proceedings of the 65th Annual Meeting of the Association for Computational Linguistics",
    publisher = "Association for Computational Linguistics",
}