Future of Work in the Age of LLMs

San Diego, California 🌴

Zora Wang

Carnegie Mellon University

Yijia Shao

Stanford University

David Nguyen

Stanford University

Diyi Yang

Stanford University

Large language models (LLMs) are reshaping how we work. They can now follow complex instructions, use software, and perform tasks once thought exclusive to humans. This is raising urgent questions: about job displacement, human agency, and overreliance on automation.

This tutorial maps the current landscape and looks ahead: How will occupations and task requirements evolve? What roles will LLM-based systems play as capable collaborators and autonomous workers? And how do we build the infrastructure to support effective human-AI collaboration?

Schedule: July 2, 2026

Session 1 Landscape of Human Work

09:00 – 09:10 Background and overview

09:10 – 09:20 The landscape of work

09:20 – 09:30 How NLP is transforming work

Session 2 Building LLMs to Augment Work

09:30 – 09:50 LLM-based agents

09:50 – 10:05 Training LLMs for work-related tasks

10:05 – 10:20 Representative open-source platforms

Session 3 Evaluating LLMs at Work

10:50 – 11:05 Desiderata for evaluating LLMs at work

11:05 – 11:20 Existing datasets and benchmarks

11:20 – 11:40 Metric design practice

Session 4 The Future of Work with LLMs (Panel)

11:40 – 12:00 Technical challenges

12:00 – 12:15 Evolving roles of human workers

12:15 – 12:30 Social impact, risks, and closing remarks

Panelists

Stanford University

Stanford University

Stanford University

OpenAI

MIT

Resources

O*NET 29.1 Database
O*NET Resource Center (2024). The 29.1 release of the O*NET database, November 2024.

Which economic tasks are performed with AI? Evidence from millions of Claude conversations
Kunal Handa, Alex Tamkin, Miles McCain, et al. (2025). arXiv preprint arXiv:2503.04761.

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
Tejal Patwardhan, Rachel Dias, Elizabeth Proehl, et al. (2025).

Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the US Workforce
Yijia Shao, Humishka Zope, Yucheng Jiang, et al. (2025). arXiv preprint arXiv:2506.06576.

How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations
Zora Wang, Yijia Shao, Omar Shaikh, et al. (2025). arXiv preprint arXiv:2510.22780.

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F Xu, Yufan Song, Boxuan Li, et al. (2024). arXiv preprint arXiv:2412.14161.

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, et al. (2024). Advances in Neural Information Processing Systems, 37.

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin, Maxime Gasse, Massimo Caccia, et al. (2024). arXiv preprint arXiv:2403.07718.

Agent Workflow Memory
Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig (2025). ICML 2025.

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
Cheng Yang, Xuemeng Yang, Licheng Wen, et al. (2025). arXiv preprint arXiv:2510.08002.

OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu, et al. (2025). arXiv preprint arXiv:2508.09123.

How Well Does Agent Development Reflect Real-World Work?
Zora Wang, Sanidhya Vijayvargiya, Aspen Chen, et al. (2026). arXiv preprint arXiv:2603.01203.

BibTeX

@inproceedings{future-of-work-tutorial, title = "Future of Work in the Age of LLMs", author = "Wang, Zora and Shao, Yijia and Nguyen, David and Yang, Diyi", booktitle = "Proceedings of the 65th Annual Meeting of the Association for Computational Linguistics", publisher = "Association for Computational Linguistics", }

Future of Work in the Age of LLMs

Tutorial at ACL 2026

San Diego, California 🌴

Schedule: July 2, 2026

Session 1 Landscape of Human Work

Session 2 Building LLMs to Augment Work

Session 3 Evaluating LLMs at Work

Session 4 The Future of Work with LLMs (Panel)

Resources

BibTeX