O*NET 29.1 Database
O*NET Resource Center (2024). The 29.1 release of the O*NET database, November 2024.
Which economic tasks are performed with AI? Evidence from millions of Claude conversations
Kunal Handa, Alex Tamkin, Miles McCain, et al. (2025). arXiv preprint arXiv:2503.04761.
GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
Tejal Patwardhan, Rachel Dias, Elizabeth Proehl, et al. (2025).
Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the US Workforce
Yijia Shao, Humishka Zope, Yucheng Jiang, et al. (2025). arXiv preprint arXiv:2506.06576.
How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations
Zora Wang, Yijia Shao, Omar Shaikh, et al. (2025). arXiv preprint arXiv:2510.22780.
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F Xu, Yufan Song, Boxuan Li, et al. (2024). arXiv preprint arXiv:2412.14161.
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, et al. (2024). Advances in Neural Information Processing Systems, 37.
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin, Maxime Gasse, Massimo Caccia, et al. (2024). arXiv preprint arXiv:2403.07718.
Agent Workflow Memory
Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig (2025). ICML 2025.
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
Cheng Yang, Xuemeng Yang, Licheng Wen, et al. (2025). arXiv preprint arXiv:2510.08002.
OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu, et al. (2025). arXiv preprint arXiv:2508.09123.
How Well Does Agent Development Reflect Real-World Work?
Zora Wang, Sanidhya Vijayvargiya, Aspen Chen, et al. (2026). arXiv preprint arXiv:2603.01203.