Saturday, February 1, 2025

What do we want from AI Agents? What are we likely to get?

First, I present a Facebook post by Jonathan Mayhew, who teaches at The University of Kansas, about some recent frustrations he’s had using computers. Then I present a tweet from NYTimes reporter, Kevin Roose, about the capabilities of Operator, OpenAI’s new agent app.

What’s the likelihood that OpenAI’s Operator would have been able to solve either of Mayhew’s problems? Why or why not? And if not now, when?

Mayhew is frustrated

I thought I'd go into the office. First thing I wanted to do was print a single page of something that had been sent to me by email. I have to log on to my own computer, then open my email--which won't open for me on the first 3 attempts. So I go to the email through my browser. I have to log in again, and do a dual step authentication. Then, the very first thing I see is the attachment I have to print. Yay! Almost done. I print it, go down to the dept. office and log into my account on the printer. Push the button to print, and a blank page emerges. I go back to my own office, and this time I think I should use my normal email program, so I finally get it to open. Search for the name of the person who sent me the mail. I notice in the meantime my university has sent me five more generic messages. Find the message I want, download pdf to my desktop, open the document and print again. (I ignore the prompt to quit adobe so it can continue with its update! Grr....) I go down again to the department office, log again into the printer, push the button to print, and the page prints. Success! I was able to print a single page in 20 minutes.

I don't think my computer skills are particularly lacking, since I came up, for every obstacle, with a logical next step, but I feel, somehow, that technology should be seamless in a way that it is not. It took me about as long to download my W2 yesterday from the State of Kansas, which of course uses a different user ID and password than the normal university ones. I had to switch browsers and change my password twice before it worked. When I am obliged to change my password for the university every six months I end up in an endless loop before finally figuring out where to go. The computers in the classrooms where I teach also require authentications, log ins, the answering of irrelevant prompts; are slow to respond, awkward to navigate.

This is my beginning of the semester rant--and the semester doesn't even start until Tuesday.

Roose reports

New York Times reporter Kevin Roose has been testing OpenAI’s new operator app. Here’s a tweet about it:

I spent the last week testing OpenAI's Operator AI agent, which can use a browser to complete tasks autonomously.

Some impressions:

• Helpful for some things, esp. discrete, well-defined tasks that only require 1-2 websites. ("Buy dog food on Amazon," "book me a haircut," etc.)
• Bad at more complex open-ended tasks, and doesn't work at all on certain websites (NYT, Reddit, YouTube)
• Mesmerizing to watch what is essentially Waymo for the web, just clicking around doing stuff on its own
• Best use: having it respond to hundreds of LinkedIn messages for me
• Worst/sketchiest use: having it fill out online surveys for cash (It made me $1.20 though.)

Right now, not a ton of utility, and too expensive ($200/month). But when these get better/cheaper, look out. A few versions from now, it's not hard to imagine AI agents doing the full workload of a remote worker.

He also links to his full column about it: How Helpful Is Operator, OpenAI’s New A.I. Agent? (Feb. 1, 2025).

No comments:

Post a Comment