Operator from OpenAI

acypher · January 27, 2025, 9:53am

OpenAI just introduced Operator

Using its own browser, it can look at a webpage and interact with it by typing, clicking, and scrolling... Operator is one of our first agents, which are AIs capable of doing work for you independently—you give it a task and it will execute it... Operator can “see” (through screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.

Up to now, I have been delighted if I can record my actions and play them back (e.g. QuicKeys, CoScripter), or create scripts by hand that automate tasks (e.g. QuicKeys and Keyboard Maestro).
Operator promises to leap ahead and simply perform a task when you give it a description, without even having to demonstrate the task once!
This is a new and amazing world that is coming into existence.

ComplexPoint · January 27, 2025, 11:30am

Sounds dangerous – the functional correctness of LLM-generated code is tailing off at barely better than a coin toss. They model syntax, but their suppliers essentially cross their fingers and hope to get lucky, some of the time, on meaning.

"Leap ahead and simply perform" would quickly end in tears.

See the Wolfram LLM Benchmarking Project

ComplexPoint · January 27, 2025, 12:36pm

PS we've been through this cycle before:

Apple's launch of "leap ahead and simply summarise" for news items (a much lower bar than desktop automation), then
Apple urged to withdraw 'out of control' AI news alerts - BBC News , and finally
Apple Intelligence: iPhone AI news alerts halted after errors - BBC News

but not before damage had been done, to Apple and to others.

Look before you leap, and always go easy on the Kool-Aid.

Nige_S · January 27, 2025, 2:16pm

Experience has shown that most users can barely articulate what they want to do, never mind describe it well enough that an Operator can take over. So expect a lot of false starts and clarifications -- they'll have to explain a dozen times to avoid "having to demonstrate the task once".

And note that Operator is (currently) limited to web sites/web apps -- you won't be using it to drive your laptop anytime soon.

Not knocking it, nor their ambition -- if it can do half the things claimed it will be pretty amazing. And they've at least considered "Safety and privacy" this time round!

ComplexPoint · January 27, 2025, 2:39pm

Understandably – questions are built from concepts, and until there's been enough experimentation to acquire the relevant concepts, a clear and relevant question is very hard to frame.

( once the concepts are in place, the questions are often no longer needed ... )

but LLMs form no concepts all – just the statistics of word distributions.

johns · January 27, 2025, 2:48pm

For a mere $200/month you too can be part of the beta experience.

No thanks.. I got pulled into funding the Tesla Autopilot beta years ago, not falling for this again.

Airy · January 27, 2025, 4:33pm

I'm curious: does Operator limit itself to pubic macOS APIs, or does it use undocumented APIs also?

tiffle · January 27, 2025, 4:37pm

It uses its own browser and does not operate outside of it, so it'll make use of whatever functionality they've built into their own browser and so aren't necessarily dependent on any public APIs

ComplexPoint · January 27, 2025, 4:47pm

Unleashing blind parroting on either of those would be far too risky.

( A recipe, in the aftermath, for huge class actions )

A browser is, by design, a sandbox without any access to the host system.

Airy · January 27, 2025, 4:49pm

I wanted him to think about it. I'm not worried about you.

ComplexPoint · January 27, 2025, 5:07pm

either route leads to the same place.

A much bigger story for OpenAI's prospects today (2025-01-27) is a big setback to the pitch which they have been making to investors – that more spending leads to better outcomes for AI models.

Nvidia's (related) share-price – which had depended on an assumption of huge future demand for chips, driven by deep learning – went over a slightly alarming cliff.

See: Chinese AI startup DeepSeek is threatening Nvidia's AI dominance | Fortune

( DeepSeek seems to be getting competing LLM results for what looks like 3%-5% of the cost – not good news for OpenAI investors )

But stepping back, I'm not sure how this thread got into Questions and Suggestions in the first place ...

( seems more like a puff for someone else's product )

~~(now moved to Outback Lounge~~ Update – left here on the grounds that the OP can't see the Outback Lounge section)

And in the context of a $1 trillion market panic, all bets are off for product rollouts and roadmaps:

DeepSeek buzz puts tech stocks on track for $1.2 trillion drop - The Economic Times

a significant drop in global tech stocks,
raising doubts about the high valuations of AI-driven companies

tiffle · January 27, 2025, 5:34pm

Well, the Outback Lounge isn't accessible to all so you might want to move it back to where the OP can see it perhaps...

kevinb · January 27, 2025, 11:32pm

With no disrespect to the OP, this thread is undeniably off-topic in "Questions & Suggestions", and if an alternative topic area is not provided, there are reasons for that. I hope the community leaders will treat this is a "one-off" exception.

macdevign_mac · January 28, 2025, 8:05am

If it sounds too good to be true, it probably is.

What seems like magic come with limitations that do not get reveal until one use it.
You can take a look at this article:

I looking forward to the day when the operator can work alongside with third party like Keyboard Maestro as it really make sense because there is no foolproof way for such automation to cover all cases (eg edge case).

ALYB · January 28, 2025, 8:28am

Who needs OpenAI anymore, now "we" have DeepSeek?

Nige_S · January 28, 2025, 12:02pm

Perhaps people who want a less... shall we say "regulated" -- response? See this (free) Guardian article. Such people might also want to review the privacy policy.

Similar applies to other services, of course. But, for some reason, certain governments that are quite happy letting (western) corporations (ab)use our data get rather twitchy when China is involved. So I suspect that the answer to "who needs OpenAI anymore" will be "those who aren't allowed to use DeepSeek" -- at least until more "acceptable" concerns start deploying the R1 model themselves.

Airy · January 28, 2025, 4:05pm

I wouldn't want to use DeepSeek. All of its political views are approved by the Chinese Communist Party. Congress will probably debate banning it from America for being a "Chinese weapon."

ALYB · January 29, 2025, 12:52am

spitfire · February 2, 2025, 10:41am

Statement from the last sentence was probably last true 15 years ago

spitfire · February 2, 2025, 10:42am

It’s just a model you can run locally (with good enough hardware), even without access to the internet

Operator from OpenAI

Options