With first mentions dating back to November of last year, OpenAI could now be ready to release its very own AI agent, named Operator, in the near future. Other key takeaways include:
- Biden's 2023 AI executive order gets revoked by now-President Donald Trump on his first day
- A recently published FTC report digs into the competitive risks posed by major AI-cloud partnerships
- Epoch AI's FrontierMath benchmark integrity comes under fire for failing to disclose OpenAI funding in time
Join us at AI Tangle as we untangle this week's happenings in AI!
|
AI agents that take direct control of your computer or other devices are slowly becoming more and more widely known, and OpenAI's very own AI agent, named Operator, may debut shortly. A report by The Information states that Operator could see the light of day as soon as sometime this January, according to OpenAI's targets, and code uncovered by reputable leaker and software engineer Tibor Blaho supports that notion.
What do we know of Operator and its release so far?
Operator has been known to be in the works as early as November 2024, and Blaho's discoveries, including hidden "Toggle Operator" and Force Quit Operator" commands in OpenAI's macOS ChatGPT client, could signal readiness for release. Early benchmarks suggest mixed performance: while excelling at web navigation on WebVoyager and beating Anthropic on OSWorld, it struggles with more specific tasks, like creating Bitcoin wallets. One area where Operator has been given a lot of attention is its safety guardrails to keep it from performing "illicit activities" on behalf of users. OpenAI co-founder Wojciech Zaremba even criticized rivals (specifically Anthropic) for their unsafe agent releases in a recent X/Twitter post, highlighting Operator's careful development in a market projected to reach $47.1 billion by 2030.
|
On his first day back in office, US President Donald Trump revoked a 2023 executive order by Joe Biden that required AI developers to share safety test results for high-risk AI systems with the government before public release. Biden's original order looked to establish safety standards during a period of legislative idleness on AI guardrails to reduce risks the risks posed to consumers, workers, and national security, but the Republican Party criticized the order, arguing it held back innovation. A separate Biden order supporting energy needs for AI data centers has not been repealed by Trump thus far.
The Federal Trade Commission (FTC) has recently released a report that dives into the partnerships between leading cloud service providers (CSPs) like Google, Amazon, and Microsoft and two of the largest AI firms: OpenAI and Anthropic. The report mainly concerns itself with competitive risks, such as lock-in effects, restricted access to AI resources for smaller players in the space, and CSPs gaining sensitive business insights through these partnerships. FTC Chair Lina Khan warned that these partnerships could "undermine innovation, fair markets, and competition."
Epoch AI, a nonprofit developing AI math benchmarks, faced criticism after disclosing on December 20 that it received funding from OpenAI for its FrontierMath benchmark, one of the tests used to demo OpenAI's o3. In a LessWrong post by one contractor named "meemi", many contributors were concerned about the lack of transparency, saying they were unaware of OpenAI’s involvement and potential exclusive access to the benchmark, which some state undermines the point of the demo. Epoch AI admitted to the slip-up but maintained the benchmark's integrity by noting safeguards such as a "separate holdout set," though Epoch AI has yet to independently verify OpenAI's results.
The use of AI in film editing has, once again, sparked controversy during this awards season, as two major Oscar contenders, "The Brutalist" and "Emilia Pérez," disclosed using AI voice-cloning technology to enhance performances, in one way or another. In the case of "The Brutalist," AI was used to massively speed up and perfect the Hungarian dialogue, while in "Emilia Pérez" AI was leveraged to adjust Karla Sofía Gascón's singing range. Though the disclosures have seen both critics and supporters alike, there is no consensus as to whether the use of AI will affect the results.
AI-powered search startup Perplexity has acquired Read.cv, a professional social media platform competitor of LinkedIn, as the platform followed up shortly after by stating that it would cease operations starting on the 17th of January. While Perplexity has not disclosed specific plans for Read.cv, CEO Aravind Srinivas praised the team's expertise in consumer and social experiences along with showing his enthusiasm for welcoming them on board. This marks Perplexity's third acquisition since its inception, as Read.cv joins the list with Carbon and Spellwise, supported by substantial VC funding.
|
Nuelink - Nuelink helps you organize, automate, analyze, and manage your social media from one unified place, saving you time and enabling you to focus on your business while your social media runs itself.
Copyleaks - Detect plagiarism across multiple languages with Copyleaks, the all-in-one intuitive and easy-to-use platform to help create and protect original content.
Roundtable - Clean your survey responses with Roundtable, an easy-to-integrate API with behavioral tracking, dynamic clustering, and more to cut down on time spent analyzing survey data.
Gocodeo - Gocodeo is an AI-powered unit testing suite that detects and picks out pesky software bugs early on in the codebase's development lifecycle.
|
The Second Wave of AI Coding (6-min read)
When asking people in tech about what generative AI is currently good for, many would agree that coding is a particular one, and a string of startups are racing to build a second generation of models that can produce better software, believing that code is the fast track to achieving AGI.
|
What did you think of this newsletter? Let us know! |
|
|
|
|
|