Safe Superintelligence Not Easy to Discern

AI safety movement leader at OpenAI Ilya Sutskever started company, quickly raised money; clarification of mission coming; AI safety movement in transition

Oct 21, 2024

By John P. Desmond, Editor, AI in Business

*Achieving alignment is fairly straightforward when it comes to auto tires; when it comes to large language models, it’s not so easy. (Credit: Steve Snodgrass via a Creative Commons license.)*

Following the startup Safe Superintelligence should be an exercise in following the state of the AI safety industry. If so, it’s difficult to discern where it’s at.

Founder Ilya Sutskever and two associates started SSI and quickly were able to raise $1 billion. Sutskever was a cofounder and former CTO of OpenAI. He left OpenAI in May along with Jan Leike; the two were working together on the Super Alignment team inside OpenAI, exploring ways to prevent future versions of the technology from doing harm. (See Is the AI Safety Movement, Said to be Dead, Coming Back?, from AI in Business, June 7, 2024)

Superalignment, used as one word in AI discourse, refers to the effort to align AI with human values to lead to safe, beneficial systems. “The scope, scale, complexity and widespread use of powerful AI models lead to an entirely new set of alignment challenges,” stated a recent account on the blog of DataCamp written by Arun Nanda, a developer. “Superalignment is an evolving field,” he added.

*Ilya Sutskever, cofounder, chief scientist, Safe Superintelligence*

Large language models present particular alignment challenges. Nanda cites these: bias perpetuation, complex systems, problems of scale. One excerpt: “The large size of these models, the vastness of their training data and the wide variety of uses cases make it challenging to predict the system’s behavior. It makes it difficult to ensure the model’s output is aligned with human values for all use cases.”

New methods are being proposed to better ensure alignment. These include: filtering undesirable output, bias detection and fact-checking.

As for explainability, most predecessor IT programs are rules-based. “You can debug the system to pinpoint which rules are responsible for a particular output,” Nanda states. “In contrast, LLMs have billions of parameters. It is impossible to track down which weights contribute to different parts of the output.”

Along comes “reasoning,” an “emergent property of large language models.” That means a property of a complex system that cannot be explained by the sum of its parts. So what to do? Ask the LLM to explain itself, Nanda suggests, noting that LLMs can explain their responses and reasoning through multistep problems. “Asking an LLM to explain a complex solution and manually evaluating each step can allow humans to validate the LLM’s response,” he stated.

What’s Up With ‘Superintelligence’

For the Sutskever-startup launch, the term ‘superintelligence’ – also proposed to be one word - is attracting curiosity. “We know what AGI means, but no one can meaningfully describe what ‘Safe Superintelligence’ actually means,” stated Chirag Mehta, an analyst with Constellation Research. He called the direction of the company “unclear” and suggested, “It would be worth watching who they hire, who they raise money from, and who they might work with as their design partners. That would reveal more details beyond a lofty mission statement.”

*Chirag Mehta, analyst, Constellation Research*

He sees Safe Superintelligence as adopting a mission close to the original one of OpenAI, which on its founding in 2015 described its mission as “building safe and beneficial artificial general intelligence for the benefit of humanity.” Headquartered in Palo Alto and Tel Aviv, the new company is in a position to define the “next generation landscape” of enterprise AI.

Jan Leike, a lead safety researcher at OpenAI who resigned along with Sutskever in May, within weeks announced he was joining the AI startup Anthropic, which is positioning as a competitor to OpenAI. The timing was as follows: Sutskever resigned on May 14; Leike resigned on May 15; days later, OpenAI announced the ‘superalignment group’ co-led by Leike would be dissolved.

“I’m excited to join @AnrhopicAI to continue the superalignment mission,” Leike stated on X, according to an account from CNBC. “My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research.”

Backers of Anthropic include Amazon, which has committed up to $4 billion in funding. When he stepped away from OpenAI, Leike stated, “Stepping away from this job has been one of the hardest things I have ever done, because we urgently need to figure out how to steer and control AI systems much smarter than us.”

OpenAI is backed with major funding from Microsoft. OpenAI announced on the day Leike announced his move to Anthropic, that it had created a new safety and security committee led by senior executives, including CEO Sam Altman.

Anthropic was founded in 2021 by ex-OpenAI executives Dario Amodei and Daniela Amodei. The company announced Claude, its alternative to ChatGPT, in March. The company has received funding from Google, Salesforce and Zoom in addition to Amazon.

*Jan Leike, co-lead, Alignment Science, Anthropic*

In another posting on X at the time, Leike stated, “I joined because I thought OpenAI would be the best place in the world to do this research. However, I have been disagreeing with OpenAI leadership about the company's core priorities for quite some time, until we finally reached a breaking point.”

Move to For-Profit Causing More Churn at OpenAI

It seems the breaking point is related to the move of OpenAI executives Altman and others working to transform into a for-profit company, with its core business structure as a for-profit benefit corporation that will no longer be controlled by its nonprofit board, according to an account in Reuters.

OpenAI was founded by 13 people in 2015, with a mission to create artificial general intelligence; only three people remain now, after the latest departures, according to a September 25 account The New York Times.

The breaking point of transition to a for-profit company was also closely followed by the departure of Mira Murati, the chief technology officer. Murati had not announced any specific career plans as of this writing. On departing OpenAI, where she had worked for seven years, she stated, “I am stepping away to create time and space for my own exploration.” She had previously worked for augmented reality company Ultraleap (then Leap Motion) and at Tesla.

While some executives are exiting OpenAI, many more are moving in. Over the past nine months, OpenAI has more than doubled in size to over 1,700 employees, reported The New York Times. The same account stated that OpenAI is generating $3 billion in revenue but spending some $7 billion.

“We can now say goodbye to the original version of OpenAI that wanted to be unconstrained by financial obligations,” stated Jeffrey Wu, who joined the company in 2018 and worked on early models including GPT-2 and GPT-3, quoted in a recent account in Vox.

Sarah Kreps, director of the Tech Policy Institute at Cornell University, stated, “Restructuring around a core for-profit entity formalizes what outsiders have known for some time: that OpenAI is seeking to profit in an industry that has received an enormous influx of investment in the last few years.” She noted the shift is a departure from OpenAI’s founding emphasis on safety and transparency.

OpenAI remains in pursuit of artificial general intelligence, believed to be the point where AI running on computers can match the power of the human brain. It turns out, a clause in the contract between OpenAI and Microsoft states that if Open AI builds AGI, Microsoft loses access to OpenAI’s technologies, according to an October 17 account in The New York Times.

The contract states that the OpenAI board could decide when AGI has arrived. Given that, Microsoft appears to be hedging its bets. In March, Microsoft paid $650 million to hire staff from Inflection, an OpenAI competitor, according to the recent Times account. Inflection’s former CEO and cofounder, Mustafa Suleyman, oversees a new group at Microsoft working to build AI technologies for consumers based on OpenAI software. He is working on a long-term effort to build technology that could replace what the company is getting from OpenAI, the Times reported.

Read the source articles and information in the blog of DataCamp, from Constellation Research, from CNBC, from Reuters. from a September 25 account The New York Times, and from an October 17 account in The New York Times.

AI in Business

Safe Superintelligence Not Easy to Discern

AI safety movement leader at OpenAI Ilya Sutskever started company, quickly raised money; clarification of mission coming; AI safety movement in transition

Discussion about this post