scorecard
  1. Home
  2. artificial intelligence
  3. news
  4. He quit OpenAI, branding it reckless. He hopes outside pressure can now force change.

He quit OpenAI, branding it reckless. He hopes outside pressure can now force change.

Jyoti Mann   

  • William Saunders quit OpenAI in February. He thinks it can't responsibly handle safety risks.
  • He's one of 13 people who signed a letter calling for principles to let AI workers raise concerns.

A former OpenAI employee who quit in February spoke out about what led him to quit, and later sign a letter calling for change at AI companies.

William Saunders told Business Insider that concerns he raised while working at OpenAI were "not adequately addressed." He's now calling for changes to allow employees to speak up without fear of retaliation.

Saunders believes the AI models being built by firms like OpenAI could have catastrophic consequences in the future.

He told BI: "I am probably most concerned personally about things that could lead to a large-scale disaster or if things go really off the rails, like potentially human extinction."

In his view, if AI systems become smarter than humans it raises a host of questions about the potential impact on society.

That's especially true if "these systems can then improve themselves and build a new generation that's even smarter," Saunders said. "There's that, or the possibility of systems being used as a very smart assistant to lead to build things like nuclear, biological, or chemical weapons if they're jailbroken."

He said there's a "good chance" of mitigating risks if firms slowly and carefully try to predict risks and address them before and after deployment.

However, Saunders added: "I just don't think the behavior of OpenAI is responsible in managing these kinds of risks."

Saunders, who worked at OpenAI for about three years, is one of 13 signatories of a letter published Tuesday demanding action to hold AI firms accountable.

Saunders said an incident where a former employee who vocalized his concerns about safety and security was fired, along with OpenAI requiring staff who leave to sign "extraordinary" non-disparagement agreements, led to the creation of the four principles set out in the letter.

Leopold Aschenbrenner, who worked on OpenAI's Superalignment team, opened up about why he claims he was fired in an interview with podcaster Dwarkesh Patel this week. It comes after The Information reported in April that he was fired for leaking information.

'Egregiously insufficient'

According to Aschenbrenner, OpenAI told employees that he was fired over sharing a document containing safety ideas with external researchers. But Aschenbrenner claims his dismissal was also related to informing the board that he'd received a warning from HR after sharing a safety memo with the directors.

That memo related to OpenAI's security, which Aschenbrenner said he felt was "egregiously insufficient."

Saunders told BI he also raised concerns but that OpenAI's response was "not adequate."

Saunders, a researcher and engineer, led a team of four working on interpretability and language models. After resigning he says he was given seven days to sign a non-disparagement agreement and was discouraged from speaking to a lawyer first.

'Minimize and downplay'

Following a Vox report, CEO Sam Altman said OpenAI would not withhold departing employees' vested equity if they did not sign such agreements.

Saunders said Altman's response was a "microcosm" of how Altman and OpenAI leadership communicate — to "minimize and downplay the severity of what happened."

He maintains that OpenAI hasn't yet clarified whether it will prevent former employees who have criticized the company from being able to sell equity.

After reports on the non-disparagement agreements emerged, and several OpenAI employees quit over shared concerns about its approach to safety issues, the company announced last week that it formed a safety and security committee.

OpenAI didn't respond to a request for comment from Business Insider.


Advertisement

Advertisement