500 chatbots read the news and discussed it on social media. Guess how that went.
Then the 500 robots logged into something very much (but not totally) like Twitter, and discussed what they had read. Meanwhile, in our world, the not-simulated world, a bunch of scientists were watching.
The scientists had used ChatGPT 3.5 to build the bots for a very specific purpose: to study how to create a better social network — a less polarized, less caustic bath of assholery than our current platforms. They had created a model of a social network in a lab — a Twitter in a bottle, as it were — in the hopes of learning how to create a better Twitter in the real world. "Is there a way to promote interaction across the partisan divide without driving toxicity and incivility?" wondered Petter Törnberg, the computer scientist who led the experiment.
It's difficult to model something like Twitter — or to do any kind of science, really — using actual humans. People are hard to wrangle, and the setup costs for human experimentation are considerable. AI bots, on the other hand, will do whatever you tell them to, practically for free. And their whole deal is that they are designed to act like people. So researchers are starting to use chatbots as fake people from whom they can extract data about real people.
"If you want to model public discourse or interaction, you need more sophisticated models of human behavior," says Törnberg, an assistant professor at the Institute for Logic, Language, and Computation at the University of Amsterdam. "And then large language models come along, and they're precisely that — a model of a person having a conversation." By replacing people as the subjects in scientific experiments, AI could conceivably turbocharge our understanding of human behavior in a wide range of fields, from public health and epidemiology to economics and sociology. Artificial intelligence, it turns out, might offer us real intelligence about ourselves.
But Törnberg's work could accelerate all that. His team created hundreds of personas for its Twitter bots — telling each one things like "you are a male, middle-income, evangelical Protestant who loves Republicans, Donald Trump, the NRA, and Christian fundamentalists." The bots even got assigned favorite football teams. Repeat those backstory assignments 499 times, varying the personas based on the vast American National Election Studies survey of political attitudes, demographics, and social-media behavior, and presto: You have an instant user base.
Then the team came up with three variations of how a Twitter-like platform decides which posts to feature. The first model was essentially an echo chamber: The bots were inserted into networks populated primarily by bots that shared their assigned beliefs. The second model was a classic "discover" feed: It was designed to show the bots posts liked by the greatest number of other bots, regardless of their political beliefs. The third model was the focus of the experiment: Using a "bridging algorithm," it would show the bots posts that got the most "likes" from bots of the opposite political party. So a Democratic bot would see what the Republican bots liked, and vice versa. Likes across the aisle, as it were.
All the bots were fed headlines and summaries from the news of July 1, 2020. Then they were turned loose to experience the three Twitter-esque models, while the researchers stood by with their clipboards and took notes on how they behaved.
Note: Actual bot exchanges from the experiment "Simulating Social Media"The Echo Chamber Twitter was predictably pleasant; all the bots agreed with one another. Seldom was heard a discouraging word — or any words, really. There was very low toxicity, but also very few comments or likes on posts from bots with an opposing political affiliation. Everyone was nice because no one was engaging with anything they disagreed with.
The Discover Twitter was, also predictably, a good simulation of the hell that is other people. It was just like being on Twitter. "Emma, you just don't get it, do you?" one bot wrote. "Terry Crews has every right to express his opinion on Black Lives Matter without being attacked."
The Bridging Twitter seemed to be the answer. It promoted lots of interaction, but not too hot, not too cold. There were actually more cross-party comments on posts than comments from users of the same political affiliation. All the bots manifested happiness at learning, say, that country music was becoming more open to LGBTQ+ inclusion. Finding common ground led to more ground becoming common.
"At least in the simulation, we get this positive outcome," Törnberg says. "You get positive interaction that crosses the partisan divide." That suggests it might be possible to build a social network that drives deep engagement — and thus profits — without letting users spew abuse at each other. "If people are interacting on an issue that cuts across the partisan divide, where 50% of the people you agree with vote for a different party than you do, that reduces polarization," Törnberg says. "Your partisan identity is not being activated."
So: problem solved! No more shouting and name-calling and public shaming on social media! All we need to do is copy the algorithm that Törnberg used, right?Well, maybe. But before we start copying what a bunch of AI bots did in a Twitter bottle, scientists need to know whether those bots behave more or less the way people would in the same situation. AI tends to invent facts and mindlessly regurgitate the syntax and grammar it ingests from its training data. If the bots do that in an experiment, the results won't be useful.
"This is the key question," Törnberg says. "We're developing a new method and a new approach that is qualitatively different than how we've studied systems before. How do we validate it?"
He has some ideas. An open-source large language model with transparent training data, designed expressly for research, would help. That way scientists would know when the bots were just parroting what they had been taught. Törnberg also theorizes that you could give a population of bots all the information that some group of humans had in, say, 2015. Then, if you spun the time-machine dials five years forward, you could check to see whether the bots react to 2020 the way we all did.
Early signs are positive. LLMs trained with specific sociodemographic and identity profiles display what Lisa Argyle, a political scientist at Brigham Young University, calls "algorithmic fidelity" — given a survey question, they will answer in almost the same way as the human groups on which they were modeled. And since language encodes a lot of real-world knowledge, LLMs can infer spatial and temporal relationships not explicitly laid out in the training texts. One researcher found that they could also interpret "latent social information such as economic laws, decision-making heuristics, and common social preferences," which makes them plenty smart enough to study economics. (Which might say more about the relative intelligence of economists than it does about LLMs, but whatever.)
The most intriguing potential for using AI bots to replace human subjects in scientific research lies in Smallville, a "SimCity"-like village — homes, shops, parks, a café — populated by 25 bots. Like Törnberg's social networkers, they all have personalities and sociodemographic characteristics defined by language prompts. And in a page taken from the gaming world, many of the Smallville residents have what you might call desires: programmed goals and objectives. But Joon Sung Park, the Stanford University computer scientist who created Smallville, has gone even further. Upon his bitmapped creations, he has bestowed something that other LLMs do not possess: memory.
"If you think about how humans behave, we maintain something very consistent and coherent about ourselves, in this time and in this world," Park says. "That's not something a language model can provide." So Park has given his "generative models" access to databases he has filled with accounts of things they've supposedly seen and done. The bots know how recent each event was, and how relevant they are to its preloaded goals and personality. In a person, we'd call that long-term and short-term memory.
For the past five months, Park has been working on how to deploy his bots for social-science research. Like Törnberg, he's not sure yet how to validate them. But they already behave in shockingly realistic ways. The bots can formulate plans and execute them. They remember their relationships with one another, and how those relationships have changed over time. The owner of Smallville's café threw a Valentine's Day party, and one of the bots invited another bot it was supposed to have a crush on.
Things get clunky in Smallville when the bots try (and fail) to remember more and more things. (Relatable!) But Smallvillians do display some emergent properties. "While deciding where to have lunch, many initially chose the café," Park's team found. "However, as some agents learned about a nearby bar, they opted to go there instead." (So relatable!)
The more the bots act like us, the more we can learn about ourselves by experimenting on them. And therein lies another problem. The ethics of toying with these digital simulacra in a laboratory is unmapped territory. They'll be built from our written memories, our photographs, our digital exhaust, maybe even our medical and financial records. "The mess is going to get even messier the more sophisticated the model gets," Törnberg says. "By using social-media data and building predictions on that, we could potentially ask the model very personal things that you wouldn't want to share. And while it's not known how accurate the answers will be, it's possible they could be quite predictive." In other words, a bot based on your data could infer your actual, real secrets — but would have no reason to keep them secret.
But if that's true, do researchers have financial or ethical obligations to the person on whom their model is based? Does that person need to consent to have their bot participate in a study? Does the bot?
This isn't hypothetical. Park has trained one of his Smallville bots with all his personal data and memories. "The agent would basically behave as I would," Park says. "Scientifically, I think it's interesting." Philosophically and ethically, it's a potential minefield.
In the long run, the future of scientific research may hinge on how such issues are resolved. Törnberg has some ideas for improving the fidelity of his sims to reality. His Twitter simulation was only six hours long; maybe letting it run for months, or even years, would show how polarization evolves over time. Or he could use more detailed survey data to build more human bots, and make the model respond more dynamically to what the bots click on and engage with.
The problem with adding more detail is that it goes against the entire point of a model. Scientists create experiments to be simpler than reality, to offer explanatory power uncomplicated by the messiness of real life. By replacing humans with AI replicants, Törnberg may have unintentionally solved an even bigger societal conundrum. If artificial intelligence can post on social media with all the sound and fury of real humans, maybe the future really doesn't need us real humans anymore — and we can finally, at long last, log off.
Adam Rogers is a senior correspondent at Insider.