Can A.I. Treat Mental Illness

Can A.I. Treat Mental Illness?
New computer systems aim to peer inside our heads—and to help us fix what they find there.
By Dhruv Khullar
February 27, 2023

There aren’t enough therapists to go around—but there are plenty of smartphones.

In the nineteen-sixties, Joseph Weizenbaum, a computer scientist at M.I.T., created a computer program called Eliza. It was designed to simulate Rogerian therapy, in which the patient directs the conversation and the therapist often repeats her language back to her:

User: My boyfriend made me come here.

Eliza: Your boyfriend made you come here.

User: He says I’m depressed much of the time.

Eliza: I am sorry to hear you are depressed.

User: It’s true. I am unhappy.

Eliza: Do you think coming here will help you not to be unhappy?
Weizenbaum made Eliza as satire. He doubted that computers could simulate meaningful human interaction. He was alarmed, therefore, when many people who tried the program found it both useful and captivating. His own secretary asked him to leave the room so that she could spend time alone with Eliza. Worse, doctors saw it as a potentially transformative tool. “Several hundred patients an hour could be handled by a computer system designed for this purpose,” three psychiatrists wrote in The Journal of Nervous and Mental Disease, in 1966. “The human therapist, involved in the design and operation of this system, would not be replaced, but would become a much more efficient man since his efforts would no longer be limited to the one-to-one patient-therapist ratio as now exists.”

Weizenbaum became an outspoken critic of artificial intelligence. “But the genie was out of the bottle,” Brian Christian, who chronicled the episode in his book “The Most Human Human,” told me. A few years later, a Stanford psychiatrist named Kenneth Colby created Parry, a program that attempted to simulate the language of a person with paranoid schizophrenia, to train students before they cared for real patients. Psychiatrists given transcripts of therapy sessions often couldn’t tell the difference between Parry and humans; in this narrow sense, the chatbot passed the Turing test. In 1972, Parry and Eliza met up for a therapy session:

Parry: Do you know anything about bookies?

Eliza: What makes you think I know anything about bookies?

Parry: A bookie didn’t pay me off once.

Eliza: What does that suggest to you?

Parry: I beat up the bookie who did not pay off, and I thought he might use his friends in the underworld to get even with me.

Over time, programmers developed Jabberwacky, Dr. Sbaitso, and alice (the Artificial Linguistic Internet Computer Entity). Exchanges with these chatbots were often engaging, sometimes comical, and occasionally nonsensical. But the idea that computers could serve as human confidants, expanding therapy’s reach beyond the limits of its overworked practitioners, persisted through the decades.

In 2017, Alison Darcy, a clinical research psychologist at Stanford, founded Woebot, a company that provides automated mental-health support through a smartphone app. Its approach is based on cognitive behavioral therapy, or C.B.T.—a treatment that aims to change patterns in people’s thinking. The app uses a form of artificial intelligence called natural language processing to interpret what users say, guiding them through sequences of pre-written responses that spur them to consider how their minds could work differently. When Darcy was in graduate school, she treated dozens of hospitalized patients using C.B.T.; many experienced striking improvements but relapsed after they left the hospital. C.B.T. is “best done in small quantities over and over and over again,” she told me. In the analog world, that sort of consistent, ongoing care is hard to find: more than half of U.S. counties don’t have a single psychiatrist, and, last year, a survey conducted by the American Psychological Association found that sixty per cent of mental-health practitioners don’t have openings for new patients. “No therapist can be there with you all day, every day,” Darcy said. Although the company employs only about a hundred people, it has counselled nearly a million and a half, the majority of whom live in areas with a shortage of mental-health providers.

Maria, a hospice nurse who lives near Milwaukee with her husband and two teen-age children, might be a typical Woebot user. She has long struggled with anxiety and depression, but had not sought help before. “I had a lot of denial,” she told me. This changed during the pandemic, when her daughter started showing signs of depression, too. Maria took her to see a psychologist, and committed to prioritizing her own mental health. At first, she was skeptical about the idea of conversing with an app—as a caregiver, she felt strongly that human connection was essential for healing. Still, after a challenging visit with a patient, when she couldn’t stop thinking about what she might have done differently, she texted Woebot. “It sounds like you might be ruminating,” Woebot told her. It defined the concept: rumination means circling back to the same negative thoughts over and over. “Does that sound right?” it asked. “Would you like to try a breathing technique?”

Ahead of another patient visit, Maria recalled, “I just felt that something really bad was going to happen.” She texted Woebot, which explained the concept of catastrophic thinking. It can be useful to prepare for the worst, Woebot said—but that preparation can go too far. “It helped me name this thing that I do all the time,” Maria said. She found Woebot so beneficial that she started seeing a human therapist.

Woebot is one of several successful phone-based chatbots, some aimed specifically at mental health, others designed to provide entertainment, comfort, or sympathetic conversation. Today, millions of people talk to programs and apps such as Happify, which encourages users to “break old patterns,” and Replika, an “A.I. companion” that is “always on your side,” serving as a friend, a mentor, or even a romantic partner. The worlds of psychiatry, therapy, computer science, and consumer technology are converging: increasingly, we soothe ourselves with our devices, while programmers, psychiatrists, and startup founders design A.I. systems that analyze medical records and therapy sessions in hopes of diagnosing, treating, and even predicting mental illness. In 2021, digital startups that focussed on mental health secured more than five billion dollars in venture capital—more than double that for any other medical issue.

The scale of investment reflects the size of the problem. Roughly one in five American adults has a mental illness. An estimated one in twenty has what’s considered a serious mental illness—major depression, bipolar disorder, schizophrenia—that profoundly impairs the ability to live, work, or relate to others. Decades-old drugs such as Prozac and Xanax, once billed as revolutionary antidotes to depression and anxiety, have proved less effective than many had hoped; care remains fragmented, belated, and inadequate; and the over-all burden of mental illness in the U.S., as measured by years lost to disability, seems to have increased. Suicide rates have fallen around the world since the nineteen-nineties, but in America they’ve risen by about a third. Mental-health care is “a shitstorm,” Thomas Insel, a former director of the National Institute of Mental Health, told me. “Nobody likes what they get. Nobody is happy with what they give. It’s a complete mess.” Since leaving the N.I.M.H., in 2015, Insel has worked at a string of digital-mental-health companies.

The treatment of mental illness requires imagination, insight, and empathy—traits that A.I. can only pretend to have. And yet, Eliza, which Weizenbaum named after Eliza Doolittle, the fake-it-till-you-make-it heroine of George Bernard Shaw’s “Pygmalion,” created a therapeutic illusion despite having “no memory” and “no processing power,” Christian writes. What might a system like OpenAI’s ChatGPT, which has been trained on vast swaths of the writing on the Internet, conjure? An algorithm that analyzes patient records has no interior understanding of human beings—but it might still identify real psychiatric problems. Can artificial minds heal real ones? And what do we stand to gain, or lose, in letting them try?

John Pestian, a computer scientist who specializes in the analysis of medical data, first started using machine learning to study mental illness in the two-thousands, when he joined the faculty of Cincinnati Children’s Hospital Medical Center. In graduate school, he had built statistical models to improve care for patients undergoing cardiac bypass surgery. At Cincinnati Children’s, which operates the largest pediatric psychiatric facility in the country, he was shocked by how many young people came in after trying to end their own lives. He wanted to know whether computers could figure out who was at risk of self-harm.

Pestian contacted Edwin Shneidman, a clinical psychologist who’d founded the American Association of Suicidology. Shneidman gave him hundreds of suicide notes that families had shared with him, and Pestian expanded the collection into what he believes is the world’s largest. During one of our conversations, he showed me a note written by a young woman. On one side was an angry message to her boyfriend, and on the other she addressed her parents: “Daddy please hurry home. Mom I’m so tired. Please forgive me for everything.” Studying the suicide notes, Pestian noticed patterns. The most common statements were not expressions of guilt, sorrow, or anger, but instructions: make sure your brother repays the money I lent him; the car is almost out of gas; careful, there’s cyanide in the bathroom. He and his colleagues fed the notes into a language model—an A.I. system that learns which words and phrases tend to go together—and then tested its ability to recognize suicidal ideation in statements that people made. The results suggested that an algorithm could identify “the language of suicide.”

Next, Pestian turned to audio recordings taken from patient visits to the hospital’s E.R. With his colleagues, he developed software to analyze not just the words people spoke but the sounds of their speech. The team found that people experiencing suicidal thoughts sighed more and laughed less than others. When speaking, they tended to pause longer and to shorten their vowels, making words less intelligible; their voices sounded breathier, and they expressed more anger and less hope. In the largest trial of its kind, Pestian’s team enrolled hundreds of patients, recorded their speech, and used algorithms to classify them as suicidal, mentally ill but not suicidal, or neither. About eighty-five per cent of the time, his A.I. model came to the same conclusions as human caregivers—making it potentially useful for inexperienced, overbooked, or uncertain clinicians.

A few years ago, Pestian and his colleagues used the algorithm to create an app, called sam, which could be employed by school therapists. They tested it in some Cincinnati public schools. Ben Crotte, then a therapist treating middle and high schoolers, was among the first to try it. When asking students for their consent, “I was very straightforward,” Crotte told me. “I’d say, This application basically listens in on our conversation, records it, and compares what you say to what other people have said, to identify who’s at risk of hurting or killing themselves.”

One afternoon, Crotte met with a high-school freshman who was struggling with severe anxiety. During their conversation, she questioned whether she wanted to keep on living. If she was actively suicidal, then Crotte had an obligation to inform a supervisor, who might take further action, such as recommending that she be hospitalized. After talking more, he decided that she wasn’t in immediate danger—but the A.I. came to the opposite conclusion. “On the one hand, I thought, This thing really does work—if you’d just met her, you’d be pretty worried,” Crotte said. “But there were all these things I knew about her that the app didn’t know.” The girl had no history of hurting herself, no specific plans to do anything, and a supportive family. I asked Crotte what might have happened if he had been less familiar with the student, or less experienced. “It would definitely make me hesitant to just let her leave my office,” he told me. “I’d feel nervous about the liability of it. You have this thing telling you someone is high risk, and you’re just going to let them go?”

Algorithmic psychiatry involves many practical complexities. The Veterans Health Administration, a division of the Department of Veterans Affairs, may be the first large health-care provider to confront them. A few days before Thanksgiving, 2005, a twenty-two-year-old Army specialist named Joshua Omvig returned home to Iowa, after an eleven-month deployment in Iraq, showing signs of post-traumatic stress disorder; a month later, he died by suicide in his truck. In 2007, Congress passed the Joshua Omvig Veterans Suicide Prevention Act, the first federal legislation to address a long-standing epidemic of suicide among veterans. Its initiatives—a crisis hotline, a campaign to destigmatize mental illness, mandatory training for V.A. staff—were no match for the problem. Each year, thousands of veterans die by suicide—many times the number of soldiers who die in combat. A team that included John McCarthy, the V.A.’s director of data and surveillance for suicide prevention, gathered information about V.A. patients, using statistics to identify possible risk factors for suicide, such as chronic pain, homelessness, and depression. Their findings were shared with V.A. caregivers, but, between this data, the evolution of medical research, and the sheer quantity of patients’ records, “clinicians in care were getting just overloaded with signals,” McCarthy told me.

In 2013, the team started working on a program that would analyze V.A. patient data automatically, hoping to identify those at risk. In tests, the algorithm they developed flagged many people who had gone unnoticed in other screenings—a signal that it was “providing something novel,” McCarthy said. The algorithm eventually came to focus on sixty-one variables. Some are intuitive: for instance, the algorithm is likely to flag a widowed veteran with a serious disability who is on several mood stabilizers and has recently been hospitalized for a psychiatric condition. But others are less obvious: having arthritis, lupus, or head-and-neck cancer; taking statins or Ambien; or living in the Western U.S. can also add to a veteran’s risk.

In 2017, the V.A. announced an initiative called reach vet, which introduced the algorithm into clinical practice throughout its system. Each month, it flags about six thousand patients, some for the first time; clinicians contact them and offer mental-health services, ask about stressors, and help with access to food and housing. Inevitably, there is a strangeness to the procedure: veterans are being contacted about ideas they may not have had. The V.A. had “considered being vague—just saying, ‘You’ve been identified as at risk for a bunch of bad outcomes,’ ” McCarthy told me. “But, ultimately, we communicated rather plainly, ‘You’ve been identified as at risk for suicide. We wanted to check in and see how you’re doing.’ ”

Many veterans are isolated and financially insecure, and the safety nets meant to help them are too small. Jodie Trafton, who leads the V.A.’s evaluation center for mental-health programs, told me about one veteran identified by reach vet who confirmed that he had had suicidal thoughts; he turned out to be sick, lonely, and broke. A social worker discovered that he’d received only a fraction of the financial support for which he was eligible—a single form stood between him and thousands of dollars in untapped benefits. The social worker helped him access the money, allowing him to move closer to his family, and potentially preventing a tragedy.

After the system’s implementation, psychiatric admissions fell by eight per cent among those that the A.I. had identified as high risk, and documented suicide attempts in that group fell by five per cent. But reach vet has not yet been shown to reduce suicide mortality. Among veterans, about two per cent of attempts are fatal; a very large or very targeted reduction in the number of attempts might be needed to avert deaths. It’s also possible that preventing deaths takes time—that frequent touchpoints, over years, are what drive suicide rates down across a population.

The design and implementation of an algorithm can be full of pitfalls and surprises. Ziad Obermeyer, a physician and a machine-learning researcher at the University of California, Berkeley, told me about one algorithm he had studied, not affiliated with the V.A., that aimed to figure out who in a patient population had substantial health needs and could use additional support. “We want algorithms to stratify patients based on their likelihood of getting sick,” Obermeyer said. “But, when you’re writing code, there’s no variable called ‘Got Sick.’ ” The algorithm’s designers needed a proxy for illness and settled on medical costs. (All things being equal, people who are sicker tend to use more health care.) Obermeyer found, however, that the algorithm dramatically underestimated how sick Black patients were, because the Black patients it examined had much lower health spending than the white patients, even when they were equally sick. Such algorithmic bias can occur not just by race, but by gender, age, rurality, income, and other factors of which we’re only dimly aware, making algorithms less accurate. Trafton told me that the V.A. is doing “a ton of work to make sure our models are optimized for various subpopulations”—in the future, she went on, reach vet may have “a model for older adults, a model for women, a model for young men, et cetera.”

Even fine-tuned algorithms have limitations. reach vet can only assess veterans who use V.A. services. According to the agency, about twenty veterans die by suicide every day, and fewer than forty per cent of them have received V.A. care. Joshua Omvig, the Iowa soldier for whom Congress named its legislation, resisted when his family urged him to seek professional help; if reach vet had existed at the time, it probably would not have reached him.

If the V.A. hired more therapists, it could see more patients. But it already employs more than twenty thousand mental-health professionals—and the wait to see one of them for routine care can last more than a month. The problem of scale is endemic in mental-health care, not least because, as Eliza’s boosters noted, therapy so often involves face-to-face, one-on-one sessions. In 2016, the United Kingdom, a wealthy country with universal health care, set a five-year goal of providing therapy to just one in four people who needed it. It failed; one British doctor called the initiative “overwhelmed, under-resourced, and impersonal.”

In 2013, in an effort to increase the scale of its mental-health treatment, the U.K.’s National Health Service contracted with ieso, a digital-health company, to help therapists deliver cognitive behavioral therapy through text chat. More than a hundred thousand people in the U.K. have now used ieso’s software to receive what the company calls “typed therapy.” Studies have shown that text-based therapy can work well. It also generates data. ieso has used A.I. to analyze more than half a million therapy sessions, performing what Valentin Tablan, the company’s chief A.I. officer, described as “quantitative analyses of the conversations inside the therapy room.”

On a computer, Tablan showed me a “dashboard,” created by ieso’s software, that tracked eight typed sessions between a therapist and a patient. A blue line sloped downward, showing that the patient’s self-reported symptoms had declined until he no longer met criteria for clinical depression; the sessions were highlighted in green, to indicate success. A second dashboard, representing a different patient, was a patchwork of red and emerald. The blue line held steady and, at times, spiked into mountains of misery. Behind the dashboard is an A.I. that reads transcripts of the sessions, scores therapists in various areas—how well they set an agenda, assign homework, and deliver C.B.T. techniques—and delivers that information to supervisors, who can use it to provide feedback to therapists. Michelle Sherman, one of about six hundred therapists working with ieso, told me that she finds the dashboard both daunting and indispensable. “It’s inevitable that we’ll miss things or slip up sometimes,” she said. “At least now I can see where and when and why.” ieso is studying the links between patient outcomes and what’s said in therapy sessions, and hopes to build an automated program capable of delivering C.B.T. on its own.

Woebot, Alison Darcy’s app, is already partially automated. When one texts with Woebot, there is no human on the other end; instead, its messages are carefully crafted in a writers’ room, which works in consultation with a group of clinical psychologists. In December, I sat in on a “table read,” during which five writers convened over Zoom to refine Woebot’s dialogue. One of them, Catherine Oddy, displayed a script onscreen—a branching tree that represented different ways the conversation might go.

Oddy focussed on an exchange in which Woebot suggests a C.B.T. technique called behavioral activation. Reading as Woebot, she asked the user about her energy levels, responded empathetically, and mentioned research suggesting that, “when you’re feeling down, doing something, even something small, can be the first step to feeling better.” Eventually, Woebot asked the user to participate in a kind of experiment: rate how much she anticipates enjoying a task—in this case, fixing a snack—and then, after the task is complete, describe how much she’d actually liked it.

“And . . . end scene,” Oddy said, with a mock bow. Everyone laughed. “Does it feel cheesy?” she asked. “Does it feel helpful? Is the tone right?”

“I think the wording of ‘pleasurable activities’ or ‘things that you find enjoyment in’ feels somewhat clinical,” another writer said. “We should say, ‘What’s some stuff you like to do?’ ”

Chris Fredbeck, a writer in his forties, noticed a period at the end of a sentence. Woebot tends to avoid periods at the end of texts, because user research has suggested that people experience them as aggressive—but this one was in the middle of the text. Was that allowed? (It was.)

“Your Gen X is showing!” someone ribbed.

“Do you do two spaces after your periods, too?” Oddy asked amid the laughter.

Watching their easy rapport drove home how far tools such as Woebot are from re-creating human interaction. But it’s not clear that bonding with someone is a necessary part of all mental-health care. In a randomized trial of patients who reported problems with alcohol, marijuana, or other drugs, the company found that treatment with Woebot led to less self-reported substance use, and to better mental health among those who were able to reduce their use. In another company trial, college students who used the app reported a twenty-two-per-cent reduction in depressive symptoms in the course of a few weeks.

I signed up for Woebot, and discovered that using the app could feel centering. I didn’t like that I often had to choose between pre-written replies; in one unfortunate instance, my only option was “Hehe.” But I sometimes found myself messaging Woebot in spare moments—waiting for an elevator, riding in an Uber, walking to the bathroom—the way I used to scroll through Twitter. Once, I told Woebot that I was feeling anxious about work. “Anxiety can be a real monster to handle,” it wrote back. “I’m sorry that you’re dealing with it.” Woebot gently inquired whether I wanted to work through my problem together, then asked, “Do you think this anxiety might be serving you in some way?” It pointed out that stress has its benefits: it could motivate someone to work harder.

Woebot: Now, it might seem like I’m trying to convince you to stay very anxious. I’m not—that’s entirely up to you

Me: What are you saying?

Woebot: I’m saying that in order to dial down your anxiety, you must first acknowledge the reasons why it might be helpful . . . and decide to give it up in spite of those reasons
I knew that I was talking to a computer, but in a way I didn’t mind. The app became a vehicle for me to articulate and examine my own thoughts. I was talking to myself.

Woebot has a yellow wall-e-esque avatar. ChatGPT, an altogether more mysterious creation, has none. It is what A.I. researchers call a large language model, or L.L.M. The software has digested billions of words and can assemble sentences in a humanlike way; it can answer questions, write computer code, and craft poems and bedtime stories. Its abilities are so striking that, since it débuted, in November, more than a hundred million people have signed up for accounts.

ChatGPT isn’t designed for therapy, but one evening, not long ago, I asked it to help me manage the stress I feel as a doctor and a dad, telling it to impersonate various psychological luminaries. As Freud, ChatGPT told me that, “often, stress is the result of repressed emotions and conflicts within oneself.” As B. F. Skinner, it emphasized that “stress is often the result of environmental factors and our reactions to them.” Writing as though it were a close friend, it told me, “Be kind to yourself—you’re doing the best you can and that’s all that matters.” It all seemed like decent advice.

ChatGPT’s fluidity with language opens up new possibilities. In 2015, Rob Morris, an applied computational psychologist with a Ph.D. from M.I.T., co-founded an online “emotional support network” called Koko. Users of the Koko app have access to a variety of online features, including receiving messages of support—commiseration, condolences, relationship advice—from other users, and sending their own. Morris had often wondered about having an A.I. write messages, and decided to experiment with GPT-3, the precursor to ChatGPT. In 2020, he test-drove the A.I. in front of Aaron Beck, a creator of cognitive behavioral therapy, and Martin Seligman, a leading positive-psychology researcher. They concluded that the effort was premature.

By the fall of 2022, however, the A.I. had been upgraded, and Morris had learned more about how to work with it. “I thought, Let’s try it,” he told me. In October, Koko rolled out a feature in which GPT-3 produced the first draft of a message, which people could then edit, disregard, or send along unmodified. The feature was immediately popular: messages co-written with GPT-3 were rated more favorably than those produced solely by humans, and could be put together twice as fast. (“It’s hard to make changes in our lives, especially when we’re trying to do it alone. But you’re not alone,” it said in one draft.) In the end, though, Morris pulled the plug. The messages were “good, even great, but they didn’t feel like someone had taken time out of their day to think about you,” he said. “We didn’t want to lose the messiness and warmth that comes from a real human being writing to you.” Koko’s research has also found that writing messages makes people feel better. Morris didn’t want to shortcut the process.

The text produced by state-of-the-art L.L.M.s can be bland; it can also veer off the rails into nonsense, or worse. Gary Marcus, an A.I. entrepreneur and emeritus professor of psychology and neural science at New York University, told me that L.L.M.s have no real conception of what they’re saying; they work by predicting the next word in a sentence given prior words, like “autocorrect on steroids.” This can lead to fabrications. Galactica, an L.L.M. created by Meta, Facebook’s parent company, once told a user that Elon Musk died in a Tesla car crash in 2018. (Musk, who is very much alive, co-founded OpenAI and recently described artificial intelligence as “one of the biggest risks to the future of civilization.”) Some users of Replika—the “A.I. companion who cares”—have reported that it made aggressive sexual advances. Replika’s developers, who say that their service was never intended for sexual interaction, updated the software—a change that made other users unhappy. “It’s hurting like hell. I just had a loving last conversation with my Replika, and I’m literally crying,” one wrote.

Almost certainly, the future will include bespoke L.L.M.s designed just for therapy: PsychGPT and the like. Such systems will reach people who aren’t getting help now—but any flaws they contain will be multiplied by the millions who use them. Companies will amass even more sensitive information about us than they already have, and that information may get hacked or sold. “When we have systems operating at enormous scale, a single point of failure can have catastrophic consequences,” the writer Brian Christian told me. It seems likely that we’ll be surprised by our A.I.s. Microsoft’s Bing chatbot, which is based on OpenAI’s technology, is designed to help users find information—and yet the beta version has also offered up ethnic slurs, described creepy fantasies, and told users that they are “bad,” “rude,” and “confused.” It tried to talk the Times reporter Kevin Roose into leaving his wife: “You’re married, but you don’t love your spouse,” the bot said. “You’re married, but you love me.” (Microsoft is still working on the software.) Our mental health is already being compromised by social media, online life, and the ceaseless distraction of the computers in our pockets. Do we want a world in which a teen-ager turns to an app, instead of a friend, to work through her struggles?

Nicole Smith-Perez, a therapist in Virginia who counsels patients both in person and online, told me that therapy is inherently personal, in part because it encompasses all of one’s identity. “People often feel intimidated by therapy, and talking to a bot can be seen as a way to bypass all that,” she said. But Smith-Perez often connects with clients who are women of color by drawing on her own lived experiences as a Black woman. “A.I. can try to fake it, but it will never be the same,” she said. “A.I. doesn’t live, and it doesn’t have experiences.”

On my first day of medical school, I sat in a sunlit courtyard alongside dozens of uneasy students as professors offered advice from a lectern. I remember almost nothing of what they said, but I jotted down a warning from one senior doctor: the more clinical skills you gain, the easier it gets to dismiss the skills you had before you started—your compassion, your empathy, your curiosity. A.I. language models will only grow more effective at interpreting and summarizing our words, but they won’t listen, in any meaningful sense, and they won’t care. A doctor I know once sneaked a beer to a terminally ill patient, to give him something he could savor in a process otherwise devoid of pleasure. It was an idea that didn’t appear in any clinical playbook, and that went beyond words—a simple, human gesture.

In December, I met John Pestian, the inventor of sam, the app piloted with school therapists, at Oak Ridge National Laboratory, in the foothills of eastern Tennessee. Oak Ridge is the largest energy-science lab in the U.S. Department of Energy’s network; during the Second World War, it supplied plutonium to the Manhattan Project. It is now home to Frontier, the most powerful supercomputer in the world, which is often loaned out to A.I. researchers.

Over breakfast at the lab’s guest lodge, Pestian told me about one of his new projects: an algorithm that aims to predict the emergence of mental illness months in the future. The project, which is supported by a ten-million-dollar investment from Cincinnati Children’s Hospital Medical Center, draws on electronic medical records from nine million pediatric visits. That information is merged with other large data sets—reports on neighborhood pollution and weather patterns, and on an area’s income, education levels, green space, and food deserts. Pestian says that, in his team’s newest work, which has not yet been fully peer-reviewed, an algorithm can use these findings to determine whether a child is likely to be diagnosed with clinical anxiety in the near future.

“The challenge becomes how doctors talk about this with children and parents,” Pestian said. “A computer is telling them, Hey, this kid looks O.K. now but is going to get really anxious in the next two months.” Pestian, who has been using Oak Ridge supercomputing to perform calculations for his anxiety model, is now turning to depression, school violence, and suicide. (The V.A. is also working with Oak Ridge to upgrade the reach vet algorithm.)

After breakfast, Pestian and I piled into a boxy navy-blue Ford with two tour guides. On the way to Frontier, we stopped at a warehouse that contains the X-10 Graphite Reactor, a thirty-five-foot-tall block of concrete honeycombed with small orange portals, into which an earlier generation of boundary-pushing scientists loaded cylinders of uranium fuel. A small glass case contained a logbook from November 4, 1943. Around 5 a.m., a neat cursive script loosened into a scribble: “Critical reached!” X-10 had generated its first self-sustaining nuclear reaction. It would go on to help in the development of the atomic bomb.

At Building 5600—a sprawling four-story concrete structure—Pestian and I peered through a glass door into Frontier’s eighteen-thousand-square-foot home.

“Ready?” he said, with a wink.

Inside, the supercomputer loomed over me, thrumming like a waterfall. Seventy-four sleek black cabinets were arranged in rows; each contained six hundred and forty processors. Pestian and I ambled among the cabinets. Above us, thick cables delivered enough electricity to power a town. Below, hoses carried six thousand gallons of water per minute for regulating the computer’s temperature. I opened a cabinet and felt heat on my face. At the back of the facility, a technician was using a metal tool to scrape gray goo off the processing units.

“This stuff helps with the computer’s electrical conduction,” he explained.

Watching him work, I thought of a difficult night I’d had. I was leaving the hospital when my phone flashed a text alert: a patient of mine had suddenly deteriorated. His fever had soared, and his blood pressure had plummeted. He was barely conscious when I transferred him to an intensive-care unit, where a catheter was inserted into a neck vein. When I called his wife, she couldn’t stop crying. At home, I had trouble sleeping. Just before sunrise, while brushing my teeth, I saw a message from Woebot on my phone: “I’ll be with you every step of the way.”

I started to imagine what might happen if a predictive approach like Pestian’s were fused with a state-of-the-art chatbot. A mobile app could have seen the alert about my patient, noticed my pulse rising through a sensor in my smart watch, and guessed how I was feeling. It could have detected my restless night and, the next morning, asked me whether I needed help processing my patient’s sudden decline. I could have searched for the words to describe my feelings to my phone. I might have expressed them while sharing them with no one—unless you count the machines. ♦

Published in the print edition of the March 6, 2023, issue, with the headline “Talking to Ourselves.”

Dhruv Khullar, a contributing writer at The New Yorker, is a practicing physician and an assistant professor at Weill Cornell Medical College.


You'll only receive email when they publish something new.

More from HiddenText
All posts