The next time you get a blood test, X-ray, mammogram, or colonoscopy, there’s a good chance an artificial intelligence (AI) algorithm will first interpret the results even before your doctor has seen it.Over the course of just a few years, AI has spread rapidly into hospitals and clinics around the world. More than 1,000 health-related AI tools have been authorized for use by the U.S. Food and Drug Administration (FDA), and more than 2 in 3 physicians say they use AI to some degree, according to a recent survey by the American Medical Association. The potential is extraordinary. AI—particularly in the form of AI agents that can reason, adapt, and act on their own—can lighten doctors’ workloads by drafting patient notes and chart summaries, support precision medicine through more targeted therapies, and flag subtle abnormalities in scans and slides that a human eye might miss. It can speed discovery of drugs and drug targets through new processes, such as AI-driven protein structure prediction and design that led to last year’s Nobel Prize in Chemistry. AI can give patients faster, more personalized support by scheduling appointments, answering questions, and flagging side effects. It can help match candidates to clinical trials and monitor health data in real time, alerting clinicians and patients early to prevent complications and improve outcomes. [time-brightcove not-tgx=”true”]But the promise of AI in medicine will only be realized if it is built and used responsibly. Today’s AI algorithms are powerful tools that recognize patterns, predict, and even make decisions. But they are not infallible, all-knowing oracles. Nor are they on the verge of matching human intelligence, despite what some evangelists of so-called artificial general intelligence suggest. A handful of recent studies reflect the possibilities but also the pitfalls, pointing out how medical AI tools can misdiagnose patients and how doctors’ own skills can weaken with AI.A team at Duke University (including one of us) tested an FDA-cleared AI tool meant to detect swelling and microbleeds in the brain MRIs of patients with Alzheimer’s disease. The tool improved the ability of expert radiologists to find these subtle spots in an MRI, but it also raised false alarms, often mistaking harmless blurs for something dangerous. We concluded that the tool is helpful, but radiologists should do a careful read of MRIs first, and then use the tool as a second opinion—not the other way around.These kinds of findings are not confined to the tool we looked at. Few hospitals are independently assessing the AI tools they use. Many assume that just because a tool has been cleared by the FDA, it will work in their local setting, which is not necessarily true. AI tools work differently for different patient populations, and each has unique weaknesses. That’s why it’s essential for health systems to do due diligence and a quality check before implementation of any AI tool to ensure it will work in that local setting and then educate clinicians. In addition, both AI algorithms and the ways humans interact with them change over time, prompting former FDA commissioner Robert Califf to urge constant post-market monitoring of medical AI tools to ensure they remain reliable and safe in the real world. In another recent study, gastroenterologists in Europe were given a new AI-assisted system for spotting polyps during colonoscopies. Using the tool, they initially found more polyps—tiny growths that can turn into cancer—suggesting the AI was helping them spot areas they may have otherwise missed. But when the doctors then returned to performing colonoscopies without the AI system, they detected fewer pre-cancerous polyps than before they’d used the AI. Although it’s not clear exactly why, the study’s authors believe clinicians may have become so reliant on AI that in its absence they became less focused and less able to spot these polyps. This phenomenon of “deskilling” is supported by another study which showed that overreliance on computerized aids may make the human gaze less likely to scan peripheral visual fields. The very tool meant to sharpen medical practice had perhaps blunted it.AI, if used uncritically, can not only propagate wrong information, but erode our very ability to fact-check it. It’s the Google Maps effect: drivers who once navigated by memory now often lack basic geographic awareness because they’re used to blindly following the voice in their car. Earlier this year, a researcher surveyed more than 600 people across diverse age groups and educational backgrounds and found that the more someone used AI tools, the weaker their critical-thinking abilities. This is known as “cognitive off-loading,” and we are only just starting to understand how it relates to AI usage by clinicians.Read More: Why Do Taxi Drivers Have a Lower Risk of Alzheimer’s?All of this underscores that AI in medicine, as in every field, works best when it augments the work of humans. The future of medicine isn’t about replacing health care providers with algorithms—it’s about designing tools that sharpen human judgment and amplify what we can accomplish. Doctors and other providers must be able to gauge when AI is wrong, and must maintain the ability to work without AI tools if necessary. The way to make this happen is to build medical AI tools responsibly.We need tools built on a different paradigm—ones that nudge providers to look again, to weigh alternatives, and to stay actively engaged. This approach is known as Intelligent Choice Architecture (ICA). With ICA, AI systems are designed to support judgment rather than supplant it. Instead of declaring “here is a bleed,” an ICA tool might highlight an area and prompt, “check this region carefully.” ICA augments the skills medicine depends on—clinical reasoning, critical thinking, and human judgment.Apollo Hospitals, India’s largest private health system, recently began using an ICA tool to guide doctors in preventing heart attacks. A previous AI tool had provided a single heart-attack risk score for each patient. The new system provides a more personalized breakdown of what that score means for them and what contributed to it so that the patient knows which risk factors to address. It’s an example of the kind of gentle nudging that can allow doctors to succeed at their jobs without taking over their autonomy.There is a temptation to oversell AI as if it has all the answers. In medicine, we must temper these expectations to save lives. We must train medical students to work both with and without AI tools and to treat AI as a second opinion or an assistant rather than an expert with all the right answers. The future is humans and AI agents working together.We’ve added tools to medicine before without weakening clinicians’ skills. The stethoscope amplifies the ear without replacing it. Blood tests provide new diagnostic information without eliminating the need for a medical history or physical exams. We should hold AI to the same standard. If a new product makes doctors less observant or less decisive, it’s not ready for prime time, or it’s being used the wrong way.For any new medical AI, we should be asking whether it makes the clinician more thoughtful, or less. Does it encourage a second look or invite a rubber stamp? If we commit to designing only those systems that sharpen rather than replace our abilities, we’ll get the best of both worlds, combining the extraordinary promise of AI with the critical thinking, compassion, and real-world judgment that only humans can bring.