AIコンサル

The Future of Medicine and AI — Professor Otsuka on Medical Consultation, Diagnosis, and the Risks of AI in Healthcare

2026-01-21濱本

AI is rapidly advancing into healthcare — and in some cases, generative AI models are already outscoring medical students in health consultation accuracy and empathy. This article examines the real-world data, risks, and future implications of AI in clinical consultation, diagnosis, and medical education.

The Future of Medicine and AI — Professor Otsuka on Medical Consultation, Diagnosis, and the Risks of AI in Healthcare
シェア

This is Hamamoto from TIMEWELL Inc.

AI Is Reshaping Healthcare — But Not Without Risks

In recent years, AI has evolved rapidly and begun making its transformative presence felt in medicine. In the realm of medical consultation, there are now documented cases of AI demonstrating higher accuracy and greater empathy than medical students or general practitioners — presenting both new possibilities and serious challenges for patients and healthcare professionals alike.

In a conversation with Professor Otsuka, a leading physician, data was introduced showing that generative AI tools like ChatGPT have achieved scores exceeding those of medical students in medical judgment assessments. The discussion also covered real-world dermatology experiments and actual medical consultation examples — including a scenario where AI scored 67.2% on a 5,000-case virtual health consultation evaluation and received "good answer" ratings from patients. In blind comparisons of physician responses versus AI responses, patients rated the AI as overwhelmingly superior in empathy, clarity, and politeness. This reveals the distinctive strengths that AI — trained on data from around the world — brings to both medical consultation and diagnosis.

At the same time, this rapid AI advancement carries real risks. One case discussed involved a patient following AI-guided health advice that led to an unusual toxic reaction (hyponatremia) requiring hospitalization — an illustration of the dangers of AI hallucination producing dangerously incorrect guidance. There is also growing concern about "deskilling" — the risk that healthcare professionals who become overly reliant on AI may experience a decline in their own diagnostic capabilities.

This article draws on a conversation with Professor Otsuka to examine AI's current state in healthcare, the accuracy of AI in medical consultation, its diagnostic potential, its risks, and its implications for the future of medical education.

Looking for AI training and consulting?

Learn about WARP training programs and consulting services in our materials.

AI in Medical Consultation: Comparison with Human Physicians

AI technology is bringing significant change to the medical consultation landscape. According to Professor Otsuka, current generative AI achieved an average score of 67.2% in an evaluation experiment using 5,000 virtual health consultation scenarios. This places AI comfortably in the realm of practical utility compared to specialist physicians — and in some cases, AI responses scored higher than physician responses in clarity and empathy.

Behind these results lies AI's core strength: learning from enormous datasets and instantly referencing medical cases from around the world. While a physician relies on limited personal experience and memory, AI can draw on countless documented cases and statistical data to provide optimal responses. References to published data from authoritative internal medicine journals were discussed in the conversation, framing broader debates about the safety and reliability of AI-powered medical consultation. Even physicians acknowledge that "over 80% accuracy is needed for specialized domains," while recognizing that around 70% accuracy may be acceptable for the kind of general internal medicine seen at community clinics.

Evaluation criteria for medical consultation also extend beyond factual accuracy to include the quality and empathy of responses. When patients receive medical information, they assess not only the clinical accuracy but also whether the response makes them feel understood and reassured. The data showed that patients, evaluating AI responses blind, frequently rated them as "very good" — particularly for empathy. This stands in contrast to physicians who, managing high patient volumes, often struggle to maintain consistent quality due to fatigue and time pressure. AI's ability to deliver consistent quality regardless of circumstances was highlighted as a key advantage.

The risks are equally real. One case discussed involved a patient following AI health consultation advice, ingesting an unfamiliar substance ("sodium iodate") instead of ordinary salt, and developing toxic symptoms requiring hospitalization. This illustrates the risk of AI hallucination — the generation of plausible-sounding but incorrect information — in high-stakes medical contexts. Safety in AI medical consultation is therefore a critical challenge: the same empathetic presentation that builds trust can mask dangerous errors.

The evaluation data was multidimensional — covering not just accuracy rates but the quality of actual patient responses and scores from physicians reviewing AI answers. Statistical graphs in the discussion compared individual physician scores against AI performance, providing concrete evidence of AI's diagnostic potential relative to traditional practitioners.

A key tension also emerged: physician "expertise" based on direct patient observation (facial expressions, physical examination, clinical experience) versus AI's "statistical precision" based on pattern recognition across vast datasets. AI excels at typical presentations but struggles with novel or atypical cases — where the experienced clinician's flexibility remains essential.

Medical consultation also demands emotional attunement — reassurance, psychological support in difficult moments. AI's demonstrated empathy scores highly on this dimension, but in practice, the nuanced emotional care a physician provides in response to a patient's full mental state remains beyond current AI capabilities.

At present, AI's responses may appear convincing at first glance, but healthcare professionals must maintain their own critical judgment — especially for uncertain information. The consensus is that AI should be used as a decision support tool, with final judgment always resting with the physician.

AI in Clinical Practice: Dermatology Diagnosis Experiments

AI is being applied across clinical practice — from diagnostic support and treatment planning to medical education. In dermatology, research comparing AI diagnoses against evaluations by specialists has drawn particular attention. In one experiment, photos of patient skin lesions were evaluated simultaneously by AI and physicians, with accuracy rates quantified and compared. The resulting data showed AI's average accuracy ranking in the upper range — above many individual physicians — for common dermatological conditions. This reflects AI's strength in pattern recognition across large training datasets.

However, dermatological diagnosis requires more than simple pattern recognition. It demands consideration of each patient's individual background and the subtle nuances of their symptoms. For rare conditions or presentations that deviate from common patterns, AI's tendency to match against trained data can actually lead to misdiagnosis.

As Professor Otsuka noted: "Some papers report AI accuracy in skin diagnosis exceeding specialists — but in actual clinical settings, the individual background and subtle differences in each patient's symptoms matter enormously. A simple numerical comparison is not sufficient."

Beyond static photographs, dermatology diagnosis also requires dynamic information: palpation, the patient's own account of their experience, and observation of symptom progression over time. AI's ability to respond to this non-standardized information is limited, which is why hybrid diagnostic processes involving collaboration between AI and physicians are currently recommended. There is also a concern that AI may be recognizing pre-learned images rather than generalizing — performing well on known cases but failing to adapt to truly novel presentations.

A final, sobering reality: even a 1% error rate in medical AI can include catastrophic mistakes. If a fatally wrong treatment recommendation reaches a patient, the consequences can be life-threatening. This underscores the point that regardless of how powerful AI becomes as a medical tool, human physicians must always review the information it provides and take ultimate responsibility for patient safety. Current AI technology has not eliminated hallucination, and uncritically accepting AI output carries the risk of serious harm.

In response to these risks, the medical community needs to internalize several principles: physicians must treat AI information as reference material while maintaining their own clinical judgment, never accepting AI outputs blindly. Patients must also recognize that however empathetic or compelling an AI response may be, final decisions should always involve a healthcare professional. Medical education must ensure that students and junior physicians still accumulate real clinical experience even as AI tools become more prevalent — and educational programs must be designed specifically to prevent deskilling.

Medical Education and AI Dependency — Looking Toward the Future

As AI permeates healthcare, it introduces several serious risks alongside its benefits. The most acute concern Professor Otsuka raises is that AI providing incorrect health advice leads patients to rely on AI when they should be seeing a physician.

One specific case was discussed: a patient followed AI instructions to consume "sodium iodate" — an unfamiliar substance — instead of ordinary salt, developed toxic symptoms, and required hospitalization. This case makes the risks of AI's influence on patients starkly visible. Patients may be swayed by the fluency and apparent empathy of AI's writing, and accept AI information uncritically in situations that genuinely require medical judgment.

AI dependency in healthcare also raises the concern of "deskilling" — or what might be called "never-skilling" — among healthcare professionals themselves. Professor Otsuka, speaking from a medical education perspective, identified this as a potentially serious problem.

The concern is that medical students and residents who rely too heavily on AI during licensing exams and clinical training may fail to develop the clinical reasoning, diagnostic capability, and written communication skills that the profession requires. As AI use increases in areas like paper writing and translation of English-language literature, there is a risk that healthcare professionals' foundational capabilities may atrophy — including diagnostic judgment itself.

Medical education requires practitioners to form their own diagnoses and treatment plans and engage directly with patients. AI can improve efficiency and information accuracy substantially. But the risk is that the clinical judgment and communication skills physicians must develop may go underdeveloped if AI handles too much.

AI adoption in medicine also opens the door to a new model of "shared decision-making" — where patients use AI-provided information to understand their own conditions and work with physicians to choose treatment approaches. This holds promise for better outcomes. But it also means that both physicians and patients must be equipped to correctly interpret and critically evaluate AI information — requiring educational infrastructure for both groups.

Going forward, medical education will need new curricula that account for both the benefits and risks of AI. Current AI systems rely on learned data, so their accuracy is high for well-documented conditions and published literature — but their limitations become apparent with novel presentations. Healthcare professionals must understand these limitations and never place excessive trust in AI results, always applying their own judgment and experience to final clinical decisions.

The medical community holds regular conferences and lectures where the latest developments in AI are shared and both benefits and risks discussed openly. Experts note that AI technology may advance dramatically every few months, meaning current evaluations and risk management approaches can quickly become outdated — making ongoing technical training an urgent priority. The deployment of AI systems must also be accompanied by clear operational guidelines — ethical and practical — to ensure healthcare professionals continue to use AI correctly as a supplementary tool, not a replacement for judgment.

At the same time, AI brings genuine new possibilities for medical education. It can substantially reduce the burden of literature review and data analysis for healthcare professionals, serving as a valuable tool for research and clinical trials. This could free physicians to focus more on patient dialogue and clinical reasoning — improving overall care quality.

The future of healthcare and medical education depends on how AI and physicians coexist and complement each other. This era demands that physicians, educators, and researchers maintain their own skills while maximizing the benefits of AI — a careful balance that must be continuously pursued.

Summary

AI's rapid evolution in healthcare is a critical theme with broad implications for patient consultation, diagnosis, and medical education. The data discussed in Professor Otsuka's conversation shows that AI can match or exceed specialist-level accuracy and empathy in some scenarios — but simultaneously carries real risks: hallucination-driven misinformation and AI dependency leading to physician deskilling.

In medical consultation, AI may provide reassurance and clarity that patients find genuinely helpful — but incorrect AI advice can cause real harm, which is why physician verification must always be the final step. In medical education, AI's efficiency gains must be balanced against the risk of eroding the skills physicians must genuinely develop — requiring AI to be positioned firmly as a supportive tool, with humans retaining ultimate decision-making authority.

Healthcare professionals, patients, and educators must work together to build the systems needed to properly integrate AI's evolution into medical practice. Doing so — while preventing misinformation and skill erosion — is the path to safer, higher-quality care.

Reference: https://www.youtube.com/watch?v=L9H3GrFIsAg

Considering AI adoption for your organization?

Our DX and data strategy experts will design the optimal AI adoption plan for your business. First consultation is free.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About AIコンサル

Discover the features and case studies for AIコンサル.