Is "Seeing AI" the Key to Fixing Siri? Apple's Next-Gen AI Strategy and the Fusion of Camera Technology
Is "Seeing AI" the Key to Fixing Siri? Apple's Next-Gen AI Strategy and the Fusion of Camera Technology
On June 9, 2026, the technology industry's attention turns once again to Apple's Worldwide Developers Conference (WWDC). This annual event is known as the stage for the latest software updates — particularly new features in iOS and macOS — but this year carries unusually high expectations and a palpable sense of tension. At the center of it all is, without question, artificial intelligence, and the future of Siri, the voice assistant that has long underpinned Apple's ecosystem. The rapid evolution of generative AI — starting with ChatGPT — has driven AI into a new phase in recent years. Against this backdrop, the entire world is watching with bated breath to see what AI strategy Apple will pursue and how it will reinvent the user experience. Behind those expectations, however, lies an undeniable reality: persistent frustration with Siri. The criticism that it stumbles on basic questions and lags behind rival AI agents has not let up. Apple itself has acknowledged delays to Siri upgrades, and in an unusual turn of events, the situation escalated to a class action lawsuit alleging false advertising. In the face of this headwind, how is Apple attempting to reclaim leadership in AI? What will iOS 19 and "Apple Intelligence" look like when revealed at WWDC, and what is the path to Siri's revival? This article digs deep into the core of Apple's AI strategy — drawing on leaked information and reports — with a particular focus on the possibility that "camera technology" holds the key, examined from a business perspective.
- The Urgency of Siri's Reinvention: Hope, Disappointment, and a Class Action Lawsuit — Apple's AI Strategy on Trial
- Why Is the Vision Pro Team the Ace in the Hole for Siri's Rebuild? A Foundation for "Seeing AI"
- Apple's Ambition: How Will "Seeing AI" Change the Future of Devices? — Rumors vs. Reality
- Summary
Looking to optimize community management?
We have prepared materials on BASE best practices and success stories.
The Urgency of Siri's Reinvention: Hope, Disappointment, and a Class Action Lawsuit
Within Apple's ecosystem, Siri has long been the centerpiece of the voice interface. It debuted alongside the iPhone, offering a preview of a hands-free, voice-first future — but its reputation has been less than stellar in recent years. While it handles simple commands like "What's the weather today?" or "Set a timer," it frequently breaks down when faced with slightly more complex questions or conversations requiring contextual understanding. Compared to Google Assistant, Amazon Alexa, and more recently AI agents with sophisticated conversational capabilities like ChatGPT, the gap is stark — and "Siri is outdated" is a sentiment that grows louder by the day.
In an attempt to turn this around, Apple last year's WWDC hinted at a "new Siri" with deeper personalization and stronger cross-app integration. It would be able to handle more advanced tasks — linking message content to calendar data to suggest appointments, or gathering and organizing information across multiple apps. This announcement generated enormous excitement among users who had long waited for Siri to evolve. In particular, this "smarter Siri" was scheduled to be heavily promoted as a flagship feature of the iPhone 16. Reports indicate that commercial footage had already been produced to showcase these new features.
But those expectations quickly unraveled. Apple abruptly acknowledged that the Siri upgrade features required additional development time and announced a delay in delivering them to iPhone 16. This exposed the fact that Apple was facing technical challenges in an increasingly competitive AI development race, dealing a major blow of disappointment to the market. User frustration naturally mounted as promised features failed to materialize — and the situation escalated beyond mere disappointment into legal territory.
Last week, a class action lawsuit was filed against Apple in the Federal District Court in San Jose, California. The complaint alleges that in promoting iPhone 16 sales, Apple used AI features that were never realized as "false advertising." The plaintiffs claim that Apple misrepresented unavailable features as accessible, misleading consumers in their purchase decisions, and are seeking damages. This lawsuit demonstrates that Apple's AI development delays are having serious repercussions not just technically, but on corporate credibility and marketing strategy as well.
This series of disruptions appears to have also triggered changes within Apple's organizational structure. According to Bloomberg reports, major leadership changes were made to the Siri development team. The person newly tasked with leading the Siri division is Vice President Mike Rockwell — known as the driving force behind the development and launch of Apple's first spatial computer, the Vision Pro. Additionally, members of his software team who worked on Vision Pro are reportedly joining the Siri rebuild project. This personnel decision might seem puzzling at first glance. Why would the executive behind Vision Pro — a product that has yet to achieve widespread market penetration — be entrusted with one of Apple's most critical challenges? Yet this very decision may be signaling Apple's future AI strategy and, in particular, a shift toward "seeing AI."
Why Is the Vision Pro Team the Ace in the Hole for Siri's Rebuild? A Foundation for "Seeing AI"
Facing the adversity of Siri's lagging improvement — escalating all the way to a class action lawsuit — Apple's chosen ace in the hole for the Siri rebuild is the Vision Pro development team led by Mike Rockwell. Vision Pro has attracted no shortage of "niche product" and "far from mainstream" assessments, given its approximately ¥500,000 price point and still-limited use cases. So why was the leader of such a product entrusted with the future of Siri — which matters to every iPhone user? There is, it turns out, an extremely important reason hidden in this appointment for reading Apple's next-generation AI strategy. The key lies in viewing Vision Pro's essence not as "just a headset" but as "an advanced AI system."
Vision Pro is a device that pushes how we interact with computers further than the traditional dimensions of keyboard, mouse, or voice input. At its core, it houses no fewer than 12 cameras and numerous sensors that work in concert to precisely capture and process the user's surrounding environment, gaze, and hand movements in real time. Without physical controllers, users can manipulate digital content simply by looking or making finger gestures. This is the foundational technology for realizing "spatial computing" — seamlessly fusing the physical world and digital information — and underlying it all is advanced AI that instantly analyzes and comprehends enormous volumes of visual information.
According to Bloomberg, within Apple, the work of the Vision Pro and Vision Pro-related groups is increasingly being referred to as "AI products." This is evidence that Apple itself positions Vision Pro not merely as hardware but as a system representing the pinnacle of AI technology. In other words, Rockwell and his team possess the deepest knowledge and most extensive track record in Apple of cutting-edge "computer vision" (AI technology that recognizes and understands the world through cameras) and interaction design built upon it.
From this perspective, the Vision Pro team's involvement in Siri development looks like an extremely rational decision. Apple may be moving to break through Siri's limitations by abandoning the traditional "voice-dialogue-centric" approach in favor of an AI agent with higher situational awareness that incorporates "visual information." One of the problems with current Siri is that it cannot understand the context or situation the user is in. For example, if a user is looking at a specific object and asks "What is this?", Siri has no way of knowing what it is. But if Siri could "see" what the user is looking at through the device's camera, it could answer that question accurately.
The true essence of the next-generation AI Apple is pursuing — namely "Apple Intelligence" — may not simply be about improving Siri's conversational ability. Rather, the core of its innovation may lie less in how Siri "talks" with us and more in how Siri "sees" us and the world around us. The technology cultivated with Vision Pro — using cameras and sensors to analyze and understand the real world in real time — if incorporated into Siri, and by extension into everyday devices like iPhone, Apple Watch, and even AirPods, Apple's AI could achieve a dramatic leap forward.
At WWDC, specific Siri demonstrations are expected — and how significant a role "camera-based situational awareness" will play deserves close attention. With the Vision Pro team's expertise infused into Siri, rather than simply speaking more intelligently, we may see an entirely new AI agent that "sees" and understands the user's situation, providing more proactive and thoughtful support. Apple is convinced that cameras are the key to next-generation AI, and WWDC is highly likely to serve as the opening chapter of that grand plan.
Apple's Ambition: How Will "Seeing AI" Change the Future of Devices? — Rumors vs. Reality
The view that Apple is focusing on "cameras" as its next move in AI strategy is not mere speculation. Connecting the dots of rumors about Apple's unreleased projects that have leaked in recent years, a consistent direction emerges: "creating new user experiences through the fusion of cameras and AI." The Vision Pro team's integration into Siri development is one movement within this larger trend, and Apple's vision of "seeing AI" holds the potential to fundamentally change the way the devices we use every day work.
Specific projects currently rumored, combining AI and camera technology, include:
Apple Watch with a Camera: Prominent Bloomberg reporter Mark Gurman has reported that Apple is exploring ways to add a camera to Apple Watch, evolving it into a more advanced wearable AI agent. Today's Apple Watch is primarily used for health monitoring and notification checking, but with a camera, it could recognize what the user is looking at and their surrounding situation, enabling information provision and action suggestions based on that context. Possibilities include translating foreign language signs in front of you, displaying calorie information for food you're looking at, or checking exercise form and providing coaching — a far more proactive assistant function. A device on the wrist that constantly "sees" the world around you to support the user — a vision that sounds like science fiction but is becoming increasingly real.
AirPods with a Camera Sensor: Reports have also emerged of camera sensors potentially being added to AirPods, Apple's iconic earphones. Again reported by Bloomberg, the intended role appears to be more as an environmental information collection sensor than a photography camera. Analyst Ming-Chi Kuo has pointed out the possibility of it being an infrared camera, with applications including recognizing hand gestures to control music, more precisely tracking the wearer's position and orientation to improve spatial audio experiences, or prompting appropriate Apple Intelligence actions based on surrounding conditions. The aim may be to provide more immersive or situationally appropriate audio and assistant experiences by incorporating visual information in addition to what enters the ears.
Smart Doorbell with Camera: In the smart home space, Apple is also reportedly advancing product development using cameras and AI. Specifically, there are rumors of a smart doorbell featuring facial recognition capabilities — automatically unlocking for registered individuals, identifying visitors and sending notifications. According to Bloomberg, this product may not be released until the end of this year at the earliest, but it also aligns with Apple's broad vision of computers scanning and assessing the real world through cameras to take action.
These individual rumored projects might each appear to be standalone product developments, but running through all of them is a common philosophy: "AI processes visual information obtained through cameras to enhance the user experience." If there are still doubts about Apple's orientation toward "seeing AI," attention should be paid to Apple Maps' ongoing efforts. Apple uses cameras mounted on vehicles and pedestrian backpacks to collect images of roads and landscapes worldwide to improve map service accuracy. This has been underway for some time, but notably — as first reported by 9to5Mac — this month these collected images have started being used for training "Apple Intelligence" as well. Specifically, the collected image data is being used to improve generative AI models including the image generation tool "Image Playground" and the photo editing feature "Cleanup." This clearly demonstrates that Apple is now viewing the enormous visual data from the real world as a valuable resource for strengthening its AI foundations and is actively leveraging it.
Taking all of these moves together, Apple appears to be pursuing not just improved conversational ability for Siri, but the creation of more situationally relevant, more personal, and more intuitive user experiences — by giving cameras as "eyes" to various devices and using AI to process the vast visual information obtained through them. At WWDC, fragments of these future visions will be concretely presented as new features in iOS 19 and Apple Intelligence.
Summary
WWDC, now just around the corner on June 9, could be an extremely significant turning point for Apple's AI strategy. Long criticized for failing to meet expectations, Siri is absorbing the expertise of the Vision Pro team and pivoting toward "seeing AI" — the question is what kind of evolution this will produce. And what innovation will that evolution bring to Apple's ecosystem as a whole, starting with iOS 19? Developers and users around the world are watching with enormous anticipation.
The AI feature suite that will likely be deployed under the "Apple Intelligence" brand holds the potential not merely to add features but to fundamentally transform how Apple devices are used. In particular, the ability to recognize surrounding conditions through cameras and provide contextually appropriate support may break through the limitations of traditional voice assistants. Apple Watch understanding what the user is looking at; AirPods recognizing gestures; iPhone performing smarter cross-app integration. If such a future becomes reality, our digital lives will become more seamless and intuitive.
However, challenges and concerns remain. The biggest challenge is still Siri's fundamental conversational ability. However much its "seeing" capability improves, if natural and smooth voice communication with users cannot be achieved, its value is halved. The current state — where it can't accurately answer even basic questions — must be improved first, or expectations for advanced features risk spinning their wheels. At WWDC, the critical question will be whether the personalization and cross-app integration features, delayed from last year's announcement, can finally be demonstrated in a practical and polished form.
Furthermore, an AI system that constantly utilizes cameras raises new privacy concerns. How will information about the user's surrounding environment and behavior be collected, processed, and protected? Apple has consistently positioned itself as placing privacy protection at the highest priority, but for the rollout of "seeing AI," even more transparent explanation and the construction of a system that users can engage with confidently will be indispensable.
Will Apple be able to dispel the long-standing frustration with Siri at WWDC and once again demonstrate leadership in AI? And will it be able to generate consumer purchase intent for new camera-equipped hardware devices? While holding high expectations for the future iOS 19 and Apple Intelligence will bring, it will be necessary to assess their feasibility and practicality with a clear eye. Let's watch carefully as the WWDC announcements unfold and see how the market responds. We hope that the future of AI Apple is trying to pioneer through "seeing" will exceed our expectations.
Reference: https://www.youtube.com/watch?v=PUTCZKDi4qk
TIMEWELL AI Adoption Support
TIMEWELL is a professional team supporting business transformation in the AI agent era.
Services
- ZEROCK: High-security AI agent running on domestic servers
- TIMEWELL Base: AI-native event management platform
- WARP: AI utilization talent development program
In 2026, AI is moving from "something you use" to "something you work alongside." Let's think through your AI strategy together.
