AIコンサル

What AI Can Do Now: Chrome × Gemini, Manus, and the Latest in Robotics | Voice-Controlled Photo Editing and Robots That Learn by Watching

2026-01-21濱本

Rapidly evolving AI technologies are bringing transformative change to our daily lives and industry futures. From the autonomous AI agent Manus to Gemini integration in Chrome, robots learning human movements from video, and voice-controlled photo editing tools — this article covers the latest use cases.

What AI Can Do Now: Chrome × Gemini, Manus, and the Latest in Robotics | Voice-Controlled Photo Editing and Robots That Learn by Watching
シェア

Hello from TIMEWELL

I'm Hamamoto from TIMEWELL Inc.

Looking for AI training and consulting?

Learn about WARP training programs and consulting services in our materials.

AI Technologies Are Rapidly Reshaping Our World

AI technologies are rapidly reshaping our world, bringing transformative change to our daily lives and industry futures. Autonomous AI agent Manus, Gemini integration coming to the Chrome browser, an innovative project in which robots learn human movements from video, and voice-controlled photo editing tools — a wide range of use cases are becoming reality. This article explains in detail the specific examples that have emerged from multiple recent news developments and demonstrations.

A world where AI can buy a sleeping bag, automatically register calendar events, and even change your outfit in a photo — and a future where first-person perspective robot learning reshapes the nature of work itself — leaves everyone astonished. We break down these cutting-edge technology trends in plain language. After reading this article, you will not only understand terms like "Manus," "autonomous AI," and "Gemini integration" — you will gain a comprehensive picture of an AI revolution that foreshadows the future.

What Changes When Chrome Gets Gemini? The Moment AI Becomes Your Browser's True Companion Robots That "Watch and Learn" | The Impact of First-Person Perspective Learning and Voice-Controlled Image AI Manus Connects Automatically to Gmail and GitHub | The Reality of "Autonomous AI" and NVIDIA's Serious Investment Summary What Changes When Chrome Gets Gemini? The Moment AI Becomes Your Browser's True Companion

Google has announced and is gradually rolling out Gemini integration into Chrome, with major implications for the future browsing experience. The ability to access all information across tabs and on-screen, understand user intent, and provide real-time support — capabilities that conventional browsers never had — represents a truly next-generation interface. For example, if a user wants to buy a sleeping bag, clicking the Gemini icon that appears in the upper right of Chrome launches an AI you can chat with right on screen. A conversational interface that automatically handles color, design selection, and connection to online shopping sites to suggest the best options marks a clear departure from the search experience we have known.

In a demonstration, an actual sleeping bag purchase scenario was shown — Gemini reviewed all open browser tabs and windows and selected superior options from multiple product images. Gemini also made revolutionary security improvements, reinforcing protections against phishing, viruses, and password management to help keep users safe.

This New Chrome Feature Does More Than Deliver Search Results

This new Chrome feature does more than deliver search results — it executes specific actions aligned with user intent. For example, it helps compile shopping lists and automatically compares products, reducing the friction of everyday tasks. Because it operates within the browser itself, processing is distributed locally, which offers the reassurance of reduced risk of information leakage.

A system that draws on users' browsing history, past searches, and calendar schedules to proactively organize and present needed information feels very much at home in the Google ecosystem. In practice, AI automatically assists with email and calendar management — confirming reservation details and drafting outgoing messages — dramatically reducing the manual effort required.

This system received high marks for practicality and convenience from beta testers, who remarked "this is something else." It is currently available to select users primarily in the United States as a cutting-edge trial, but for users in other markets, a future where similar AI assistants are integrated into every aspect of daily life is drawing steadily closer.

As implied by the keywords "Chrome," "Gemini," "AI integration," and "enhanced security," Gemini's integration also serves to further strengthen Google's ecosystem. This is expected to establish an environment where more users collaborate with AI and to accelerate adoption in areas such as search, shopping, and business productivity. Demos have already been published showing automatic management of multiple browser tabs and assistance with external service searches and browser operations, with final actions typically requiring user confirmation.

Key Characteristics of Chrome's Integrated AI

Key characteristics of Chrome's integrated AI:

  • Real-time web information analysis and conversational capability

  • Enhanced security features (anti-phishing, password management, etc.)

  • Improved productivity through integration with Google Calendar and Gmail

Gemini-powered Chrome thus goes beyond being a mere browsing tool — it has the potential to automate and streamline many tasks as a life partner. Currently focused on English-speaking markets and the United States, expansion into other markets including Japan is anticipated as international rollout progresses. Users will be able to achieve a more seamless digital life through cross-platform experiences that link various devices and applications.

Furthermore, the arrival of Gemini marks a step toward "conversational search" — combining voice recognition and image analysis — that overturns conventional search engine use. This new search experience not only reduces the burden of information-seeking but also provides a more intuitive interface. The result is a browsing environment accessible to a broad range of generations, from young people to seniors, with the potential to fundamentally transform web use going forward.

In Actual Demo Footage

In actual demo footage, Gemini was seen responding to natural spoken language — reading context, not just individual words — and returning optimal suggestions. These innovative characteristics, combined with an intuitive user experience, are expected to serve as new benchmarks for the digital lifestyles of the future. The AI revolution that starts with the Chrome browser is expected to find applications in an ever-wider range of scenarios while intensifying competition across the industry as a whole.

Robots That "Watch and Learn" | The Impact of First-Person Perspective Learning and Voice-Controlled Image AI

Recent advances in robotics technology have the potential to dramatically change traditional teaching methods by enabling robots to learn directly from human actions and habits. For example, in Project Gobik developed by Figure AI, robots were introduced that directly learn from first-person perspective video of humans — footage captured while wearing a headset and handling everyday tasks like laundry and dishes — and then imitate those movements. This approach has the potential to reduce reliance on conventional manual teaching, and is expected to improve operational efficiency when used alongside supplementary instructions and safety designs.

In this project, a camera and sensors mounted on the subject's head are used to capture first-person perspective video recording the human's everyday movements. Every daily moment is recorded — folding laundry, setting dishes on a table, watering plants — and this data is then analyzed by AI. As a result, robots gain the ability to smoothly carry out tasks even in environments and with tasks they have not previously learned, by imitating the movements of skilled humans.

The voice-controlled AI photo editor — Genspark's PhotoGenius — is also an innovative example of AI's potential. In a demonstration, a user simply spoke a specific request about a photo taken on their smartphone — "change this person's hairstyle to an afro" — and the AI instantly analyzed the image and changed the hairstyle and clothing as requested. In the demo, as if by magic, the subject's hair changed to an afro, then the outfit automatically changed to a white suit, blurring the boundary between reality and digital manipulation.

This AI Photo Editor Offers Professional-Level Editing

This AI photo editor makes professional-level photo editing accessible to anyone without specialist software like Photoshop. The ability to accomplish complex editing work that previously required skill and time simply by giving instructions in conversational form is a genuinely innovative experience for users. In this way, the shared evolutionary direction of AI — learning natural movements and operations — can be seen across seemingly different fields like robotics and photo editing, generating great anticipation for the future.

Key elements driving the evolution of robotics and image editing technology include:

  • Natural movement learning through first-person perspective video recording

  • An editing engine that instantly reflects the user's specific instructions

  • Autonomous learning capability that eliminates the need for traditional teaching processes

These Technologies Are Expected to Find Applications Across Many Fields

These technologies are expected to find applications across many fields when deployed in real-world settings — including factory automation, home care robots, and professional video editing. In robotics, they are likely to play an increasingly important role in addressing labor shortages and replacing human workers in hazardous environments.

In the actual demonstrations, footage was shown of a robot in a laboratory autonomously folding laundry and handling kitchen items based on first-person human perspective video. Engineers are developing algorithms that enable robots to learn even the subtle nuances that previous teaching methods could not capture — such as the fine differences in how hands move and how objects are held. These efforts represent a major turning point that will transform the robot's role from performing simple repetitive tasks to enabling sophisticated judgment and flexible responses.

Furthermore, real-time editing via voice control is expected to be useful in everyday scenarios such as checking one's appearance before a party or event. Being able to review the finished result on the spot every time you take a photo allows users to casually simulate how they look and the impression they make. This system, which combines high operability with immediacy, will be valued across a wide range of creative scenarios — from social media posts to promotional image creation.

In this way, robot learning and voice-controlled image editing both exemplify AI's ability to accurately grasp human movements and intent and respond immediately. As these technologies permeate everyday life and become a natural presence that supports our actions, a future where daily tasks are dramatically more efficient is becoming very real. For technologists, marketers, and everyday users alike, this evolution is growing not just into a tool, but into a presence that functions as a new life partner.

Manus Connects Automatically to Gmail and GitHub | The Reality of "Autonomous AI" and NVIDIA's Serious Investment

Among the Many AI Solutions in the Market

Among the many AI solutions in the market, ManusAI stands out for its distinctive presence as an autonomous agent. Manus goes beyond simply processing data — it is deeply involved in users' entire working lives, providing business assistance such as drafting emails via Gmail and registering events in Google Calendar. For example, when a user instructs it to "draft an email to myself for tomorrow morning and add the event to my calendar at the same time," Manus instantly gathers relevant information and automatically carries out the drafting process.

This system manages multiple tasks that users previously handled manually, connecting them seamlessly to dramatically improve business productivity. Users can complete complex processing through voice input or simple commands — without even opening a smartphone or PC — registering things like a packing list before a business trip or important notes for a trip to Singapore directly to their calendar. This has greatly improved convenience for real-world use.

ManusAI is also expanding use cases for creators by integrating with GitHub repository management and code management tools for engineers. These integrations, combined with Google Calendar, Gmail, and other cloud services, are forming a new platform that centralizes management of users' work environments. Users can go beyond simply inputting information — they can maximize the use of past data and cross-application linkages to achieve optimized workflows.

More recently, industry heavyweights NVIDIA and OpenAI announced a strategic partnership aimed at building next-generation AI infrastructure. NVIDIA's large-scale investment in longtime rival Intel signals the potential to shift the overall market power balance. In conjunction with OpenAI, this is expected to advance not only conventional PC products but also the development of AI-dedicated hardware and data centers — hinting at a future where AI is positioned as a fundamental infrastructure, like water or electricity.

The partnership between ManusAI and major enterprises can be summed up by these key elements:

Task Automation Through Integration with Gmail and Google Calendar

  • Task automation through integration with Gmail and Google Calendar

  • Improved efficiency through integration with engineer-oriented tools like GitHub

  • NVIDIA and OpenAI's strategic investment signals the construction of next-generation AI infrastructure

  • The seamless integration of the entire market ecosystem enables the future digital lifestyle

These efforts symbolize a future where AI functions not just as a tool but as an ecosystem that supports users' entire lives. In practice, the ability for users to manage email drafting, calendar event registration, and GitHub code management all from a single smartphone is already being used by early adopters, who give it high marks for convenience and efficiency. These systems are also focused on secure data management and security in the cloud, with designs that minimize the risk of information leakage — another major attraction.

The Major Capital Movements of NVIDIA and OpenAI

The major capital movements of NVIDIA and OpenAI have the potential to affect not just the acceleration of technology development but also the global economy and entire industries. This makes a future where AI infrastructure becomes as fundamental as water, electricity, and gas increasingly real — and governments and businesses around the world are being forced to respond. In today's digital society, it is almost inevitable that the ecosystem surrounding AI will function as infrastructure, and ManusAI is attracting attention as a forerunner of that trend.

For users, the integration of tasks previously managed separately onto a single platform is expected to deliver significant efficiency gains in daily life as well as business. For example, the ability to aggregate all the information needed for pre-trip preparation in a single operation and have it automatically reflected in calendar and email has extremely broad practical applications. ManusAI handles complex information processing and task management simply because the user "speaks" to it, contributing greatly to a stress-free digital life.

This article has detailed how the latest AI technologies are poised to transform our entire lives, drawing on specific demonstrations and real-world examples. The first-person perspective robot learning of Project Gobik and the demo of voice-controlled photo editing that operates entirely outside conventional frameworks both show new possibilities. The potential of autonomous agents was also introduced through examples of application to daily business tasks.

These efforts are not merely technological innovations — they represent a major turning point that will improve efficiency, convenience, and safety across every aspect of our lives. In the future, the penetration of AI is expected to outpace even the spread of smartphones, ushering in an era where every generation and industry benefits from it. The specific use cases and demonstrations from each project serve as a clear guide toward the realization of tomorrow's digital lifestyle, providing major support for users to handle daily tasks more simply and efficiently.

The Technologies We Are Witnessing Will Continue to Evolve

The technologies we are witnessing will inevitably continue to evolve and spread into other industries and life situations. Solutions combining AI with smart home appliances, self-driving vehicles, and even remote diagnosis in the medical field are being born across every domain. Faced with this future, we should feel the dawn of a new era and pay increasing attention and hold increasing expectations for the realization of a safer, more convenient life. It will be important to continue watching the movements of companies and research institutions and to keep pursuing the possibilities that AI brings to the future.

As described above, browser integration, robot learning, and ManusAI's seamless connectivity are becoming major factors transforming the future living environment. In the years ahead, AI will go beyond being a mere support tool — through coexistence with us, it will become an important partner in building a richer foundation for life. We hope this article serves as a useful guide to understanding the currents of the latest technologies.

Reference: https://www.youtube.com/watch?v=J-d0NC5dW3g

Considering AI adoption for your organization?

Our DX and data strategy experts will design the optimal AI adoption plan for your business. First consultation is free.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About AIコンサル

Discover the features and case studies for AIコンサル.