What kind of study is "AI Apps 50"?

a16z (Andreessen Horowitz) and the fintech company Mercury analyzed the real bank transaction data of more than 200,000 startups from June through August 2025 to identify the top 50 AI-native application companies by spend. It was published on October 2, 2025. What makes it new is that it measures adoption by "money that actually moved" rather than by web traffic.

Who took the number one and number two spots in spending?

OpenAI was first and Anthropic second, with foundation model providers dominating the top of the list. Because the application-layer companies (such as Lovable) themselves pay these APIs to build their own products, spending is effectively counted twice. Replit, in the coding space, came in third.

How can Japanese companies make use of this ranking?

Roughly 70% of the listed companies are products that an individual can adopt and then spread across a team. A realistic approach is "bottom-up adoption" — trying low-barrier areas such as meeting notes or coding assistance on the ground, confirming the impact, and only then rolling it out. It is also worth noting that budget is concentrating on copilot-style tools that blend into existing work, rather than on full automation.

What should you watch out for when reading the ranking?

Because it is limited to transactions that go through Mercury, it skews toward early-stage U.S. startups. Many of the ARR and valuation figures are self-reported, and there have been cases — 11x, Cluely, Delve — where the credibility of the numbers was called into question. The rankings are a snapshot from June to August 2025, and the situation has kept moving since publication through acquisitions and large funding rounds.

Which AI Are Startups Actually Paying For? Reading Into a16z's "AI Apps 50"

Hello, this is Ryuta Hamamoto from TIMEWELL.

"So which AI are startups actually paying for, in the end?" A study has come out that answers this question not with guesswork or vibes, but with the real money that left their bank accounts. It is "AI Apps 50: Startup Edition," compiled by a16z (Andreessen Horowitz) and the fintech firm Mercury.

Rankings of this kind are usually measured by web visitor counts — who gets searched and opened the most. That has its own value, but it is a different thing from "who is actually signing contracts and paying." This study pulled out the 50 AI services that startups are truly paying for, drawing on the transaction data of more than 200,000 Mercury customers. What I found interesting is that the services that stand out by traffic and the services where wallets are actually opening turned out to be quite far apart.

Let me lead with just three takeaways.

OpenAI is first, Anthropic second. Foundation model providers dominate spending, and Replit, in the coding space, slots in at third.
By count of listed companies, horizontal tools usable across any role make up about 60%, while vertical tools for specific roles make up about 40%. The single largest category was creative tools, with 10 companies (this is a share by number of companies, not by spend).
Two patterns showed up clearly in the numbers: "bottom-up adoption," where individuals adopt a tool and bring it into the workplace, and "the enterprise arrival of vibe coding," where coding agents have settled into production development.

What this study measured, and how

The study was led by a16z's enterprise investing team (partners Olivia Moore, Marc Andrusko, and Seema Amble). It was conducted jointly with Mercury, which provides financial services for startups. Over the three months from June through August 2025, they analyzed the spending of more than 200,000 Mercury customers and identified the top 50 companies at the AI-native application layer. It was published on October 2, 2025.

There are a few assumptions about the methodology worth knowing. The data is limited to transactions that flow through Mercury (ACH, card spend, wire transfers, and so on), so it cannot capture spend on other cards or through expense reimbursement. Companies that mainly sell cloud, GPUs, or infrastructure (such as Azure or CoreWeave) are excluded. Google's spend could not be separated between Cloud and Gemini, so it is reported as a combined figure. This is therefore not a miniature of "all startups in the world" — it is a cross-section of the wallets of early-stage U.S. startups.

What still gives it value is that it is the first time adoption has been measured by "the amount of money that actually moved." a16z itself describes it as "a real-time signal of where AI is actually working in products and workflows." Olivia Moore points out that the more pleasant a consumer-facing tool is, the more individuals start using it on their own and bring it into the workplace. Seema Amble says that "tools are proliferating, and it is not the case that each category has converged on one or two companies." Indeed, scanning the list, you get the impression that no field has settled on a winner yet.

The 60% horizontal versus 40% vertical split is, again, a ratio of the number of companies listed — not a ratio of total spend. That is easy to misread, so I want to emphasize it. Among vertical tools, the three most popular areas were customer service, sales and GTM, and recruiting and HR.

Why are foundation models this far up the list?

OpenAI, in first place, was founded in 2015 and is headquartered in San Francisco. It provides ChatGPT, the GPT family of APIs, DALL-E, and Whisper. On March 31, 2026, it raised $122 billion in a SoftBank-led round, bringing its post-money valuation to $852 billion (per CNBC reporting). Its revenue mix is also shifting: a16z's report says that "as of last October, 75% of revenue was consumer, but recently it is closer to 50/50," while OpenAI itself stated as of March 2026 that "enterprise accounts for more than 40% of revenue." Because the figures conflict depending on the point in time, I am listing both here. Either way, it is clear that OpenAI is becoming an "infrastructure layer" that nearly every app connects to.

Anthropic, in second place, was founded by former OpenAI members and trains its Claude family of LLMs using "Constitutional AI." It has been favored as the "safe choice" by regulated industries and risk-conscious enterprises. It raised $13 billion in September 2025 in an ICONIQ-led round (a $183 billion valuation), and $30 billion in February 2026 in a GIC/Coatue-led round (a $380 billion valuation). Claude Code is driving its growth. The fact that it is the only frontier model available across all three major clouds — AWS Bedrock, Google Vertex AI, and Microsoft Azure — also lowers the friction for enterprises adopting it.

The reason both companies occupy the top is, when you think about it, simple. Their capabilities are general-purpose, so they can be embedded in any department. On top of that, other listed companies (such as Lovable, Manus, and Notion) pay these APIs to build their own products, so spending is counted twice. And startups try the chat UI and the API first. Perplexity (12th, valued at over $20 billion) and Merlin (30th) also rank highly as general-purpose LLM access, and a16z sees it this way: "Dominance in this area is not yet decided. It may not end up winner-take-all."

Vibe coding has already become a tool for production development

This is the area that personally interested me the most. a16z sums it up: "Vibe coding is no longer just a consumer trend — it has reached the workplace."

Replit, in third place, was founded in 2016, with CEO Amjad Masad. It has evolved from a browser-based IDE into an agentic development environment; the Replit Agent runs autonomously for hours and has database, authentication, and publishing built in. In September 2025 it reached a $3 billion valuation, and in March 2026 it raised $400 million to hit a $9 billion valuation. Annual revenue grew rapidly from about $10 million at the end of 2024 to $150 million by September 2025 (some estimates put it at a roughly $300 million scale by the end of 2025). This is where the strength of spending data comes through. In consumer traffic, the more front-end-oriented Lovable stands out, yet by revenue from Mercury customers, a16z states plainly that Replit was about 15 times larger than Lovable. Popularity in traffic and the enterprise wallet are two different stories.

Cursor (Anysphere), in sixth place, was founded in 2022 by four MIT alumni. It is an AI-native code editor forked from VS Code, with ARR rising from $100 million in January 2025 to over $1 billion in November, and to $2 billion by February 2026. It has even been called "the fastest B2B company ever to reach $2 billion from zero." Roughly 70% of the Fortune 1,000 are said to be part of its customer base.

Lovable, in 18th place, comes out of Stockholm, Sweden, with CEO Anton Osika. It is a no-code product that generates an entire app from a text prompt, reaching $100 million ARR about eight months after founding and $200 million four months after that. Klarna, Uber, and Zendesk are customers. That said, because it uses OpenAI and Anthropic models, inference costs squeeze its margins — which I think is the fate of no-code AI.

Others on the list include Cognition in 34th place (the autonomous AI engineer "Devin," with combined ARR of about $155 million after acquiring Windsurf) and Emergent in 48th place (agentic vibe coding for non-technical users). Behind so many of these making the list are several factors: the reasoning power of Claude and GPT has improved to the point of handling production tasks; amid an engineer shortage, development speed has become a competitive advantage; and PMs, marketers, and even founders can now build apps "in English," widening the user base all at once. Whether this area converges on a single company or four or five coexist is something no one can say yet.

Interested in leveraging AI?

Download our service materials. Feel free to reach out for a consultation.

Book a Free Consultation Download Resources

Creative tools are the largest category

With 10 companies listed, the creative segment became the largest category. I think this reflects how "creation," once the domain of marketers and certain specialists, has become a horizontal capability that anyone can touch.

Freepik, in fourth place, comes out of Málaga, Spain. It was originally a stock-asset site, but after the arrival of DALL-E 2 it pivoted entirely to generative AI. Without taking a single dollar of U.S. VC money, it reached about $230 million ARR (roughly half of which is video) and rebranded to "Magnific" in April 2026. It is said to be Europe's largest generative-AI web company (by user count).

ElevenLabs, in fifth place, was founded in 2022 by two people from Poland. It provides near-human voice synthesis, voice cloning, and dubbing. It ended 2025 with over $330 million ARR, raised $500 million in February 2026 in a Sequoia-led round, and reached an $11 billion valuation. Deutsche Telekom, Revolut, and Klarna have deployed it for customer support. NTT DOCOMO Ventures is also among its strategic investors.

Others include Canva (17th, leveraging OpenAI and Google models in Magic Studio, at a $3.3 billion ARR scale), Photoroom (22nd, background removal for e-commerce sellers, profitable from early in its founding), Midjourney (28th, with zero VC money and an astonishing capital efficiency of about $3 million in revenue per employee), Descript (31st, built on the idea that editing a transcript edits the video), OpusClip (35th), CapCut (44th, under ByteDance), Arcads (47th, ad videos with AI actors), and Tavus (50th). China's KlingAI (15th), under Kuaishou, also made the list, symbolizing the rise of Chinese players in video generation that rival Western models. Looking at companies like Midjourney that produce $200–500 million ARR with 100 to 160 people, it makes you wonder how far the "small headcount, high revenue" of the AI era can go.

The speed of voice and meeting AI that "records and summarizes"

Record a meeting, transcribe it, and summarize it. The value of this barely needs explaining. You do not have to change your behavior, and it is cheap and quick to try — so purchasing decisions come fast. a16z also writes that this category "has no single winner."

The main names are Fyxer (7th, an AI assistant integrated with Gmail/Outlook/Slack/Zoom, surpassing a $17 million run rate in eight months), Retell AI (16th, a voice-agent platform that automates phone calls, processing over 50 million calls per month), happyscribe (36th, transcription in over 120 languages), PLAUD (38th, paired with card-sized recording hardware, with over 1 million units sold across 170 countries), Otter.ai (41st, auto-joining Zoom/Meet/Teams, breaking $100 million ARR in March 2025), and Read AI (49th, differentiated by meeting engagement analytics).

What this field looks like right now is that many forms coexist — from hardware (PLAUD) to software (Otter, Read) to real-time assistance. One practical note: meeting records and recordings always come with the consent of the parties involved and the recording laws of each country. Circulating them carelessly because they are convenient can cause trouble later.

Automating sales and GTM, and customer support

In sales and marketing, the list includes Clay (25th, which bundles over 150 data sources to automate research and outreach, growing ARR from $1 million to $100 million in two years and spawning a new role called the "GTM engineer"), Instantly (13th, a disruptive model offering unlimited cold-email automation under a single subscription), Customer.io (14th, behavior-based multi-channel messaging), and 11x (37th, "autonomous digital workers" such as the AI SDR "Alice").

As for 11x, however, a TechCrunch investigation in March 2025 reported inflated ARR and issues with customer logos, and the founder stepped down as CEO in May of that year. The more a product calls itself an "AI employee," the more often the reality requires human management. Fact-checking before adoption is essential.

In customer support, the list includes Lorikeet (8th, which follows SOPs via an "intelligent graph" to prevent hallucination, handling Tier 2/3 cases in fintech and healthcare), Ada (40th, a no-code platform resolving over 83% of inquiries automatically), and Crisp (46th, a multi-channel shared inbox). Do you automate deeper procedural handling rather than self-service, or do you favor easy initial setup? It feels like the difference in design philosophy is laid out right there in the lineup.

Recruiting and compliance, and the "softwarization of services"

Among vertical tools, what especially caught my eye were the areas that human professionals have traditionally handled.

In recruiting and HR there is micro1 (9th, the AI recruiter "Zara," with ARR growing from $7 million to $50 million in nine months), Metaview (19th, an AI note-taker specialized for recruiting), and Applaud (43rd, a no-code platform for HR). In compliance, legal, and patents there is Delve (11th, automating SOC2 and HIPAA compliance in days), Solve Intelligence (23rd, speeding up patent drafting by 60–90%), Crosby (27th, a hybrid "AI law firm" combining AI agents and lawyers, reviewing NDAs and MSAs at a flat rate and high speed), Combinely (29th, an AI "colleague" for accountants), Serval (39th, an AI-native IT service desk), and Alma (42nd, immigration-law services).

The language a16z uses here is telling: "What used to be a services company or a consultancy is becoming a software company in the AI era." Instead of being locked into long-term contracts with a law firm or consultancy, you "hire" a flat-rate, high-speed, AI-native service. The newer the startup, the more it benefits from this nimbleness. The fact that Crosby is taking on the billable-hour model itself is, I think, the symbol of this shift.

That said, vertical tools also include some companies whose numbers are loosely verified. In April 2026, Delve was reported to have allegedly fabricated compliance audit trails and stolen IP from open-source software, and multiple outlets reported that YC removed it from its portfolio. If you are considering adoption, you should keep up with news of this kind.

Knowledge management and horizontal productivity tools

In the realm of handling internal knowledge, the list includes Notion (10th, which launched its AI agent feature Notion 3.0 in September 2025 and broke $500 million ARR, running profitably without external funding), Glean (21st, which searches across internal SaaS while respecting permissions, valued at $7.2 billion), and Manus (33rd, an autonomous agent built on Claude, acquired by Meta for over $2 billion in December 2025).

Others include Gamma (24th, an AI presentation builder reaching $50 million ARR on less than $25 million raised), Grammarly (32nd, which acquired Superhuman and expanded into an AI productivity suite), Merlin (30th, a browser extension that calls an LLM on any web page), Cluely (26th, an "invisible" assistant that surfaces answers in real time during meetings), Motion (45th, which auto-schedules tasks and meetings), and Adept (20th, an autonomous agent that operates software at the UI level, effectively acquired by Amazon). Note that in March 2026, Cluely's founder admitted on X that the $7 million ARR figure was inflated (analysis suggests the reality was about $5.2 million). I think it is about right to discount the flashier numbers.

How is this different from past rankings?

a16z separately publishes a "Top 100 Gen AI Consumer Apps" ranking based on web traffic. This "AI Apps 50" is based on actual spend. Comparing the two surfaces several interesting differences.

One is the Lovable–Replit reversal I already touched on. Even though Lovable ranks higher by traffic, Replit was about 15 times larger by enterprise spend. The presence or absence of enterprise features maps directly onto spending. Another is that 12 companies appear on both lists, and 11 of them began as consumer products and later added team features. Cluely and Midjourney still derive the majority of their revenue from consumers.

And the breakdown of the 17 vertical companies was "12 versus 5" — 12 copilot-type tools that assist humans, against 5 AI-employee-type tools that complete a task end to end. The more reliability a job demands, the more copilots with a human in the loop are still being chosen. a16z predicts that "once computer use spreads as a mode and end-to-end agents can be assembled, this ratio will tilt toward the AI-employee type," but for now my read is that spending is heading toward the steadier copilots.

Implications for Japanese startups and companies

This ranking is U.S.-heavy, but it has practical implications for those of us working in Japan as well.

First, a posture that permits bottom-up adoption. About 70% of the listed companies can be adopted by an individual without an enterprise license. Japanese companies, too, would do better to allow the flow of "the field tries it, proves the impact, and spreads it to the team" rather than "top-down, blanket deployment." MIT research has also pointed out that many AI projects ordered from the top down end up stalling.

Second, the fact that budget is concentrating on "augmenting humans." Money is heading toward copilots that blend into existing workflows and deliver value immediately, rather than toward full AI employees. So if you are going to start, it is rational to begin in low-barrier areas such as meeting notes, coding, and creative work.

Third, the double spend on foundation models. Understand the structure: when you pay an application-layer company, behind the scenes you are also paying OpenAI or Anthropic. If your API spend spikes month over month, that is the moment to think about model routing and setting caps.

Finally, the lineup of success stories from outside the English-speaking world. Freepik (Spain), ElevenLabs (Polish founders), Lovable (Sweden), Manus (China to Singapore), Photoroom (France). Global AI companies keep emerging from outside Silicon Valley. The same path should be open to AI-native companies from Japan.

How to go about adoption

To translate this into practice, here is what I recommend. Start by trying the low-friction categories. Meeting AI, coding, and creative tools have free or low-cost plans, and the behavioral change is small. Pilot them at the individual or team level, and measure the impact by the time saved or by CSAT. As a rough guide, roll out once you can see a few hours saved per user per week.

Use multiple foundation models in parallel to avoid lock-in. Match them to the use case (Claude for coding, GPT for general purposes, and so on), and put routing or caps in place if spending suddenly surges. For vertical tools, narrow down to the single role where your own headcount is most strained, put one tool into production, and expand it horizontally only once it has clearly improved things. And be wary of inflated ARR and reliability risks (cases like 11x, Cluely, and Delve). Do not skip reference checks with real customers and reliability testing in production before adopting.

So as not to take the numbers at face value

Finally, let me be honest about the caveats when reading this study. The data is limited to transactions through Mercury, so it skews toward early-stage U.S. startups both geographically and by customer base. The "60% versus 40%" is a ratio of the number of companies, not a ratio of spend. Many of the ARR and valuation figures are self-reported or come from secondary sources, and in May 2026 TechCrunch reported that inflation — calling a run rate or CARR "ARR" — is rampant across AI startups in general.

The rankings themselves are no more than a snapshot from June to August 2025. Since publication, Manus has been acquired by Meta, ElevenLabs has reached an $11 billion valuation, Replit a $9 billion valuation, and Adept has already been taken in by Amazon. It is also worth keeping in the back of your mind that a16z is itself an investor in OpenAI, ElevenLabs, Cursor, Replit, and others (a16z discloses this too). A ranking is not a fixed truth but a single photograph capturing "the flow of money" at this exact moment. I think that is the right distance to keep from it.

How to use this ranking in your own decisions

With tools proliferating and none of them winner-take-all, this is, conversely, a moment when you can "re-choose what suits your own company." But simply scanning the list and chasing trends will scatter your budget. What matters is to sit down and design: which of your own workflows is the bottleneck, whether a copilot or an autonomous agent is what addresses it, and how you will manage spend on foundation models.

TIMEWELL's WARP consulting walks alongside you from taking stock of where you stand on AI adoption to tool selection, internal rollout, and cost management. We often get inquiries from teams at the "I don't know where to start" or "we got stuck at the PoC" stage. If you want to use AI while handling internal knowledge safely, please also consider ZEROCK, an enterprise AI that runs on domestic servers.

"Which tool is right for my company?" "How do I rein in the API spend I'm paying twice over?" If you have questions like these, please feel free to contact us.

References

a16z (Andreessen Horowitz), "The AI Application Spending Report (AI Apps 50: Startup Edition)," October 2, 2025
a16z, "Top 100 Gen AI Consumer Apps" (6th edition, March 2026)
CNBC, TechCrunch, Forbes, Sacra reporting (funding, ARR, and valuations of individual companies)
TrendForce, official company announcements

Which AI Are Startups Actually Paying For? Reading Into a16z's "AI Apps 50"

What this study measured, and how

Why are foundation models this far up the list?

Vibe coding has already become a tool for production development

Creative tools are the largest category

The speed of voice and meeting AI that "records and summarizes"

Automating sales and GTM, and customer support

Recruiting and compliance, and the "softwarization of services"

Knowledge management and horizontal productivity tools

How is this different from past rankings?

Implications for Japanese startups and companies

How to go about adoption

So as not to take the numbers at face value

How to use this ranking in your own decisions

References

How well do you understand AI?

Newsletter

あなたのAIリテラシー、診断してみませんか？

Related Knowledge Base

Solutions

Learn More About テックトレンド

Related Articles

SpaceX's Record IPO and the Age of "AI as Weapon and Heavy Industry"

Why Kimi Built a World-Class LLM｜Decoding Zhilin Yang's GTC Talk

What Is AI for Science? A Clear Guide to MEXT's Strategy and SPReAD 1000 (1,000 Projects x 5 Million Yen)

Newsletter