December 9, 2025

Dexterous Embodied Intelligence: The Promise and Challenge of Humanoids

Dive into the latest perspectives, insights,
and updates from our global community.

Listen now

All eyes are on humanoid systems as the next frontier of robotics. Once mostly sci-fi, they now appear in startup roadmaps and corporate strategies alike as teams explore how increasingly human-like AI can move from screens into machines. Humanoids represent an ambitious push to manifest the latest in AI into embodied form, with the potential to impact everything from industrial floors to household chores.

Humanoids represent the latest chapter in a modern robotics era that traces back to the early 1970s. Companies like FANUC and KUKA developed quickly behind products like Unimate, the world’s first successful industrial robot, innovating and scaling in the first innings of the industry. The foundations set in this Rigid Automation phase power manufacturing today. The space evolved steadily until democratization of sensing and compute opened the door for high-performance Application Specific robots starting in the 2010s. Many of these systems leveraged either advanced perception and navigation to enable semi- or fully autonomous motion or performed pick-and-place tasks with off-the-shelf robot arms. We recently entered the generation of Highly Adaptable, Generalized systems, featuring flexible and collaborative robots including offerings from NGP portfolio companies like GrayMatter Robotics and ANYbotics. Systems from all these three generations co-exist today and are not static; they increasingly offer AI-enabled functionality.

A 4th cycle is rapidly coming into view: Dexterous Embodied Intelligence, exemplified most viscerally by humanoid systems. Advances in Gen AI-inspired foundation models and compute are enabling fast skill acquisition, generalization to new tasks, and natural language instructions—all previously out of reach. Continuously improving sensing plus actuation are yielding better mechanical control than ever before. The form and function of humanoids have naturally captured the public imagination; for many, they represent the most tangible physical manifestation of AI. A key selling point is that – in a world designed around humans – they theoretically should slot seamlessly into existing processes and infrastructure with the potential to use unmodified human tools.

But what about specialized robots? Application-specific systems are available today, less mechanically complex and computationally intensive, with faster ROIs and lower integration risk. Purpose-built systems are often more reliable and safer for targeted applications. But in that narrowness lies the rub. Many industrial workers manage multiple tasks per day and are frequently rotated or retrained. The promise of humanoids lies in generalizability; the ability to drop into new environments and take on diverse tasks immediately. The appeal is natural. Who wouldn’t want a humanoid worker that’s “up to speed” on Day One across different job stations? That’s a factory manager’s dream.

But dreams are not easily realized. In the case of humanoids, technical challenges abound – and many of the toughest lie in the intelligence layer. Architectural approaches to cognition vary: layered stacks with vision-language models (VLMs) paired with separate action models (diffusion-based or tokenized), and end-to-end vision-language-action (VLA) models. For all these paradigms, training with multimodal data is the bottleneck, particularly for learning how environmental and semantic inputs convert into action series – even for what we may consider simple tasks. The challenge is captured by Moravec’s Paradox (1988) declaring that what we consider the pinnacles of human differentiation – logic and reasoning – are much easier to encode into AI-driven systems than the most fundamental human perception and movement, the result of hundreds of millions of years of evolution and which we execute subconsciously, without even thinking.

Simulators can help with basic tasks, but advanced behaviors still require mimicking (imitation learning with haptic feedback) or teleoperation at a minimum. Nothing trumps real-world data from actual operations – but humanoid field experience remains limited. As such, we expect winning players to have a head start not just in use of the most advanced robotic foundation models, but data acquisition and management strategies.

Similarly, matching anything near human dexterity remains a formidable hurdle. The human hand has roughly 17,000 complex mechanoreceptors, with highest density at the fingertips (~1,000 in each). It also has ~27 degrees of freedom – whereas most conventional robots feature 6-7 DOF, and exceeding 20 is a real achievement. The complex system of nuanced sensing and actuation across so many DOF is extremely difficult to replicate, yet it’s a key unlock for humanoid adoption. The hardware barrier is real: there is no universal hand today that reliably achieves anything close to human-parity touch or grasping across tasks. Some would argue that tried-and-true parallel grippers with suction and force/tactile sensing work well enough in many applications. But to hit the holy grail of generalizability, the hand is key.

Beyond core technical challenges in cognition and manipulation, humanoids need to prove out functional safety and whole-body control. Humanoids are heavy and strong. It’s critical to prioritize fall safety and extensively test human-robot interaction (HRI) to minimize the risk of harm. Additionally, the unit economics are unproven at scale. For lasting adoption, total humanoid cost of ownership must be comparable to conventional worker compensation at prevailing wage levels. This is where China leads—the low-cost hardware game. China’s commercial progress paired with governmental support has cast the humanoids race, from the outset, as a global competition.

Despite the myriad challenges and unanswered questions, established players are leaning in early. Auto manufacturers like BMW and Mercedes are partnering with humanoid startups, while Tesla is building in-house. Big Tech is making big bets on the model side, with NVIDIA, Microsoft, Google, Baidu, and others developing foundation models and simulation tools.

But startups are playing the most impactful role in pushing the frontiers of innovation. In our market map below, we identified over 100 companies building in the space. The market opportunity feels real, particularly for humanoid startups targeting industrials, manufacturing, and logistics where customers tell us repeatedly about the labor shortages they face. The current shortfall in skilled labor headcount in the US alone is estimated at ~600,000 and expected to exceed 2 million by 2030. Even capturing a small fraction of that gap points to a multibillion-dollar opportunity over the next five years—conservatively—against broader market projections reaching into the tens of billions by 2040.

“Killer applications” for humanoids have not yet come into sharp focus. However, given their intended versatility, the commercial aperture is wide. Instruction-following, adaptable robots that quickly pick up tasks they haven’t seen before will take robotics to a new level. When paired with robustness in the field—high uptime, graceful recovery, safety and reliability across varied environments—the result could be transformative.

We see many parallels with how autonomous driving has evolved over the past 10 years – with a demand for generalized-level intelligence, concomitant challenges in acquiring and training with multimodal data, active participation by both tech giants and startups, safety a top priority (and bottleneck), and an essential ingredient – billions of dollars of funding. While we do not anticipate an easy road ahead, we believe the commercial opportunity will be much broader than what the AV industry targeted. The wait will be worth it.

With that in mind, we at NGP are looking to back exceptional technical founders tackling some of the most difficult challenges facing humanoid development. If you are taking an orthogonal approach to manipulation, perception and action models, data management and training, specialized hardware, systems engineering and more – we would love to hear from you. Just reach out –dexterously, of course.