Skip to main content

Apple trained an AI to recognize previously unseen hand gestures from wearable sensors

In the new study, Apple taught an AI model to recognize hand gestures that weren’t part of its original training dataset. Here are the details.

What is EMG?

Apple has published a new study in its Machine Learning Research blog, called EMBridge: Enhancing Gesture Generalization from EMG Signals through Cross-Modal Representation Learning. This study will be presented at the ICLR 2026 Conference in April.

In it, the researchers explain how they trained an AI model to recognize hand gestures, even when those specific hand gestures weren’t part of its original dataset.

To achieve this, they developed EMBridge, “a cross-modal representation learning framework that bridges the modality gap between EMG and pose.”

EMG, or Electromyography, measures the electrical activity generated by muscles during contraction. Its practical applications span from medical diagnosis and physical therapy to prosthetic limb control.

More recently (although this is definitely not a new area), it has been more widely explored in wearables and AR/VR systems.

Meta’s Ray-Ban Display glasses, for instance, use EMG technology in the form of what Meta calls a Neural Band, a wrist-worn device that “interprets your muscle signals to navigate Meta Ray-Ban Display’s features,” per the company’s description.

In Apple’s study, the EMG signals used for training weren’t detected by a wrist-worn device. Instead, the researchers used two datasets:

  • emg2pose: “[…] a large-scale open-source EMG dataset containing 370 hours of sEMG and synchronized hand pose data across 193 consenting users, 29 different behavioral groups that include a diverse range of discrete and continuous hand motions such as making a fist or counting to five. The hand pose labels are generated using a high-resolution motion capture system. The full dataset contains over 80 million pose labels and is of similar scale to the largest computer vision equivalents. Each user completed four recording sessions per gesture category, each with a different EMG-band placement. Each session lasted 45–120 s, during which users repeatedly performed a mix of 3–5 similar gestures or unconstrained freeform movements. We use non-overlapping 2-second windows as input sequences. EMG is instance-normalized, band-pass filtered (2–250 Hz), and notch-filtered at 60 Hz.”
  • NinaPro DB2: “We utilized two NinaPro EMG datasets for a more comprehensive evaluation of EMBridge. Specifically, Ninapro DB2 is used for pre-training , which includes paired EMG-pose data from 40 subjects. It contains 49 hand gestures (including basic finger flexions, functional grasps, and combined movements) performed by 40 healthy subjects. EMG signals are recorded from 12 electrodes placed on the forearm at a sampling rate of 2 kHz, alongside hand kinematics data captured by a data glove. For downstream gesture classification, we use NinaPro DB7, which contains data from 20 non-amputated subjects collected with the same EMG device and gesture set as DB2

With all that said, it’s easy to see how Apple’s EMBridge could pave the way for a future Apple Watch model (or other wearables) to control devices such as Apple Vision Pro, Macs, iPhones, and other wearables, including its rumored upcoming smart glasses.

In practice, from new interaction methods to accessibility improvements, the possibilities could be significant.

Granted, the study itself obviously doesn’t mention any specific upcoming Apple products or applications, but it does state the following:

A potential practical application of our framework is wearable Human-Computer Interaction. In
scenarios like VR/AR and prosthetic control applications, a wrist-worn device must continuously infer hand gestures from EMG to drive a virtual avatar or robotic hand.

What is EMBridge?

EMBridge was the researchers’ way to bridge the gap between real EMG muscle signals and structured hand pose data.

Trained using a cross-modal framework, the model was first pre-trained on EMG and hand pose data separately.

Then, the researchers aligned the two representations so the EMG encoder could learn from the pose encoder. This allowed EMBridge to learn to recognize gesture patterns from EMG signals.

Once that was done, they trained the system using masked pose reconstruction, hiding parts of the pose data and asking the model to reconstruct them using only the information extracted from EMG signals.

The result, as explained by the researchers:

“To the best of our knowledge, EMBridge is the first cross-modal representation learning framework to achieve zero-shot gesture classification from wearable EMG signals, showing potential toward real-world gesture recognition on wearable devices.”

To reduce training errors caused by similar gestures being treated as negatives, the researchers taught the model to recognize when poses represent similar hand configurations, allowing it to generate soft targets for those poses instead of treating them as completely unrelated.

This helped structure the model’s representation space, improving its ability to generalize to gestures it had never seen before.

The authors evaluated EMBridge on two benchmarks, emg2pose and NinaPro, and found that it consistently outperformed existing methods, particularly in zero-shot (or, never-before-seen) gesture recognition. Importantly, it did so with only 40% of the training data.

One important limitation noted in the paper is that the model relies on datasets containing both EMG signals and synchronized hand pose data. This means its training still depends on specialized datasets that can be difficult to collect.

Still, the study is interesting, particularly at a time when EMG-based device control seems to be on the rise.

For the full technical details on EMBridge, including its Q-Former, MPRL, and CASCLe components, follow this link.

Worth checking out on Amazon

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Marcus Mendes Marcus Mendes

Marcus Mendes is a Brazilian tech podcaster and journalist who has been closely following Apple since the mid-2000s.

He began covering Apple news in Brazilian media in 2012 and later broadened his focus to the wider tech industry, hosting a daily podcast for seven years.