Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster
Apr 12, 2001
68,189
38,974


Starting in iOS 14 and macOS Big Sur, developers will be able to add the capability to detect human body and hand poses in photos and videos to their apps using Apple's updated Vision framework, as explained in this WWDC 2020 session.

apple-vision-framework-human-body-pose-detection-jumping-jack.jpg

This functionality will allow apps to analyze the poses, movements, and gestures of people, enabling a wide variety of potential features. Apple provides some examples, including a fitness app that could automatically track the exercise a user performs, a safety-training app that could help employees use correct ergonomics, and a media-editing app that could find photos or videos based on pose similarity.

Hand pose detection in particular promises to deliver a new form of interaction with apps. Apple's demonstration showed a person holding their thumb and index finger together and then being able to draw in an iPhone app without touching the display.

apple-vision-framework-hand-pose-detection.jpg

Additionally, apps could use the framework to overlay emoji or graphics on a user's hands that mirror the specific gesture, such as a peace sign.

apple-vision-framework-hand-emoji.jpg

Another example is a camera app that automatically triggers photo capture when it detects the user making a specific hand gesture in the air.

The framework is capable of detecting multiple hands or bodies in one scene, but the algorithms might not work as well with people who are wearing gloves, bent over, facing upside down, or wearing overflowing or robe-like clothing. The algorithm can also experience difficulties if a person is close to edge of the screen or partially obstructed.

Similar functionality is already available through ARKit, but it is limited to augmented reality sessions and only works with the rear-facing camera on compatible iPhone and iPad models. With the updated Vision framework, developers have many more possibilities.

Article Link: Hand and Body Pose Detection in iOS 14 Will Provide New Ways to Interact With Your iPhone Without Touching the Display
 
apple-vision-framework-hand-pose-detection.jpg


Honestly, this seems like the kinda stuff that'd make Apple AR compelling—being able to draw in midair means you’d also be able to navigate an interface in midair with just your hands.

Using AR/VR without bringing a controller everywhere seems analogous to what set the iPhone apart from other touchscreen phones in 2007; you didn’t need a stylus.
 
Last edited:
Seems a lot like what Xbox was able to do with the Kinect 10 years ago

The Kinect required expensive 3D scanning hardware, which ultimately Microsoft couldn't afford and discontinued. (Kinect games even attracted an additional royalty which I recall was rumored at $10) This is all done with computer vision.
 
  • Like
Reactions: ZZ9pluralZalpha
Does this confirm the depth mapping stuff and therefore FaceId on Big Sur? Or does this just use the regular camera?
 
Honestly, this seems like the kinda stuff that'd make Apple AR compelling—being able to draw in midair means you’d also be able to navigate an interface in midair with just your hands.

Using AR/VR without bringing a controller everywhere seems analogous to what set the iPhone apart from other touchscreen phones in 2007; you didn’t need a stylus.
You didn't, but you could still accurately interact with the phone screen and get feedback. I don't think motions in the air will feel very accurate for an interface, but it might be okay for tracking your motions or body posture, and the phone is sort of thinking for itself instead of relying on the user's input
 
Kind of cool if we can use this data for motion capture and apply it to 3d models in C4D or Unity. Adobe has a pretty broad collection of MoCap models ready to go, but to be able to do this yourself would be fun.
 
Great! Now Siri will be able to critique your ability in bed as it happens.
 
Apple really is doing their best to pull abandoned Microsoft features into their ecosystem. Tiles from Windows Phone. now Kinect.

And yet, Apple always seems to refine the technology that other companies couldn’t execute properly in the first place. Isn’t that what makes Apple so unique in the first place?

[I’d say let’s give this technology a shot, I think it has potential if it’s implemented properly. I for one, want to see where this leads, especially with AR.]
 
  • Like
Reactions: Mobster1983
I don't think the wizarding community is going to be too happy about this...
From what I’ve read, wizards and modern technology don’t seem to go well together anyway, so Apple probably figured they could do without that particular demographic.

It doesn’t bode well, though, for my idea of an app that would judge the posture of Catholic priests during mass.
 
  • Haha
Reactions: SDJim
For motion graphics applications this would be great if expanded to it. Maybe we will start to see builtin tracking functions for puppeteering a person or object in a picture, or even better, translate performance capture from one to another pretty much like Animojis do with the front face cam or motion capture as we know it in films.
 
Except they didn't do it right. You notice how they don't pursue Kinect anymore?
And you think Apple will do it right? This might be useful in very limited set of circumstances. With the regular interface, one just needs to move a finger to interact. With these new features one needs to move a hand or the entire body. Why would anyone want to do that? Apple introduce AR support three years ago. The demos were cool but it is hardly used.
 
  • Like
Reactions: PC_tech
The Kinect required expensive 3D scanning hardware, which ultimately Microsoft couldn't afford and discontinued. (Kinect games even attracted an additional royalty which I recall was rumored at $10) This is all done with computer vision.
This may be why the iPad has LIDAR.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.