Launched in 2010 with a $500 million marketing campaign, the Kinect painted a room in a discoball of invisible, infrared dots, mapping it in 3D space and allowing unprecedented tracking of the human body. The Kinect seemed perfect for getting gamers off the couch. Why press a button to duck, when you can just duck? It also enabled handy voice commands, when they worked, like “Xbox On” to turn on the Xbox One console. It was Microsoft’s greater attempt to blur the line between the human body and the human interface–beyond the existing limitations of keyboards, mice, and even touch screens.
In the years since, I don’t believe it an exaggeration to say that Kinect has been the single most influential, or at least prescient, piece of hardware outside of the iPhone. Technologically, it was the first consumer-grade device to ship with machine learning at its core, according to Microsoft. Functionally, it’s been mimicked, too. Since 2010, Apple introduced the Siri voice assistant copying the speak-to-control functions of Kinect, and Google started its own 3D tracking system, called Project Tango (which was founded and continues to be led by Johnny Lee, who helped on the original Kinect).
Manufacturing of the Kinect has shut down, although it was used for everything from experimental art, to creating next generation UI prototypes. It’s been a vital tool to the greater research community.
“The important thing about Kinect is it showed you could have an inexpensive depth camera. And it supported the development of thousands of applications that used depth sensing,” Levin says. He points out that it was literally Microsoft Kinect hardware that made it possible for a startup like Faceshift to exist. Built to perform extremely 3D tracking of the human face that’s suitable for biometric security, Apple acquired Faceshift to replace its thumbprint scans. And to take advantage of the technology, Apple essentially built a Kinect clone right into the iPhone X, having acquired PrimeSense in 2013, the Israeli company that developed 3D tracking technology that Microsoft licensed for the first Kinect.
The Kinect may be done for gamers and researchers, but it’s not disappearing entirely.
‘Look, if we’re spending more and more time with these [technologies], one of two things will occur,'” recounts Kipman. “Either we’re going to spend more time interacting with machines in machine ways, and dealing with what’s behind the screen. Or we’re going to have to teach machines to interact better in our world, the analog universe, and teach them to coexist.”
“I choose path two for us, as humans,” he concludes.
So tracking a human? That’s hard. But tracking environments, with all their nuances is 10 times harder than tracking people. And tracking objects, with all their textures and variances in context? That’s 100 times harder than people.
So Kipman, being what he calls a “lazy” engineer, focused on the simplest square in his matrix to solve–the 1×1 problem, as he put it. Human input. That meant computers had to understand gestures and voice.
These steady improvements allowed Kinect to be miniaturized into something small enough that we could wear it. So in 2015, Microsoft announced the Hololens. The 10×10 problem: Environmental output. That means Hololens could see not just a person, but space. And it could not just recognize this space, but allow people to output things into that space–dragging and dropping holograms.
Illustration: The Rodina – the Visitor is Present (www.therodina.com)