All Snapchat users are aware of their hilarious, scary, cute filters. Just recently I decided to try them out and was truly fascinated by how they recognized my facial features in real-time. Many people may find them silly, ridiculous or cool but the architecture and engineering behind these filters or lenses as Snapchat calls them, is no joke.
The technology that made all this come to be is Looksery, a Ukrainian startup that snapchat acquired back in late 2015 for a booming $150 million. Snapchat’s filters are tap into the largely growing field of Computer vision (Applications that use camera pixels to interpret objects and 3D space.), which in layman’s is how Facebook knows who is in your photos, or how self driving cars know how not to hit people or objects and of course, how you give yourself bunny ears.
Computers don’t exactly “see” like the human brain does.
The Viola-Jones algorithm is the tool that computers use to detect faces, The theory behind this tool is looking for areas of contrast, for instance the eye-sockets are darker than the forehead and the middle part of the forehead is lighter than the sides of it. If these tests find enough matches in one area of an image it concludes that is a face, which is why cameras put boxes around a face.
However, in order for the filters to be placed appropriately on a face, the app needs to do more than detect the face.It should also locate facial features. According to Looksery’s patents, this is made possible by a statistical model of a face shape that has been trained by people manually marking the facial features on thousands of images. The algorithm uses this trained data as a “template” because it is never a perfect fit.The algorithm adjusts the points of the template to match those of your face.These points are used as coordinates to create a mesh, which is a 3D mask that can rotate,move and scale along with your face as the video data comes in for every frame. Once they’ve got that, they can do a lot to the face mask like change the face shape, eye-color and trigger animations once you open your mouth or raise your eyebrows.
This technology is not new, but setting it to work in real-time on mobile devices is pretty recent. I have to say as much as these filters may seem inane, they are pretty impressive form a technical point of view. 🙂
Can’t wait to see the next step in Computer Vision.