COMPONENT PARTS OF MEDIATED REALITY: MACHINE VISION

As we discussed in an earlier post, the fundamental difference between an Artificial Intelligence and a very powerful computer is the AI’s capacity to not only learn new abilities but to teach them to itself. No human or group of humans can process the amount of information that our devices deal with on a daily basis. It is for this reasons that AI has become an essential part of the growth of many up and coming inventions. In the mediated reality space, we have yet to determine the limits of AI’s applications, but one function it serves is Machine Vision.

Machine vision is what allows computers to process visual data and draw useful information from what they see. When a computer first views an image, there is no connection between the shapes in the picture and any other external reality. It is simply a series of patterns that hold no meaning for the computer. However, once the computer has processed thousands of similar images, it will have seen enough of the patterns to determine that some of them are not random, and may even correspond to something common, like the shape of a tooth or the bridge of a nose.

A normal computer could do this too, but it couldn’t teach itself to refine its understanding of these patterns or autonomously decide to re-run the program as many times as necessary to fully ascertain the details. By the end of the exercise, the AI may in fact have taught itself the difference between plants and furniture, furniture and people, some people and others, children and adults, etc.

For a more practical example, any device, head mounted or otherwise, that has eye tracking technology uses machine vision. It is only over the course of thousands upon thousands of images and videos of the movements of the eye that the device became capable of discerning where the user’s attention was focused or the direction in which they wished to travel.

In Augmented/Mixed Reality these pattern recognition techniques become necessary for interactions with “real world” as well. For instance, WallaMe is an app that allows users to leave virtual messages for other users in real places. If you are planning to leave a piece of writing or a drawing on the outside of a real building, your device will need to be able to precisely determine the contours of that building. Another example: if you want your virtual pet dragon to be walking around on your desk while you work, the machine has to know exactly where the desk ends and the air begins or the illusion is blown.

Further common example: lane correction technology or proximity detectors present as safety features in many modern cars. For instance, in the new Chevy Volt, the car will maintain or change speed for you once in cruise control. When a car is in front of you, the Volt will match that car’s speed, but as soon as it changes lanes, it will speed up without you hitting the gas.

So where does this technology go from here? We recently learned that there is big money in machine vision. Intel just spent 15.3 billion dollars acquiring Mobileye, an Israeli Machine Vision developer working on the software for both normal and autonomous vehicles. The current prediction is that driverless cars will be a 70-billion-dollar industry by 2030. Games and virtual spaces will be enhanced. Product recognition while shopping from home. And the implications of this tech reach far beyond these terrestrial ambitions. Who knows what might be invented by the end of 2017, let alone within the next decade? We could be looking now at the origins of the tech that finally enables us to explore even further into the depths of space than we have ever gone before. Our machines can see even better than we can, and we are embracing the idea that it is about time to let them.