Thursday, July 27, 2017

How Machine Learning enables alternative User Interfaces

The way we interact with computers and computer-connected devices has come a long way.  In the beginning, computers were number crunching devices and people used punch-cards to load the data into them to crunch those numbers.  Then the command-line interface was introduced and it became a dominant (and still highly useful) interface.  Via the keyboard, you instructed the computer to do exactly what you wanted.

Then, in the late 80s we had the graphical user-interface revolution.  This is where the mouse,  a movable pointing device, became a key part of the user interface.  A command-line interface is focused on precision.  You must spell out the instructions that you want the computer to perform in fine detail.  Overtime, macros and other short-cuts were introduced to make it easier to type out frequently performed items, but the instructions were always very precise.  A mouse and GUI are very different.  You are now interacting in a three-dimensional space (even if it was only rendered in 2D with overlapping windows).  Distance became important as we would physically move items around on the screen in order to interact with them.  A mouse, although not as precise as the command-line, is still a very precise device for interacting with the computer.  We just had to rely on the OS to keep up with which XY position the pointer was at, which screen element was directly underneath it, which elements were overlapping each other, etc.

In the mid-2000s, touch interfaces became more relevant, with the 2007 iPhone being the break-away hit to popularize the touch interface.  The touch interface is a fairly mature interface now, although it is still not 100% complete.  It is something that is still being experimented with, such as utilizing pressure to enable new interactions with a device or computer.  A touch interface, although still precise, is not quite as precise as a mouse is.

Finally, over the last few years machine learning technologies have evolved to the point where they are becoming part of most software systems.  In business, we mostly think of ML as performing sophisticated analytics and predictions on data from our operational systems or IoT devices.  But it is also enabling the development and use of other types of input devices to directly interact with our software applications.  The key inputs that are becoming more dominant as alternative user experiences are based on sight and sound.

Voice has currently taken the lime-light as a key user-interface technology, with Apple (Siri), Google (Google Home) and Amazon (Alexa) fighting to become a dominant voice-based technology.  Wired recently had an article :  VOICE IS THE NEXT BIG PLATFORM, AND ALEXA WILL OWN IT .  In it they declared:

In the coming year, the tech that powers Amazon’s assistant will become even more robust. “We’ve made huge progress in teaching Alexa to better understand you,” Amazon’s head scientist and vice president of Alexa Rohit Prasad told Backchannel earlier this month. Amazon is making more tools available to developers. In November, the company announced an improved version of its developer tools, the Amazon Skills Kit. The company also announced improvements to the Alexa Voice Kit, the set of tools that allow manufacturers to incorporate Alexa into third-party devices like cars or refrigerators.

Machine Learning is the key technology backing voice-systems since they must learn and improve over time to recognize different  accents, different word orders, etc.  Although Apple, Google and Amazon get the majority of the press attention, they are not the ones provided voice interfaces and voice can be used in surprising ways.  My company, Contextant, specializes in using machine-learning technologies to help companies improve their business processes.  Currently, we're in the process of integrating voice-recognition technology with one our client's warehouse management system (WMS) to enable people in the warehouse to perform picking, inventory movement, receiving and packing operations via voice.  This frees their hands from having to worry about a keyboard or mouse.  Voice isn't perfect, but it can complement a standard GUI to help workers see what to do next, where to go inside a warehouse, etc.

Computer Vision is also an exploding machine-learning based technology that allows for alternative user experiences.  It has a come a long way in just the past couple of years, but is still not as mature or precise as the other technologies I've mentioned.  But the future is very bright for Computer Vision.  Our own ViznTrac service was developed out of an indoor-location tracking system to allow large facilities to keep track of where people are.  It is also a big-part of home-security systems (a market area that we hope to push ViznTrac into shortly) and with the development of facial-recognition, package detection and more sophisticated movement detection algorithms it will only improve.

These alternative UX features are still evolving, and they not as precise as traditional methods.  Voice systems can misunderstand what you say and computer-vision systems misinterpret what they are "seeing".  They won't replace the other forms of input (I'm using a keyboard now!) but they will continue to become more effective and complementary technologies.

No comments:

Post a Comment