How AI and Computer Vision Speed Up Job Automation

Advanced computer vision technology could fully disrupt the market for low-qualification jobs and introduce formerly unknown automation.

497
An android looking at x-ray imaging

Neural networks show impressive results working with image data. Today, well-trained technology out-performs the human brain when it comes to classifying millions of images or recognizing patterns in the photos taken by Kepler telescope. As a result, AI-enabled image analysis and processing have made their way to diverse areas, far beyond photography or social media.

EBay, for example, launched a computer vision feature that allows to search products using image instead of keywords or description. Opting for Image Search, a customer can simply take a picture of the product and use it to find a similar one in the marketplace.

In healthcare, the use of neural networks promises to seriously enhance diagnostics capabilities. Today, neural networks already perform skin cancer classification and can identify melanoma with 90% accuracy.   

And then there are applications in national security. Recently, U.S. law enforcement agencies turned their focus on Amazon Rekognition to target suspected criminals. This advanced facial recognition system not only tracks people in real time, but also suggests their age and emotional state.

As neural networks demonstrate maturity in different areas, more brands look into the potential of image data to learn how to integrate this asset into their digital transformation strategies.

Here are the key opportunities of using neural networks in image data businesses could consider:

Improve targeting and personalization

B2C companies are constantly searching for the ways to get closer to customers and envision their desires and goals. Thanks to social media and customer review platforms, it’s easy to learn what people say about products and services, but not what they see or show.

Computer vision can expand a brands’ vision to their customers. Using image, companies can get a better picture of people’s preferences and interests. Therefore, those who learn to extract insights from this data will stay ahead of the competition.

Startup Shoto was among the first to recognize this opportunity. The company trained neural networks to understand the pictures people take at similar events and enabled attendees to swap these photos on a peer-to-peer basis.

First, it improves customer engagement and inevitably enhances coverage on social media. However, this isn’t a true gem. Ultimately, the company receives unique insights on the event itself – which sessions are the most popular, what key speakers attract the majority of audience, who gets online engagement. This information allows them to make data-driven decisions on the next event, personalize sessions to certain audiences and better target attendees.

Increase reach and scale up to multiple channels

Once neural networks understand image content, they can transform it into any form of communication. In other words, they can write it down or speak it out, for example, using popular voice assistants. This means the reach of image content expands, depending on the purpose.

Facebook started translating image into speech to improve customer experience for the blind people. The company created computer vision technology to be able to transform image content into alt text and then tell people in English what is in the picture using simple phrases like “outdoors, smiling people, puppies.“

ABBYY took a step up and enabled text recognition in the photos taken with a simple smartphone camera and instant translation both in online and offline mode. Offline TextGrabber understands text written in 10 common languages. Online mode scales it up to 62 sources and 104 target languages translated in real time.

Automate processes, enhance efficiency and accuracy

Leveraging image data can seriously enhance customer experience and help brands reengage with their audience using existing and new channels.

For some companies, however, image data is the core of operations, which until recently required manual work only. Today, thanks to intelligent image analytics and processing, these operations can get much more efficient.

First, computer vision can automate paperwork processes that require simple decision making, for example, copying data from paper forms to online sources.

Fintech startup Receipt Bank has built a system that targets accounting and bookkeeping operations across different businesses and enables the automation of paperwork using computer vision. The system analyzes the photos of bills, invoices and receipts, extracts important information from this image data and uploads it to customer’s software. Thus, Receipt Bank allows you to fully automate personnel-heavy routine jobs, enhance efficiency and even reduce mistakes and inaccuracies.

Secondly, solutions for visual data understanding can make an impact on more sophisticated human-only operations. Recently, Google DeepMind has created the system able to render 3D objects from 2D image. In other words, DeepMind trained neural networks to imagine space from different angles and build a three-dimensional environment without any human supervision.

See the process of 3D rendering explained here.

Today, designers, engineers and architects have to label every single aspect of a structure to build a relevant 3D model. Using the DeepMind algorithm can seriously reduce time and cost spent on this routine, thus improving organizational efficiency.

Conclusion

In most cases, the examples of well-trained neural networks today are the demonstration of what this technology could do, rather than what it does. In fact, suggesting recipes based on what a smart fridge “sees” on its shelves is more of a perk than a benefit.

Further improvements in AI will bring about more self-reliant, highly accurate computer vision technology which will fully disrupt the market of low-qualification jobs and introduce formerly unknown automation. Moreover, it will bring unprecedented efficiency to life-changing domains: diagnostics in healthcare, precision agriculture in the food industry, climate change monitoring, public safety and national security.