Now Reading
Data Extraction Just Got Smarter With ML: AWS Announces Textract

Data Extraction Just Got Smarter With ML: AWS Announces Textract

Anirudh VK

Amazon Web Services, the cloud computing arm of the e-commerce giant, recently launched an ML service for automated text and data extraction. The service, known as Textract, is fully cloud-hosted and managed by AWS, and allows users to parse various forms of data easily.

The service is said to be more than just an optical character recognition algorithm, as it can parse data tables, whole pages, forms, scans, PDFs, photos and more. Moreover, it also identifies fields and tables, so as to contextualize the data and allow for the collection of cleaner datasets with deeper insights.

The company states that it can process millions of document pages “accurately” in just a few hours. All the data is exported to a JSON format, and can integrate easily with other ML-based AWS services. What sets this product apart is that there is no need to maintain any code or template, and that there is no ML experience required to operate or manage the product.



Amazon states that they have trained Textract on “tens of millions of documents from virtually every industry”, making it suitable for use in any scenario. It can “automatically detect a document’s layout”, preserving the key elements in the page and perform optimal data collection by understanding the relationships between the data.

Amazon is billing it as a lower-cost alternative to manual data entry, with an ease-of-use benefits. Moreover, as with every cloud computing service, it is provided on a pay-as-you-go basis, with accessible APIs. Swami Sivasubramanian, Vice President, Amazon Machine Learning, stated:

“Amazon Textract makes it possible for customers to gain real meaning from their file collections, operate more efficiently, improve security compliance, automate data entry, and facilitate faster business decisions.”

See Also
Webinar: Easily Build AI-Powered Applications With AWS AI Service

Currently, the service is available in US East (Ohio), US East (N. Virginia), US West (Oregon), EU (Ireland), with Amazon stating that further expansion will happen within the year.



Many prominent companies have already begun using the service, such as The Globe and Mail, a Canadian media outlet, Met Office, the UK’s national weather service  and PriceWaterhouseCoopers, one of the world’s biggest accounting firms. The rise of accessible data entry ML models might be the beginning of the end for low-level jobs such as data entry.

Provide your comments below

comments


If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top