Can GenAI Process Data Better than Humans?

Piotr Wawryka Business October 15, 2024 7 min read

Table of contents

Manual Data Processing: A Flawed Approach
Generative AI Changes the Narrative
Next Steps
Drowning in Data?

Manual Data Processing: A Flawed Approach

The Stakes Are High

Our client—a global telecommunications service provider—wanted to speed up their internal processes related to quoting network services from their partners. This was a major undertaking since the number of vendors across all global markets numbered in the hundreds.

That is where things got complicated. As a global leader, our client cooperates with a large network of localized telecom carriers that periodically update their prices in cost books. If there is a pricing mismatch, the price offered to end customers will lower the company’s profits or even prevent it from generating any profits at all. This is a serious risk for the company’s liquidity which is why it’s being managed by a dedicated internal team on a daily basis.

A Slow And Resource-Heavy Process

To efficiently process cost books, our client has a 30-person department whose primary goal is to continuously update vendor prices. This is to ensure that customers buy the services they want at the correct prices.

The typical process starts with obtaining cost files from their vendors, storing them internally, and preparing them in the required format. This is a tedious and time-consuming process. It is also prone to errors because of the many manual tasks and points where data can be processed incorrectly.

Delays Have Real Consequences

Once prepared, the files are sent to a third-party technology provider with dedicated software to process network services quoting data. During this process, they return the most current quotes from vendors available in a given local market through a dedicated API.
As a result, the entire process for just one supplier will typically take at least a couple of days and sometimes as long as a few weeks. This poses a severe risk, as the company’s margin can be reduced or even zeroed out in the event of a sudden change in multiple cost files from different telecom vendors.

Generative AI Changes the Narrative

Our customer wanted to find out if a generative AI model could properly automate (read and process) cost files. This technology is highly successful at reducing human errors, especially in databases, which reduces the amount and cost of manual labor.

Preparing a PoC

To prepare a training database, we requested a sample of real telco data (anonymized) to check whether a generative AI model could properly process it and output an acceptable result. After that was confirmed, we added a dedicated Python interpreter tasked with analyzing any input file provided. This showed the consecutive steps taken in processing the cost file.

Taking a New Approach With GenAI

One of the biggest challenges we faced was supporting a wide variety of different input file formats. We had assumed we could simply convert any input to text and use an LLM to extract the relevant information. However, we soon realized that such a generic conversion could lead to a loss of information, especially around the spatial arrangement of the text. In turn, this would make it impossible to analyze the data accurately.

To overcome this issue, we decided to leverage an agent-based approach.

This framework allows for more complex operations. It incorporates concepts such as scheduling, storage, and tools (e.g., code interpreters). Instead of extracting information in a single API call, tasks are broken down into smaller steps. The LLM can understand the file format, use the most appropriate method to read and present its contents, and finally extract the relevant information in the desired format.

An Elegant Solution

The PoC frontend application consists of a simple screen with two buttons: one to upload a file and the other to start processing it. It works as follows:

The user provides an input file(s) with cost data
The model analyzes the file structure and content
Relevant data is extracted in a desired output format
Once the outcome is displayed, the user either accepts or rejects the result
If accepted, the user stores the processed data in a specified location
If rejected, the user requests adjustments by re-running the model with changed prompts, and this step is repeated until the output is accepted

The output cost data was compared with real network service cost data provided by the client. The model was able to parse all file formats specified by the client (PDF, CSV/Excel, email messages), proving that manual labor can be reduced or even eliminated in the long term.

Next Steps

The plan is to continue to improve the PoC by allowing users to change GPT prompts. We have also suggested further refinements to improve file format processing and output data visualization.

If fully implemented, the client will benefit from the ability to instantly process multiple cost files with minimal user assistance. The entire process will be nearly fully automated, eliminating most manual errors and significantly improving the processing time. In the longer term, it will also be possible to stop using the third-party technology provider currently responsible for processing those files.

Once refined, the solution has the potential to save our client millions of dollars a year in overhead costs.

Was this article useful for you?

Get in the know with our publications, including the latest expert blogs

End-to-End Digital Transformation

Reach out to our experts to discuss how we can elevate your business

Learn more