Streamlining the Entire Invoice Processing Workflow: Uber’s Vision for Automation
The American ride-sharing giant Uber is contemplating the automation of its supplier invoice processing from start to finish. While no specific timeline has been announced, the company has already paved the way by enhancing its invoice handling pipeline with the integration of Generative Artificial Intelligence (GenAI).
Previously, Uber’s system relied on Robotic Process Automation (RPA) combined with business rules. This approach was sufficiently effective when the formats of incoming invoices were relatively uniform and predictable. However, as the variety and complexity of invoice formats grew, the limitations of this solution became apparent, prompting Uber to seek more advanced capabilities.
Introducing TextSense: Elevating to Large Language Models (LLMs)
To scale efficiently, Uber has adopted a new platform named TextSense. This custom-developed document processing system extends beyond just handling invoices. Its modular design allows for easy customization, including the ability to embed templates tailored to different countries.
TextSense leverages a combination of traditional Optical Character Recognition (OCR) technology—using Uber’s internal Vision Gateway CV platform—and Large Language Models (LLMs), whether generative or not, hosted on their internal Michelangelo platform. This hybrid approach enables more nuanced understanding and extraction of data from diverse document types.
Uber integrated TextSense directly into its supplier invoice workflow. Incoming documents are initially stored in an object storage system. Their content can be further enhanced or enriched before being converted into a standardized format suitable for processing. Depending on the document type, a configuration called Flipr is injected, which adjusts the processing parameters. The system applies an appropriate LLM prompt, and the model processes the document, extracting relevant data.
Post-processing involves applying business rules to refine the extracted information. Human reviewers then validate and approve the data before it is finally entered into Uber’s Enterprise Resource Planning (ERP) system, ensuring accuracy and compliance.
Invoices are ingested either through a self-service upload portal for suppliers or via email. Incoming emails are routed into a ticketing system that forwards the attachments and the email body, providing context for the AI models to interpret the documents effectively.
Preferring Proprietary Models Over Open Source Solutions
To train its underlying LLMs, Uber utilized a year’s worth of historical invoice data. This dataset was divided into two categories: structured data—such as system-entered fields—and unstructured data, like text extracted from PDFs. Approximately 90% of this data was dedicated to training, with the remainder used for testing and validation.
Uber evaluated various models, including seq2seq, LLaMA 2, and Flan T5. The Flan T5, a non-generative model, achieved over 90% accuracy on header fields like general information, but its performance declined when processing other parts of the invoices. Fine-tuning helped the model recognize patterns but sometimes led to “hallucinations,” or incorrect data generation.
Interestingly, the generative model GPT-4 demonstrated superior performance on the substantive content within invoices and proved to be more adaptable overall. Its ability to handle varied document formats and extract detailed information more reliably suggests potential for future ensemble approaches, where multiple models work together to enhance accuracy.
The company reports several key metrics from its pilot tests and deployment:
– An overall accuracy rate of 90% across the processed invoices, reaching 99.5% for a third of them.
– A 70% reduction in average processing time.
– Halving the manual review workload.
– Cost savings estimated between 25% and 30% compared to manual processing.
Uber’s aim is to automate entirely those scenarios where a 100% accuracy threshold has already been established. Additionally, plans are underway to incorporate a document classification layer—further refining the system’s ability to categorize invoices and prioritize processing workflows.
This move signifies Uber’s commitment to transforming invoice management through advanced AI, reducing manual intervention, and improving operational efficiency.
Published by:
Clément Bohic