The Itemize API – Behind the Scenes

Written by Hitesh Chitalia, VP of Technology at Itemize

The Itemize API solves a well known issue in expense management: the manual entry of financial values and dates for receipts and folios. We have all been there – the end of the month and you need to submit your expenses. Did you keep that receipt from the restaurant on that business trip earlier this month? Where did I save that hotel folio? If I have to type all these in, it’ll take me an hour!

The Itemize API (and App) unlocks the data for you. Our proprietary OCR (optical character recognition) and ML (machine learning) processing extract the Merchant, Date and Financial Values (Grand Total, Tax and Payment Type) and returns them in 30 seconds. Whether you snap a picture of your receipt using our mobile app on iOS or Android or use our API in your application, life is made simpler. The data you need for the accounting team, accounting software or reimbursement is ready for submission.

How does Itemize perform the complex tasks of extracting the payment information?  

It is a mix of custom OCR, ML and a processing pipeline that lives on the cloud. This is the engine that services our mobile apps and our API. When a receipt is submitted, it enters a processing queue that scales up and down with volume. We strive to make sure you get your data in short order.

As the receipt flows through the pipeline, the receipt is pre-processed to “straighten” and “clear up” the image. If the image is already straight and clear, it helps the system that much more to identify the key data points. We also enforce lower and upper limits on the image file size to ensure that what is submitted can be processed. The extracted text is then sent to the ML system. I wish I could explain all the data science that happens here, but I would not do it justice and our lawyers would redact it. After the ML system, we collect all the information and produce the JSON response that is returned by the API. We also provide a webhooks mechanism for asynchronous integrations.

There are challenges of course. The engine needs constant tuning. There are cases where we can fall short (as our operations team is always pointing out). This is the exciting part of being at Itemize. Our developers are smart, energized and committed to delivering an accurate extraction.  We learn from documents we were unable to process and generate training and feedback for the engine. This results in a smarter engine that improves the overall accuracy which equates to happier customers.

To sum up, the Itemize API is a cloud-based scalable payments data extraction engine. We use it in our apps, and customers use it to process their receipts to ease their customers’  angst of manually entering receipt data. Try it out – you can contact sales or go to our offering in the AWS Marketplace.

If you would like to know more about what I have discussed here, feel free to connect with me on Twitter (@hitesh_itemize) or email me at hchitalia@itemizecorp.com.