Itemize is a FinTech data extraction engine used by companies to transform purchase documents, such as receipts and invoices, into data for their accounting and expense tracking needs. Recognized by Gartner as a top provider in the field, Itemize harnesses leading edge Artificial Intelligence and Machine Learning to drive processing efficiency for a range of leading clients in financial services, including credit cards, accounting software, and expense management.
Itemize operates a cloud-based processing service that involves both fully and partially automated systems for extraction, validation, and verification. The Itemize platform supports users in over 25 countries and numerous languages.
You are a smart and skilled Software Developer eager to join a team at the heart of the Itemize IP. You have a deep understanding of how components and processes work together and communicate with each other, including database queries. You are knowledgeable about system design to avoid bottlenecks and let your algorithms scale well with increasing volumes of data. You are able to evaluate and make recommendations on tools, algorithms, and architecture based on data-driven findings that will exceed our defined service levels for information extraction accuracy.
You thrive in a rapid-paced environment, and transition seamlessly between assigned tasks and addressing production issues. You are able to build upon broadly stated requirements and to independently weigh alternatives and present data-driven solutions.
You are committed to the values of teamwork, integrity, and innovation.
As a Senior Software Developer on the Core Engine at Itemize, you will enhance existing software and build new components and services in an overall system for document classification and information extraction that helps clients to automate their financial workflows and accounting. You will work closely with the Chief Data Scientist and the Production and Engineering teams to execute a variety of projects. The Senior Software Developer’s responsibilities will includes:
- Develop, enhancement and support new and existing components, services, and modules for document classification and information extraction along with relevant confidence scores
- Measure and validate confidence scores through rigorous and continuous testing
- Design and implement automated testing and feedback to improve model performance and information extraction with high confidence
- Identify and acquire training and testing sets for classification and extraction models
- Maintain documentation on components, processes, and test plans
- Interact regularly with the Production team to identify and improve extraction, reduce bottlenecks, and increase throughput
Required Skills & Experience
- Bachelor’s Degree in Computer Science or Engineering
- 4+ years software development experience in a professional work environment
- Database: MySQL
- Languages: Java, Python
- Services: Message Brokers, Restful APIs
- Frameworks: Spring Boot
- Packages: Excel (or equivalent), Pandas, Jupyter notebooks
- Experience with unit testing, continuous integration, and test-driven development
- Understanding of probability and statistics and machine learning concepts
- Self-driven and self-guided in a small and dynamic work environment
- Experience with OpenCV or equivalent for image preprocessing
- Experience working with OCR software engines such as Tesseract
- One or more ML Toolkits/Libraries: scikit-learn, Theano, Spark MLlib, H2O etc.
- Familiarity with coding best practices, OOD/OOP, modular design, SOA, and systems architecture