
UiForm: Building the Next Generation of Document Processing with LLMs
For over a year, our team at Cube has been hard at work on a simple mission: make it effortless to extract structured information from documents using Large Language Models (LLMs). While analyzing our clients’ shipping papers—everything from bills of lading to invoices—we realized that many of the biggest challenges we faced were universal:
- Documents Come in Every Format Under the Sun
PDFs, Excel sheets, emails, scanned images—each demanded specialized, custom parsing logic. It quickly became a headache to maintain and scale so many one-off solutions.
- Prompt Engineering is Fragile
Even when we successfully extracted the text, using LLMs to convert it into clean, reliable data was another hurdle. Slight tweaks to the prompt could cause wildly different results. We wanted consistent output every time, without having to babysit the model.
Those two pain points—document preprocessing and prompt engineering—pop up over and over in any real-world LLM project. So instead of continuing to build the same one-off solutions in-house, we decided to open up our tools to everyone. That’s how UiForm was born.
One Platform for All Your Documents
UiForm is an API that lets you upload any file type—PDFs, spreadsheets, emails, images—and transforms them into an LLM-ready format. Rather than sinking time into writing (and rewriting) custom parsers, you can simply pass your documents to UiForm. It automatically converts them into a text-based structure, preserving enough context for the LLM to do its job accurately.
By removing the friction of dealing with messy file inputs, UiForm eliminates the first big barrier to using LLMs in production. So whether you’re analyzing invoices, legal documents, or customer support emails, you can focus on what you want to extract instead of how to extract it.
CodeBlock test
Structuring Outputs with Schema-Driven Extraction
The second big problem we solved is getting consistent, trustworthy data from LLMs. If you’ve worked with AI text models, you know how easy it is for them to veer off track—especially when you ask for a specific JSON response or a precise set of fields.
UiForm tackles this using schema-based prompt engineering. You define your target fields in a JSON schema (or through a Pydantic model), and UiForm embeds helpful guidance to the model directly inside that schema. For example, you can specify instructions like:
- System prompts that define the LLM’s overall role or behavior,
- Field prompts that clarify exactly how to fill in each field,
- Reasoning prompts that create “thinking space” for the LLM, improving accuracy on complex data extractions.
This approach keeps your prompts structured, reduces guesswork, and makes it far more likely you’ll get clean, valid JSON in return. Prompt engineering is no longer about creative guesswork or trial-and-error; it’s a systematic, predictable process that scales.
A Community-Driven Approach
We see UiForm as more than just a product—it’s a movement toward open, collaborative document processing.
- Open Source Prompt Engineering Framework: We’ve released our prompt engineering tools so anyone can adopt and extend them. The community can share new schema patterns, solve tricky domain-specific problems, and collectively raise the bar for document extraction.
- Knowledge Sharing: By publishing these templates for invoices, contracts, shipping documents, or medical forms, we aim to foster a library of best practices that anyone can tap into—regardless of industry or domain.
When we say “community-driven,” we genuinely mean it: we’re building UiForm in the open and sharing our roadmap on GitHub, encouraging feedback and contributions. If you have a particular workflow or document type you’re passionate about, we want to hear from you!
What’s Next: Email Forwarding & Automatic Extraction
One of our immediate goals is to make document ingestion even simpler. A prime example of this is the email channel: many businesses still receive crucial data via forwarded emails—think receipts, shipping notices, inquiries with attached PDFs, and so on.
We’re currently working on a new feature that will let you set up dedicated email addresses that automatically route incoming messages and attachments into UiForm. Once received, UiForm can parse the text, extract the data you care about using your chosen schema, and then send that structured result wherever you need it—an internal dashboard, a CRM, or even a custom webhook.
In other words, you’ll be able to:
- Forward your emails to UiForm
- Watch as UiForm extracts the relevant fields
- Automatically push that data into your existing systems
This will drastically reduce manual data entry, especially for companies juggling high volumes of email attachments and text requests. It’s one more way we’re aiming to remove friction and let you focus on building better experiences for your customers.
The Bigger Picture: “Stripe for Document Processing”
UiForm’s broader ambition is to be the “Stripe of document processing.” Just like Stripe simplified online payments, we want to simplify how companies handle any piece of unstructured data.
- Seamless Integration: We’re building out SDKs (Python first, Node.js coming soon) and intuitive workflows so you can drop UiForm into your infrastructure without friction.
- Robust Scaling: Whether you’re processing a few hundred invoices a month or thousands of pages of legal documents every day, UiForm is built to scale in a reliable, cost-effective way.
- Transparency & Control: You always maintain control of your data. We simply provide the best tools to transform it into structured formats that can then plug into your existing analytics, automation, or product experiences.
Where We’re Headed
Beyond email forwarding, our public roadmap includes:
- Node.js SDK for easier integration with JavaScript-based systems.
- Finetuning to let you train specialized LLMs on your unique document sets.
- Prompt Optimization tools that make it even easier to refine and test prompt strategies at scale.
- Data Labeling Platform to speed up creating or validating training data for specialized use cases.
- Webhooks API for real-time notifications whenever UiForm completes the extraction of a new document.
We’re not building these features in a silo. Community input is essential. Join our Discord or follow us on Twitter/X to share your thoughts, request new features, or show off what you’re building.
Join the UiForm Community
We believe that the future of document processing should be open, collaborative, and powered by the best of modern AI. UiForm is our step toward that future. Instead of repeating the same custom parsing or prompt-engineering work, we hope teams can now stand on each other’s shoulders and push the boundaries of what’s possible with LLMs.
- Sign up for a free UiForm account and get 1,000 free requests—enough to explore how easily you can integrate it into your workflows.
- Jump into our Discord to chat directly with our team and other developers tackling similar challenges.
- Keep an eye on our GitHub for new features, open issues, and transparent conversations about our roadmap.
We’re excited about what’s coming next—from email forwarding to advanced prompt engineering, and beyond. Document processing doesn’t have to be a tedious chore. With UiForm, it becomes an elegant, scalable, and open-ended platform for whatever you want to build.
Thank you for joining us on this journey—and stay tuned for what’s next!