File conversion APIs are a mess – here’s what I learned
I’ve been researching file conversion solutions for a project, and honestly? The pricing is wild. Here’s what I discovered and why I’m thinking about building an alternative.
The recurring problem
Every web project eventually needs file conversions:
- Users want to download invoices as PDF
- Data needs exporting to Excel
- Someone uploads a DOCX that needs to become HTML
And we always face the same choices:
- Pay for an API (expensive)
- DIY with open source (time sink)
- Say no to the feature (not great)
Current market options
I spent last week analyzing the main players:
CloudMersive
- Starts at $19.99/month for 10k calls
- $199.99/month for 100k calls
- Hidden catch: Complex conversions can use multiple API calls
- A single PDF conversion might count as 2-5 calls depending on complexity
- Enterprise pricing gets steep fast
ConvertAPI
- ~$35/month for 1,000 conversions
- ~$90/month for 5,000 conversions
- Overage penalty: ~$0.06 per extra conversion (ouch!)
- At least they’re transparent about pricing
Zamzar
- $12/month for ~1,500 conversions (50/day limit)
- $39/month for ~15,000 conversions (500/day limit)
- Daily limits instead of monthly – weird for APIs
- Unclear API vs desktop app pricing
Reality check: For a small SaaS doing 5,000 conversions/month:
- CloudMersive: $20-100 (depends on complexity multiplier)
- ConvertAPI: ~$90/month + brutal overage fees
- Zamzar: $39/month (if you stay under 500/day)
And none of them make it simple.
The “just use LibreOffice” trap
Everyone’s first instinct: “Just shell out to LibreOffice!”
const { exec } = require('child_process');
function convertFile(input, output) {
return new Promise((resolve, reject) => {
exec(`soffice --headless --convert-to pdf "${input}"`, (error) => {
if (error) reject(error);
else resolve(output);
});
});
}
Simple, right? Here’s what Stack Overflow and GitHub issues taught me about production reality:
Common problems people report:
- Memory leaks after ~100 conversions
- Random hangs on complex documents
- Processes that won’t terminate
- Can’t handle concurrent requests well
- Different outputs on different OS versions
One developer’s solution I found shows the real complexity:
// From a GitHub issue - handling the edge cases
async function convertWithRetry(input, output, attempt = 1) {
const timeout = setTimeout(() => {
exec('pkill -9 soffice'); // Nuclear option
}, 30000);
try {
await convert(input, output);
clearTimeout(timeout);
} catch (error) {
clearTimeout(timeout);
if (attempt < 3) {
await new Promise(r => setTimeout(r, 1000));
return convertWithRetry(input, output, attempt + 1);
}
throw error;
}
}
A smarter approach I’m considering
Based on my research, here’s what seems to work best:
1. Use specialized libraries where possible
Many conversions don’t need heavy tools:
// Images: Sharp is fantastic
import sharp from 'sharp';
await sharp('input.png').jpeg().toFile('output.jpg');
// CSV/Excel: XLSX handles most cases
import XLSX from 'xlsx';
const workbook = XLSX.readFile('data.xlsx');
XLSX.writeFile(workbook, 'data.csv');
// Markdown: Marked.js
import { marked } from 'marked';
const html = marked(markdownContent);
2. Cache everything cacheable
Many apps convert the same templates repeatedly:
// Simple caching strategy
const cache = new Map();
async function convertWithCache(file, format) {
const key = `${hash(file)}-${format}`;
if (cache.has(key)) {
return cache.get(key);
}
const result = await convert(file, format);
cache.set(key, result);
return result;
}
3. Hybrid approach
Use APIs only for complex stuff:
async function smartConvert(file, targetFormat) {
// Simple formats = use libraries
if (isSimpleConversion(file, targetFormat)) {
return convertLocally(file, targetFormat);
}
// Complex stuff = use API
return convertViaAPI(file, targetFormat);
}
Making the numbers work
For a hypothetical SaaS with 5,000 monthly conversions:
- ~70% might be cacheable (same templates)
- ~25% could use simple libraries
- ~5% would need proper API/LibreOffice
Theoretical cost: ~$20-30 vs $80-100/month
But more importantly: no daily limits, no overage surprises, no complexity multipliers.
Why I’m thinking about building something
After all this research, I keep thinking: why isn’t there a simple, fairly-priced conversion API?
What I’d want:
- One endpoint for all conversions
- Pay only for what you use
- No monthly minimums
- Handles the edge cases properly
But before I spend months building this, I need to know: do others have this problem too?
Questions for you
- How do you handle file conversions currently?
- What’s your monthly volume?
- Would you pay for a simpler solution?
I’m genuinely curious about your experiences. Every project seems to solve this differently!
P.S. If you think a simpler conversion API would be useful, I’m collecting feedback at fileconvert.dev. No product exists yet – just trying to see if this is worth building.