File conversion APIs are a mess – here’s what I learned


I’ve been researching file conversion solutions for a project, and honestly? The pricing is wild. Here’s what I discovered and why I’m thinking about building an alternative.



The recurring problem

Every web project eventually needs file conversions:

  • Users want to download invoices as PDF
  • Data needs exporting to Excel
  • Someone uploads a DOCX that needs to become HTML

And we always face the same choices:

  1. Pay for an API (expensive)
  2. DIY with open source (time sink)
  3. Say no to the feature (not great)



Current market options

I spent last week analyzing the main players:

CloudMersive

  • Starts at $19.99/month for 10k calls
  • $199.99/month for 100k calls
  • Hidden catch: Complex conversions can use multiple API calls
  • A single PDF conversion might count as 2-5 calls depending on complexity
  • Enterprise pricing gets steep fast

ConvertAPI

  • ~$35/month for 1,000 conversions
  • ~$90/month for 5,000 conversions
  • Overage penalty: ~$0.06 per extra conversion (ouch!)
  • At least they’re transparent about pricing

Zamzar

  • $12/month for ~1,500 conversions (50/day limit)
  • $39/month for ~15,000 conversions (500/day limit)
  • Daily limits instead of monthly – weird for APIs
  • Unclear API vs desktop app pricing

Reality check: For a small SaaS doing 5,000 conversions/month:

  • CloudMersive: $20-100 (depends on complexity multiplier)
  • ConvertAPI: ~$90/month + brutal overage fees
  • Zamzar: $39/month (if you stay under 500/day)

And none of them make it simple.



The “just use LibreOffice” trap

Everyone’s first instinct: “Just shell out to LibreOffice!”

const { exec } = require('child_process');

function convertFile(input, output) {
  return new Promise((resolve, reject) => {
    exec(`soffice --headless --convert-to pdf "${input}"`, (error) => {
      if (error) reject(error);
      else resolve(output);
    });
  });
}
Enter fullscreen mode

Exit fullscreen mode

Simple, right? Here’s what Stack Overflow and GitHub issues taught me about production reality:

Common problems people report:

  • Memory leaks after ~100 conversions
  • Random hangs on complex documents
  • Processes that won’t terminate
  • Can’t handle concurrent requests well
  • Different outputs on different OS versions

One developer’s solution I found shows the real complexity:

// From a GitHub issue - handling the edge cases
async function convertWithRetry(input, output, attempt = 1) {
  const timeout = setTimeout(() => {
    exec('pkill -9 soffice'); // Nuclear option
  }, 30000);

  try {
    await convert(input, output);
    clearTimeout(timeout);
  } catch (error) {
    clearTimeout(timeout);
    if (attempt < 3) {
      await new Promise(r => setTimeout(r, 1000));
      return convertWithRetry(input, output, attempt + 1);
    }
    throw error;
  }
}
Enter fullscreen mode

Exit fullscreen mode



A smarter approach I’m considering

Based on my research, here’s what seems to work best:



1. Use specialized libraries where possible

Many conversions don’t need heavy tools:

// Images: Sharp is fantastic
import sharp from 'sharp';
await sharp('input.png').jpeg().toFile('output.jpg');

// CSV/Excel: XLSX handles most cases
import XLSX from 'xlsx';
const workbook = XLSX.readFile('data.xlsx');
XLSX.writeFile(workbook, 'data.csv');

// Markdown: Marked.js
import { marked } from 'marked';
const html = marked(markdownContent);
Enter fullscreen mode

Exit fullscreen mode



2. Cache everything cacheable

Many apps convert the same templates repeatedly:

// Simple caching strategy
const cache = new Map();

async function convertWithCache(file, format) {
  const key = `${hash(file)}-${format}`;

  if (cache.has(key)) {
    return cache.get(key);
  }

  const result = await convert(file, format);
  cache.set(key, result);
  return result;
}
Enter fullscreen mode

Exit fullscreen mode



3. Hybrid approach

Use APIs only for complex stuff:

async function smartConvert(file, targetFormat) {
  // Simple formats = use libraries
  if (isSimpleConversion(file, targetFormat)) {
    return convertLocally(file, targetFormat);
  }

  // Complex stuff = use API
  return convertViaAPI(file, targetFormat);
}
Enter fullscreen mode

Exit fullscreen mode



Making the numbers work

For a hypothetical SaaS with 5,000 monthly conversions:

  • ~70% might be cacheable (same templates)
  • ~25% could use simple libraries
  • ~5% would need proper API/LibreOffice

Theoretical cost: ~$20-30 vs $80-100/month

But more importantly: no daily limits, no overage surprises, no complexity multipliers.



Why I’m thinking about building something

After all this research, I keep thinking: why isn’t there a simple, fairly-priced conversion API?

What I’d want:

  • One endpoint for all conversions
  • Pay only for what you use
  • No monthly minimums
  • Handles the edge cases properly

But before I spend months building this, I need to know: do others have this problem too?



Questions for you

  1. How do you handle file conversions currently?
  2. What’s your monthly volume?
  3. Would you pay for a simpler solution?

I’m genuinely curious about your experiences. Every project seems to solve this differently!


P.S. If you think a simpler conversion API would be useful, I’m collecting feedback at fileconvert.dev. No product exists yet – just trying to see if this is worth building.



Source link