Oryx Translation · Document Studio
A free utility that pulls Arabic text out of a PDF and into an editable Word file. Expect rough edges — broken layouts, ligature errors, and OCR mistakes are unavoidable. Polish the rest in Word. For higher accuracy, use the Mistral OCR option below — free to set up.
Tesseract (the free engine) gets Arabic text into Word, but mangles ligatures, breaks tables, and misreads many words. Mistral OCR reads Arabic at near-human accuracy. The free Experiment tier covers typical use — no credit card, just an email and phone number.
Already have a key? Click the button — your saved key auto-fills.
PyMuPDF parses every page. Pages with embedded text use the digital extraction path; image-only pages run through OCR. You don't pick — the tool decides per page.
Digital pages go through pdf2docx — simple layouts usually survive, complex ones (multi-column, footnotes, nested tables) often break. Scanned pages run through Tesseract 5 (Arabic + English); accuracy depends heavily on scan quality.
RTL direction is tagged at section, paragraph, and run level. Common Arabic bugs (lam-alef ligature flips, spurious hamzas, justification tatweels) get post-processed. You receive a Word file you can edit further.
Direction is tagged at section, paragraph, and run level. Word usually renders Arabic correctly, but mixed-language paragraphs and unusual fonts can still misbehave — check before sending.
Single-column body text usually survives. Multi-column layouts, footnotes, and complex tables often break or merge. Plan to redo non-trivial layout in Word.
Tesseract 5 reads Arabic + English. Quality varies wildly with scan, font, and justification. We post-process common Arabic OCR bugs (ligature flips, spurious hamzas), but expect mistakes. Use Mistral OCR for materially better results.
Files are removed from disk within 60 minutes. No accounts, no logs of file content, no third parties. Your API key is used once per request and never stored on our server.
No signup, no email. Drop a file, get the .docx. For more pages or better Arabic accuracy, bring your own Mistral key — free to set up, takes 3 minutes.
Made by Oryx Translation — a studio that does this work daily. We know which corners can be cut and which can't. This tool removes the first 5% of any Arabic translation job. The rest is yours.