I noticed on the Arabic example they lost a space after the first letter on the third to last line, can any native speakers confirm? (I only know enough Arabic to ask dumb questions like this, curious to learn more.)
Edit: it looks like they also added a vowel mark not present in the input on the line immediately after.
Edit2: here's a picture of what I'm talking about, the before/after: https://ibb.co/v6xcPMHv
I am pretty sure it added a kasrah not present in the input on the 2nd to last line. (Not saying it's not super impressive, and also that almost certainly is the right word, but I think that still means not quite "perfect"?)
Yep, and فمِنا too, this is not just OCR, it made some post-processing corrections or "enhancements".
That could be good, but it could also be trouble the 1% chance it makes a mistake in critical documents.
And here I thought after reading the headline: finally a reliable Arabic OCR. I've never in my life found a good that does the job decently especially for a scanned document. Or is there something out there I don't know about?
Edit: it looks like they also added a vowel mark not present in the input on the line immediately after.
Edit2: here's a picture of what I'm talking about, the before/after: https://ibb.co/v6xcPMHv