The Saint-Olympe French Cursive Rhum OCR Challenge

Saint-Olympe (M.).—Notes sur la fabrication du rhum a la Martinique a la fin du XIX (Inedite).

A distiller from Martinique wrote an incredible historical document titled “Notes on the manufacture of rum in Martinique at the end of the 19th century”. This appears to largely be a reflection on the period before the Mount Peleé eruption of 1902 that decimated Saint-Pierre and killed 40,000 people. The wiki details of the eruption are startling. The eruption also coincided with economic challenges for the sugar industry of Martinique that would go on to shape the surviving industry. Many sugar factories would shut down reducing the domestic supply of molasses due to the rise of sugar beets in France. Martinique would also see the formation of a quota system for exported rhum. What this all means is that Saint-Olympe looks back on a dramatically changed industry.

This is an unedited manuscript so what we get is about 80 pages of clearly written cursive text. I translate French fairly well but not cursive. To translate the text, or even make it readable to young French people, requires creating a new digital manuscript. We can either convince our French & Francophile grandmothers to work on it, page by page, or some slick techie can bring in the mythic AI OCR (if AI is not full of shit).

AI OCR is not obvious off the shelf. You can google until you’re out of breath, but you will not find easy to use freeware, ready to go. You can burn a lot of time exploring tools like Microsoft One Note and come up dry. What is it going to take? I’m not sure. An Adobe product? Adobe seems to punt, claiming they can simply create a PDF from an image that is easy to process with someone else’s cursive OCR.

Do we simply need a retirement home of ten old ladies doing eight pages a piece? Probably.

I leave this challenge to someone else and I will volunteer my time to do a translation from French and interpretation of the text. This would make an exciting case study with a captivating backstory for anyone trying to prove a cursive OCR method. Journalists would no doubt fawn over the output as the distilling industry mines the document for insights. So why not score a delicious puff piece for your cursive OCR project?

Good luck!

