Get the latest tech news
Show HN: Adventures in OCR
This past few weeks I've been working on OCRing an ancient book: a late 19th century edition of 18th century memoirs, in French: Les Mémoires de Saint-Simon. Saint-Simon was a courtier in Versailles during the last part of the reign of Louis XIV; his enormous memoirs (over 3 million words) are a first-hand testimony of this time and place, but are more revered today for their literary value than for their accuracy.
Saint-Simon was a courtier in Versailles during the last part of the reign of Louis XIV; his enormous memoirs (over 3 million words) are a first-hand testimony of this time and place, but are more revered today for their literary value than for their accuracy. They have had a profound influence on the most prominent French writers of the 19th and 20th centuries, including Chateaubriand, Stendhal, Hugo, Flaubert, the Goncourt brothers, Zola, and of course Proust, whose entire project was to produce a new, fictitious version of the memoirs for his time. The French National Library (Bibliothèque nationale) scanned these physical books years ago, and they're available online, but only as images and through a pretty clunky interface that makes reading quite difficult.
Or read this on Hacker News