Get the latest tech news

Advanced text features and PDF


The basic text model of PDF is quite nice. On the other hand its basic design was a very late 80s "ASCII is everything everyone really needs...

On the other hand its basic design was a very late 80s "ASCII is everything everyone really needs, but we'll be super generous and provide up to 255 glyphs using a custom encoding that is not in use everywhere else". Most of the time (in most scripts anyway) source text's Unicode codepoints get mapped 1:1 to a font glyph in the final output. For extra challenge you need to write an ActualText tag in the PDF command stream so that when users copypaste that text they get the original form with each individual letter rather than the ffi Unicode glyph.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of PDF

PDF

Related news:

News photo

Show HN: PDF to Podcast – Convert Any PDF into a Podcast Episode

News photo

Elsevier embeds a hash in the PDF metadata that is unique for each download (2022)

News photo

LaTeX is the first PDF/UA-2 compliance accessible PDF producer