Google to OCR PDF docs +a bit about imaginary books Google has announced that they will be using OCR on all PDF docs, see the following links for more info: http://googlesystem.blogspot.com/2008/10/google-uses-ocr-to-index-pdf-files.html and http://arstechnica.com/news.ars/post/20081031-google-turns-on-ocr-for-scanned-pdfs.html
block quote start Google has now decided that its open-source OCRopus technology, based on software called "Tesseract" that HP developed, is up to the task of indexing scanned documents that can contain any mixture of text, images, and coffee stains. block quote end
...But now I want Google to OCR the Book of G'Kar although, as far as imaginary books http://en.wikipedia.org/wiki/List_of_fictional_books are concerned, I want to read Jerzy Hacek's Dangerous Knowledge stories so I can find out more about the Librarians Militant.
Current Location: aerye Current Mood: wistful Tags: accessibility-pdf, cross-referencing, googlebooks, librarians militant
|