i fell the pain ... NONE
of any professional OCR system is able to live to my, high, standards. not even abbyy fine reader is not able to correct ocr 100% a clean pdf file. so ... i step to this docs:ocropus-internet-archive.pdf
and i see that i am not longer alone ... the big google is thinking the same ... proprietary ocr applications sucks ... the open & gpl counterparts also is not in good shape ... the net result:ocropus
and hope can see the light again. with interest and resource from a giant like google, bit by bit, things, i hope will be better in a near future ...
what spark my attention from ocropus-internet-archive.pdf
docs? the statement:closed source = can't be improved
hooray, and now google see the light, like all of us ... but ... after a lot of work in http://books.google.com
is happening, right now!
"First they ignore you, then they laugh at you, then they fight you, then you win." GandhiScientific publishing corporations get nervous
open access to all human knowledge is happening at an increased speed rate. no gate can keep it locked anymore .... the big O is more and more powerful and swamp everything in it ...