Postby totedati » Tue Sep 04, 2007 21:10

i fell the pain ... NONE of any professional OCR system is able to live to my, high, standards. not even abbyy fine reader is not able to correct ocr 100% a clean pdf file. so ... i step to this docs:


and i see that i am not longer alone ... the big google is thinking the same ... proprietary ocr applications sucks ... the open & gpl counterparts also is not in good shape ... the net result:


and hope can see the light again. with interest and resource from a giant like google, bit by bit, things, i hope will be better in a near future ...

what spark my attention from ocropus-internet-archive.pdf docs? the statement:
closed source = can't be improved

hooray, and now google see the light, like all of us ... but ... after a lot of work in

is happening, right now!
Scientific publishing corporations get nervous

open access to all human knowledge is happening at an increased speed rate. no gate can keep it locked anymore .... the big O is more and more powerful and swamp everything in it ...
