The Naked Scientists
  • Login
  • Register
  • Podcasts
      • The Naked Scientists
      • eLife
      • Naked Genetics
      • Naked Astronomy
      • In short
      • Naked Neuroscience
      • Ask! The Naked Scientists
      • Question of the Week
      • Archive
      • Video
      • SUBSCRIBE to our Podcasts
  • Articles
      • Science News
      • Features
      • Interviews
      • Answers to Science Questions
  • Get Naked
      • Donate
      • Do an Experiment
      • Science Forum
      • Ask a Question
  • About
      • Meet the team
      • Our Sponsors
      • Site Map
      • Contact us

User menu

  • Login
  • Register
  • Home
  • Help
  • Search
  • Tags
  • Recent Topics
  • Login
  • Register
  1. Naked Science Forum
  2. Non Life Sciences
  3. Geek Speak
  4. Optical charactor recognition
« previous next »
  • Print
Pages: [1]   Go Down

Optical charactor recognition

  • 7 Replies
  • 8303 Views
  • 0 Tags

0 Members and 1 Guest are viewing this topic.

Offline syhprum (OP)

  • Naked Science Forum King!
  • ******
  • 5198
  • Activity:
    0%
  • Thanked: 74 times
Optical charactor recognition
« on: 30/11/2012 14:23:49 »
Does anyone have any OCR experience I have Acrobat 9.0 and SimpleOCR but neither make much of it I have an old 16 bit program Bit ware that might run on a 32 bit system that I will try to use that was very good with FAX if I can find the CD.

* Capture.JPG (40.49 kB, 387x450 - viewed 834 times.)
Logged
 



Offline RD

  • Naked Science Forum GOD!
  • *******
  • 9094
  • Activity:
    0%
  • Thanked: 163 times
Re: Optical charactor recognition
« Reply #1 on: 30/11/2012 14:57:57 »
You shouldn't use the jpg format for images with text, as it uses lossy compression which degrades the image making the text difficult to read, (for humans and OCR).  lossless compression formats like png or gif or bmp should be used for images with text ...

 [ Invalid Attachment ]

* Capture-tweaked.png (66.35 kB, 774x900 - viewed 2537 times.)
* PDF with lossless compression of text image.pdf (69.79 kB - downloaded 290 times.)
« Last Edit: 30/11/2012 15:09:45 by RD »
Logged
 

Offline syhprum (OP)

  • Naked Science Forum King!
  • ******
  • 5198
  • Activity:
    0%
  • Thanked: 74 times
Re: Optical charactor recognition
« Reply #2 on: 30/11/2012 17:29:19 »
This text panel was a screen grab from Facebook I was hoping I could advance my Geek status by producing a nice clean text version with OCR I can of course improve the visibility of it with Photoshop et al but I wanted to see some OCR in action.
Logged
 

Offline RD

  • Naked Science Forum GOD!
  • *******
  • 9094
  • Activity:
    0%
  • Thanked: 163 times
Re: Optical charactor recognition
« Reply #3 on: 01/12/2012 06:37:38 »
Quote from: syhprum on 30/11/2012 17:29:19
This text panel was a screen grab from Facebook I was hoping I could advance my Geek status by producing a nice clean text version with OCR I can of course improve the visibility of it with Photoshop et al but I wanted to see some OCR in action.

When you make a screengrab you can save it in a losseless compression format like GIF or PNG rather than lossy JPG. After conversion to JPG irreversible damage has occurred to the image data which even photoshop can't totally reverse.

I tried a randomly selected free online OCR service with your blurry image, unsurprisingly it failed miserably to convert barely readable small text, and surprisingly made mistakes with the giant heading text: "Students" => "StudonW"...

Quote from: free-online-ocr.com
by High School StudonW

1
2
3
4

S
6
7
8

9
10
11

12
http://www.free-online-ocr.com/

To be fair that website can OCR a (GIF) screengrab of its own pages ...
 [ Invalid Attachment ]


* can OCR a GIF screengrab of own webpage (except ''recognit ion'').gif (15.81 kB, 720x466 - viewed 1544 times.)
« Last Edit: 01/12/2012 07:12:44 by RD »
Logged
 

Offline syhprum (OP)

  • Naked Science Forum King!
  • ******
  • 5198
  • Activity:
    0%
  • Thanked: 74 times
Re: Optical charactor recognition
« Reply #4 on: 01/12/2012 09:18:28 »
I did of course use a lossless save when I made the original screen grab (.png) but when I sent the test picture to the forum I had to use .jpg due to the large size.
I am surprised that an expensive program like Adobe Acrobat 9 Pro Extended could not do better.
I am going to do further test with my own scans unfortunately I cannot run the old Bitware program as it needs windows 2000 or some olde worlde system that I do not have set up anywhere.
   




Adobe Acrobat 9 Pro Extended
Logged
 



Offline RD

  • Naked Science Forum GOD!
  • *******
  • 9094
  • Activity:
    0%
  • Thanked: 163 times
Re: Optical charactor recognition
« Reply #5 on: 01/12/2012 10:19:29 »
Quote from: syhprum on 01/12/2012 09:18:28
I did of course use a lossless save when I made the original screen grab (.png) but when I sent the test picture to the forum I had to use .jpg due to the large size.

If you lower the number of colours, aka "bit-depth", with png or gif you get a much smaller file size.
The default setting on png will give photographic quality (256 colours) which is unnecessarily high bit-depth for a text image. Greyscale with 4 "colours" (4 shades of grey) are sufficient for text.

You may have a preset to called "web optimised" to reduce the size of a PNG image file by reducing the number of colours, (the size is reduced greatly : to 1/6th)  ...

 [ Invalid Attachment ]

* ''web optimised'' PNG images are a sixth the size.gif (24.17 kB, 801x310 - viewed 1598 times.)
Logged
 

Offline Mazurka

  • Hero Member
  • *****
  • 510
  • Activity:
    0%
  • Thanked: 1 times
Re: Optical charactor recognition
« Reply #6 on: 05/12/2012 09:26:42 »
Cheers RD, that is a good tip.
Logged
 

Offline phil2000

  • First timers
  • *
  • 1
  • Activity:
    0%
Re: Optical charactor recognition
« Reply #7 on: 27/10/2014 12:48:32 »
Hey, in case anyone comes across such an problem again: The following tool newbielink:https://www.ocrgeek.com/ [nonactive] is really handsome in doing OCR with PDF.
Logged
 



  • Print
Pages: [1]   Go Up
« previous next »
Tags:
 
There was an error while thanking
Thanking...
  • SMF 2.0.15 | SMF © 2017, Simple Machines
    Privacy Policy
    SMFAds for Free Forums
  • Naked Science Forum ©

Page created in 0.302 seconds with 49 queries.

  • Podcasts
  • Articles
  • Get Naked
  • About
  • Contact us
  • Advertise
  • Privacy Policy
  • Subscribe to newsletter
  • We love feedback

Follow us

cambridge_logo_footer.png

©The Naked Scientists® 2000–2017 | The Naked Scientists® and Naked Science® are registered trademarks created by Dr Chris Smith. Information presented on this website is the opinion of the individual contributors and does not reflect the general views of the administrators, editors, moderators, sponsors, Cambridge University or the public at large.