Autofile

Discussion in 'New Release and Beta Release Information' started by Graham, Apr 10, 2008.

  1. Graham

    Graham Developer Staff Member

    Build 18
    • Option to use Tesseract on the regions after the form type has been recognized. Use this on high quality deskewed scans. Will speed up the recognition quite considerably.
    • Regions are now cropped before being OCR'd and any regions that are pure white are no longer submitted.
    • Renamed and copy the source files to the local directory before running ghostscript/imagemagick ... as some filenames generated by Paperport would cause an error
  2. Jason

    Jason Developer / Handyman Staff Member

    What's the cropping process ?
  3. Graham

    Graham Developer Staff Member

    Removes any partial words fragments, or junk, above and below the area of interest.

  4. Jason

    Jason Developer / Handyman Staff Member

    sorta like I depecit here ?

    [​IMG] ?

    Would I make my "Zonal OCR boxes" bigger for more accuracy ?
  5. Graham

    Graham Developer Staff Member

    That's the intent .. to allow taller boxes so we can be sure we are getting the desired text.

    But if we get all the text of the line above, then that'll defeat the purpose.
  6. Jason

    Jason Developer / Handyman Staff Member

    Zonal OCR: a fine balance. Some differential equation should allow us to figure out the idea size :)

  7. Graham

    Graham Developer Staff Member

    Bumped to build 19

    If the date OCR fails on Tesseract, now switches back to Web OCR for another try.

  8. Graham

    Graham Developer Staff Member

    Build 20

    Switches back to Web OCR if name contains non-alpha characters.

  9. Graham

    Graham Developer Staff Member

    Build 21

    If a rule is selected in the rules table, it will be used as
    the first rule. This is where you know which document type you are
    importing. Good also for testing a particular rule.

  10. Graham

    Graham Developer Staff Member

    Build 22

    Crops the image horizontally as welll as vertically.

    This means if your scan box cuts thru a letter on the left or right edges, these partial letter fragments are now discarded.

  11. Graham

    Graham Developer Staff Member

    A couple of videos:

    Creating a rule and then recognizing the file.


    And then this one showing Synapse importing the autofile named file
  12. Jason

    Jason Developer / Handyman Staff Member

    Graham: Do you ever use Autofile ?

    I keep dreaming I'll buy a $2000 scanner, commercial OCR software, and my daily documents will practically file themselves !
  13. Graham

    Graham Developer Staff Member

    Not anymore as pretty much everything arrives as HL7 these days.
  14. Jason

    Jason Developer / Handyman Staff Member

    So jealous.

Share This Page