Autofile

Discussion in 'New Release and Beta Release Information' started by Graham, Apr 10, 2008.

  1. Graham Developer

    Build 18
    • Option to use Tesseract on the regions after the form type has been recognized. Use this on high quality deskewed scans. Will speed up the recognition quite considerably.
    • Regions are now cropped before being OCR'd and any regions that are pure white are no longer submitted.
    • Renamed and copy the source files to the local directory before running ghostscript/imagemagick ... as some filenames generated by Paperport would cause an error
  2. Jason Developer / Handyman

    [quote user="Graham"]Build 18
    • Regions are now cropped before being OCR'd

    [/quote]

    What's the cropping process.
  3. Graham Developer

    Removes any partial words fragments, or junk, above and below the area of interest.

  4. Jason Developer / Handyman

    sorta like I depecit here ?

    [IMG] ?

    Would I make my "Zonal OCR boxes" bigger for more accuracy ?
  5. Graham Developer

    That's the intent .. to allow taller boxes so we can be sure we are getting the desired text.

    But if we get all the text of the line above, then that'll defeat the purpose.
  6. Jason Developer / Handyman

    Zonal OCR: a fine balance. Some differential equation should allow us to figure out the idea size :)

  7. Graham Developer

    Bumped to build 19

    If the date OCR fails on Tesseract, now switches back to Web OCR for another try.

  8. Graham Developer

    Build 20

    Switches back to Web OCR if name contains non-alpha characters.

  9. Graham Developer

    Build 21

    If a rule is selected in the rules table, it will be used as
    the first rule. This is where you know which document type you are
    importing. Good also for testing a particular rule.

  10. Graham Developer

    Build 22

    Crops the image horizontally as welll as vertically.

    This means if your scan box cuts thru a letter on the left or right edges, these partial letter fragments are now discarded.

  11. Graham Developer

    A couple of videos:

    Creating a rule and then recognizing the file.


    And then this one showing Synapse importing the autofile named file

Share This Page