kickaha: (Default)
kickaha ([personal profile] kickaha) wrote2008-04-16 03:25 pm

Scanned files, redux

Screw the CD-Rs. Grabbed a 4GB SanDisk cruzer micro USB stick for $30. Removed f--silly* annoying U3 firmware. Will keep updated, and store in safety deposit box. Nice little archive, and many boxes worth of space are freed up in the house. At the current rate, 10 full large file storage boxes will fit on that little stick.

* Just for you, gwyneira. :D

[identity profile] franktheavenger.livejournal.com 2008-04-17 01:11 pm (UTC)(link)
That's actually a really good idea. How'd you sort the damn things, filenames etc? I think that'd be the tedious part of scanning all the paperwork, setting the filenames for ease of browsing if neccessary.

[identity profile] kickaha.livejournal.com 2008-04-17 01:36 pm (UTC)(link)
I didn't. :)

The scanner will just keep adding scans to one PDF file until you say stop. So I have one file for 2004 BellSouth statements, for instance. (Named '2004.pdf' and placed in the 'BellSouth' folder, naturally.)

MacOS X 10.5's Preview app allows you to move, rotate or delete pages in a PDF, (and drag and drop pages between files,) so it's really easy to do some clean up afterwards if necessary.

And, the OCR software included adds text to the PDFs... which MacOS X's Spotlight then uses to make every PDF searchable.

If later I want to find all phone statements where I called a particular number, I just hit Cmd-space anywhere to fire up Spotlight, type in 'BellSouth (123) xxx-yyyy' and it'll return a list of all statements for BellSouth where I called that number. When I open them in Preview, the search terms are already highlighted for easy bouncing between hits.

Organizational filing isn't really a concern. :) I have a basic setup for just keeping the clutter to a minimum, but I'm probably not going to use folder traversal to find things - I'll just search.

[identity profile] kickaha.livejournal.com 2008-04-17 02:01 pm (UTC)(link)
Oh, and forgot to add... this if for the massive batch scanning. For scans moving forward, the PDF engine in MacOS X is scriptable, so I'll just set up scripts to add new pages to existing files as needed. Every January, (or quarterly, etc) start a new file.

I know others use apps like DevonThink Pro to do this organization - it does heuristic analysis of the words in the pages to figure out where to file them. Apparently it does a phenomenal job. (I just don't need to spend the money right now when this works fine for archiving.)

[identity profile] franktheavenger.livejournal.com 2008-04-17 07:35 pm (UTC)(link)
Ah, so what if you don't have software to make PDFs and OCR crap? :p YOU ARE NOT HELPFUL

[identity profile] kickaha.livejournal.com 2008-04-17 11:45 pm (UTC)(link)
1) Get a Mac.

Kidding. ;)

The software comes with the scanner, dude. Fujitsu ScanSnap. Go forth.

[identity profile] franktheavenger.livejournal.com 2008-04-18 02:26 am (UTC)(link)
I have a damn scanner already! I'm not buying a new one. :p

[identity profile] kickaha.livejournal.com 2008-04-18 02:32 am (UTC)(link)
Flatbed? You'll claw your eyes out before you get anything done.

Right tool for the right job. This one does double-sided, with an ADF, about 25 sheets a minute, color. It's optimized for exactly this task.

I can't imagine doing this on a scanner made for artwork.