scanning pre-PDF software manuals to PDF

edited September 2021 in Software
I'm kind of hooked on an old app called Clarion Personal Developer 2.0 that I bought new in 1991.
I've had the book for it (still in extremely good shape) for the 30 years since. The floppies are unreadable but that's why WinWorld exists.
I decided to bite the bullet and paid about $85 to have it professionally scanned using OCR with a service called Blue Leaf Book Scanning Service. I used them because their OCR is supposed to be more accurate than others...but none is perfect.
I'm happy to donate a copy of the PDF for CPD2 to this site when I get it back.
I still have a bunch of old software manuals (some of the software was purchased at thrift shops years later and included manuals). I suspect some of you have old manuals as well that you wished you could have a PDF copy for so you can do searches on them.
I don't know if there's any movement on this site to scan old manuals to PDF but that would greatly increase useability of the apps, especially those close to 30 years old...especially if you don't have a manual for them to begin with or it was lost!!!


  • Yes, we try to add manual scans with our software when possible and practical. Sharing that PDF would be much appreciated.

    It's not always easy. Scanners with document feeds can quickly scan pages removed from a 3-ring binder or spiral bound book. But bound manuals with a spine have to have the spine cut off, otherwise one has to manually scan each page on a flatbed one page at a time. I've got one such thick manual here that I don't want to cut up, and it has been sitting here for several years as I scan just a few pages at a time.

    I guess some people have a hard time finding scanners these days. They used to show up all the time at thrift stores. It also used to be that large offices, libraries, and copy shops would have scanners available for use.

    As for borked disks, just a reminder that floppy disks that have not been used in 30 years should not just be plopped in a floppy drive. They should be carefully inspected and possibly cleaned/treated if needed first. Otherwise they can rip themselves to shreds.
  • The BLBS service I recommended will do a non-destructive scan, but they prefer destructive as you said (because it's easier for them). Costs a tiny bit more to keep the book together as I requested, but they have a requirement of a larger margin width on the pages to successfully scan the edges of the text. My reason for keeping the book together is I can proofread if I wanted to. They claim over 99% accuracy which could mean very few typos if 99.99% (one out of 10,000) or a lot if only 99% (one out of 100).
    I'll be able to tell just by proofreading the first couple pages because the text quality is consistent throughout the book.
    If they would provide scanned pages before OCR processing those can be used to proofread as well. I'll ask them about that. Then I can also share those.

    I agree about the floppy disks and also there is a special device that acts as a enhancer to help read hard to read disks, that I happen to have...which professionals use.
    I considered buying a refurb 3.5" and 5.25" drive instead of paying to have my drives refurbed before I attempt to use that.

    There is just a lot of perfectly good old DOS software out there that still works as good as it ever did.

    IMHO those were the days when computers were a pure joy to use and didn't involve all the various types of nasty crap we see these days with Windows 10 and with all the destructive crap on the internet.

    No forced OS updates. BBSes were fun. Compuserve could get pricey. There was a thing called 'netiquette' that is totally dead now.

    It was 'simpler' in a sense. It was nerdy as heck. It was beautiful.

    BTW, I found out about DOXBOX-X which is supposed to be an improvement over DOSBOX that specifically handles all DOS apps as well as just games... and supposedly better than the two flavors of vDOS designed to handle DOS apps.

    I will try that with Clarion Personal Developer 2.0

    Steve Sybesma
    Brighton, CO
  • edited September 2021
    Proofread before OCR?

    I do NOT recommend PDFs that just contain text. OCR suoks ba11s. It can NOT usually interpret symbols that old manuals commonly use (such as the return/enter arrow symbol). OCR also commonly produces tons of vomit for "figures" that may contain intentionally unreadable text.

    On Winworld, we recommend what is sometimes called "Images on Text". That is, what you see is an actual image of the page, but if you highlight the text and select "copy", you get the OCRed text. You can both copy and search the OCRed text, but you still see the original material, so if the OCR goofed up, it does not matter too much.

    When creating a PDF this way, I usually only "proof" the first few pages so search engines can properly see the manual title.

    If you want an example, look at the Freelance manuals I scanned not that long ago:
    (I also usually clean up pages so they look nice when printed, but that is a lot of work)
  • I actually don't know their process and your reasoning makes sense regarding symbols and I did not anticipate that problem. So thank you for that. Now I need to find out their process. They do specialize in computer manuals and other types of manuals, however...possibly they had to take that into account in the past.
Sign In or Register to comment.