There's a lot of commentary out there about imaging disks. Why add another one? Because this is what I wish I knew when I started. This is mostly going to assume you're on Windows, though most can be adapted to Linux without too much issue.
Imaging Floppies
Why?
The naive way to back up a floppy is of course to just copy the files off of it, and if the floppy is just some old school projects or similar that may well be all you need. But for actual software there's two words that throw a wrench into this: copy protection. Software from the 80s and 90s often did weird things to prevent normal software from just reading the disk straight up so you couldn't just copy the disk and give a copy to your friend. Since computers of that era let you do whatever you wanted to the low-level hardware your program could instruct the drive to seek to a sector that appeared not to exist or do other similar tricks. None of this proved more than a speedbump for the pirates, but it's a problem now if you want an equivalent copy.
Flux Imaging
The solution to this is to instead record the raw magnetic flux. The basic theory is this: a file on a floppy exists within that floppy's filesystem. The filesystem in turn is stored on the disk's sector format. Most copy protection involves changes to the sector format. That sector format is stored on the disk's raw magnetic values (transitions, really). Flux imaging just reads that magnetic data directly rather than trying to parse it into a filesystem or even a sector format, so it doesn't matter what you did to any layer above it, or even what it's formatted as. Mostly. There's a few notable exceptions later. The only real downside is the resulting files can be much larger than the formatted data, often 10-20x the size of the disk as you'd normally think about it.
To do this you need a floppy drive compatible with the disk(s) in question, a drive controller that can do the flux imaging, a cable to connect them, and a way to power the drive. For now I'm going to stick to the most common case of PC-style drives of either 1.44MB (3.5") or 1.2MB (5.25") and I'll cover some of the other cases later. For a floppy cable you just need to know if you need a 34pin "IDC" connector (3.5" drives) or a 34pin card edge connector (5.25" drives); the latter is more annoying to find but still not too hard. For a controller a highly recommend and will assume you are using a Greaseweazle. You can buy the physical hardware from various sellers on eBay or equivalents. The Greaseweazle can power a 3.5" drive itself, though you may need an uncommon adapter to connect your drive to it. 5.25" drives aren't guaranteed to work, but may. If you don't have the right adapter or your drive is drawing too much power then I'd recommend picking up one of those USB to IDE/SATA adapters that are all over Amazon; they almost all have a "molex" power output that you then just need to connect to your drive. You can also reuse this to read CDs and DVDs as is covered below.
The Drive(s)
But what drive to buy?
First, some terminology. I promise it'll make sense why we're going over this first when we get there.
A floppy is divided into heads (sides), tracks/cylinders (radial distance), and sectors (rotational distance). These three values allow you to locate a specific location a disk, and in fact hard drives used this same notation for some time until they grew too large for it to remain practical (if you've ever seen a reference to "CHS" addressing, possibly in an old BIOS, this is that). A floppy drive thus has the following stats that matter:
- Number of heads (1 or 2)
- Physical disk size
- Maximum tracks per inch (TPI)
- Maximum data rate (bits/s)
The drive's data rate controls how many sectors it can have but this should never be an issue unless you're working with pre-PC drives. You want a drive with two heads and the highest TPI count you're going to archive. For PC 5.25" drives, 1.2MB drives are 96TPI and 360KB drives are 48TPI, so a 96TPI drive can archive 48TPI disks but not vice versa--get the 1.2MB drive. For 3.5" drives 720KB drives can't read 1.44MB disks but 720KB drives are fairly uncommon whereas 1.44MB drives are everywhere so this shouldn't be much of an issue as you're unlikely to encounter a 720KB drive unless you specifically seek one out. 1-head drives should only exist on the earliest PC-compatible 5.25" drives, so if you're buying a 1.2MB drive you're set. (Again, there's exceptions for other situations; see below.)
That 1.2MB 5.25" drive, connected to your Greaseweazle, can image practically anything: IBM PC, BBC Micro, TRS-80, whatever, it'll handle it. Likewise a 1.44MB 3.5" drive can handle IBM PC disks, Mac 1.44MB disks, Amiga disks, and so on. My recommendation for 5.25" drives is to get a TEAC FD-55GFR, but to specifically look for one with a second solenoid that raises and lowers the heads. This will be at the back left corner of the drive when viewed from the above-front and that spot will be empty if not present. These models seem to be much rarer; it was probably removed as a cost-saving measure and they don't have a different model number, though the earlier revision FD-55GF may be more likely to have this. If you can find one of these it can improve your odds of reading marginal disks by reducing the disk's exposure to the heads when you aren't doing a read. If you can't find one of these, don't sweat it--probably you won't have the same degree of fun I had trying to archive the Studio Software floppies that formed this preference (my record was one disk that took sixteen passes to get a full read of).
Drive Special Cases
Other situations to be aware of:
- Some PC 3.5" drives, like those in the IBM PS/2 line, expect power to be provided over their 34pin connector. You can identify these by their lack of 4-pin power connector. These will not work without an adapter, and connecting a regular drive to a system that provides power in this way may well destroy the drive.
- Reading 48TPI disks in a 96TPI system will need you to tell Greaseweazle to step the head twice (--tracks=step=2).
- Writing 48TPI disks with a 96TPI drive will often result in the disk not reading properly in a 48TPI drive. If you need to write these with any regularity you will probably need to track down a 48TPI (360KB) drive, but these are less common and tend to command higher prices as a result.
- Some 8-bit systems (like the Apple II and the Commodore 64) used "flippy" disks, where you'd extract the disk and insert it upside down to use the other side. If you try to read one of these as a normal double-sided disk it won't work right due to the information being written the wrong way around, and you can't insert it upside-down into a drive not meant to handle that because the index hole the drive uses to see where the disk is in its rotation will be in the wrong position. Some specific drives can be modified to handle this; check out Greaseweazle's documentation for more info.
- Mac 400KB and 800KB 3.5" disks used an unusual sector format that also involved the disk's rotation speed changing depending on which track was being read. Some PC drives can handle this to a better or worse degree, so if you already own a 3.5" drive you may as well give it a shot. The Sony MPF920 is usually among the best at this, but you'll want to try to get one from the 90s as they apparently started getting cheaper and in the 00s and less reliable at these edge case disks as a part of that. The actual ideal case for these specific disks is a specific Apple drive connected to an AppleSauce controller, but each one of those is going to run you quite a bit more than the Greaseweazle setup as a whole.
- 8" disks are a whole collection of special cases. Most drives have only one head, some have a reputation for damaging disks, and many use 120VAC spindle motors and even the ones that don't generally still need +24VDC, neither of which you're going to get out of standard PC power supplies. These drives also use a different 50-pin connector which will need an adapter to connect to a Greaseweazle or similar. Explaining how to get a working 8" archival setup is beyond the scope of this page because it's so much more manual than the others, and I just don't have enough experience with it personally anyway (yet).
- Other less common physical disk formats, like 3", are also very specific. Most use an interface derived from the original Shugart one to save having to reimplement it, but you'll still need to figure out how to connect it yourself. You may be able to connect one to a Greaseweazle but you're going to need to look up how your specific drive works.
Using the Greaseweazle
I'm not going to spend much time on the actual Greaseweazle CLI as its own wiki covers that pretty well. Instead I want to cover some best practices I've learned using it.
#1: Always read as flux (--raw). Always. This has three benefits. First, if the disk has some copy protection weirdness or similar, flux will capture that and allow you to rewrite it. Second, if the disk is damaged and has bad sectors, flux will preserve everything it can and potentially allow you to reconstruct the damaged data. Third, if you have to abort a read because a disk is marginal or damaged (see below) and you need to stop to clean the heads, the Greaseweazle software will not write out a partial file, but flux is written per-track so you can get keep the already completed tracks. While this will make each archived floppy much larger, floppies are small enough to start with the size should not be a concern.
#2: When possible, read as flux with simultaneous decode (--raw --format=...). If you have decaying disks and need to worry about stopping to clean the heads, this will allow you to see when it starts encountering read errors. Reading flux by itself will not present errors as it has no idea if what it read is valid or not. Depending on the exact disk format this may not be possible, but if you can, you should.
#3: Inspect disks before reading them. Be very careful imaging disks that look unusual in any way. As of this writing there is an ebay seller who sells 3D printed disk cleaning rigs that allow you to spin a disk inside its sleeve; consider getting these if you plan to do serious archiving.
#4: If you see the rate of read errors increasing as a disk is read, stop and inspect it and the drive's heads. This can be a sign that the heads are accumuating material that is interfering with the read.
#5: If you hear any suspicious metallic noises, stop and eject the disk immediately. You may be hearing a head crash, either already or imminently. Clean the heads before proceeding and consider that disk high risk.
Marginal and Damaged Disks
I am not an expert on this by any means, so take this with a grain of salt. This is just what I've learned and my current understanding of the problem. This does not affect all disks, but it can affect any. If you haven't ever used a given disk before, don't trust it until you know it works.
Physically, a floppy disk consists of a plastic carrier disc which is coated with some kind of adhesive and then has a ferrous material applied to that adhesive. The ferrous layer is what is magnetized to store data. Over time however the adhesive can start to fail, and floppies are 25 years old at the youngest. This can manifest in two ways: either the adhesive weakens and the drive head magnet starts pulling the magnetic media off of the disk, or mold starts forming as the (sometimes? often?) organic adhesive becomes exposed to the outside world. The disk can also just get straight up dirty but in my experience that's the least common case.
Cleaning a dirty or moldy floppy can be dangerous as many cleaners will further damage or destroy the adhesive, and with it the magnetic data stored there. I am not going to pretend to be an expert on this. What I have been told is you should use warm distilled water, but I have no experience to know if this has its own caveats.
Disks that are shedding their magnetic media however are more insidious as they'll appear fine until you attempt to read them. As the disc passes under the read head(s) tiny bits of the ferrous material will separate from the disk and become attached to the head(s). As more of this accumulates the head will start producing read errors until it starts failing completely, and if you're very (un)lucky enough will accumulate that the head makes physical contact with the disc. This is what's known as a head crash and it's very, very bad as it will almost certainly destroy whatever track(s) the head touched and can also damage or destroy the head. This is why I say you should always do flux-with-decode so you can see if errors start accumulating faster as this can be a sign this is happening.
Cleaning heads can be done in one of two ways. Either you get a special cleaning disk that's just some felt in a standard sleve and then drip IPA onto it and insert the disk (the Greaseweazle software has a special clean command to try to do this as well as possible), or you put some IPA onto a qtip and very very gently run it across the head(s). The latter will require you to move one of the heads to get proper access, make sure you're very gentle with this as you can theoretically misalign a head doing this though I haven't yet had that happen to me and I've run probably 200 qtips through my drives this way. The former I've heard mixed reports of it also being bad for the heads when done frequently since it's more abrasive, but if done rarely it's harmless. My personal stance is cleaning disks for preventative cleaning and qtips for reactive cleaning. This also helps preserve the much rarer cleaning disks.
There is a trick done with archiving tape media that involves "baking" the tape in a moisture-free environment to try to restore the adhesive (not just a regular oven, there's specific devices for this). I've heard that this can work for floppies too, but only as a rumor-of-a-rumor, I've yet to hear of anyone doing it themselves. If you have disks that otherwise appear to be a lost cause this may be worth looking into, but it will require removing the actual disc from its protective casing so this is a destructive operation.
After the Read
Once you have your .raw files, you'll want to convert them to something more useful. Greaseweazle provides a "convert" command to do this; most times you'll want the ".img" file format though you need to again tell it which disk format that is (such as --format ibm.1200) as an img file can be anything. These among others can be loaded into 86box. For non-PC platforms you'll probably need to look up what formats are standard there; for example Amigas use .adf files. Most of these were defined in the 90s before any kind of standard existed so there's a lot of very similar file formats out there. Greaseweazle can read almost everything and write to most of them, and even if Greaseweazle can't write to whatever format you need there's probably some other utility that can do it, probably by converting an img to whatever platform-specific format.
If you just want the files, 7-Zip can open most PC disks that are stored as .img. It will have issues with disks written by DOS 1.x which used a very different disk header, but seemingly so does everything else, I've found practically nothing that can read those aside from mounting them in actual DOS in 86box so their contents can be copied to a "new" floppy image. Disks in other formats will probably need tools for their specific platforms.
Other Tools
There's two other pieces of software worth mentioning that perform similar or adjacent functions:
HxC is a floppy drive emulator and can present a visual view of your disk, highlighting bad reads. It can also export files from PC floppies (though it seems much picker about disks than 7-Zip) and show you a very low level view of a disk's data from flux reads. I've found it invaluable for examining reads of marginal disks to determine which sectors I want to try again as well as when reading disks of unknown format as you can use it to determine the layout of a disk. For example I used it to verify a disk was 96TPI, 10 sectors, FM encoding (I haven't covered encodings here, but there's only a few and they control how data is stored as magnetic flux) and a brief visit to Wikipedia later confirmed that that's a format used by Acorn. I've also had HxC manage to recover sectors that Greaseweazle said were bad when there were only a few bitflips in the flux. On the negative's side HxC's CLI is not nearly as good as Greaseweazle's in my opinion.
FluxEngine is an alternative to Greaseweazle that performs a similar function though in my opinion it's not as intuitive. I've had some luck using it to read Mac formatted 3.5" disks as it seems to be slightly better at that, or at least better at providing feedback. Like HxC I think its CLI is more awkward to use than Greaseweazle's so I only use it when I'm already manually intervening.
Putting It All Together
Reading a floppy then consists of the following steps, at least for me:
- Read the disk as raw, specifying format to see if any errors occur during read (gw read --raw --format=...)
- Watch the read to make sure you don't need to suddenly eject the disk; this is uncommon but you don't want to damage a drive because you weren't paying attention
- Use HxC to export the flux read as an image (-conv:PNG_DISK_IMAGE)
- Convert to img (or whichever preferred format) (gw convert --format=...)
- Use 7-Zip CLI to extract disk contents
This gets you a canonical copy of the disk in raw flux, a usable copy of the disk in img, and a folder full of the files on the disk if applicable. If you ever need to write the disk back to a real floppy, write using the flux as the source data.
Don't Use Floppies
This may seem a strange note to end on, but there's a reason a decent chunk of this has been about dying floppies. Dying floppies can destroy a floppy drive, and neither are being made anymore. While the aesthetic of swapping disks is fantastic, in my opinion it's not worth the risk. Just get a Gotek emulator (but get one of the ones with the OLED displays they're way easier to use) and flash it to FlashFloppy if it didn't come that way. It won't feel as cool, but it'll make actually using your retro system much easier and won't make you have to wonder if a 30 year old disk is going to still work this time. If you have old disks, image them now before they have even more chances to go bad, and then use those images rather than the physical disks. As of this writing a Gotek is about $30-60 depending on model and seller, and that plus a flash drive is much cheaper than replacing a damaged 5.25" drive and competitive with a 3.5" drive.
Imaging Optical Media
The "why bother doing it this way" of optical media is the same as that for floppies, really: copy protection. Just copying the files off the disk is not a guarantee you'll be able to use your backup when the original inevitably fails. Software like redumper will instead read the disk as if it was audio which permits much lower level storage. It also takes up more space, though not to nearly the degree that flux does for floppies. Most of my own CDs have wound up around twice their original size when archived this way (though on the other hand doubling the size of a CD consumes much more space than 10x of a floppy). The gold standard for this process is redump.org, who have an entire standardized process.
The optical disk version of this process is much more straightforward and much more annoying at the same time compared to the floppy version. To actually archive an optical disk I highly recommend you check out MPF which will walk you through the whole process (Windows only, at least as of this writing, but you can also just control redumper yourself from a CLI if you want and/or you're on Linux). You just plug in your drive and you're off, no worrying about TPI or any of that. The problem is that low level CD access is not nearly as universal as it is for floppies, so you need a drive that redumper can use for this process, and which ones can is effectively completely random. The drive needs to be able to read into the area before the start of a track, and you need to make sure that your drive is sending data in the order that redumper expects, as there's no standard, and some drives either won't send all the optional data or won't respect the commands to perform a raw read of a data track at all. Naturally, known good drives for this are very popular and thus very expensive unless you can find somebody who isn't aware that the drive they're selling is blessed in this way. I got lucky and a random Blu-Ray drive I bought at Microcenter for like $25 back in the day is one of the known good ones. There are compatible drives not on redumper's known good list which are much easier to find for sale, but you'll need to do your own research to figure out what they are.
Generally if you have issues reading a CD or DVD, either try again or try slower (all the way down to 1x if you can wait). Optical media is surprisingly resilient to damage on the bottom layer, though not immune. If your drive is generating a ton of errors on every disk you try you probably have it configured wrong--see redumper's GitHub page for what to try based on how it's failing.