Publishing for the Kindle: Strip Unwanted Code from an old MS Word File

How to Clean up an Old MS Word File

Almost all MS Word files need at least some cleaning before being converted to the Kindle format.

If you have a truly old file that has no formatting you want to save, the best idea is clear out everything with one of the “quick and dirty” methods described below.

If your file has italics, bold, and/or images, you may want to try less drastic methods first, with:

That will remove most of the clutter that causes an untidy-looking Kindle eBook, but preserves your italics, bold, and other formatting you properly entered with MS Word Styles.

That clutter includes, but is not limited to, extra page breaks, extra paragraph breaks, extra spaces, extra periods, extra commas, and other “user errors”.

It won’t remove “MS Word bloat”, and it won’t remove trash code hiding under the surface.

There’s always a chance (though not a very good one) that your file really doesn’t have any hidden code that will mess up your Kindle file.

So… before you take the drastic measures necessary to clear out the bloat (which may also clear out code you want), consider the following:

Once you have gone through that process, and your file looks clean, it’s time to put it to the test by creating a Sample Kindle file that you can view with Kindle Previewer or on your own Kindle device.

How to Create and Download a Sample of Your Kindle Format eBook

When you preview your newly created Kindle format file, you may decide that there is no hope for it… you have to clear out all the old code.

If you don't have a lot of formatting such as italics and bold fonts, that should be your first choice... it will save much time.

But if you have a lot of formatting you want to preserve, you may be able to use a less drastic method.

There are several ways to do that, with the best one for you depending on the current condition of your own MS Word file.

The first thing to try is to simply return all formatting to "Normal".

Apply Normal Style to Entire MS Word Doc File

To do that, "select" or "highlight" your entire file with CTRL-A, then do a CTRLSHIFT-S to display your Styles List, choose Normal from the list, and click the "Reapply" button.

If you try this little trick early in this process... before you've made changes that make Normal lose its place, it may even leave your bold and italics intact... saving you a lot of time.

If your file is in really bad shape, your best choice may be the “quick and dirty” approach, which Mark Coker of Smashwords calls the Nuclear Option:

Mark Coker's The Nuclear Option

You can take your file all the way back to plain text by copying it out to Notepad, then back into a clean MS Word page.

(This is the same thing Mark Coker calls the "Nuclear" approach.  He is more comfortable with it than I am... I prefer trying a less drastic method first.)

That will clear out “MS Word bloat”, along with all your formatting, such as italics and bold, and any special paragraph formats.

It also will clear out all your page breaks, section breaks and pictures or illustrations.

It will not take out extra paragraph returns, manual page breaks, tabs, extra spaces, and the long list of clutter that still must be removed by hand.

MS Word's Clear Formatting

A more gentle approach is the “Clear Formatting” feature of MS Word.

Clear Formatting will also clear out MS Word bloat, along with italics and bold, and other formatting you may want.

It will not, as Notepad does, clear page and section breaks, and it won’t remove your images.

Clean up an MS Word 2003 doc file With Clear Formatting.

From the Menu at the top of the screen, choose Format|Styles & Formatting

Menu|Format|Styles & Formatting|Clear Formatting

Select Styles and Formatting from the Format drop down menu, as shown in the above screen shot.

The Styles and Formatting box will open on the right side of the screen.  The way I have mine set, it takes up almost half my screen; yours may look different, depending on your own settings.

Notice the first item is Clear Formatting.

That selection will clear the formatting from anything you have selected in  your text.  You could clear just one word if you wanted to, but we are going to select and clear the entire file.

Do CTRL-A, to select  your entire file:

With your entire file “selected”, and highlighted, press Clear Formatting at the top of the Styles and Formatting box at the right of your screen.

Formatting and hidden codes are gone.

Poof! The formatting (along with most of the “bloat”) is gone!

Well… most of the formatting is gone. As you can see, the photos are still in place, so you know some code is still holding them in place.

Paragraph breaks are also still in place. That will make it easier to put things back together.

If I had a file like this with a large number of photos, I might be tempted to try to leave this code in place and test my file by creating a Sample:

How to Create and Download a Sample of Your Kindle Format eBook

To see if it will work this way so I don’t have to go through the steps of  putting all the photos back in.

Depending on how good the code was to begin with, sometimes this works; sometimes it doesn’t.

But, for this lesson, we are going to clear out all the old code, so I can show you how easy it is, and how easy it is to put everything back together.

The Nuclear Option Step-by-Step

Next, we will look at how we can clear out even this code that’s holding the photos in place.

To do that, select all your text with CTRL-A and cut it all out with CTRL-X.

Next, open a new blank page with your little plain vanilla “Notepad” that comes with Windows, and paste your entire file into it with CTRL-V.

Save the Notepad file with some disposable name like XX or Clear Format… to make sure Notepad clears out all the code.

Look at Format at the Menu Bar of Notepad to make sure that Word Wrap is is NOT checked. You want nice straight lines of text; each paragraph should be just one line.

Next, do a CTRL-A to select all the text that you have just pasted into Notepad… text that Notepad has cleaned up, then do a CTRL-C. to copy it out to the clipboard.

Now open a clean blank file… I’ve named mine “A Donkey Named Minnie”... and paste your clipboard contents into it with CTRL-V.

It looks pretty pitiful and naked now, doesn’t it… with no formatting?

But now we have a good clean file with no errant code, and no bloat.

Next, we will put our photos back in, set up the titles and headers we want in our final file,  and even create a properly inter-active Table of Contents, using MS Word.

