Et Al

Fixing a Broken Link Snafu

I had a broken link snafu in September. It was horrible. I had over 300 broken links on the website I manage at work, all of which were my own fault. That’s the worst kind of snafu, folks, but thankfully, they were fixable, and that’s what I’m going to talk about today.

First of all, let’s ask the obvious question: How did I let 300 broken links go unnoticed? Easy, really. I mashed all three of my work blogs into one, so all of my posts had new URLs. Aside from ruining many of my posts’ search engine rankings, this broke all of the internal links I had created on each individual post. I thought I fixed many of these broken links when the site was in beta, but then I changed the URL again, and the links were once again broken.

Lesson Learned: When you’re importing and exporting blogs or changing file structures, don’t assume that broken links will magically fix themselves.

The broken links were on my site for a few weeks before I found the bulk of them. Many of my readers were complaining that when they clicked on links from their emails, which are sent through Feedburner, they were being sent to 404 pages. About this time, I was playing with Google Webmaster Tools, which has a broken link report and which informed me that I had over 300 broken links on my site. I went into panic mode and frantically searched for a WordPress plugin to find and fix all my broken links for me. I found the Broken Link Checker plugin and installed it, hoping it would solve the problem for me. But it didn’t. The plugin slowed down the page load time on my site to minutes and made my WP dashboard practically inaccessible because it was working overtime to find all my broken links.

I uninstalled the Broken Link Checker by accessing the site through FTP and deleting the plugin files, which fixed the page load time problem, but I was still stuck with broken links. What do I do now, I asked myself? There’s no way I can go through every post, check for, and fix every broken link. There’s too much room for error, and I’ll miss something for sure.

Lesson Learned: There are no quick fixes or easy solutions when you have 300 broken links. Fixing the problem requires time and patience.

I didn’t let myself spiral into broken-link depression. I pushed up my sleeves and got to work. On my site, there were about 50 posts that solely linked to other posts and that didn’t have any original content on them. Since these posts were only necessary for the day they were originally posted, I justified that it would be easier to delete these posts rather than fixing every link in them. Once that was done, the majority of the broken links were gone, but there were still too many to install Broken Link Checker. I remembered that our friends at the W3C have a link validator and that this validator is built into Firefox’s Web Developer Toolbar plugin. Even though I’d have to fix all the links manually, I wouldn’t have to locate them on my own!

My plan was simple: First, I would check all my static pages. Then, I would send the W3C link validator through every blog page*, fix the links as they were identified, and double-check my work by sending the validator through the site again, only this time through category pages instead of blog pages.

*By blog page, I mean the page that lists all of the site’s blog posts in chronological order. For many blogs, this is the home page, but on my work website, it wasn’t. These pages usually have “Older Posts” and “Newer Posts” links at the top and/or bottom of every page.

Lesson Learned: Panicking solves nothing. Use the tools at your disposal to create a methodical plan for finding and fixing broken links and for double-checking your work.

Problem: The link validator took 10-15 minutes to check one page’s links. I don’t know why it’s so slow, but I accepted it and tweaked my plan. Turns out you can run multiple pages through the validator at once by checking each page in a separate tab or window. Bingo! At work, I have two PCs that I rarely use, so I opened Firefox on each computer and had each computer validating about 15 blog pages at a time (I needed to get through about 70 blog pages). While pages were validating, I went about my usual work day. Once validation was finished for the set, I went through the results and fixed the links. Once I went through all the blog pages, I validated all the pages again using the site’s category pages. It took me about one full work day to validate twice, fix links, and do my regularly-scheduled work tasks.

Confident that I had found and fixed at least 99% of the broken links, I installed the Broken Link Checker again, and it worked without slowing down my site! And it found zero broken links! Problem solved!

But the work wasn’t over. Webmaster Tools still said that all those links were broken and that the links in my sitemap.xml file from those 50 posts I had deleted were now broken. Not a problem for me because I knew for certain those broken links were gone; I just needed to let Google know. First, I recreated my sitemap.xml file and resubmitted it to Google. Google didn’t remove those broken links immediately; it took a few weeks, but eventually they were unlisted.

Second, I manually told Google to remove those broken link URLs from its index using the Remove URLs tool. This took a lot of copying and pasting links from the broken link page to the remove URLs page, and it took a few weeks for Google to get rid of the majority of those URLs. Today, 33 are still on the list, and I’m not sure why. For some reason or another they remain in Google’s index; however, there are no broken links identified in Broken Link Checker in WP.

One more thing: Before I found the link problem, the website lost pages of its search engine ranking. Where it had once been on the front page of some our most important keywords, it wasn’t coming up at all. Shortly after fixing my link snafu and submitting a sitemap, our search engine ranking returned to normal.

Lesson Learned: Search engines care about broken links. If you have too many on your site, you’re going to lose your spot in search engine rankings.

I pray that no other blogger ever has to go through such a mess. It was stressful, time consuming, and mind taxing. To make sure it doesn’t happen to you, here are some things to consider:

  • When importing or exporting your blog or changing your blog file structure, allow time to methodically check and fix all the links on your blog: internal and external links + image, audio, and video links.
  • Once all your links have been validated, keep them that way. Use a tool to regularly check the links on your site. Once identified, fix them as soon as possible.
  • When linking, use absolute links that include the full URL ( rather than using relative links (../images/me.gif). This assures that if your posts are being published somewhere besides your blog (i.e. in a Feedburner subscription email) that the links still work.

So tell me, have you ever had a broken link snafu? How did you fix it?