Migrating an old Blogger blog to WordPress

I’m an expert now at migrating old Blogger Blogs.

Google, the owner of Blogger, has decided to shut off FTP posting, which has left many people stranded. They could have improved their algorithms, instead, things are shut off.

I am now a reluctant expert at migrating Blogger blogs to WordPress. If you have a more “modern” style Blogger blog, it’s very easy to import directly from Blogger using the XMLRPC way. Just navigate to Tools → Import Blogger:

Import Blogger

Then authorize Blogger via Google:

Authorize Blogger with Google

And voila, you have access to your Blogger Blog, assuming it’s supported.

BUT IF IT FAILS, YOU NEED A PLAN B

But if it fails, as it does for some folks with older blogs, then you must try the long route. Comments and corrections are welcome on this post. Please feel free to spread the word about this post to those left hanging by Blogger.

Step 1: Export your blog to XML from inside Blogger.

You must be an administrator on the blog to have this option. While logged into Blogger, visit your Blogger Dashboard and choose settings:

Blog Dashboard

Step 2: Choose “Export”:

Blogger Settings

Step 3: Click “Download Blog”:

Blogger: Export Your Blog

The file produced is a in a special Blogger export xml format. It will be named like blog-03-02-2010.xml. This file should contain all your posts and comments. It will not contain the images you use on your blog. If you host on your own server via FTP though, you should have all those old images.

Step 4: Convert your Blogger XML to WordPress XML (WXR Format):

But WordPress can’t import this file format, first you must convert the file to the WordPress compatible WXR format. You can use the blogger2wordpress online tool: http://blogger2wordpress.appspot.com/ to do this:

blogger2wordpress.appspot.com

Step 4a. (If online conversion from Blogger XML to WordPress XML fails):
Caveat: if your blog is too large, this may time out or fail. If that’s the case, you need to download the command line version of the tool and run it locally. The project is Google Blog Converters. Download the file google-blog-converters-r79.tar.gz(as of this writing it’s at version 79, that number may change).

Under MacOS, this .gz file should automatically gunzip itself and leave you with a file named google-blog-converters-r79.tar. My downloads go to a directory called Downloads in my home directory, so where I’m working is /Users/artlung/Downloads. I untar the file by double-clicking on the file in Finder or if I’m already on the command line by running tar xvf google-blog-converters-r79.tar and get a directory called google-blog-converters-r79.

Step 4.b: Open Terminal, and run:

cd Downloads

ls

You should see:

blog-03-02-2010.xml
google-blog-converters-r79

Now it’s time to run the conversion:

google-blog-converters-r79/bin/blogger2wordpress.sh blog-03-02-2010.xml > wordpress-blog-03-02-2010.xml

Assuming the conversion worked, ls should now show the following files in your Downloads directory.

blog-03-02-2010.xml
google-blog-converters-r79
wordpress-blog-03-02-2010.xml

That XML conversion is the hard part. I’ve not had it fail on me doing it that way, but if it is, you may want to check the README.txt file distributed with the blog converters tool.

Step 5: Import WordPress to WordPress:

This is covered pretty well in the document Importing Content: WordPress on the WordPress site.

Step 5a: If the file fails to import because it’s too large:

If the file is too large, then you can try overriding the various file upload limits under PHP, and WordPress: How to Import a Large WordPress XML File and Override the Default Limits

Basically:

php_value upload_max_filesize 32M
php_value post_max_size 32M

Read more about these PHP settings (which override the settings in php.ini.

You can also try setting the max limit constant in your wp-config.php file, just add this line:


/** Increase max upload limit */
define('WP_MEMORY_LIMIT', '32MB');

But I’ve had mixed success getting a shared server to respect these various limits. So, yes, here’s another step. Ugh. I know this is ugly.

Step 5b: Remove extra whitespace:

If, and only if your XML file is almost small enough to fit, you can remove the leading whitespace from the XML file. Basically under an editor like TextMate you can do a search for ^ + and replace with nothing to remove the leading spaces.

But that may not help enough. In which case it’s much uglier still.

Step 5c: Break up the XML file into segments small enough to fit on upload:

Really, WordPress should support importing a file you can FTP into its upload space, but I can’t find a tool or hack that allows for that.

Anyway, the wordpress-blog-03-02-2010.xml we have can be broken up. To do this you need to edit the text file (XML is just a text file). The way I did it was to make multiple copies of it, named like this:

wordpress-blog-posts-0001-0075.xml
wordpress-blog-posts-0076-0150.xml
wordpress-blog-posts-0151-0230.xml

The nice thing is that for each element in the WordPress WXR file, there is an element named 1. The numbering starts at 1 and in my example goes to 230. So remove everything between and including ... for the numbers not in the filename, and resave each file. So in the first example I removed ‘s with wp:post_id values greater than or equal to 76.

Assuming each file is under your upload limit, each one can be uploaded the normal way in the WordPress Dashboard: Tools → Import → WordPress. If the files are still too large, you will have to break them up further. Be cautious with the formatting in the XML or you will lose posts.

Now, this does not cover the ways to manage your images, converting your permalinks to a new format to assure old links to your site work, the best way to install WordPress, how to convert a blogger theme to WordPress, or how to assure you have archives pages work and redirect correctly. If there’s call for it, I may write those up as well.

Best of luck!

forty-one comments so far...

Wow, am I glad I found this post. Thank you for taking the time to share this solution. One question for you, I see you’ve moved some rather large blogger blogs. I’m facing the same challenge and am hours ahead thanks to you, but did you experience missing and/or duplicate comments in your blogger exports? There are a number of threads on the blogger support pages regarding this issue and I’m having the same problem.

Any thoughts?

Very interesting. You have a link on the dupe issue?

Of the blogs I have done the largest was Fussy.org and had ~1300 posts and ~15,000 approved comments. Everything so far has appeared to be fine, though I don’t have a line on something like an integrity check on either a) the export from blogger, b) the conversion to WordPress, or c) comparing the two exports or c) comparing any of those to the imported WordPress blog.

This makes comprehensive testing for problems like that not something I have a solution for. I did spot checks on, and so did my client, so it all appeared fine.

One thing you might do is use your existing blogger archives, then compare that to your WordPress archives if you have a theme which replicates the format of your Blogger archives.

Those’re my thoughts. What do you think?

I can see the duplicated comments right in the Blogger export (xml) file so naturally they appear when I convert to a WordPress file. This being the case I don’t think it has anything to do with the theme.

The problem seems to appear in posts older than 4 or 5 months. The most recent posts are fine.

You can view an example here (blog not live):
http://vps3204.inmotionhosting.com/~wootdog/2008/12/ready/

And here’s the original Blogger post:
http://www.wootube.net/2008/12/ready.html

I would cut/past the same from the original Blogger export but I’m sure you can take my word for it that it’s there.

There’s an ongoing support thread in the Blogger help forums.

I’m going to sneak over to fussy.org and see if I can find any duplicate comments in her archives πŸ˜‰

Those are definitely doubled! I didn’t see anything like that in my spot checks but I wouldn’t rule it out. I didn’t pay attention to counts of comments inside Blogger, perhaps I should have and then doublechecked after import.

Wondering if the date being so close to end and beginning of the year (Dec 31 / Jan 1) caused a glitch in “uniqueness” somehow. That’s pure guessing on my part though.

I wonder if there’s a tool for scanning nodes in an XML document for being identical. Something to check adjacent comment nodes for similarity.

Good eyes, good luck, and I’ll be curious to hear how you resolve it! Thanks for sharing!

It depends, on more recent posts, if the slugs are there, and there are titles on the posts, then all you have to do is use a custom permalink structure: /%year%/%monthnum%/%postname%.html

Then do redirects for your old archives as follows using .htaccess like:

Redirect permanent /2001_10_01_archive.html http://www.example.com/2001/10/
Redirect permanent /2001_11_01_archive.html http://www.example.com/2001/11/
Redirect permanent /2001_12_01_archive.html http://www.example.com/2001/12/
Redirect permanent /2002_01_01_archive.html http://www.example.com/2002/01/
Redirect permanent /2002_02_01_archive.html http://www.example.com/2002/02/

If you have lots more, you could do Apache RedirectMatch directives.

But it depends on the archive format you were using under Blogger.

It depends on the slug. I have not come up with an easy solution yet, but I run linklint link checks to come up with the blogger urls against the existing site, then compare. In some cases there are only a few.

Blogger, for example will remove articles like ‘a’, ‘an’ and ‘the’ – and also appear to be limited to 38 characters.

So it will turn for example: A Good Day To “Swim” into good-day-to-swim (that’s an example I just made up). WordPress may accept the existing urls, but you may need to add them back.

Like I said, I haven’t found a consistent way to handle that yet, it’s a pain and laborious. There are some plugins for WordPress that can tweak the default slug for new posts, but I have not seen a way to run through existing ones.

But you can look at your existing archive filenames and then cross check those, changing the slugs inside the Edit Post window.

Thanks for the tip!

/%year%/%monthnum%/%postname%.html does not seem to result in the same permalink for longer titles, as the titles are truncated with blogger. πŸ™

Hi Joe, you might remember me from the WebSD mailing list seven or eight years ago.

Anyway, I exported the blog from Blogger, but all I got was the blog template. None of the posts are there. Any suggestions?

Right now the blog is at SquierPhotography.com/blog/ and the new WP blog will be at blog.SquierPhotography.com — but I’d like the new one to contain all the old posts, as well. I’ll just leave the old one alone.

Hi Adam — I do remember you! At which stage did you get no posts and how could you tell? It’s not clear from your description at what point in the process it stalled. And did you try the online process — Tools, Import, Blogger? Or is this via the XML export process?

I looked at the xml file. It was only 1.5M so I thought something weird was up. It didn’t look like it stalled — it just finished the download. I tried a few months ago to just import the Blogspot blog but it wanted me to convert it to being blogger-hosted first.

I can look into it some more to see if the WP importer has changed at all.

For a circa 2006 blog, 1.5 MB could be right. Remember your images aren’t getting pulled in there, so it’s just the text.

Of course, examining the xml file can be tricky. Look for instances of the string <item> in a WordPress XML export, or <entry> in a Blogger XML export.

Shoot me an email if you have any questions Adam.

Thanks for your tutorial. It worked like a charm. I had to run the converter in the Terminal, but your directions were perfect. There are a few differences (all the file names are different) but I can live with that.

Thanks again.

Does the tool work from a Windows computer? My blog is too big for the online converter it seems and I don’t have access to a Mac.

Thanks

Thank you! With the May 1 deadline for converting from FTP published Blogger blogs just around the corner, I need to figure this out soon. I’ll try that conversion.

Just got what looks like the final notice:

———- Forwarded message ———-
From: Blogger
Date: Tue, Apr 27, 2010 at 9:22 PM
Subject: URGENT: Blogger FTP publishing discontinued on May 1st
To: artlung@gmail.com

Dear FTP user:

You are receiving this e-mail because one or more of your blogs at Blogger.com are set up to publish via FTP. Earlier this year we announced a planned shut-down of FTP support on Blogger Buzz (the official Blogger blog), and that deadline of May 1st is quickly approaching. This is the second and final email reminder to migrate your FTP blog from your current URL to a Blogger-managed URL (either a Custom Domain or a Blogspot URL).

For more information on the deprecation of FTP, as well as a handful of helpful resources which will help walk you through the migration process, please take a look at our dedicated migration blog: http://blogger-ftp.blogspot.com/.

Thanks for using Blogger.

Regards,
The Blogger Team
Google
1600 Amphitheatre Parkway
Mountain View, CA 94043
—-
This e-mail is being sent to notify you of important changes to your Blogger account.

Joe, I’m helping someone who is trying to import a large Blogger blog into WordPress and running into all kinds of issues. Can you please contact me. I’d like to get an estimate from you on how much it would cost to have you help us out. (Followed your instructions here, but we’re coming up significantly short on the bringing over all the comments).

Just an FYI – I finally solved all my issues with the import. Had to run the import on that WordPress XML many many times before it was able to finish the whole file – I ended up with duplicate comments and the comment counts weren’t working. I fixed those two issues with some SQL commands in PHPMyAdmin – details here

I would be highly interested if you wrote another article about how to convert a Blogger theme into a WordPress theme. I’ve searched high and low for such a plugin but haven’t found anything as of yet. :p

I am having some issues. I am using windows and get lost in the instructions starting: Step 4.b: Open Terminal, and run:

I see that this is written for mac? So what instructions should I use for Windows. The blog I am converting is 134MB so this is my only hope.

Your help is appreciated,
Thanks
Jessica

Hey! Hoping someone can help me…I’m trying to use this conversion method using the terminal on a mac…I tried following the steps and it seemed to work, I get a new file that says WordPress but its an empty file. its 0KB. Any suggestions what I’m doing wrong?

Thank you. This was a godsend article. I had so much trouble because my blog had over 1000 posts – it was one problem after another. Google wouldn’t recognize my site so I couldn’t use the plugin, then my blog was so big I couldn’t use the APP, then I had to split it up…LOL

I felt like I had done something to piss off the blog gods. This really helped out, I appreciate it. And on top of it – you had the mac stuff. I could kiss you!

Lewis L, that’s when Step 4a kicks in. What the error means is the filesize of your export has a problem I believe. It would be worthwhile to try the downloadable tool. It’s more work, but it does solve the problem.

Joe:

Technology truly appears to not wish to cooperate with me.

When I try to unzip the converter file a message pops up which says:

“Compressed Zip Folder Error

Windows cannot open the file.

The compressed Zip Folder

C:/users/Lewis/Downloads/Google-Blog-Converters/r89.tar.gz is invalid

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.