My latest client launch involved converting a blog from Typepad to Wordpress, including everything from content to site design. While the latter was relatively trivial to rebuild on top of the Thematic framework, I would not have been able to efficiently convert the content from Typepad without this set of instructions on foliovision’s website.
Most of it was pretty straightforward, but reading through the reader comments helped smooth out a few basic issues. I was grateful for the help, as manually copying-and-pasting 2½ years worth of posts and images was not something I was looking forward to. Still, there were a few small things that could have been deal breakers:
When using HTTrack to parse the Typepad account and download all the images (foliovision step #7), it was necessary to change the spider settings in the preferences dialog. Specifically, I had to change the ’spider’ dropdown value to “no robots.txt rules”.
In order to successfully run the Search Regex plugin, I had to allocate more memory to PHP in Wordpress – accomplished by adding a line to the wp-config file.
Finally, Typepad uses image URLs that don’t contain a file type extension. Wordpress choked on these and refused to display them. After trying various custom options in Search Regex, it finally hit me that there were only 4-5 different ending strings for the image “names”. I was able to search for those strings and replace them with the file type appended to the end.
I have to confess that I didn’t do much after step 9 in foliovision’s instructions since the actual URLs of the posts weren’t changing, just the back end software, so permalink redirects weren’t needed. Which is fine, because as it turns out, I ended up spending more time than I cared for on the permalinks. Shortly after the initial launch, we discovered that the archive permalinks would only work if you were logged in. If you were unauthenticated, as most site visitors are, you would get WP’s 404 error.
After all the work converting the Typepad posts, my initial thought was that the permalink structure had somehow been imported incorrectly or had gotten corrupted, but trying all of the different permalink formatting options and fussing with the .htaccess file and other php config options only lead me to a dead end.
Finally, I just opened one of the old files and republished it to see if there was something in the database that was goofy and would be fixed by publishing the file. Sure enough, it worked. Eventually I discovered that all of the old Typepad posts had been imported as quasi-drafts – the post status read “Last Modified” instead of “Published”, as all of my client’s new content indicated.
I wasn’t thinking down that route because all of the excerpts appeared in the archive pages, whether logged in or logged out. How could the excerpts be showing, as well as the full posts on the index pages, if the posts themselves weren’t published? Still haven’t figured that one out…
Fortunately WP’s batch editing features include changing the published status, so I didn’t have to republish them one-by-one — almost 550 posts! — but still, 36 pages of posts… In the end, it was good to just get it done and get everything online with Wordpress. I’m looking forward to tackling the next phase of the website and building a more media-conscious blogsite.
Converting a Typepad blog to Wordpress
Most of it was pretty straightforward, but reading through the reader comments helped smooth out a few basic issues. I was grateful for the help, as manually copying-and-pasting 2½ years worth of posts and images was not something I was looking forward to. Still, there were a few small things that could have been deal breakers:
I have to confess that I didn’t do much after step 9 in foliovision’s instructions since the actual URLs of the posts weren’t changing, just the back end software, so permalink redirects weren’t needed. Which is fine, because as it turns out, I ended up spending more time than I cared for on the permalinks. Shortly after the initial launch, we discovered that the archive permalinks would only work if you were logged in. If you were unauthenticated, as most site visitors are, you would get WP’s 404 error.
After all the work converting the Typepad posts, my initial thought was that the permalink structure had somehow been imported incorrectly or had gotten corrupted, but trying all of the different permalink formatting options and fussing with the .htaccess file and other php config options only lead me to a dead end.
Finally, I just opened one of the old files and republished it to see if there was something in the database that was goofy and would be fixed by publishing the file. Sure enough, it worked. Eventually I discovered that all of the old Typepad posts had been imported as quasi-drafts – the post status read “Last Modified” instead of “Published”, as all of my client’s new content indicated.
I wasn’t thinking down that route because all of the excerpts appeared in the archive pages, whether logged in or logged out. How could the excerpts be showing, as well as the full posts on the index pages, if the posts themselves weren’t published? Still haven’t figured that one out…
Fortunately WP’s batch editing features include changing the published status, so I didn’t have to republish them one-by-one — almost 550 posts! — but still, 36 pages of posts… In the end, it was good to just get it done and get everything online with Wordpress. I’m looking forward to tackling the next phase of the website and building a more media-conscious blogsite.