Cleaning Up Your WordPress Database
Problem: Difficulty importing or exporting large WordPress database files.
Over the years, one of my blogs has acquired upwards of 1,000 posts and 10,000 comments, and I’ve changed hosting companies 3 or 4 different times. Each time, I’ve experienced various degrees of hassle related to the exporting and importing of the large database files. The first time I changed hosting companies, I used the standard WordPress export functionality, resulting in an XML file upwards of 40MB. Trying to import this file via WordPress was a total nightmare, which I eventually solved by using this Python script to split the export file into about 40 smaller chunks, and uploading each one by one. Not fun!
The second time I changed hosting companies, I got a little smarter and exported the entire MySQL database via MyAdmin. This was a relatively easy and hassle-free solution, as my blog was still pretty small then. By the third time I changed hosting companies, my database file had grown to about 40MB and I couldn’t export it smoothly due to timeout constraints my hosting company imposed on MyAdmin. Enter Sequel Pro, a free database management tool for Mac OSX that easily allows one to import and export MySQL files. NOTE: initial connection attempts failed until I set the hostname to the IP address, after which things worked just fine.
By the fourth time I changed hosting companies, I noticed that Sequel Pro could no longer import my
wp_posts table because it was throwing up all sorts of confusing syntax errors regardless of modifications to the frequency of
INSERT statements. I also noticed that
wp_posts listed about four times as many posts as I actually had, accounting for about 25MB of space! Upon browsing through the exported table, I noticed that many of these posts appeared to be duplicates. That led to a Google search, which led to this article on cleaning up WordPress databases.
It turns out that WordPress saves revisions as separate posts in the database. In other words, if you edit and republish Post X seven different times, there will be seven different versions of Post X stored in your database. This might be convenient if persistent revisions are important to your style of blogging, but if not, then this is just a pot of yuck waiting to boil over. It would be helpful if WordPress provided some method of notifying the owners of large blogs about this particular problem, but, they don’t. So, after backing up my database, I anxiously ran the following query against
DELETE FROM wp_posts WHERE post_type = "revision";
This quickly cut my
wp_posts table to 25% of what it was, resulting in an export file a little over 3MB, which allowed for smooth importing, after which things were back to normal. I hope this advice helps somebody, because it took me hours of frustration to get to this point!