Couchbase 1.8.0 -> 1.8.1 upgrade and rebalance freeze issue

We’ve stumbled on a (relatively) important issue in the version 1.8.0 of couchbase, which is that during online upgrade to 1.8.1, the rebalance process stops/hangs/freezes . Various reasons have been proposed for this problem (for instance, the existence of empty vbuckets).

I’ve figured out that there are good old reasons why the upgrade process could fail. So after 5 unsuccessful attempts at smoothly upgrading, I’ve finally managed to find a procedure that seems to work :

(this is a procedure for online upgrade, where for each node, you successively remove/rebalance it, then stop/upgrade it, then add/rebalance it again) (this procedure is using the debian package)

  1. backup with cbbackup.
  2. remove the node and rebalance (it has never failed me at this point)
  3. stop the couchbase-server
  4. make sure that epmd is not still running (it was always running in at least 3 of my previous upgrade attempts). Otherwise, kill it !
  5. wipe the /opt/couchbase (ok, maybe this is overkill, but at least once I HAD to do it in order to continue the upgrade)
  6. dpkg -i couchbase-server...1.8.1.deb
  7. at this point you can edit /opt/couchbase/etc/couchbase_init.d and add this line « export HOME=/tmp » at the beginning, else you won’t be able to stop the server using « service couchbase-server stop« 
  8. add the node back to the cluster / rebalance