Troubleshooting cassandra: Saved cluster name XXXX != configured name YYYY

If at anytime you can’t start cassandra, and the logfile show this error:

INFO [SSTableBatchOpen:3] 2012-10-31 16:51:35,669 SSTableReader.java (line 153) Opening /cassandra/data/system/LocationInfo-hd-56 (696 bytes)
ERROR [main] 2012-10-31 16:51:35,717 AbstractCassandraDaemon.java (line 173) Fatal exception during initialization
org.apache.cassandra.config.ConfigurationException: Saved cluster name XXXX != configured name YYYY
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:299)
at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:169)
at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)

The problem is a mismatch between the cluster name in the config file ($CASSANDRA_HOME/conf/cassandra.yaml) and the value LocationInfo[utf8(‘L’)][utf8(‘ClusterName’)] inside the database. To fix this we must either change one or the other.

Changing the config file is obvious, but if we want to change the other one, you need cassandra-cli:

$CASSANDRA_HOME/bin/cassandra-cli -h localhost
use system;
set LocationInfo[utf8('L')][utf8('ClusterName')]=utf8('');
exit;

This can happen when we are trying to move data from a cluster to another (moving from a production environment to a non-production, for instance), or simply if we want to change cassandra cluster name, for whatever reason.

Basic apache cassandra installation (and some fine tune improvements)

Lately I’ve been working a lot with apache Cassandra clusters, so I’ll write some posts about it. I will start with the obvious: cassandra installation. It’s a very simple process that you could make even easier using existing .rpm or .deb. But I’ll do it distro-independent so it can be useful for everybody, no matter what distro you use.

1- Prerequisite: Install Java Virtual Machine

For starters, apache cassandra is a java application, so we will need the java virtual machine (jvm) to get it going. There is different jvm versions (openjdk, for instance), but using Sun / Oracle official version is highly recommended. We can find it in this link: http://www.java.com/en/download/linux_manual.jsp?locale=en, or we could install it using the package manager of each distro (it’s packed at least for the main distros: redhat/centos, debian, ubuntu, etc).

2- Downloading Cassandra

Now we download Cassandra. If it’s our first installation, we probably want the last version, and we can find it in the frontpage of their site (it’s 1.1.6 right now). After that, we unpack it where we want it (should be on /opt but you might disagree about that) and soft-link this directory to “/opt/cassandra“, to make future upgrades easier.

cd /opt
wget http://apache.rediris.es/cassandra/1.1.6/apache-cassandra-1.1.6-bin.tar.gz
tar zxvf apache-cassandra-1.1.6-bin.tar.gz
ln -s apache-cassandra-1.1.6 cassandra

3- Managing cassandra user’s permissions

It’s a best practice to not giving to a service more permissions than it’s needed. So we will create a “cassandra” user, that will be the user running our process. And we will create some directories that cassandra will use while running (logs, pids, etc)

adduser cassandra
mkdir /var/log/cassandra
chown -R cassandra /var/log/cassandra/
mkdir /var/run/cassandra
chown -R cassandra /var/run/cassandra/
mkdir /var/lib/cassandra
chown -R cassandra /var/lib/cassandra/
chown -R cassandra /opt/cassandra/

4- Cassandra as a system service

Now we need to start it as a system service. It does not come a init script with the official distribution, but here we have a standard one:

wget http://www.tomas.cat/blog/sites/default/files/cassandra.initd -O /etc/init.d/cassandra
chmod a+x /etc/init.d/cassandra
chkconfig --add cassandra
chkconfig cassandra on

This will make we could start and stop cassandra as a system service, and cassandra starting with the server boot.

5- Cassandra configuration changes

Right now we have a one-node cassandra cluster. We may want to make some configuration changes, such as cassandra RAM allowance. By default, half the RAM is assigned to java HEAP, and 100MB for each CPU, but you can change that in /opt/cassandra/conf/cassandra-env.sh file. You can also change java cmdline options, JMX port and so on. It pretty self-explaining.

Another file we may be interested in is /opt/cassandra/conf/cassandra.yaml, where we can configure a lot of things, such as where data si saved, or IP address and port we want to listen to. Examples:

  • rpc_address: address where thrift will be listening. We must put a existing IP address (it may be localhost, if we want to), or 0.0.0.0 if we want to listen through all of them
  • data_file_directories: directory where we want to store cassandra sstables
  • commitlog_directory: directory where we want to store commit_log
  • saved_caches_directory: directory where we want to store caches

In case we want to make a cluster, we will be interested in these parameters as well:

  • cluster_name: The name you want for your cluster (it must be the same for all cluster nodes and, MOST IMPORTANTLY, different between different clusters, so they don’t step on each other)
  • initial_token: this nodes token
  • listen_address: Address where gossip will be listening to. It can’t be localhost nor 0.0.0.0, because the rest of nodes will try to connect to this address.
  • seeds: Which server should we ask for a list of cluster nodes

6 – Fine tune

There is some things that are not enabled by default, but they upgrade stability and performance. I always enable them! There they are:

6.1- Fine tune 1 – Batch-certified cassandra init script

I think the former init script has a limitation. As cassandra is a java app, the init script just send the “start” or “stop” signal, and it doesn’t wait to the real startup or shutdown. This is why we can’t have a REAL confirmation of the being started, and if one of our batch processes depends on that, you’re screwed. This is why I added some lines checking with netcat if the thrift port is listening, and so, if cassandra is up and running, before returning to the shell. These are the lines:

First we extract thrift address and port from the config file:

CASS_CLI_ADDRESS=cat $CASS_HOME/conf/cassandra.yaml|grep rpc_address|cut -d":" -f2
CASS_CLI_PORT=cat $CASS_HOME/conf/cassandra.yaml|grep rpc_port|cut -d":" -f2

Then we add this check to the “start” function:

nc -z -w 1 $CASS_CLI_ADDRESS $CASS_CLI_PORT
while [ $? -ne 0 ]; do
[ date -r $CASS_LOG +%s -lt date -d "$TIMEOUT_LOG minutes ago" +%s ] && echo "Too much time of inactivity in the log file. Aborting..." && exit 1 ;
[ $start_date -lt date -d "$TIMEOUT_STARTUP minutes ago" +%s ] && echo "It's taking too long to start. Aborting..." && exit 1;
sleep 10;nc -z -w 1 $CASS_CLI_ADDRESS $CASS_CLI_PORT; done

It’s a little tatty, but it works. When we come back to the shell we know for sure cassandra is running. If you want the “tuned” version, you just do this:

wget http://www.tomas.cat/blog/sites/default/files/cassandra_tuned.initd -O /etc/init.d/cassandra
chmod a+x /etc/init.d/cassandra
chkconfig --add cassandra
chkconfig cassandra on

Keep in mind that this script depends on netcat, so you’ll need to install it to work (“yum install nc“, “apt-get install netcat“, or whatever it is you need to do).

6.2- Fine tune 2 – Enable Java Native Access (JNA) in cassandra

Disk access, among other things, improves a lot if you use OS native libraries instead of java (they are slow and resource-consuming). We can do this installing JNA, that does kind of a bridge between Java and native libraries. Installing is as easy as putting .jar files in /opt/cassandra/lib directory, and cassandra will enable them automatically

wget https://github.com/twall/jna/blob/3.5.1/dist/jna.jar?raw=true -O /opt/cassandra/lib/jna.jar
wget https://github.com/twall/jna/blob/3.5.1/dist/platform.jar?raw=true -O /opt/cassandra/lib/platform.jar

We can make sure cassandra is using it if we take a look to cassandra’s log file (/var/log/cassandra/system.log). If it’s not using JNA, we will find this message:

INFO [main] 2012-09-26 16:52:40,051 CLibrary.java (line 62) JNA not found. Native methods will be disabled.

If it has correctly found JNA, we’ll get this other message:

INFO [main] 2012-10-30 16:41:00,970 CLibrary.java (line 109) JNA mlockall successful

6.3- Fine tune 3 – Raising open files limit for cassandra

If we stress cassandra a little, usually it will more than 1024 open files (the default maximum). To avoid this bringing any problems, we should edit /etc/security/limits.conf file and raise that limit. And advisable value is 65536:

root soft nofile 65536
root hard nofile 65536
cassandra soft nofile 65536
cassandra hard nofile 65536

6.4- Fine tune 4 – Putting a limit on Cassandra memory use

Finally, we all know java and how its memory consumption is. You can configure a number, but it will use more than that. Application memory, heap memory, overhead, etc, we can find a 8GB-configured java process using 16GB or more. To ensure this will not happen to us, its highly advisable to put a limit. In our case, we’ve found wise this number to be half the server memory, in /etc/security/limits.conf file:

root soft memlock 8388608
root hard memlock 8388608
cassandra soft memlock 8388608
cassandra hard memlock 8388608

7 – Checking cassandra is up & running

And we’re done, we just need to start the daemon (“/etc/init.d/cassandra start” or “service cassandra start” or whatever). And we can check it using nodetool (connects to cassandra via JMX) or with cassandra-cli (connects to cassandra via thrift). The standard output will be something like this:

[[email protected] ~]# /opt/cassandra/bin/nodetool -h localhost ring
Note: Ownership information does not include topology, please specify a keyspace.
Address DC Rack Status State Load Owns Token
127.0.0.1 datacenter1 rack1 Up Normal 11,21 KB 100,00% 66508542233540571552076363838168202092
[[email protected] ~]# /opt/cassandra/bin/cassandra-cli -h localhost
Connected to: "Test Cluster" on localhost/9160
Welcome to Cassandra CLI version 1.1.6

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[[email protected]] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
59adb24e-f3cd-3e02-97f0-5b395827453f: [127.0.0.1]

[[email protected]] quit;
[[email protected] ~]#

And that’s it!! We have a basic cassandra installation a little tuned.