Thursday, May 10, 2012

Apache Cassandra on a Raspberry Pi


One of the  reason I got hold of a Raspberry Pi (the $35 arm based Linux
machine) was to play around with building a cluster of them  for handling
"Big Data".  This is a real exercise in tinkering, very much a “what if” scenario.  The first thing I wanted to play with was getting Apache Cassandra to run on  the Pi.  Of course Cassandra is built with Java, there is no java on the Pi out of the box.    Several people suggested building Openjdk (http://openjdk.java.net/) but I plumbed for Oracle’s java SE for  embedded available here:


Download Java SE for Embedded 7 (ARMv6/7 Linux - Headless )and install it on the PI.  Once done and with the path and java_home correctly set you should be able to run the java on your pi.

Next get hold of a version of Apache Cassandra (http://cassandra.apache.org/)  I was using version 1.1.0.  Install it as usual.    If yo u try and run Cassandra from the bin directory it will fail to start with the serror:

“Invalid initial eden size: -Xmn0M”

The problem is that cassandra-env.sh is trying to work out the heap size by getting the  amount of memory on the machine and multiplying it it by the number of processors (line 69)

max_sensible_yg_in_mb=`expr $max_sensible_yg_per_core_in_mb "*" $system_cpu_cores`

The number of system cpu cores (line 22 or there abouts):

system_cpu_cores=`egrep -c 'processor([[:space:]]+):.*' /proc/cpuinfo`

is failing on the Pi (there is no processor line in /proc/cpuinfo).  For now I’ve manually altered cassanda-env.sh  to report only on core. 

system_cpu_cores=1

Cassandra now runs on the Pi.     There must be a better way of doing this( my Linux programming is failing me for the moment)  but for now I’m just tinkering so it will do.

Next up, try and get some performance figures for the Pi running Cassanadra.

Update

The correct way to fix this is to add change the Linux section of cassandra-env.sh to
           system_memory_in_mb=`free -m | awk '/Mem:/ {print $2}'`
            system_cpu_cores=`egrep -c 'processor([[:space:]]+):.*' /proc/cpuinfo`
            if [ "$system_cpu_cores" -lt "1" ]
            then
               system_cpu_cores="1"
            fi
            echo "Linux"
            echo "memory" $system_memory_in_mb
            echo "cores" $system_cpu_cores

7 comments:

  1. Could you go over how to install that java? It didn't seem to come with an installer or any instructions on where to put it/how to make it "installed"

    ReplyDelete
  2. Brian, sorry for the late reply. I did it the simple way, unpack the file and copy the contents to /usr/local/bin/java . Then it's just a case of setting PATH and Java_home correctly

    PATH=$PATH:/usr/local/bin/java/bin
    JAVA_HOME=/usr/local/bin/java
    export PATH
    export JAVA_HOME

    You'll need to update PATH in /etc/profile toy get it to load when you login

    Hope that helps !

    ReplyDelete
  3. Cassandra represents one of the most exciting uses of P2P technology to date.

    ReplyDelete
  4. Hi,
    latest vesion of cassandra (1.2.1) now handles our raspberries just fine (it seems):

    # some systems like the raspberry pi don't report cores, use at least 1
    if [ "$system_cpu_cores" -lt "1" ]
    then
    system_cpu_cores="1"
    fi

    br Svante

    ReplyDelete
  5. Which distro did you install on your Pi to get this running? Soft-float Raspbian "wheezy"?

    ReplyDelete
    Replies
    1. Disregard. I ended up using the JDK 1.8 (still in beta/developer release) which supports hardfloat.

      Delete
    2. Which cassandra did you use ? I found that 1.2.2 needs JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark" commented out in the cassandra-env.sh file

      Delete