How to Install Cassandra on Ubuntu

Introduction

Apache Cassandra is a popular, open-source NoSQL database software. It provides high availability while handling a large amount of data. Regular relational databases cannot handle linear scaling, seamless data distribution, and other big data requirements as efficient as Cassandra.

A number of big players in online industries have turned to Apache Cassandra. Some of them include Netflix, Apple, Uber, and eBay.

Follow the steps listed in this guide to learn how to install Apache Cassandra on Ubuntu with the necessary packages.

Tutorial on how to install Cassandra on Ubuntu

Note: Learn more about Cassandra in our MongoDB vs Cassandra head-to-head comparison article.

Prerequisites

  • An Ubuntu system
  • Access to a terminal or command line
  • A user with sudo or root

STEP 1: Install Packages Necessary for Apache Cassandra

Before you get on to installing Cassandra on Ubuntu, make sure you install Java OpenJDK 8 and the api-transport-https package.

If you already have these packages installed, you can skip to STEP 2 of the guide.

Note: We used Ubuntu 20.04 to provide the examples, but the instructions apply to other Ubuntu versions as well.

Install Java OpenJDK

Apache Cassandra needs OpenJDK 8 to run on an Ubuntu system. Update your package repository first:

sudo apt update

When the process finishes, install OpenJDK 8 using the following command:

sudo apt install openjdk-8-jdk -y

When the installation completes, test if Java was installed successfully checking the Java version:

java -version

The output should print the Java version.

The second digit (8) represents the version of Java.

Install the apt-transport-https Package

Next, install the APT transport package. You need to add this package to your system to enable access to the repositories using HTTPS.

Enter this command:

sudo apt install apt-transport-https
Output when installing apt-transport package.

The example above highlights the final two steps of the apt-transport-https installation process.

STEP 2: Add Apache Cassandra Repository and Import GPG Key

You need to add the Apache Cassandra repository and pull the GPG key before installing the database.

Enter the command below to add the Cassandra repository to the sources list:

sudo sh -c 'echo "deb http://www.apache.org/dist/cassandra/debian 40x main" > /etc/apt/sources.list.d/cassandra.list'

The output returns to a new line with no message.

The last major Cassandra release at the time of writing this article is 4.0. That is why we used 40 in the command. To install an older version, for example 3.9, replace 40x with 39x.

Then, use the wget command to pull the public key from the URL below:

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Command for pulling the public GPG key.

If you entered the command and the URL correctly, the output prints OK.

Note: pay attention the letter case in the URL above. You need to enter the correct case and the dash at the end of the command.

STEP 3: Install Apache Cassandra

You are now ready to install Cassandra on Ubuntu.

Update the repository package list:

sudo apt update

Then, run the install command:

sudo apt install Cassandra
Command to install Cassandra on Ubuntu.

The output above shows the final section of the Cassandra installation procedure on Ubuntu 20.04. The output should look similar on older versions of Ubuntu.

Note: Once the installation finishes, the Cassandra service starts automatically. Also, a user cassandra is created during the process. That user is used to run the service.

Verify Apache Cassandra Installation

Finally, to make sure the Cassandra installation process completed properly, check cluster status:

nodetool status
Checking the cluster status with the nodetool command.

The UN letters in the output signal that the cluster is working.

You can also check Cassandra status by entering:

sudo systemctl status cassandra

The output should display active (running) in green.

Command for checking Cassandra status.

Commands to Start, Stop, and Restart Cassandra Service

If, for any reason, the service shows inactive after the installation, you can start it manually.

Cassandra status showing inactive.

Use the following command to start Cassandra:

sudo systemctl start cassandra

Check the status of the service again. It should change to active.

To restart the service, use the restart command:

sudo systemctl restart cassandra

To stop the Cassandra service, enter:

sudo systemctl stop cassandra

The status shows inactive after using the stop command.

Optional: Start Apache Cassandra Service Automatically on Boot

When you turn off or reboot your system, the Cassandra service switches to inactive.

To start Cassandra automatically after booting up, use the following command:

sudo systemctl enable cassandra

Now, if your system reboots, the Cassandra service is enabled automatically.

STEP 4: Configure Apache Cassandra

You may want to change the Cassandra configuration settings depending on your requirements. The default configuration is sufficient if you intend to use Cassandra on a single node. If using Cassandra in a cluster, you can customize the main settings using the cassandra.yaml file.

Note: We strongly advise to create a backup of your cassandra.yaml file if you intend to edit it. To do so, use this command:

sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.backup

We used the /etc/cassandra directory as a destination for the backup, but you can change the path as you see fit.

Rename Apache Cassandra Cluster

Use a text editor of your choice to open the cassandra.yaml file (we will be using nano):

sudo nano /etc/cassandra/cassandra.yaml
Cassandra cluster name in yaml file.

Find the line that reads cluster_name: The default name is Test Cluster. That is the first change you want to make when you start working with Cassandra.

If you do not want to make more changes, exit and save the file.

Add IP Addresses of Cassandra Nodes

Another thing that you must add to the cassandra.yaml if you are running a cluster is the IP address of every node.

Open the configuration file and under the seed _provider section, find the seeds entry:

Adding IP Addresses of Cassandra Nodes

Add the IP address of every node in your cluster. Divide the entries by using a comma after every address.

STEP 5: Test Cassandra Command-Line Shell

The Cassandra software package comes with its command-line tool (CLI). This tool uses Cassandra Query Language (CQL) for communication.

To start a new shell, open the terminal and type:

cqlsh
Launching cqlsh shell on Ubuntu.

A shell loads showing the connection to the default cluster. If you had changed the cluster_name parameter, it will show the one you defined in the configuration file. The example above is the default connection to the localhost.

Conclusion

By following these simple steps, you should have a working Cassandra installation on your Ubuntu system.

Additionally, we showed you how to edit the most important parameters in the Cassandra configuration file. Remember to make a backup of the conf file, just in case, and you can start using the Cassandra database software.

Learn more about how to use Cassandra in our guide on how to create, drop, alter and truncate Cassandra tables.

Was this article helpful?
YesNo
Goran Jevtic
Goran combines his leadership skills and passion for research, writing, and technology as a Technical Writing Team Lead at phoenixNAP. Working with multiple departments and on various projects, he has developed an extraordinary understanding of cloud and virtualization technology trends and best practices.
Next you should read
How to Install Elasticsearch, Logstash, and Kibana (ELK Stack) on CentOS 8
May 6, 2020

Need to install the ELK stack to manage server log files on your CentOS 8? Follow this step-by-step guide and...
Read more
Cassandra vs MongoDB - What are the Differences?
April 27, 2020

Learn about the difference between Cassandra and MongoDB. These NoSQL databases have some similarities, but...
Read more
PostgreSQL Vs MySQL: A Detailed Comparison
March 30, 2023

Explore the differences between the two most widely used database management systems. PostgreSQL and MySQL...
Read more
How to Install MongoDB on Ubuntu 18.04
April 4, 2024

MongoDB is a database program that provides high performance, high availability, and automatic scaling to...
Read more