When you are new to Cassandra many people wonder what is that keyspace. In order to start exploring the Cassandra, we must have a basic idea of keyspaces work and how they can be created. So whenever you are planning to create a keyspace in production Cassandra you must know what are the do’s and dont’s. Hence in this blog post, we are going to discuss more regarding the Cassandra keyspaces.
A keyspace is similar to a database in the RDBMS. A keyspace is an object that holds the column families, indexes, user-defined types. Keyspaces defines the data replication strategy, replication factor & durable write mode on the nodes. A similar keyspace structure looks like this:
Creating a keyspace:
In order to create a keyspace in Cassandra, we have to specify the replication strategy, replication factor, datacenter name, and the durable writes. Below is the syntax to create a keyspace:
1 2 3 4 5 6 | CREATE KEYSPACE [IF NOT EXISTS] keyspace_name WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : N } | 'class' : 'NetworkTopologyStrategy', 'dc1_name' : N [, ...] } [ AND DURABLE_WRITES = true|false] ; |
Replication strategy:
Replication is a technique used for high availability. In Cassandra replication involves mainly two strategies as explained below:
- Simple Strategy
- Network Topology Strategy
If you want to know more about the replication strategy, you can read this blog post. (read more)
Simple Strategy:
This strategy is a simple one which treats the entire cluster as a single data center. It also consists of another parameter called as Replication factor. It defines the number of copies of the replica to be maintained in a cluster.
1 | CREATE KEYSPACE portal WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 } |
Network Topology Strategy:
This strategy is used to specify the data centers and the number of replicas to be placed in each data center.
1 | CREATE KEYSPACE portal WITH REPLICATION = { 'class' : NetworkTopologyStrategy, 'South' : 3,’North’:3 } |
Durable writes:
The major impact comes here in creating the keyspaces. When Cassandra receives a write request at first a copy os written to disk on a append only structure called commit log. Then it is written to a memory structure called memtable. When memtable reaches a certain limit the data gets flushed to a structure in disk called SSTable.
Enabling durable writes the data to the commit log. By default, it is enabled and it’s not advisable to disable the durable writes in a simple strategy replica cluster.
1) DURABLE_WRITES=false
2) DURABLE_WRITES=true
Thus this is the simple way how the keyspaces in Cassandra work. Hope you might get an idea of how easy it is to create them. If you have any queries regarding the keyspaces in Cassandra let me know through the comments section.
The above example has been shown in the wrong order.
durable_writes = true is the scenario when the write is duly written to commit_log and when durable_writes = false, it is not written to commit_log but the diagram above shows vice-versa.
Please correct that.
Thanks. Changed