Cassandra
CQL3
TTL
database
table creation

Create Cassandra table using cql3 with default TTL

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle large amounts of data across numerous commodity servers without a single point of failure. One of its powerful features is Time-To-Live (TTL), which allows automatic expiration of data after a specified period. This article will focus on creating a Cassandra table using CQL3, the Cassandra Query Language, and setting default TTL for data.

Understanding CQL3

CQL3 is the third generation of Cassandra's query language, designed to make database operations more SQL-like. CQL3 offers an easy-to-read syntax that simplifies these operations while providing robust features such as automatic data expiration through TTL.

Time-To-Live (TTL) in Cassandra

TTL is vital for scenarios where data relevance is temporary. For example, session data, caches, or logs might only be useful for a limited time. TTL ensures such data is automatically purged, reducing storage costs and improving performance.

Creating a Table with Default TTL

Creating a Cassandra table with a default TTL involves defining the table structure and specifying the TTL value. Here’s a step-by-step guide:

Step 1: Define Keyspace

First, ensure you have a keyspace. A keyspace is analogous to a database in SQL. To create a keyspace:

sql
CREATE KEYSPACE IF NOT EXISTS my_keyspace
WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
  • SimpleStrategy is suitable for single datacenter setups.
  • replication_factor indicates the number of replicas for each piece of data.

Step 2: Create Table with Default TTL

Now, create a table with a default TTL. This is achieved using the CREATE TABLE CQL command, followed by setting the default_time_to_live option.

sql
1CREATE TABLE IF NOT EXISTS my_keyspace.users (
2    user_id UUID PRIMARY KEY,
3    username text,
4    email text,
5    created_time timestamp
6)
7WITH default_time_to_live = 86400;  -- 1 day in seconds
  • default_time_to_live specifies that each row expires after 24 hours (86400 seconds).
  • TTL is counted from the time the row is first inserted.

Step 3: Insert Data

Data inserted into this table will automatically have the defined TTL applied:

sql
INSERT INTO my_keyspace.users (user_id, username, email, created_time) 
VALUES (uuid(), 'Alice', '[email protected]', toTimestamp(now()));

Step 4: Query Data with TTL

You can check the remaining TTL for a specific row using:

sql
SELECT TTL(username) FROM my_keyspace.users WHERE user_id = some_uuid;

If TTL is 0 or null, the data does not expire.

Modifying TTL for Specific Rows

You can override the default TTL by specifying a different TTL during the insert or update operations:

sql
INSERT INTO my_keyspace.users (user_id, username, email, created_time)
VALUES (uuid(), 'Bob', '[email protected]', toTimestamp(now()))
USING TTL 3600;  -- 1 hour

Advantages & Considerations

Advantages

  • Automatic Data Management: Automatically manage and purge outdated data.
  • Reduced Manual Cleanup: Minimizes maintenance by auto-deleting irrelevant data.
  • Space Efficiency: Frees up storage and ensures only relevant data is maintained.

Considerations

  • Node System Time: TTL relies on node system time. Therefore, ensure time synchronization (e.g., using NTP).
  • No Partial Updates: TTL resets on updates, which could unintentionally extend data lifespan.
  • Data Archiving: Important data might be deleted if TTL is incorrect. Always verify TTL settings.

Summary Table: Key Points

Key FeatureDescription
CQL3Query language designed for simplicity and power
TTLTime-To-Live defines data lifespan by default
default_time_to_liveTable-wide TTL setting
TTL OverrideSpecify TTL during insert/update to customize
Data ExpiryAutomatic expiry after TTL duration
Time SynchronizationEssential for accurate TTL operations
Data Management BenefitsReduces storage costs and maintains efficiency

Conclusion

Using default TTLs in Cassandra is an effective strategy for managing data that is only relevant for a limited time. By incorporating a default TTL in your table definitions, you can seamlessly automate data expiration, help conserve resources, and ensure your database remains performant. As with all database configurations, careful planning and testing are recommended to avoid unintended data losses.


Course illustration
Course illustration

All Rights Reserved.