Make & Know Java

- on August 08, 2016 - No comments

NOSQL-COLUMN Family Model

In this article, I will discuss Column-Family Data Model. This Column-Family Data model uses in NoSQL database. Cassandra follows this Column-Family Data Model.

What Is Column-Family Data Model?

In order to understand Column Family, I will take a small example and try to map it on Relational database then gradually move on to No SQL mode.

Suppose I want to design Employee and Department relationship, so in an RDBMS

We will create Two Tables One for Employee another for Department.

Employee table has dept_id column which stored Department ID so Department ID is act as Foreign Key

Table: Employee

Emp_id	Emp_name	Emp_address	Emp_sex	Dept_id
1	Shamik mitra	1,N D Lane	M	2
2	Ajay Bose	34,CH Street	M	1
3.	Ragini Sil	45, CT Road	F	2
4	Aniket Dhar	2,PL Road	M	2

Table: Department:

Dept_id	Dept_name	Dept_detail
1	HR	HR Function
2	IT	Developement

The Problem domain, map into Two tables now suppose we will search for

“Details of Employees Under a Department.”

One should join Employee and Department using Dept_id and fetch the necessary information.

Query should be

Select * Employee e, Department d where e.Dept_id=d.Dept_id order by e.name;

For a small number of Dataset, this will return result very fast but when data increases gradually performance will degrade. In the case of 2 million records, it will take a good amount of time.

Why it takes time?

To understand this, we need to know How Data are store in RDBMS.

In RDBMS Row-wise data are save in sequential order but individual Rows are not saved in sequential they are distributed over disk space. So in above case Shamik Mitra and it’s all related columns are store sequentially say from location

00022 to 00026 in a disc.

But for Ajay Bose may it will store in 00134 to 00138.

So now think to find Employees in the same department, RDBMS has to hop over here and there in the disc space to collect Employee data who are in the same department so obviously it takes time.

Apart from this, another big problem in RDBMS is it has pre-defined Schema so anything Outside the schema would not fit. Like I can say if I want to save hobby for an Employee

I need to change Table Structure of Employee to fit this requirement.

Addressing this problem, NOSQL comes to play

NOSQL Characteristics are

1. It should be Schema-less.

2. Data should be stored in distributed manner.

3. Most important It stores Data Aggregation in another way it stores the whole relationship.

mainly there are 4 types Data model in NOSQL databases

1. Key-value pair

2. Document Base.

3. Column Family.

4. Graph Database

We will talk about Column Family.

In Column family style, data store based on column so you can think as

Multiple columns together make a Column Family. In One glance it may look like same as RDBMS but that is not the case.

Now you can think Table is like Column family. But the main difference is it is Schema-less and here Columns are stored sequentially, unlike RDBMS where Rows are store sequentially. As it is schema-less we can add any column relate with that column family.

So Here all name Column, dept_id column, sex column stores sequentially but

Each column in a row stores in a different location .

So according to the definition, All Employee name stores sequentially in a disc and all dept_id store sequentially but for a single row name and dept_id is not sequential.

But one key point should be remembered, each Row has One Unique Key for a Column family. The key can be same for different Column family.

By this Unique key, we can Identify an employee in Employee column family.

Column Family Model has Three main elements

1. Column Family: Column Family is a single structure that can group Columns and SuperColumns with ease. Think as a table in RDBMS.

2. Column : It has an ordered list of elements or tuple with a name and a value defined.

3. Key : Unique Identifier of the record. Keys have different numbers of columns, so the database can scale in an irregular way as it is Schema-less.

Keyspace: This defines the outermost level of an organization, typically the name of the application. Think as database schema in RDBMS.
Super Column : Super column is stored a mapping between Keys of the different column family.

Let’s take a look How we can map Employee & Department relation in Column Family

Data Structure :

Employee Column Family

Shamik Mitra
	Name : shamik Mitra
	Adress : Nivedita lane

Ajay Bose
	Name : Ajay Bose
	Address : 34 CT Road
	Hobby : Tennis

Department Column family

HR
	Name : HR
	Details : HR Function
IT
	Name : IT
	Details : Development

Mapping of Employee and Department (Super column)

HR
	Ajoy Bose
	KEY…N
IT
	Shamik Mitra

Now think about the query again search Employee under a department.

As (Dept_id) HR/IT columns are sequential to search Dept_id within the large data set is not a problem because no require hopping here and there. Second thing we need to fetch employees under department

But here we need help of Super column as in No SQL database there is no concept of foreign key or nor we can’t search NOSQL database by any attribute only we search it through key so to find the Key of Employee we need to find out Employees Key

So we need help from the super column and find Employees key then find Employee details.

NOSQL-COLUMN Family Model

In this article, I will discuss Column-Family Data Model. This Column-Family Data model uses in NoSQL database. Cassandra follows this Column-Family Data Model.

What Is Column-Family Data Model?

In order to understand Column Family, I will take a small example and try to map it on Relational database then gradually move on to No SQL mode.

Suppose I want to design Employee and Department relationship, so in an RDBMS

We will create Two Tables One for Employee another for Department.

Employee table has dept_id column which stored Department ID so Department ID is act as Foreign Key

Table: Employee

Emp_id	Emp_name	Emp_address	Emp_sex	Dept_id
1	Shamik mitra	1,N D Lane	M	2
2	Ajay Bose	34,CH Street	M	1
3.	Ragini Sil	45, CT Road	F	2
4	Aniket Dhar	2,PL Road	M	2

Table: Department:

Dept_id	Dept_name	Dept_detail
1	HR	HR Function
2	IT	Developement

The Problem domain, map into Two tables now suppose we will search for

“Details of Employees Under a Department.”

One should join Employee and Department using Dept_id and fetch the necessary information.

Query should be

Select * Employee e, Department d where e.Dept_id=d.Dept_id order by e.name;

Why it takes time?

To understand this, we need to know How Data are store in RDBMS.

00022 to 00026 in a disc.

But for Ajay Bose may it will store in 00134 to 00138.

So now think to find Employees in the same department, RDBMS has to hop over here and there in the disc space to collect Employee data who are in the same department so obviously it takes time.

Apart from this, another big problem in RDBMS is it has pre-defined Schema so anything Outside the schema would not fit. Like I can say if I want to save hobby for an Employee

I need to change Table Structure of Employee to fit this requirement.

Addressing this problem, NOSQL comes to play

NOSQL Characteristics are

1. It should be Schema-less.

2. Data should be stored in distributed manner.

3. Most important It stores Data Aggregation in another way it stores the whole relationship.

mainly there are 4 types Data model in NOSQL databases

1. Key-value pair

2. Document Base.

3. Column Family.

4. Graph Database

We will talk about Column Family.

In Column family style, data store based on column so you can think as

Multiple columns together make a Column Family. In One glance it may look like same as RDBMS but that is not the case.

So Here all name Column, dept_id column, sex column stores sequentially but

Each column in a row stores in a different location .

So according to the definition, All Employee name stores sequentially in a disc and all dept_id store sequentially but for a single row name and dept_id is not sequential.

But one key point should be remembered, each Row has One Unique Key for a Column family. The key can be same for different Column family.

By this Unique key, we can Identify an employee in Employee column family.

Column Family Model has Three main elements

1. Column Family: Column Family is a single structure that can group Columns and SuperColumns with ease. Think as a table in RDBMS.

2. Column : It has an ordered list of elements or tuple with a name and a value defined.

3. Key : Unique Identifier of the record. Keys have different numbers of columns, so the database can scale in an irregular way as it is Schema-less.

Keyspace: This defines the outermost level of an organization, typically the name of the application. Think as database schema in RDBMS.
Super Column : Super column is stored a mapping between Keys of the different column family.

Let’s take a look How we can map Employee & Department relation in Column Family

Data Structure :

Employee Column Family

Shamik Mitra
	Name : shamik Mitra
	Adress : Nivedita lane

Ajay Bose
	Name : Ajay Bose
	Address : 34 CT Road
	Hobby : Tennis

Department Column family

HR
	Name : HR
	Details : HR Function
IT
	Name : IT
	Details : Development

Mapping of Employee and Department (Super column)

HR
	Ajoy Bose
	KEY…N
IT
	Shamik Mitra

Now think about the query again search Employee under a department.

So we need help from the super column and find Employees key then find Employee details.

NOSQL-COLUMN Family Model

NOSQL-COLUMN Family Model

javaOnFly Achievements

Powered by javaOnFly Production

Featured Post

Microservice Vs Monolith:Which one to Choose?

Followers

Categories

Top 7 Blogs

Top 20 Blog

Top Blog

JCG Partner

DZONE MVB PROFILE

Javaonfly Productions Presents

Popular Posts

Blog Archive

Recent Comments

Make & Know Java

NOSQL-COLUMN Family Model

NOSQL-COLUMN Family Model

javaOnFly Achievements

Powered by javaOnFly Production

Featured Post

Microservice Vs Monolith:Which one to Choose?

Followers

Categories

Top 7 Blogs

Top 20 Blog

Top Blog

JCG Partner

DZONE MVB PROFILE

Javaonfly Productions Presents

Popular Posts

Blog Archive

Subscribe To Javaonfly

Recent Comments