Standard
for NoSQL Classification
Reading and writing are the
basic building blocks of a platform, whether it be a storage system or a
service development system including a PC. Think about a football match. You
must have two teams - and when one on the offense, the other must be on
defense. Let's take a few examples.
The theoretical basis of
all computers with built-in programs, in the John von Neumann structure, is a
mathematical model called Turing machine. The model consists of the following
components:
· A
tape that is infinitely long and has an infinite number of cells, each of which
contains a symbol.
· A header that can read a symbol from and write it to a
cell, and can traverse the cells to the left or right.
· A state transition table
o (The current state, the symbol read in the current cell)
? (next state, the symbol to write in a cell, left/right)
Suppose a state transition
table program is a service. Then what a platform does is reading and writing.
Thus, read/write is the fundamental feature of any system of
any size, from the micro architecture of a CPU to cloud services.
· Intel
core2 micro architecture: In
its hardware configuration, everything except the pipelining, out-of-order
execution, and ALU is the configuration for reading and writing operations.
· MS Windows Azure: A
storage that reads and writes is also the core part of a cloud system such as
Azure.
A program repeats the
process of reading-calculating-writing; it reads written data, processes it,
and writes the intermediate result data. The same rule applies to Internet
services. The reason for this lengthy explanation of the importance of
reading and writing is to remind The Platform Magazine readers
- who already know such basic knowledge - of the core features of a
storage system such as NoSQL.
NoSQL
System Standards
The following is a summary
of standards that will be used to classify the NoSQL systems, which are the
topic of our discussion.
Data
model and query model
Although a Turing machine
performs read/write in a single cell on a tape, on a practical system much
more diverse and complex read/write operations are required. I will
define a storage model that is the target of read/write operations as
a data model, and the command for such operations as
a query model.
A query model can
be regarded as an API, but to emphasize that it is a reading/writing operation
for a storage model, I will use the term query, which is often used in
RDBMS. A data model and a query model are
closely related to each other, like the two sides of a coin. In a sense,
the relationship between a storage model and a query is similar to that of an
instance and a method that is defined in a class in an object-oriented
language.
Data
distribution
Sometimes, the processing
power of a single computer is not sufficient to process a query requested by a
service. This happens because the resources that a computer can use to process
the query, such as the CPU, memory, disk, network capacity, and throughput, are
limited. There is a solution to this problem, called thescale-up approach,
but it is costly and often ineffective when the scale of data to be processed
is huge, due to the nature of its architecture. For this reason, the scale-out
approach, an alternative, is used to distribute the resources required to
process queries between two or more computers.
The basic principle of
distribution in a large-scale service is that the greater the
separation within the required level of service, the more throughput for the
query is required. It would be great if there was a method to scale-out to
infinity; however, as it is not just realistic to assume that the scale-out
approach method will always satisfy all the scale-out requirements, we will
limit it with a condition, "within the required level of service."
Replication
to increase availability
Data stored in a system may
not be read or written when a failure occurs in the program or the resource
action that are necessary to process queries. To ensure high availability of a
service, resources for data and query processing must be distributed to and
stored in two or more locations. This is what we call 'replication.'
REPLICATION
MODELS: THERE ARE THREE MODELS IN REPLICATION, WHICH ARE
DESCRIBED BELOW :
· Transactional replication: A model that replicates at the transaction-level by
using the Two Phase Commit (2PC) protocol (CUBRID RDBMS uses
the transactional replication in its HA feature).
· State machine replication: A model that replicates the events that occur from the
source system to the target system by using atomic broadcast (transfers total
ordering messages). In this model, the target system is considered a Finite
State Machine.
· Virtual Synchrony:
A model that is often used in group communication and the like. It ensures that
events that concurrently occur in a process group are delivered in the same
order in which they occur. (This is for in-memory data).
Data
consistency model for operations
RDBMS operates on a concept
of transaction that can be summed up by the ACID (Atomicity, Consistency,
Isolation, Durability) attribute. A transaction logically holds one or more
queries; when such a transaction is applied to a DBMS, the system protects a
certain attribute from being changed to maintain data consistency. The NoSQL
system is no exception. We will explore how it protects data and data
consistency from user operations.
RDBMS
vs. NoSQL
Before studying the
features of a variety of NoSQL systems, it would be useful to learn the
background history of the NoSQL storage system.
NoSQL
as antithesis
Here is a little mnemonic
device. If NoSQL is No SQL, it must be an anti-RDBMS system. The logical flow
explaining why NoSQL is the opposite of RDBMS is as follows:
· RDBMS
defines the SQL language by using a data manipulation method that can model
relational data (relation and table), and supports transactions with the ACID
attribute.
· The scale-out approach which distributes data before
processing it, is in high demand due to the high data-throughput demand of Internet-scale
services.
· The scale-out approach of existing RDBMSs is intended to
maintain operational integrity of the relational model, which is the core of
RDMBS, and transaction operations.
It is difficult to
scale-out while maintaining integrity. The same problem arises when
distributing or replicating data. So we make the ACID-based transaction
attributes that are ensured by DBMS or the replication integrity model less
strict before we use them to scale-out.
Scale-out
of RDBMS
Scaling-out is difficult in RDBMS. If scaling-out to several
thousand RDMBSs were easy, prominent database developers such as Oracle would
have released a product or two for that a long time ago.
Now let us assume that the
tables of RDBMS are distributed to several computers, and each piece of data is
replicated before it is stored for high availability. First, executing
distribution transaction while satisfying ACID is difficult in scale-out.
· To
satisfy the atomicity attribute of ACID, the distribution
transaction protocol such as the 2PC protocolmust be used in all
systems that are related to a specific transaction.
· To match the isolation level among ACID
attributes, data must be locked in general. The units of Lockingcan
be a record, a table, or an index.
· Therefore, to satisfy the Atomic and Isolated attributes
in a distributed environment, all related locks must be applied to each system
while the distribution transaction protocol is being processed; the higher the
service load of the system, the heavier the lock competition becomes. This is
what makes scaling-out difficult.
Another problem is that
there is a limitation to scale-out by replicating and distributing data.
· The
transactional replication method using the 2PC method has a
problem in which a transaction fails and becomes unavailable when one of the
systems related to the replication process fails. In addition, the performance
degrades when several systems are involved in the replication.
· As an alternative, it is possible to pass the Write
Ahead Logging (as known as WAL) data of a DBMS to the replication
system and have it apply the data. If we consider the system in which
replication occurs as a master (or primary), and the system to
which the changes are applied as a slave (or backup), they are
configured either as master-slave or multi-master.
· When configuring master-slave: This is the most commonly
used replication method. The speed of the process is in inverse proportion to
the number of systems involved in replication in this method.
· When configuring multi-master: It is difficult to solve
the collision between data write processes or prevent it from happening when
there are several masters. In The Danagers of Replication and a
Solution Jim Gray conducted a study on this issue (Received
Turing Award in 1998 for his contribution to the related database and
transaction processing) .
Sharding
by developers
Generally speaking, it is
extremely difficult to scale-out while satisfying the ACID attribute in the
DBMS data model. For this reason, to scale-out based on DBMS, one must simplify
the data model itself, partition the data by the number of N, and then execute
the query within a separate piece of data.
The unit of partitioned
data is called a 'shard'. Distribute and service N number of shards to M
number of DBMSs. DBMS does not manage shards. This is the responsibility of the
service developer.
The sharding method is
focused on developers and has the following difficulties:
· First,
a shard must be defined.
· Shard mapping:
o The basic storage unit of a DBMS is the table. Because a
table can contain one or more shards, there is a need to know which shard is
mapped to which instance of a database. The locations of shard tables must be
known to the application.
· Shard
distribution/redistribution:
o As each shard is different, so are the throughput
requirements and data size. As a result, a developer must add a new instance to
or delete one from the database, and redistribute the shards manually. This is
a painstaking and labor-intensive process.
o The mapping information that has been modified by the
distribution/redistribution process must be applied to the application.
o Management, such as configuring for the replication, is
necessary when modifying data.
Approach
of NoSQL
The Internet-scale data storage
system aims for the scale-out method through data distribution, and replicates
data to ensure its high availability. We have already examined the fact that
scaling-out is difficult to carry out under the process that supports a
transaction of ACID attribute and the relation model of RDBMS. Now, we will
look into the territory of NoSQL to see how it handles scaling-out.
Simplify
data model for sharding
The data closure of a query
operation will be the default unit of distribution, and the data model will be
simplified so as to create an appropriate level of shard, which is a collection
of a certain number of data closures. The default unit of distribution is key.
· The
simplest model reads and writes the whole data by recognizing it as an
immutable object. Of the key-value storages, Dynamo and Membase are
examples of this model, both of which support blob type data models.
· Chunk, a
component unit of a file in the Google file system, has a 64-bit identifier
that is globally unique. Operations based on byte range (read/write/append) can
be done with chunks. While the storage abstraction is provided to users as a
file, the operation of this file is actually performed in chunk at the API
level.
· Some systems provide slightly more complex data models
than that of the blob-type immutable object.
o Key-Value DB:
A database that provides a query containing information on what operations can
be done on a specific structure, by modeling data in the Abstract Data
Structure (as known as ADS) such as list and set. Redis and Tokyo
Cabinet are most popular.
o Document-oriented: A database that stores random and irregular objects by
using the object representing notation, and provides queries based on the
object of a property, such as CouchDB orMongoDB.
· A
system that supports the multi-dimensional map type data model in the row-column
family-column or row-column-timestamp structure, such as Bigtable or Cassandra.
The default query supports
reading and writing at this level of distribution. However, it may also support
a separate query method, such as map-reduce in order to perform
aggregation operations in multiple distribution units. In addition,
document-oriented storage systems such as MongoDB allow users to apply an index
other than document ID for a specific property of a document, thereby enabling
users to specify additional quick access paths.
Distribution
· Hash-based:
o The key has no meaning in this method. It distributes
data natural to throughput demand or data size.
o To minimize mapping and redistribution problems
resulting from the addition or deletion of nodes, it often distributes data
based on consistent hashing.
· Index-based:
o It ensures a range query by maintaining the key-based
order-preserving.
· Metadata
server-based:
o A method that is used to manage the location information
of a distribution unit from a separate metadata server.
Makes the ACID attribute of
a query less strict.
For the consistency of
query operation, it uses a less strict attribute than the ACID attribute of
RDBMS. The relaxation phenomenon of query attribute is noticeable if you
consider replications as well. While there are systems that maintain the
ACID attribute like CouchDB, in many cases, it makes Consistence or Isolated less
strict for high availability.
We will now examine how
some of the systems make attributes less strict in detail.
· In GFS,
multiple clients may write or execute record append operations to a file
simultaneously. Although concurrent writings of multiple clients may show the
same value for all replicas, the write of each client is undefined (in the
dissertation term). The record append operation ensures consistency in the file
area of each replica that is used at least once for atomicity.
· Dynamo ensures
view consistency for the replications and reads in the sloppy quorum method, as
requested by a service that it to be 'always writeable.' The Quorum
protocol determines the maximum number of systems to be used for R
(read) , W (write), and each operation in N number of systems that are involved
in the replication process, in order to prevent inconsistency of information
from happening even when the network is partitioned (R + W > N, W > N/2).
To cope with temporary system failure, Dynamo allows other systems to execute a
key query that is not directly involved with them, by using the Hinted-handoff
method (sloppy quorum). It also allows each system to manage data versions for
possible data inconsistency and to determine the casual order of data
modification among the systems that are involved in the replication by using
the vector clock technique, so that each of them can automatically recognize
which data is new (reconciled), and allows a client to determine such data when
it is impossible to determine them, by transmitting versions of data to it.
· Cassandra adjusts
the level of consistency by allowing users to determine how many reads and
writes must be successful in a node with N number of replication relations. It
provides specifiable options when reading (Zero, Any, One Quorum, and All) and
when writing (One, Quorum, and All). It also provides consistency (view
consistency) when reading, by read-repairing through the timestamp-based
ordering, similar to what Dynamo does.
We will now classify the
NoSQL systems.
NoSQL
Classification
The following is a
classification of commonly used NoSQL storage systems:
System
|
Data Model
|
Distribution Unit
|
Distribution Method
|
Replication
|
GFS
|
GFS
|
chunk
|
Metadata Server
|
master-slave
|
Big
table
|
multi dimensional map
|
row
|
index (B+ tree
like)
|
Dependon GFS
|
Cassandra
|
multi dimensional map
|
row
|
hash/index
|
optional
|
MongoDB
|
Document (JSON)
|
document
|
index
|
master-slave
|
CouchDB
|
Document (BSON)
|
document
|
index
|
master-master
|
Dynamo
|
Blob
|
blob
|
hash
|
sloppy quorum
|
a A method in which a master node that is unique in the
whole system lends write authority to a chunk to one of the chunk servers
(replicas) that are related to the replication is used. The replica that
borrowed the authority from a master node is called primary. To reduce the
burden of a master node, the primary continues to have the write authority as
long as it is alive (piggybacking to a heartbeat message).
b Basically, it executes consistent hashing. However,
users can allow it to decide which Partitioner it will use through
configuration (random, order preserving, and collating order preserving).
c Able to create a user-defined index
d Index for ID and Sequence Number. Sequence Number
increases per update
What
Kind of Storage System Do I Need?
Now we will explore what
needs to be considered when answering a question like "What kind of
storage system should I use for the service I am developing?"
RDBMS
is still the standard
It must be considered
before all else. RDBMS is even more important when read/write operations
require ACID-type consistency. In most cases, implementing an RDBMS and
additional storage systems will solve the problem of distributing the service
load.
· Read
distribution through replication
· The RDBMS scale-out technology in OLTP domain, such as
MySQL Cluster, can also be considered.
· RDBMS + Cache + Storage system
o If the performance of RDBMS is not terribly good, you
can use a caching system such as Arcus for time-consuming
query items or frequently used items. Of course, you must carefully select a
consistency model that is appropriate for the data that is being cached.
o If a large data storage is required, you can also use a
separate storage system like OwFS.
· Finally,
there is a sharding technique implemented by developers. That is, if you're
willing to deal with replicating, troubleshooting, and restoring processes as
well as the distributing and redistributing of shards.
NoSQL
Do you think it is difficult
to implement a service on an RDBMS platform? Then use NoSQL. NoSQL is a system
that solves the difficulties associated with replication, troubleshooting,
restoration, and shard distribution/redistribution, regardless of the type of a
service. It can be an ideal choice if the data model and consistency of the
service are supported by the NoSQL system.
What
exactly is NoSQL?
A
NoSQL database provides a mechanism for storage and retrieval of data that uses
looser consistency models than traditional relational databases. Motivations
for this approach include simplicity of design, horizontal scaling and finer
control over availability. NoSQL databases are often highly optimized key–value
stores intended for simple retrieval and appending operations, with the goal
being significant performance benefits in terms of latency and throughput.
NoSQL databases are finding significant and growing industry use in big data
and real-time web applications. NoSQL systems are also referred to as "Not
only SQL" to emphasize that they do in fact allow SQL-like query languages
to be used. wiki
Why
are developers considering NoSQL?
More
flexible data model
The
relational model takes data and separates it into many interrelated tables that
contain rows and columns. Tables reference each other through foreign keys that
are stored in columns as well. When looking up data, the desired
information needs to be collected from many tables (often hundreds in today’s
enterprise applications) and combined before it can be provided to the
application.
As
you can see, there are three primary concerns you must balance when choosing a
data management system: consistency, availability, and partition tolerance.
Consistency
means that each client always has the same view of the data.
Availability
means that all clients can always read and write.
Partition
tolerance means that the system works well across physical network partitions.
Scalability
and performance advantages
To
deal with the increase in concurrent users (Big Users) and the amount of data
(Big Data), applications and their underlying databases need to scale using one
of two choices: scale up or scale out. Scaling up implies a centralized
approach that relies on bigger and bigger servers.
LIST
OF NOSQL DATABASES
Wide
Column Store / Column Families
·
Hadoop
/ HBase API:
Java / any writer, Protocol: any write call, Query Method: MapReduce Java / any
exec, Replication: HDFS Replication, Written in: Java,
·
Cassandra massively scalable,
partitioned row store, masterless architecture, linear scale performance, no
single points of failure, read/write support across multiple data centers &
cloud availability zones. API / Query Method: CQL and Thrift, replication: peer-to-peer,
written in: Java, Concurrency: tunable consistency, Misc: built-in data
compression, MapReduce support, primary/secondary indexes, security
features.
·
Hypertable
API:
Thrift (Java, PHP, Perl, Python, Ruby, etc.), Protocol: Thrift, Query Method:
HQL, native Thrift API, Replication: HDFS Replication, Concurrency: MVCC,
Consistency Model: Fully consistent Misc: High performance C++ implementation
of Google's Bigtable.
Document
Store
·
MongoDB
API:
BSON, Protocol: C, Query Method: dynamic object-based language & MapReduce,
Replication: Master Slave & Auto-Sharding, Written in: C++,Concurrency:
Update in Place.
·
Elasticsearch
API:
REST and many languages, Protocol: REST, Query Method: via JSON, Replication +
Sharding: automatic and configurable, written in: Java,
·
Couchbase
Server API:
Memcached API+protocol (binary and ASCII) , most languages, Protocol: Memcached
REST interface for cluster conf + management, Written in: C/C++ + Erlang
(clustering), Replication: Peer to Peer, fully consistent,
ADVANTAGES
AND DISADVANTAGES
Let's look into the
advantages and weaknesses of NoSQL by data model.
· Key-Value
(blob)
· Simple and fast.
· Often supports only atomic write/read at key level. In
this case, there is no way to serialize the values of several keys as they are
processed.
· Get/put is very fast when the memory serves as the
storage (Ideal when the size of data fits in the memory).
· While the operation for a single Key is fast, an
operation involving multiple keys may be slow as the network transmission is
frequently delayed. (There is a huge difference in speed when selecting
data with 100,000 rows in an RDMBS table and when reading a key-value 100,000
times. The latter is significantly slower. This difference can be perceived
when there are only 100, or even 10 items to be processed).
· Key-Value (Structure)
· A model that provides ADT such as List or Set; able to
hold multiple values for a single key. Unlike the Key-BLOB model in which a
single value is assigned to a single key, using this allows users to store
multiple sets of data with a single key.
· Processing takes a little longer than a model that
performs a simple get/set (However, the difference is barely noticeable.
The performance differences inherent to a data model can be overcome by
effective implementation of system.)
· Document oriented
· A data model that can add a random property without a
schema.
· As the name suggests, it is an optimal structure for
storing document data such as JSon or XML.
· It often maintains the order by document id or the value
of a property. (This enables efficient operation for the range of the
corresponding key value, and thus provides queries.)
· The process overhead of this model may be larger than
that of a key-value model or a key-structure model, as it must parse the data
and compute it in the memory when processing a query. (Performance will
degrade when handling a large document.)
· Multi-dimensional map
· For Bigtable, data is mapped by row-column-timestamp.
The data is binary.
· For Cassandra, data is mapped in the form of row-column
families-column. The data itself is binary.
· Both models allow data to be modeled by structuralizing
the grouping and access methods for data (column). The detailed process of
modeling is beyond the scope of this document. For more information on
modeling, refer to the Cassandra homepage and.
In Conclusion
If it is the mission of a
scholar to find the unmovable truth from the uncertain, it is the mission of a
developer to find a way to change what seems to be unchangeable.
For Not only SQL
Comments
Post a Comment