Cassandra
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
Apache Cassandra is an open source NoSQL distributed database trusted by thousands of companies for scalability and high availability without compromising performance.
https://docs.datastax.com/en/cql/3.3/cql/ddl/dataModelingApproach.html Because Cassandra uses this single table-single query approach, queries can perform faster. Data in Cassandra is often arranged as one query per table, and data is repeated in many tables, a process known as denormalization.
Slackbuild
1 cd /tmp
2 wget http://slackbuilds.org/slackbuilds/14.1/system/apache-cassandra.tar.gz
3 tar xvzf apache-cassandra.tar.gz
4 wget http://archive.apache.org/dist/cassandra/2.0.7/apache-cassandra-2.0.7-bin.tar.gz
5 ./apache-cassandra.SlackBuild
6 installpkg /tmp/apache-cassandra-2.0.7-noarch-1_SBo.tgz
Node up
cqlsh
http://wiki.apache.org/cassandra/GettingStarted
1 bin/cqlsh
CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; describe keyspaces; USE mykeyspace; CREATE TABLE users ( user_id int PRIMARY KEY, fname text, lname text ); INSERT INTO users (user_id, fname, lname) VALUES (1745, 'john', 'smith'); INSERT INTO users (user_id, fname, lname) VALUES (1744, 'john', 'doe'); INSERT INTO users (user_id, fname, lname) VALUES (1746, 'john', 'smith'); describe tables; SELECT * FROM users; desc table users; CREATE INDEX ON users (lname); desc table users; SELECT * FROM users WHERE lname = 'smith';
Python sample app
http://datastax.github.io/python-driver/getting_started.html
1 cd /tmp/
2 wget http://slackbuilds.org/slackbuilds/14.1/libraries/libev.tar.gz
3 tar xvzf libev.tar.gz
4 cd libev
5 wget http://dist.schmorp.de/libev/Attic/libev-4.15.tar.gz
6 ./libev.SlackBuild
7 installpkg /tmp/libev-4.15-i486-2_SBo.tgz
8 easy_install pip # if not installed
9 pip install cassandra-driver
10 pip install blist
python3 cass.py
python3 asyncCass.py
1 import time
2 from cassandra.cluster import Cluster
3
4 def sucessHandler(rows):
5 print('Received data !')
6 try:
7 for user_row in rows:
8 print('>>> %d %s %s'%( user_row.user_id, user_row.fname, user_row.lname) )
9 except Exception as ex:
10 print(ex)
11
12 def errorHandler(exception):
13 print(exception)
14
15 if __name__=='__main__':
16 cluster = Cluster(['127.0.0.1'])
17 session = cluster.connect('mykeyspace')
18 futurex = session.execute_async('SELECT user_id , fname , lname FROM users')
19 futurex.add_callbacks(sucessHandler,errorHandler)
20 print('Wait 3 seconds ...')
21 time.sleep(3)
Python types conversion
http://datastax.github.io/python-driver/getting_started.html
Python Type |
CQL Literal Type |
None |
NULL |
bool |
boolean |
float |
float double |
int |
int |
long |
bigint varint counter |
decimal.Decimal |
decimal |
str unicode |
ascii varchar text |
buffer bytearray |
blob |
date datetime |
timestamp |
list tuple generator |
list |
set frozenset |
set |
dict OrderedDict |
map |
uuid.UUID |
timeuuid uuid |
Java types conversion
CQL3 data type |
Java type |
ascii |
java.lang.String |
bigint |
long |
blob |
java.nio.ByteBuffer |
boolean |
boolean |
counter |
long |
decimal |
java.math.BigDecimal |
double |
double |
float |
float |
inet |
java.net.InetAddress |
int |
int |
list |
java.util.List<T> |
map |
java.util.Map<K, V> |
set |
java.util.Set<T> |
text |
java.lang.String |
timestamp |
java.util.Date |
timeuuid |
java.util.UUID |
uuid |
java.util.UUID |
varchar |
java.lang.String |
varint |
java.math.BigInteger |