RkBlog

Hardware, programming and astronomy tutorials and reviews.

Tokyo Cabinet and Python

Tokyo Cabinet is a scalable key-value database that can be used in high performance applications, including Python.

Tokyo Cabinet is a library for key-value database management developed by Mikio Hirabayashi. It's used on a very big site mixi.jp (and others). Tokyo Cabinet is a simple and scalable way to store data. There are also few additions - Tokyo Tyrant which is a server for Cabinet databases, Tokyo Dystopia which is a library for fulltext searches for those databases. All libraries are written in C and released on LGPL license.

Instalation

In case of Linux/Unix systems we only need the compiler like GCC. If the Tokyo libraries aren't in your distribution repository just download the sources and compile them the standard way:
./configure --prefix=/usr
make
make install

Database handling with Tyrant

Tokyo Tyrant provides us with the command line tool ttserver. The simplest way is to use "ttserver start", but to make it useful and launch a file database with server running in the background use:
ttserver -dmn -pid /tmp/ttserver.pid /tmp/my_database.tch

Using Tokyo Tyrant in Python

For Python we can use pytyrant module:
python setup.py install
Below is a simple script showcasing the API on "my_database.tch" database:
import pytyrant
t = pytyrant.PyTyrant.open('127.0.0.1', 1978)
t['__test_key__'] = 'foo'
t.concat('__test_key__', 'bar')
print t['__test_key__']

del t['__test_key__']

for i in range(1, 100000):
	key = 'k%s' % i
	t[key] = str(i)
	print i

print
print 'read'
for i in range(1, 100000):
	key = 'k%s' % i
	print t[key]
The speed of Tokyo Cabinet depends on which database you use (can use few types of file and RAM storage solutions), and how you tune the configs.

There is also pytc library, which is a direct API for Tokyo Cabinet (examples).

On Slideshare you can find nice slides: Building TweetReach with Sinatra, Tokyo Cabinet and Grackle: Austin on Rails, Pylons + Tokyo Cabinet, Introduction to Tokyo Products. Plurk used Tokyo Cabinet for its database LightCloud, which is a "better" memcache replacement.

RkBlog

9 October 2009;

Comment article