xdata, copyright (c) 2001 Chris Lightfoot http://www.ex-parrot.com/~chris/xdata/ $Id: README,v 1.2 2001/03/25 23:31:31 chris Exp $ These programs are free software; you can redistribute and/or modify them under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Introduction xdata is a set of programs for manipulating key-value data, preserving information about their last-modification times. Unlike the unix file system, which has a similar aim, xdata is intended to be used for large numbers of small data. Internally, it is implemented using GDBM (the GNU database management library), though the same interface could be used with another back-end. The individual tools within xdata are intended for use in shell scripts; the supplied perl module can be used in more sophisticated programs written in perl, and you can write C programs which link against the routines in xdata.c. The programs in the distribution are-- xdget retrieve a datum xdput insert or replace a datum xded run an editor on a datum xdls list keys xdrm remove data xdmv rename a datum xdcp copy a datum xdsync synchronise the contents of a database with a database on another host, using rsh(1) or ssh(1) For more information about the individual programs, read the man page. A brief introduction follows: Structure of the database Data in xdata are selected via two parameters: path and type. The path, which is specified by the environment variable XDPATH or by the command-line option -p in the tools, is a colon-separated list of locations in which xdata will look for databases. If called upon to create a database, xdata will do so in the first-named component of the path. The type, which is specified by the environment variable XDTYPE or by the command-line optin -t, is a string which is used to distinguish various sorts of data from one another. For most purposes, it is sufficient to create ~/.xdata, with appropriately restrictive permissions, and set XDPATH=~/.xdata. There is a free choice of type name, but short names which are free of whitespace are probably best. Note that gdbm has the restriction that readers and writers may not access a database simultaneously; however, arbitrarily many readers may access a database at once. In the future, xdata may be rewritten to use Berkeley DB or a similar library which supports more convenient locking semantics. Example: `to do' list This implements a `to do' list; pending items are printed on your terminal, and new items may be added using xded. First, the script which prints out the list: #!/bin/sh # # show-todo: show TODO list from xdata. # TODOLIST=`xdls -t todo` if [ "$TODOLIST" = "" ] ; then exit ; fi if [ "$COLUMNS" != "" ] ; then WIDTH=$[ $COLUMNS - 25 ] ; else WIDTH=55 ; fi for i in $TODOLIST ; do echo -ne $i"-- \n "; xdget -t todo $i done (Note that this assumes that each item has a name which is free of white space.) This produces output like water-plants-- Water plants before they die. finish-xdata-- Finish writing the documentation for xdata. You can add an item to the list by doing xded -n $item_name; edit an existing item using xded $item_name, and remove an item using xdrm $item_name. Cron entries synchronise the to do list among various hosts. Synchronisation The really useful feature of xdata is synchronisation. This is based on the existence of a last-modified time for each datum. The xdsync program contacts a remote host, and runs xdsync in `slave' mode on that host. The two instances of the program then agree on which keys remain in the database. The synchronisation rule is simply that a newer version of a key overrides an older; this also applies to deletions, so that a key which is created on host a, then deleted on host b, will be removed from a when synchronisation occurs. Because of the reliance on modification times, xdsync will abort if the two hosts do not agree on the time of day to within a certain margin (by default, one minute). (Note that this protocol is not unlike the flood-fill approach of NNTP USENET news.) Because xdsync runs over rsh or ssh, there is no need to run an additional daemon to make it work. Installation Review the contents of the makefile; you will probably wish to change the definitions of BINPATH and MANPATH, unless your opinion of the FHS concurs exactly with mine. Build the programs, and install them, either using make install, or manually. The perl interface is separate; see below. Internals As mentioned above, the data in xdata are stored in GDBM databases. Keys are arbitrary, but keys beginning `$$:' are reserved for internal use; in particular, each type will have: $$:metadata which stores information about when the database was last synchronised, and when it was last purged of deletion records. $$:sync: which stores information about when the database was last synchronise against . $$:d: which is used to save the information that has been deleted, so that deletions are propagated by synchronisation; such records are periodically purged. So, for instance, xdrm will remove and create in its place $$:d:; the latter key will have its last-modification time set to the time of deletion. Last-modification times are stored inside each key (in the first sizeof(time_t) bytes). Any activity which writes to a key will set the last-modification time. Observe that, since the last-modification time is stored in a binary format, the xdata files themselves are not portable across architectures. However, it is safe to employ xdsync to synchronise data across machines of disparate architectures. Note that key names are stored with their terminating nulls. This is probably a bad thing, but it makes the C interface slightly simpler (at the expense of making the perl interface slightly more complicated). It's done now, and if you program to the given interfaces it won't affect you. Perl interface The perl interface defines an object of type Data::XData. See the documentation for more information, but basically: use Data::XData; # Open the database for writing $xd = new Data::XData($path, $type, 1) or die "oops"; # What keys exist? @keys = $xd->keys(); foreach $k (@keys) { print "key $k\n"; # Tell me about this key. my ($time, $size) = $xd->stat($k); print " size $size, last modified " . scalar(localtime($time)) . "\n"; # Obtain the data for this key. print " data:\n"; print $xd->get($k); } # Change existing or insert new key. $xd->put("fish", "Creatures which swim around underwater", 1); # Delete key if it exists. if ($xd->exists("salmon")) { $xd->delete("salmon"); print STDERR "deleted too-specific fishtype\n"; } To install it, go into perl/Data-XData, and do the standard perl Makefile.PL ; make ; make install. You could run make test, if you wanted to, but to be honest it doesn't test anything. Some example scripts are in perl/scripts.