===========================================
 PyTables, hierarchical datasets in Python
===========================================

:URL: http://www.pytables.org/


PyTables is a package for managing hierarchical datasets and designed
to efficiently cope with extremely large amounts of data.

It is built on top of the HDF5 library and the numarray package. It
features an object-oriented interface that, combined with C extensions
for the performance-critical parts of the code (generated using
Pyrex), makes it a fast, yet extremely easy to use tool for
interactively save and retrieve very large amounts of data. One
important feature of PyTables is that it optimizes memory and disk
resources so that they take much less space (between a factor 3 to 5,
and more if the data is compressible) than other solutions, like for
example, relational or object oriented databases.

PyTables is not designed to work as a relational database replacement,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your
cluttered RDBS, then give PyTables a try. It works well for storing
data from data acquisition systems (DAS), simulation software, network
data monitoring systems (for example, traffic measurements of IP
packets on routers), or as a centralized repository for system logs,
to name only a few possible uses.

A table is defined as a collection of records whose values are stored
in fixed-length fields. All records have the same structure and all
values in each field have the same data type. The terms "fixed-length"
and strict "data types" seems to be quite a strange requirement for an
interpreted language like Python, but they serve a useful function if
the goal is to save very large quantities of data (such as is
generated by many scientific applications, for example) in an
efficient manner that reduces demand on CPU time and I/O.

There are other useful objects like arrays, enlargeable arrays or
variable length arrays that can cope with different missions on your
project. Also, quite a bit of effort has been invested to make
browsing the hierarchical data structure a pleasant
experience. PyTables implements a few easy-to-use methods for
browsing. See the documentation (located in the ``doc/`` directory)
for more details.

One of the principal objectives of PyTables is to be user-friendly.
To that end, special Python features like generators, slots and
metaclasses in new-brand classes have been used. In addition,
iterators has been implemented were context was appropriate so as to
enable the interactive work to be as productive as possible. For these
reasons, you will need to use Python 2.3 or higher (Python 2.3.5 or
better recommended) to take advantage of PyTables.

To compile PyTables you will need, at least, a recent version of HDF5
(C flavor) library, the Zlib compression library and the numarray
package. Besides, if you want to take advantage of the LZO, UCL and
bzip2 compression libraries support you will also need recent versions
of them. These compression libraries are, however, optional.

We've tested this PyTables version with HDF5 1.6.5 and numarray 1.5.1,
and you *need* to use these versions, or higher, to make use of
PyTables. Albeit you won't need NumPy or Numeric Python in order to
compile PyTables, they are supported; you only will need a reasonably
recent version of them (>= 0.9.8 for NumPy and >= 24.x for
Numeric). PyTables has been successfully tested against NumPy 0.9.8
and Numeric 24.2.

We are using Linux on top of Intel32 as the main development platform,
but PyTables should be easy to compile/install on other UNIX machines.
This package has also been successfully compiled and tested on a
FreeBSD 5.4 with Opteron64 processors, a UltraSparc platform with
Solaris 7 and Solaris 8, a SGI Origin3000 with Itanium processors
running IRIX 6.5 (using the gcc compiler), Microsoft Windows and
MacOSX (10.2 although 10.3 should work fine as well). In particular,
it has been thoroughly tested on 64-bit platforms, like Linux-64 on
top of an Intel Itanium, AMD Opteron (in 64-bit mode) or PowerPC G5
(in 64-bit mode) where all the tests pass successfully.

Nonetheless, caveat emptor: more testing is needed to achieve complete
portability, I'd appreciate input on how it compiles and installs on
other platforms.


Installation
============

The Python Distutils are used to build and install PyTables, so it is
fairly simple to get things ready to go. Following are very simple
instructions on how to proceed. However, more detailed instructions,
including a section on binary installation for Windows users, is
available in Chapter 2 of the User's Manual (``doc/usersguide.pdf`` or
http://www.pytables.org/moin/HowToUse).

1. First, make sure that you have HDF5 and numarray installed (you
   will need at least HDF5 1.6.5 and numarray 1.5.0). If don't, you
   can find them at http://hdf.ncsa.uiuc.edu/HDF5 and
   http://sourceforge.net/projects/numpy/. Compile/install them.

   Optionally, consider to install the excellent LZO and UCL
   compression libraries from http://www.oberhumer.com/opensource/.
   You can also install the high-performance bzip2 compression
   library, available at http://www.bzip.org/.

2. From the main PyTables distribution directory run this command,
   (plus any extra flags needed as discussed above)::

        python setup.py build_ext --inplace

3. To run the test suite change into the test directory, set the
   PYTHONPATH environment variable to include the ``../..`` directory
   and issue the command::

        python test_all.py

   If you would like to see some verbose output from the tests simply
   add the flag ``-v`` and/or the word ``verbose`` to the command
   line. You can also run just the tests in a particular test module.
   For example::

        python test_types.py -v

   If there is some test that does not pass, please, run the failing
   test module with all verbosity enabled (use the flags ``-v`` or
   ``verbose``), and send the output back to us.

4. To install the entire PyTables Python package, change back to the
   root distribution directory and run this command as the root user
   (remember to add any extra flags needed)::

        python setup.py install


That's it!  Good luck, and let us know of any bugs, suggestions,
gripes, kudos, etc. you may have.

----

  **Enjoy data!**

  -- The PyTables Team

.. Local Variables:
.. mode: text
.. coding: utf-8
.. fill-column: 70
.. End:
