For some reason I keep finding myself dabbling in the worlds of compression and encryption. I'm not an expert in either of these areas, nor do I aspire to become one. It's just something that catches my interest from time to time.
On computers, both compression and encryption usually take bit patterns with a given meaning and translate them to other patterns intended to have the same meaning. This typically means having to read, write, and manipulate arbitrary groups of bits. To save myself from reinventing the wheel every time I played with another compression or encryption algorithm, I developed two libraries: one for bitwise file reading and writing (bitfile), and the other for manipulating arbitrary length arrays of bits (bitarray).
I originally wrote the bitarray library in ANSI C, because I used C for
my compression algorithms. However, this libraries were one of just a few
things that I ever wrote (I've written a lot) where I thought that I could
do a better job with C++. So I developed C++ implementation. Of course
somewhere along the way the Standard Template Library (STL) was added to
C++ and you can do much of what the bit array library does with a
I am publishing both of these libraries under the GNU LGPL in hopes that they will be of use to other people.
The rest of this page discusses each of my libraries.
The ANSI C bitarray library provides a collection of functions that
create and operate on arrays of bits. The ISO C++ bitarray library
provides a class with methods that perform similar functions. Modern
versions of the C++ STL provide
bitset for similar functionality. My C++ implementation
doesn't use the C++ STL.
Bitarrays may be of any size and are implemented as arrays of
unsigned char. Bit 0 of the most significant
unsigned char (char 0) is the most significant bit (msb) of
the bit array. The last (non-spare) bit of the last
unsigned char is the least significant bit (lsb).
An array of 20 bits (0 through 19) with 8 bit
unsigned chars (0 through 2) to store all the bits.
The array data is contained inside a structure/class which includes a count of the number of bits in the array, and a pointer to the memory storing the array. Since arrays may be of arbitrary size, the memory storing the array is dynamically allocated on the heap.
The C++ bitarray class overloads bitwise operators (&, |, ^, ...), providing the expected results on bitarray objects. The C bitarray library provides functions (BitArrayAnd, BitArrayOr, BitArrayXor, ...) for similar functionality.
I have written the bitarray library so that functions and methods requiring multiple bit arrays (such as BitArrayAnd or &), will not do anything if they are given arrays of differing sizes to operate on.
With native arrays, square brackets (
) may be used to
either obtain the value of an array element
1, or to obtain a pointer to an array
if (array[index] == value) ...
array[index] = value;
Unfortunately I have not found a way to do anything close to this with bitarrays in C.
In C++ it's not possible to overload square brackets (
to behave both ways. Consequently square brackets (
returns a bit value and parenthesis (
()) returns a class that
behaves as a pointer to a bit in the array. The class returned by
()) may only be used for assigning bit values.
A description of each of the functions in my C bitarray library may be found here, unfortunately, I have't written a similar description for the C++ bitarray library. Both the C and C++ bitarray library source archives also include detailed headers preceding each function, and I have included a file named sample.[c|cpp] which demonstrates the usage of each function in the bitarray library.
All the source code that I have provided is written in strict ANSI C or ISO C++. I would expect it to build correctly on any machine with ANSI C/ISO C++ compilers. I have tested the code compiled with gcc on Linux on an Intel x86 and mingw on Windows XP.
The library includes the routines intended for debugging which dump
the array contents to a display. These routines assume that
unsigned chars are 8 bits. These routines can easily be
written to support any specific size unsigned character. Writing
the dump routines to handle arbitrary size
unsigned char seems
more difficult than it is worth to me. Especially since I only have
access to machines with 8 bit
A repository containing the source for each bitfile library may be downloaded by clicking on the links below. My source has been released under the GNU LGPL. The source code repository is available on GitHub. I recommend that you checkout the latest revision of the master branch, unless you're looking for something specific.
My latest implementations of Huffman codes provides an additional example of how to use the C version of these libraries. If you still have any questions or comments feel free to e-mail me at email@example.com.Home
Last updated on December 23, 2018