blist 1.1.1 is now available: http://pypi.python.org/pypi/blist/
What is blist?
The blist is a drop-in replacement for the Python list the provides better performance when modifying large lists. Python’s built-in list is a dynamically-sized array; to insert or removal an item from the beginning or middle of the list, it has to move most of the list in memory, i.e., O(n) operations. The blist uses a flexible, hybrid array/tree structure and only needs to move a small portion of items in memory, specifically using O(log n) operations.
For small lists, the blist and the built-in list have virtually identical performance.
What’s new?
The blist package provides other data structures based on the blist:
- sortedlist
- sortedset
- weaksortedlist
- weaksorteset
- sorteddict
- btuple
These additional data structures are only available in Python 2.6 or higher, as they make use of Abstract Base Classes.
The sortedlist is a list that’s always sorted. It’s iterable and indexable like a Python list, but to modify a sortedlist the same
methods you would use on a Python set (add, discard, or remove).
>>> from blist import sortedlist
>>> my_list = sortedlist([3,7,2,1])
>>> my_list
sortedlist([1, 2, 3, 7])
>>> my_list.add(5)
>>> my_list[3]
5
>>>
The sortedlist constructor takes an optional "key" argument, which may be used to change the sort order just like the sorted() function.
>>> from blist import sortedlist
>>> my_list = sortedlist([3,7,2,1], key=lambda i: -i)
sortedlist([7, 3, 2, 1]
>>>
The sortedset is a set that’s always sorted. It’s iterable and indexable like a Python list, but modified like a set. Essentially, it’s just like a sortedlist except that duplicates are ignored.
>>> from blist import sortedset
>>> my_set = sortedset([3,7,2,2])
sortedset([2, 3, 7]
>>>
The weaksortedlist and weaksortedset are weakref variations of the sortedlist and sortedset.
The sorteddict works just like a regular dict, except the keys are always sorted. The sorteddict should not be confused with Python 2.7’s OrderedDict type, which remembers the insertion order of the keys.
>>> from blist import sorteddict
>>> my_dict = sorteddict({1: 5, 6: 8, -5: 9})
>>> my_dict.keys()
[-5, 1, 6]
>>>
The btuple is a drop-in replacement for the built-in tuple. Compared to the built-in tuple, the btuple offers the following advantages:
- Constructing a btuple from a blist takes O(1) time.
- Taking a slice of a btuple takes O(n) time, where n is the size of the original tuple. The size of the slice does not matter.
>>> from blist import blist, btuple
>>> x = blist([0]) # x is a blist with one element
>>> x *= 2**29 # x is a blist with > 500 million elements
>>> y = btuple(x) # y is a btuple with > 500 million elements
Feedback
We’re eager to hear about your experiences with the blist. You can email me at daniel@stutzbachenterprises.com. Alternately, bug reports and feature requests may be reported on our bug tracker at: http://github.com/DanielStutzbach/blist/issues
How we test
In addition to the tests include in the source distribution, we perform the following to add extra rigor to our testing process:
- We use a "fuzzer": a program that randomly generates list operations, performs them using both the blist and the built-in list, and compares the results.
- We use a modified Python interpreter where we have replaced the array-based built-in list with the blist. Then, we run all of the regular Python unit tests.