blist 1.1.1 is now available: http://pypi.python.org/pypi/blist/
What is blist?
The blist is a drop-in replacement for the Python list the provides better performance when modifying large lists. Python’s built-in list is a dynamically-sized array; to insert or removal an item from the beginning or middle of the list, it has to move most of the list in memory, i.e., O(n) operations. The blist uses a flexible, hybrid array/tree structure and only needs to move a small portion of items in memory, specifically using O(log n) operations.
For small lists, the blist and the built-in list have virtually identical performance.
What’s new?
The blist package provides other data structures based on the blist:
- sortedlist
- sortedset
- weaksortedlist
- weaksorteset
- sorteddict
- btuple
These additional data structures are only available in Python 2.6 or higher, as they make use of Abstract Base Classes.
The sortedlist is a list that’s always sorted. It’s iterable and indexable like a Python list, but to modify a sortedlist the same
methods you would use on a Python set (add, discard, or remove).
>>> from blist import sortedlist >>> my_list = sortedlist([3,7,2,1]) >>> my_list sortedlist([1, 2, 3, 7]) >>> my_list.add(5) >>> my_list[3] 5 >>>
The sortedlist constructor takes an optional "key" argument, which may be used to change the sort order just like the sorted() function.
>>> from blist import sortedlist >>> my_list = sortedlist([3,7,2,1], key=lambda i: -i) sortedlist([7, 3, 2, 1] >>>
The sortedset is a set that’s always sorted. It’s iterable and indexable like a Python list, but modified like a set. Essentially, it’s just like a sortedlist except that duplicates are ignored.
>>> from blist import sortedset >>> my_set = sortedset([3,7,2,2]) sortedset([2, 3, 7] >>>
The weaksortedlist and weaksortedset are weakref variations of the sortedlist and sortedset.
The sorteddict works just like a regular dict, except the keys are always sorted. The sorteddict should not be confused with Python 2.7’s OrderedDict type, which remembers the insertion order of the keys.
>>> from blist import sorteddict >>> my_dict = sorteddict({1: 5, 6: 8, -5: 9}) >>> my_dict.keys() [-5, 1, 6] >>>
The btuple is a drop-in replacement for the built-in tuple. Compared to the built-in tuple, the btuple offers the following advantages:
- Constructing a btuple from a blist takes O(1) time.
- Taking a slice of a btuple takes O(n) time, where n is the size of the original tuple. The size of the slice does not matter.
>>> from blist import blist, btuple >>> x = blist([0]) # x is a blist with one element >>> x *= 2**29 # x is a blist with > 500 million elements >>> y = btuple(x) # y is a btuple with > 500 million elements
Feedback
We’re eager to hear about your experiences with the blist. You can email me at daniel@stutzbachenterprises.com. Alternately, bug reports and feature requests may be reported on our bug tracker at: http://github.com/DanielStutzbach/blist/issues
How we test
In addition to the tests include in the source distribution, we perform the following to add extra rigor to our testing process:
- We use a "fuzzer": a program that randomly generates list operations, performs them using both the blist and the built-in list, and compares the results.
- We use a modified Python interpreter where we have replaced the array-based built-in list with the blist. Then, we run all of the regular Python unit tests.