sets and bags (multisets) in Python
Peter Dobcsanyi
p.dobcsanyi at designtheory.org
Sat May 1 21:09:44 BST 2004
Here is my planned reply to the Python developers regarding bags. I
wrote this after a quick discussion with Leonard about what features we
would like to see for bags.
In my planned reply I make references to the "set builtin data type",
you can find the related documentation at:
http://www.python.org/dev/doc/devel/lib/types-set.html
You may also want to take a look at the "collections" module's doc of
which "bag" would be part:
http://www.python.org/dev/doc/devel/lib/module-collections.html
Please note, our online Python documentation on the internal pages does
not contain these details and/or is out of date, you should look the
development versions for the new Python release at the above URLs.
.........................................................................
Here are the bag (multiset) features we thought would be useful.
Regardless of actual implementation, let us consider a bag as a set of
ordered pairs (x, m) where x is an element of the bag and m >= 0 is the
multiplicity of x.
A = {(x, m_A), ...}
B = {(x, m_B), ...}
union A | B = { (x, m_A + m_B), ...}
intersection A & B = { (x, min(m_A, m_B), ... }
difference A - B = { (x, max(m_A - m_B, 0)), ... }
symmetric diff A ^ B = { (x, m_A + m_B - 2 * min(m_A, m_B)), ... }
A.multiplicity(x) returns a non-negative integer
special cases:
x in A iff A.multiplicity(x) > 0
x not in A iff A.multiplicity(x) == 0
len(A) returns the sum of multiplicities
A.underlying_set() return the underlying set of A
A.underlying_size() (== len(A.underlying_set())
A.no_repeats() return True if for all x in A.underlying_set()
A.multiplicity(x) == 1
A <= B iff for all x in A.underlying_set()
A.multiplicity(x) <= B.multiplicity(x)
A == B iff A<=B and B<=A
A.add(x, [m]) add x to A with multiplicity m; if m is omitted it
defaults to 1
A.remove(x, [m]) remove x from A m-times, if m is omitted it defaults
to 1; raise BagError (or whatever) if x is not present.
(I don't like KeyError for this, neither here nor in
the case of sets.)
A.discard(x, [m]) remove x from A m-times if it is present; if m is
omitted it defaults to 1
A.pop([selection_mode]) if selection_mode is omitted remove and return
an arbitrary element; raise BagError if A is empty
selection modes UNIFORM: make a uniformly distributed random
choice over the list of elements
UNDERLYING_UNIFORM: make a uniformly distributed
random choice over the list of underlying elements.
MAJORITY: the most numerous element (or
uniform among them if more than one)
MINORITY: similarly, as the name suggests
The mutable-immutable distinction: as it is with the builtin set.
xxx_update() functions, operator, non-operator variants, using iterable
arguments: similarly to the builtin set versions.
Conversions:
Most of them provided by the xxx_update() functions
A.items() return A as a list
Aside: could we have S.items() if S is a set?
Iteration:
for x in A: as it was for x in A.items():
hash(A) as it was tuple(sorted(A.items()))
(or more precisely, a recursive "deep-freezing sorting" -- ?)
...........................................................................
Note, how the xxx_update() functions and releated operators provides the
in-place modification; that is the nice "pythonic way".
-- ,
Peter Dobcsanyi
More information about the Developers
mailing list