4.4.2 SequenceMatcher Examples
This example compares two strings, considering blanks to be ``junk:''
>>> s = SequenceMatcher(lambda x: x == " ",
... "private Thread currentThread;",
... "private volatile Thread currentThread;")
ratio() returns a float in [0, 1], measuring the similarity of the
sequences. As a rule of thumb, a ratio() value over 0.6 means the
sequences are close matches:
>>> print round(s.ratio(), 3)
0.866
If you're only interested in where the sequences match, get_matching_blocks()
is handy:
>>> for block in s.get_matching_blocks():
... print "a[%d] and b[%d] match for %d elements" % block
a[0] and b[0] match for 8 elements
a[8] and b[17] match for 6 elements
a[14] and b[23] match for 15 elements
a[29] and b[38] match for 0 elements
Note that the last tuple returned by get_matching_blocks() is
always a dummy, (len(a), len(b), 0), and this is the only
case in which the last tuple element (number of elements matched) is 0.
If you want to know how to change the first sequence into the second, use get_opcodes():
>>> for opcode in s.get_opcodes():
... print "%6s a[%d:%d] b[%d:%d]" % opcode
equal a[0:8] b[0:8]
insert a[8:8] b[8:17]
equal a[8:14] b[17:23]
equal a[14:29] b[23:38]
See also the function get_close_matches() in this module, which
shows how simple code building on SequenceMatcher can be used to do
useful work.
|