graphid.util.name_rectifier module

graphid.util.name_rectifier.demodata_oldnames(n_incon_groups=10, n_con_groups=2, n_per_con=5, n_per_incon=5, con_sep=4, n_empty_groups=0)[source]
graphid.util.name_rectifier.simple_munkres(part_oldnames)[source]

Defines a munkres problem to solve name rectification.

Notes

We create a matrix where each rows represents a group of annotations in the same PCC and each column represents an original name. If there are more PCCs than original names the columns are padded with extra values. The matrix is first initialized to be negative infinity representing impossible assignments. Then for each column representing a padded name, we set we its value to $1$ indicating that each new name could be assigned to a padded name for some small profit. Finally, let $f_{rc}$ be the the number of annotations in row $r$ with an original name of $c$. Each matrix value $(r, c)$ is set to $f_{rc} + 1$ if $f_{rc} > 0$, to represent how much each name ``wants’’ to be labeled with a particular original name, and the extra one ensures that these original names are always preferred over padded names.

Example

>>> part_oldnames = [['a', 'b'], ['b', 'c'], ['c', 'a', 'a']]
>>> new_names = simple_munkres(part_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['b', 'c', 'a']

Example

>>> part_oldnames = [[], ['a', 'a'], [],
>>>                  ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b'], ['a']]
>>> new_names = simple_munkres(part_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
[None, 'a', None, 'b', None]

Example

>>> part_oldnames = [[], ['b'], ['a', 'b', 'c'], ['b', 'c'], ['c', 'e', 'e']]
>>> new_names = find_consistent_labeling(part_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['_extra_name0', 'b', 'a', 'c', 'e']
Profit Matrix

b a c e _0

0 -10 -10 -10 -10 1 1 2 -10 -10 -10 1 2 2 2 2 -10 1 3 2 -10 2 -10 1 4 -10 -10 2 3 1

graphid.util.name_rectifier.find_consistent_labeling(grouped_oldnames, extra_prefix='_extra_name', verbose=False)[source]

Solves a a maximum bipirtite matching problem to find a consistent name assignment that minimizes the number of annotations with different names. For each new grouping of annotations we assign

For each group of annotations we must assign them all the same name, either from

To reduce the running time

Parameters:

gropued_oldnames (list) – A group of old names where the grouping is based on new names. For instance:

Given:

aids = [1, 2, 3, 4, 5] old_names = [0, 1, 1, 1, 0] new_names = [0, 0, 1, 1, 0]

The grouping is

[[0, 1, 0], [1, 1]]

This lets us keep the old names in a split case and re-use exising names and make minimal changes to current annotation names while still being consistent with the new and improved grouping.

The output will be:

[0, 1]

Meaning that all annots in the first group are assigned the name 0 and all annots in the second group are assigned the name 1.

References

http://stackoverflow.com/questions/1398822/assignment-problem-numpy

Example

>>> grouped_oldnames = demodata_oldnames(25, 15,  5, n_per_incon=5)
>>> new_names = find_consistent_labeling(grouped_oldnames, verbose=1)
>>> grouped_oldnames = demodata_oldnames(0, 15,  5, n_per_incon=1)
>>> new_names = find_consistent_labeling(grouped_oldnames, verbose=1)
>>> grouped_oldnames = demodata_oldnames(0, 0, 0, n_per_incon=1)
>>> new_names = find_consistent_labeling(grouped_oldnames, verbose=1)

Example

>>> # xdoctest: +REQUIRES(module:timerit)
>>> import timerit
>>> ydata = []
>>> xdata = list(range(10, 150, 50))
>>> for x in xdata:
>>>     print('x = %r' % (x,))
>>>     grouped_oldnames = demodata_oldnames(x, 15,  5, n_per_incon=5)
>>>     t = timerit.Timerit(3, verbose=1)
>>>     for timer in t:
>>>         with timer:
>>>             new_names = find_consistent_labeling(grouped_oldnames)
>>>     ydata.append(t.min())
>>> # xdoc: +REQUIRES(--show)
>>> import plottool_ibeis as pt
>>> pt.qtensure()
>>> pt.multi_plot(xdata, [ydata])
>>> util.show_if_requested()

Example

>>> grouped_oldnames = [['a', 'b', 'c'], ['b', 'c'], ['c', 'e', 'e']]
>>> new_names = find_consistent_labeling(grouped_oldnames, verbose=1)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['a', 'b', 'e']

Example

>>> grouped_oldnames = [['a', 'b'], ['a', 'a', 'b'], ['a']]
>>> new_names = find_consistent_labeling(grouped_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['b', 'a', '_extra_name0']

Example

>>> grouped_oldnames = [['a', 'b'], ['e'], ['a', 'a', 'b'], [], ['a'], ['d']]
>>> new_names = find_consistent_labeling(grouped_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['b', 'e', 'a', '_extra_name0', '_extra_name1', 'd']

Example

>>> grouped_oldnames = [[], ['a', 'a'], [],
>>>                     ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'b'], ['a']]
>>> new_names = find_consistent_labeling(grouped_oldnames)
>>> result = ub.urepr(new_names)
>>> print(new_names)
['_extra_name0', 'a', '_extra_name1', 'b', '_extra_name2']