Monday, October 27, 2008

Grouping Text Lines in Freeform Handwritten Notes

Comments

Summary


This paper addresses the problem of grouping handwritten text lines in freeform digital ink notes.

It uses a cost function to determine the grouping of the strokes. There are three likelihood terms and 2 prior terms.

Likelihood of line means the fitted line's direction and the max interstroke distances in it's x and y planes. Configuration consistency means that two strokes are neighbours if the distance between the two is below a threshold and there is not stroke in between them. Model complexity of a partitioning is the number of lines.

Optmization is done using a gradient-descent method. An initial solution is obtained from 'Temporal Grouping' which is based on the fact that most text lines are composed of temporaly adjacent strokes. Then two alternative hypothesis are built whose function is to merge two adjacent strokes and correction high-configuration-energy errors. Then global cost is computed which is essentially the cost change at each hypothesis.

The author uses a recall metric to evaluate the algorithm which is defined by number of correct grouping upon number of labeled groupings. For perfect labeled diagrams the recall metric was 0.93 and for crude diagrams recall metric was 0.87.

Discussion

This paper discusses some nice features for grouping text which can also be used as context features for distinguishing shape vs text.

No comments: