Sunday, August 31, 2008

Specifying Gestures by Example

Comments

Andrew's blog

Summary


'Gesture' the term mentioned by the author is basically a stroke made by a stylus or pen on a computer screen.

In this paper the author starts with mentioning some of the earlier gesture-based recognition system developed by various authors and teams. The problem with all these systems is their gesture recognizer is 'hand coded'. This code is complicated and not easy to maintain. There is another recognition technology known as 'GRANDMA' which shows how the need of 'hand coding' can be eliminated by automatically creating gesture recognizers from example gestures.

Author discusses the design of GDP which is a gesture-based drawing program built using 'GRANDMA'. 'GRANDMA' utilizes a MVC like system where an input event handler may be associated with a view class. It allows the user to define a set of gestures and their examples. Examples here are important because they define the variance in the gesture. The user can then also define the 'Semantics' of the gesture using the interface of GRANDMA. On the semantics window there are three main functionality. 1) 'recog' the user can define an expression once a gesture is recognized. 2) 'manip' defines the expression evaluated at subsequent mouse points. 3) 'done' when mouse button is released.

Gesture recognition is done in two steps 1) statistical features are extracted from the gesture 2) extracted features are then classified into one of the class of gestures earlier defined. There are thirteen features which are chosen as the important and could be able to distinguish between to different gestures. These features are sine and cosine of the initial angle, the length and angle of the bounding box diagonal, distance between first and last point, sine and cosine of then angle between the first and last point, total length of gesture, total angle traversed, the sum of absolute value of angle at each point, the sum of squared values of those angles, the maximum speed of the gesture, and the duration of gesture. One set back of these feature extraction is that there are some cases where these features may be same for totally different gestures.

Gesture classification is simple. Every feature has an assigned weight and the sum is evaluated of the weight multiplied by the feature value. The resultant value can be used for determining in which class the input gesture falls into.

Author also discusses the training concept. Earlier the user was actually required to input a set of example gesture for every gesture class. For each of these examples the features are extracted. Then simply an average of these feature values a taken as the base values. Author then discusses the methods by which the system can reject ambiguous gestures and outliers.

The algorithm explained by the author is fairly simple in understanding and it produces good results when classifiers are trained using this algorithm. As the number of classes are increased the accuracy rate slightly falls and the number of examples per class needs to be increased to achieve the better results.

Discussion

The author presents an algorithm which addresses the issue with previous gesture recognition tools. Earlier tools were 'hand coded' so they complicated and hard to maintain. GRANDMA on the other hand can actually be trained to recognize any gesture. The algorithm it uses is conceptually similar to fingerprint recognition algorithms where features are first extracted from the fingerprint and then these features are classified to match the the fingerprint from the database.

There are two problems with this system.
1) Feature values can be similar for two different gestures.
2) Accuracy in matching falls when the number of classes are increased. (Author gives a graph of accuracy against number of examples for up to 30 classes. But I had be more interested in the graph of accuracy against number of classes for up to say 100 classes).

Thursday, August 28, 2008

Introduction to Sketch Recognition

Comments

Akshay's blog

Summary


This paper discusses the history in the field of sketch recognition. One of the early breakthroughs in this field was the birth of pen-based computer interaction in 1963 by Ivan Sutherland at the MIT. Sketchpad was a very basic device which allowed user to draw diagrams on a computer screen with the help of a pen. This device was based on vector graphics which was later substituted with raster graphics because of many digital benefits.

With the advent of tablet PCs the pen-based interaction with the computer was taken to next level. There are various manufacturers of the tablet PCs. Tablet systems are of many types. They are in a slate form where only interaction is through the screen of the slate. The other type is convertible type which is basically a notebook computer whose screen can twist and be placed over the keyboard to actually look like a slate. Then there are some USB connected pen tablets which can be plugged into the system as a pen-based input device.

The input technology in the table PCs are of two types. A passive digitizer which uses only touch and active digitizers which detects electromagnetic signals from a special. Active digitizers are better in terms of precision. They also gives the functionality of 'hover' just like in mouse where the pointer can hover over the objects without actually without actually triggering a click event.

Some software features are also unique to tablet PCs. For example Microsoft support for Tablet PCs allows user to write with a pen on a computer screen and the software can recognize the user input which is translated to text. It also gives an on screen keyboard for the users which are familiar with QWERTY keyboard. Linux can also be installed to tablet PCs and there are some open source softwares available such as 'Jarnal' to take notes and sketch. There is another specialized software such as 'camatasia' which can record user activity on the screen.

Tablet PCs main application is in the field of presentation of and teaching. Tablet PCs can provide a more intuitive interface when preparing PowerPoint slides for the lectures. They can also be used for delivering lecture material 'on the fly' or interaction with static materials prepared in advance. Then editing on the tablet PC is fairly simple and easy as compared to paper.

There can also be some disadvantages of Tablet based presentations. The presenter may not be accustomed to the tools available on the tablets such as highlighters or he/she may not be comfortable with a tablet which can reduce the effectiveness of presentation.

The application of a tablet PC joined with the specialized software of sketch recognition can be tremendous. There are already some fields where these techniques are being used such as music composition where a user can draw the nodes and the system can play the music for the composer. In the field of chemistry where 'ChemPad' can be used to draw molecular structures. This is also being used in drawing mechanical diagrams, finite state machines, UML diagrams, and military course of action symbols.

For sketch recognition the author has provided a FLUID framework which can be used to build a sketch recognition system for any field. In this framework the instructor can draw a random diagram and then write a 'Domain specific information' in the LADDER (Domain description language). The GUILD is fed with these two information which can then generate a sketch recognition system which will have the ability to recognize data in a particular domain.

In the end the paper discusses two case studies. One is of high school teacher who uses table PCs and a lot of sketching softwares to present his lectures. The other teacher uses a tablet PC for presentation but utilizes a polling devices for students to interact. Both the teachers preferred tablet PCs because their students where more attentive in class and excited about the course.

Discussion

Ah!.. That was a difficult read, I think some pages were missing there.

This paper basically gives a background history of sketch recognition and then summarizes some latest milestones achieved.

The good thing about the FLUID framework is its 'generic nature'. The ability to describe a domain for any sketch recognition system gives it a lot of power. But will its generic nature become it's shortcoming when someday specialized frameworks will come into picture?.

Wednesday, August 27, 2008

Sketchpad

Comments

Akshay's blog

Summary


Sketchpad basically is one major breakthrough in the field of sketch recognition. Ivan E. Sutherland consultant Lincoln lab of the MIT is the creator of sketchpad. A sketchpad in simple terms is a device which enable human beings to draw digital diagrams using a light pen. The resulting diagram can be easily manipulated on the computer screen.

Considering sketchpad was built in 1963 with the limited resources of the time a simple task involved a lot of complexities. Sketchpad on its conception only supported a set of predefined shapes for example lines, circles and points to be made on the screen. But it was designed in such a way which could cater more shapes such as ellipse etc. Sketch pad also provided other functionality of joining points, adding constraints, copying, merging, deletion, rotation and magnification of the diagrams. Its usefulness at that time was expected to be in the field of the topological input devices and highly repetitive diagrams.

Sketchpad also have the capability of displaying 'text' and 'numbers'. In sketchpad the letters and the numbers are more or less the combination of curves and lines. Sketchpad utilizes some recursive functions to be able to manipulate the diagrams. 1) Expansion of instances: which means its possible to have sub pictures within sub pictures 2) Recursive deletion: which is deleting an object also means the deletion of all depended objects. 3) Recursive merging: which means merging of two independent objects will result in all the dependent objects to be now dependent upon the result of the merger. Sketchpad also use 'recursive display' to display diagrams on the screen. For every instance to be drawn it breaks the picture into the smaller part which was earlier drawn and so on until the it could not broken down into any smaller instance.

Sketchpad can also define 'attachers' since small diagrams can be used to make a larger diagram and these smaller parts need to be connected with other parts so the user must define the attachers in smaller parts so these pieces can be joined. Sketchpad apart from the light pen also uses a set of buttons which actually tells the sketchpad when a copy operation is to be performed or when a delete operation is to be performed and so forth. What makes sketchpad different from the paper and pencil concept is that in sketchpad the user can define the design constraints. For example user can define a constraint that two particular lines will be parallel. If the user had not drawn the two lines parallel to each other the computer will adjust the diagram so the two lines become parallel. There are a set of constraints which are compiled in the form of a manual for the user. This makes it easier for the user to draw as he/she wishes to draw by utilizing these constraints.

Discussion

Although it has been four decades since the sketchpad was developed but still it proposes some very futuristic concepts. Sketchpad was developed as the first pen based input device on computers but its use and progress was marred by the concurrent invention of a computer mouse. Since the computer mouse was a cheaper built and pen-based device was very expensive sketchpad didn't get much attention of the researchers.

The fault that I see in this system but that can be justified is that the system rely a lot on user input. To posses the system was actually very expensive at that time and it seems that it was equally difficult to use the it also.

If I had invented this sketchpad I would have definitely gone onto the next level to make this device more user friendly and accessible to common people for use in activities such as teaching, planning and simulation.

Tuesday, August 26, 2008

Questionnaire. About me.


nabeel dot shahzad at yahoo dot com
1st year masters student

Why are you taking this class?
I was always intimated by such fields in computer science. After some time i realized that if a person spends adequate time things are not as complicated as it may seem. Basically this is a very fascinating field and plenty of dark areas to explore.

What experience do you bring to this class?
In terms of technical experience I don't have any prior experience of such field. I think I have a good problem solving ability which can be helpful when dealing with hard problems in sketch recognition I think I can make a difference.

What do you expect to be doing in 10 years?
Probably running my software company which would be doing extra research in producing more intuitive and easy to use software.

What do you think will be the next biggest technological advancement in computer science?
A computer program that could simulate a human brain.

What was your favorite course in undergrad (CS or otherwise)?
1) Data Structures
2) Algorithms
3) Digital logic design

If you could be another animal, what would it be and why?
Panda: Because they are becoming extinct :).

What is your favorite motto or slogan?
'That's the beauty of argument, if you argue correctly, you are never wrong'

What is your favorite movie?
'Scent of a woman'

Name some interesting fact about yourself
I bowl, throw and play snooker with my left hand and play table tennis, badminton and volleyball with the right hand. I dribble basketball with the left hand and shoot with the right hand.