Thursday, April 30, 2009

The VoiceBot: A Voice Controlled Robot Arm

Brandi House, Jonathan Malkin, Jeff Bilmes
Department of Electrical Engineering, University of Washington

Commented on:
Patrick Clay
Richard Russell
Sarah Gray

Individuals with motor impairments such as paraplegia and spinal cord injuries may have difficulties with daily activities. Assistive technology exists to facilitate their actions and increase their abilities. One of the problems facing this technology is the control options of the technology. A solution that was developed was the spoken language and automatic speech recognition (ASR) system. However, it does not provide adequately for tasks that require smooth continuous control.

By using the Vocal Joystick (VJ), a system was developed to use non-speech voice recognition to control a robot arm. Two experiments were carried out. First experiment was to manipulate a robot arm in 2-dimensions while the second experiment was to manipulate it in 3-D.
Three control models were used for the experiements:

- Forward Kinematic (FK) Model
- Inverse-Kinematic (IK) Model
- Hybrid Model

The forward kinematic model requires the use to control each joint angle explicitly. Advantage is that it is light on computation, but it requires more effort from the user. In the inverse kinematic model, the arm is used to position the end effector in the appropriate location. This model requires more computation but less work for the user. The hybrid model is a combination of the FK and the IK. In this model, the first two joints are controlled using IK and the last one is controlled directly.


Experiment 1: Simulate Arm User Study
Objective: Feasibility and to support their hypothesis that the hybrid model would be the best choice.


5 users were invited to test their system. The test had the users control a robot arm in 2D to move a ball along the ground to four locations. Users were given time to practice for as long as they want before the time trial was conducted. Four of the users had previous experience with VJ.

The results:

Users had the most difficulty control the pitch. Overall, users generally preferred the inverse kinematic model, as it was the fastest in the trial for 3 users out of 5. The forward kinematic model was second, and the least favorite was the hybrid model. These results indicate that users would like prefer not to have to explicitly control each joint angle. With more practice though, users may be able to adapt to the forward kinematic model.

Experiment 2: Robotic Arm Control: The VoiceBot
The VoiceBot is a hobbyist robot that they converted to be controlled by the Vocal Joystick. The VoiceBot is controlled using two modes: position mode and orientation mode. The user can switch between these modes by using a “ck” sound.
In position mode, the arm moves according to three algorithms

- Forward Kinematics (FK)
- Inverse Kinematics - Cartesian
- Inverse Kinematics – Cylindrical


In orientation mode, finer controls are allowed to manipulate the gripper.
In this experiment, 12 users were invited and their task was to use the robot arm and place 2 candies into a target spot. Each user was given two practice sessions and two time trails, each to complete the task with a different control method each. A short interview was conducted at the end.


All users were able to complete the task given with both control methods. 75% of the users preferred the Inverse Kinematic controls as opposed to forward kinematics. Many users felt that IK was more intuitive, although it felt slower than FK and produced jerkier movements. One problem the users described was the discrete sound detection. It induced frustration for a few users as there was a large amount of false detections. Another problem was that the pitch control was a significant challenge for many users. Future attempts could try and refine the control methods and improve the sound detection.


The experiments proved the feasibility of this technology and future studies could benefit from better control methods and better equipment. The results presented are the first instance of a non-verbal voice-controlled robotic arm. Future research into this area can yield a greater system that can help individuals with motor impairments.

Tuesday, April 21, 2009

GUI - Phooey!: The Case for Text Input

By: Max Kleek, Michael Bernsterin, David R. Karger, MIT CSAIL.
mc shraefel, Electronics and Computer Science, University of Southampton

Commented on:
Patrick Surber
Josh Meyers
Adam Griffin

Conventional methods for information entry and retrieval do not provide the perfect package of easy data entry and efficient data retrieval.
2 methods:

free text entry - Easy entry, hard retrieval
Context capture - hard to input, easier retrieval.

Solution: Free text entry + context capture = JourKnow
Jourknow combines the free form text entry with context capture techniques to give users a fast and easy interface for input and output.
Jourknow takes free text input and breaks down the data into structures and entities and associates it with tags. Jourknow parses information and recognizes certain subjects such as meetings, dates and locations. All information inputted into Jourknow is called "codex".


Jourknow uses a simplified language called pidgin. It allows users to express things more naturally. It also uses a syntax based on notation3, which let users make statements to "express arbitrary structural properties and relationship among entities...".
Jourknow provides feedback on how expressions are interpreted so users can know if they need to explicitly alter any associations or to correct any parses. The program also contains filtering features that allow users to efficiently find data that they previously wrote. Jourknow associates contexts to notes that users have wrote. Contexts include pictures, videos and information describing the situation when they wrote the note. These contexts help the user recall the time when they first recorded the data. These contexts are chronologically organized and Jourknow also breaks down time into segments such as minutes, hours, days, or morning, afternoon and night.

Initial informal tests included 5 users and the general consent was positive. Users commented on the text-input interface. Opinions about the tagging functionality were split and some users desired the ability to be able to associate non-textual information items with notes.

Information management was benefitted by the rich GUI interface and a good input interface is wasted if the trouble to retrieve the information outweighs it. The goal of Jourknow is to provide users with a GUI interface that'll facilitate information entry and retrieval. The paper described a design that minimizes the effort needed for text entry and their implementation of a system that meets their criteria.

Sunday, April 19, 2009

The Inmates are Running the Asylum

Commented on:
John Zachery
Josh Meyers
Brian Salato

In Alan Cooper's "The Inmates are Running the Asylum", he talks about how the software engineers, the "inmates", are actually in control of how a software is designed and developed. While I was not totally aware of this situation, I do agree with him after reading his book. The ones who actually write the code inevitably have a great deal of leverage and say on how the program will turn out, and any other roles merely act as support and guidance to them. Cooper emphasizes on the difficulties users face when using today’s software and how they are ill-designed and therefore suffer from similar sub-par interaction interfaces. From cameras to planes, Cooper presents examples of how things are complicated when a computer is involved in the equation. He insisted that when something has a computer as part of this structure, it will ultimately act like a computer and have similar weaknesses. Cooper repeatedly states the importance of the role of a design engineer. A design engineer's main goal is to create a good interface with which the user can enjoy and improve their overall experience. A product’s development life cycle should include professional design engineers to layout a detailed plan before any code is written. I very much agree with this idea. Its true that writing programs are expensive, and having to rewrite something only adds to the amount of unnecessary costs. His analogy to movie making seemed to make sense. Even though each stage of development requires a different group of people, there doesn't necessarily have to be waste as the design team can begin working on the next project while the programmers get to work on their just-finished design. He proves that it is not the best idea to have the software engineers do the design work as that presents a conflict of interest. Good design may not be particular hard to come up with, but it’ll involve more work on the coders’ part and they may not be willing to put in that extra effort. Towards the second half of the book, I felt like this book became a why-you-should-hire-design-engineers guide for employers. Perhaps future software would be benefited if all managers and employers for software companies were to read his book. Overall, I enjoyed reading this book as it had some pretty interesting points, but I thought he tried a little too hard to push the design engineer’s importance and role onto his audiences.