work
ABSTRACT
We describe a unique form of hands-free interaction that
can be implemented on most commodity computing platforms.
Our approach supports blowing at a laptop or computer
screen to directly control certain interactive applications.
Localization estimates are produced in real-time to
determine where on the screen the person is blowing. Our
approach relies solely on a single microphone, such as
those already embedded in a standard laptop or one placed
near a computer monitor, which makes our approach very
cost-effective and easy-to-deploy. We show example interaction
techniques that leverage this approach.
ACM Classification: H5.2 [Information interfaces and
presentation]: User Interfaces. - Graphical user interfaces.
General terms: Design, Human Factors
Keywords: Interfaces, interaction techniques, hands-free
interaction, blowable user interfaces, localization
INTRODUCTION AND MOTIVATION
Hands-free input techniques provide quick, secondary input
options, especially when someones hands are preoccupied
with another task. In addition, hands-free control also
offers individuals with limited arm control the ability to
interact with a user interface. Various strategies to provide
hands-free interaction have emerged in our research community.
Typical approaches use sound or voice-based interfaces,
which focus on the verbal parts of human speech.
While this is reasonable for complicated or commandbased
tasks, it is not well suited for direct, low-level controls
such as scrolling, button pressing, or selection. Other
approaches use non-speech audio for continuous, low-level
control, such as humming or whistling [2, 3, 6]. However,
verbal and non-verbal sounds still do not necessarily have
an intuitive spatial mapping for direct selection tasks, and
the stigma associated with producing loud sounds in public
places can reduce the adoption of these technologies. Other
interfaces use head or gaze tracking to infer ones intent,
but these require additional, sometimes costly, instrumentation
and may be hard to control [8].
We describe a unique form of hands-free interaction, called
BLUI (Blowable and Localized User Interaction), that can
be installed on most commodity computing platforms.
BLUI supports blowing at a laptop or computer screen to
directly control specific parts of an interactive application,
such as blowing at a button to activate it. Physically blowing
at a laptop or computer screen creates generic UI
events at specific places on the screen. These directed
events, similar to mouse events, can then directly control
certain interactive parts of an application (see Figures 1 and
2). Localization estimates are produced in real-time. The
novelty of our approach is that it relies solely on a single
microphone that comes embedded on many laptops, which
makes it very cost-effective and easy-to-deploy. In addition,
users can be discreet when blowing, because our approach
does not rely on the sound but the wind generated
when blowing.
Our input method has implications for both hands-free assistive
technology applications and entertainment applications
that want to leverage the physical blowing metaphor.
In this paper, we introduce interactive techniques that leverage
our approach. We also discuss the implementation of
this system and a preliminary performance evaluation that
characterizes accuracy and precision of the localization.
PERFORMANCE
We conducted a preliminary evaluation of the accuracy of
the BLUI localizer with three different individuals. For
each person, our setup consisted of a training/calibration
period followed by a set of 25-50 blows (depending on the
resolution) toward various regions on the screen. We conducted
this test two different times. To accurately determine
the ground truth and maintain consistency, individuals
wore a head-mounted laser pointer to visually indicate
where the person is pointed at the display. In practice, this
is not necessary as long as the person using the interface it
is the one who trained it.
Table 1: Performance of the BLUI localizer for various
resolutions (% of correctly identified regions).
9
(3 X 3)
16
(4 X 4)
25
(5 X 5)
36
(6 X 6)
Laptop 100% 96% 80% 62%
Desktop 100% 92% 82% 66%
We report the overall number of correctly classified regions
at varying resolutions for a laptop and desktop (see
Table 1). The regions were of the same size and uniformly
distributed in a grid pattern across the screen. We found
that our localization approach is very accurate for up to 16
regions and shows promising results for 25 regions. The
confusion matrix reveals that most of the misclassifications
(84%) were of adjacent regions. Part of the reason for this
is because the feature set is related to the spatial arrangement
of the regions. The lower accuracies for higher resolution
regions are also the result of the un-collimated or
conical nature of blowing, because of the possibility of
reflection off multiple regions. Moving closer to the screen
can correct this issue.
IMPROVEMENTS AND FUTURE WORK
Though we saw promising results with our user interface,
there are some important considerations to improve upon in
our current design. Although we did not apply background
noise filtering, this would be necessary for outdoor and
noisy environments, unless a practical sound baffle could
be produced that insulates the microphone from ambient
noise. We can avoid false positive responses by employing
a sophisticated audio filtering scheme that differentiates
between the learned broadband wind and other noises.
We used a real-time classification approach to identify discrete
regions. An analytical approach that directly models
the transfer functions can provide a more continuous input
analysis and higher resolution. We also presented some
initial performance data of our localization scheme, but an
important next step is to conduct empirical user studies to
answer interaction questions, such as the selection times for
various blow-based interfaces.
CONCLUSION
We presented a system, called BLUI, that enables blowing
at a laptop or computer screen to directly control interactive
applications. BLUI produces coarse-grained localization
estimates in real-time to determine where on the screen
the person is blowing. Because our approach is does not
require any additional hardware or instrumentation, it is
also cost effective. Results also show we can localize up to
16 regions on a laptop or desktop with over 95% accuracy.
