October 2003, Issue 62
by Sonaris Consulting, Felix Bopp, Amsterdam, The Netherlands
[formerly Music for New Media Newsletter]
You can find the online version at: http://www.sonaris.info
Music Data Mining
For the blind or deaf:
Vision Technology for the Totally Blind, Association of Sensory Substitution,
Institute for Innovative Blind Navigation
"Query by Humming"
The Intelligent Systems and Robotics
Center (ISRC), Stanford Robotics Laboratory,
Robotics and Intelligent Machines Laboratory, MIT
Field and Space Robotics Laboratory
Picks from IBC 2003:
The VISTA Project, SONY Broadcast & Professional Research Labs, BBC
Research & Development, Adopt-IT,
Conferences & events
Music Data Mining
Dr Darrell Conklin: "Music data mining deals with the theory and methods
for discovering knowledge from music corpora. This knowledge can be in the
form of patterns for music analysis and retrieval, or statistical
models for music classification and generation. Music presents interesting
challenges for data mining and knowledge representation; it is temporal,
highly structured, multivariate, and polyphonic. Like natural language,
it has a deep structure with extensive nonadjacent temporal dependencies.
The rapid growth of music databases, and the ongoing quest to understand
and develop models for musical styles, are motivating the application and
development of music data mining."
"Music presents new, interesting and fascinating challenges for data
mining. The growing field of music data mining is relevant to data mining
researchers interested in applying their techniques to music, or to those
with a desire to gain familiarity with a new area."
Darrell Conklin's homepage:
"Music Generation from Statistical Models"
For tailor-made chamber music concerts,
flute and English lessons visit: www.friedajacobowitz.com
For the blind or deaf
Café Signes, unlike the many bistros sprinkling the streets of Paris,
France, is strangely quiet. The conversations, though jubilant, are silent,
conveyed mostly through sign language rather than spoken word. Set up
with government backing, the cafe is designed to train people with hearing
impediments for full-time jobs. Nearly all of the 45 staff are deaf. Patrons
who are unable to read or understand sign language are given a quick tutorial:
the menu contains pictures of all the main signs needed to communicate
an order. While most enjoy the challenge of signing their orders, customers
may also wish to write their orders down. In addition to offering job
training, the cafe, serves an even greater purpose, and that is to bring
about a better understanding of those with disabilities. Francine Daude,
a staff trainer at the cafe, has said "most deaf people and hearing people
are actually afraid of each other, and suddenly the non-deaf who come
here find themselves a little in the same situation as a deaf person.
They have to learn a whole new way to communicate. It's good to have a
link between the two, and the link is this cafe - a place where you can
drink, eat and have a good time together and begin to discover that the
person on the other side is not so different after all."
Café Signes, 33 avenue Jean Moulin, Paris, France - Métro Alésia
Vision Technology for the Totally Blind
The vOICe Learning Edition translates arbitrary video images from
a regular PC camera into sounds. This means that you can see with your
ears, whenever you want to. The vOICe Learning Edition is not about some
sonar device: you can hear any visual item, including photographs, drawings,
signs, or pictograms. In fact, every visual shape gives a unique sound,
and The vOICe Learning Edition lets you hear visual perspective, doorways,
buildings, trees, furniture etc. Even color identification is included,
using human speech to turn The vOICe system into a kind of "talking camera".
Association of Sensory Substitution, Japan
The sensory substitution is an action to substitute the surviving
sense for the sensory function lost by impairment. Research and development
of sensory substitution methods is very important to assist the sensory
disabilities, such as the visual, auditory, and multiple impairment. The
Association of Sensory Substitution (ASS), Japan was established in 1975.
Institute for Innovative Blind Navigation
The mission of the Institute for Innovative Blind Navigation is to
become a global center for the study, promotion, and development of sophisticated
wayfinding technologies that have the potential for improving the efficiency
and safety of travel for blind individuals. The business of the Institute
is summarized in the four goals:
1) To gather, organize, and share knowledge about wayfinding technologies
for individuals who are visually impaired
2) To design and provide training about wayfinding technologies to consumers
and to professionals
3) To identify key wayfinding technologies, evaluate the emerging tools,
communicate and consult with inventors, and report on emerging and existing
4) To provide services and jobs for blind children (students, young adults).
This goal does not exclude older blind adults from services. Also, the
term "blind" refers to visually impaired individuals as well, and it does
not exclude navigationally impaired individuals.
"Query by Humming"
the creators of the MP3 digital music format comes another innovation
for audiophiles – a type of software that identifies a song by title and
composer based on a person humming a few bars into a microphone. The system
can also help musicians who could hum a tune into a microphone and get
the note structure, according to a spokesman for Germany’s Fraunhofer
Gesellschaft, an umbrella organization for 50 or so research institutes
and working groups.
Computer scientists around the globe are working on their own versions
of so-called “Query by Humming” melody recognition, and the rival products
have yet to be sorted out into one standard. They all try to solve the
problem created by text-based databases and search engines used for audio
content. With the new technology, Instead of typing lyrics into search
engine, the user hums into a microphone. The goal of a consumer humming
a tune to call up just the right song on her stereo or as a radio request
is still a few years off, however.
On the job in Germany are scientists with the Working Group for Electronic
Media Technology (AEMT) of the Fraunhofer Institute for Integrated Circuits.
Part of their work is building a database of songs that includes for each
piece a set of “metadata” on the composer, artist, category, rhythm, beat
and tempo. The Fraunhofer scientists are working on a method by which
this information is extracted automatically from the audio data and attached
in the form of an explanatory note. “In order to find the requested song,
the sound waves generated by the tune being sung are resynthesized by
the computer into a written sequence of notes,” AEMT scientist Dr. Frank
Klefenz explained to Fraunhofer magazine. “The input pitch and timing
information is essentially translated back into notes.” The system then
selects the matching piece of music from the database.
The work is a natural progression from MP3 technology, which opened up
access via the Internet to a nearly infinite range of recordings. Now
what is needed is an easier and faster way to identify and select the
desired music. Query by Humming is not the only innovation in the works.
Fraunhofer’s “intelligent stereo” project aims to allow users to simply
state the title of the desired piece and have the stereo play it automatically;
no more shuffling through CDs or music files. And in a twist on Query
by Humming, researchers at another Fraunhofer installation, the Institute
for Integrated Curcuits IIS in Erlangen have come up with AudioID in which
the system “listens” to a piece, registers features of the song and can
then identify it and even distinguish between different version. The technology,
based on the open technology MPEG-7 standard, has applications in music
sales, broadcast monitoring and protection of copyright.
The first website implementing "Query by Humming"
The Intelligent Systems and Robotics Center (ISRC)
The Intelligent Systems and Robotics
Center (ISRC) has developed unique tools for simultaneously simulating
multiple models of a single integrated system. A future focus will be
on microsystems implementation (including microelectromechanical systems
- MEMS) military robotic systems analysis, and massively parallel robot
Stanford Robotics Laboratory
At the Stanford Robotics Laboratory, research is conducted on a number
of core topics, including manipulation, machine learning, navigation,
vision, tactile sensing, and reasoning. The goals of this research are
to make robots both autonomous and dexterous, to increase the self-sufficiency
of existing robots, and to enable robots to accomplish very delicate tasks.
Robotics and Intelligent Machines Laboratory
Microrobotics and Millirobotics
Our lab is interested in shrinking
robots to a size range where they can exploit application domains inaccessible
to conventional robots or humans. Our current focus is on flying micro-robots
(the MFI project), medical milli-robots, and micro-assembly techniques
which can be used to build micromechatronic systems, including microrobots.
Current force-reflecting teleoperators
provide information on the net force of contact, but do not provide information
about static texture, local compliance, or local shape. For dextrous manipulation,
tactile (i.e. cutaneous) feedback should be provided to the operator.
A tactile sensor array can be used to sense contact properties remotely.
To provide local shape information, an array of force generators can create
a pressure distribution on a finger tip, synthesizing an approximation
to a true contact. To implement tactile feedback, we are considering the
problems of tactile transduction, signal processing, tactile display,
and human perception. We have developed a prototype 5 by 5 pressure display
with 3 bits of resolution and are developing a 1 mm square tactile sensor
with 8 by 8 elements. One of our long-term goals is to fabricate a tactile
sensing catheter which will allow surgeons to remotely feel tissue properties
during minimally invasive surgery.
"An Integrated Approach to Intelligent Systems"
This multiuniversity research initiative
(University of California, Berkeley, Stanford University and Cornell)
supports distributed multi-agent control systems; hierarchies of sensing
and control; specification, verification and robustness of hybrid systems;
architectures for multi-agent hybrid systems; intelligent agents for complex
uncertain environments; and soft computing and neural networks.
MIT Field and Space Robotics Laboratory
Physics-Based Design, Planning, and Control
of Robotic Systems in Space
Future robotic systems will perform important tasks in space, in both
on-orbit operations and planetary exploration. In this research program,
we are developing new design methods and control and planning algorithms
to enable future space robotic systems to overcome their limitations and
meet challenging mission objectives. The underlying intellectual focus
of the program is to construct a set of integrated design, planning and
control techniques based on an understanding of the fundamental mechanics
of space robotic systems.
The Planning and Control of Space Robotic Systems
In November 2000, The National Space Development Agency of Japan (NASDA)
and the Field and Space Robotics Laboratory of MIT (FSRL) began a 5 year
cooperative research program on free-flying orbital robotics. The objective
of this research is to advance the state of the art of space robotics.
The research will also to enhance the research and development capability
of NASDA in the fields of dynamics and control of future space robotics
systems including the Reconfigurable Brachiating Robot (RBR) and the Hyper
Orbital Service Vehicle (Hyper-OSV). The RBR and Hyper-OSV are advanced,
high-performance modular robots being developed by NASDA for use in satellite
servicing, the Japanese Experimental Module (Kibo) of the International
Space Station. The research will be geared toward developing technologies
to be testing in flight by year 2010.
On-line Terrain Characterization for Mars
In rough terrain other than paved road, it could be hard to move due to
the sinkage and slippage of the wheel. For planetary rovers, failure in
moving means the failure of all mission objectives. In this sense, to
predict the traversability of terrain accurately enough is very important.
The goal of this research is to develop on-line algorithms to predict
Model-Based Control of High Speed Rough Terrain Robotic Vehicles
High-speed mobile robots have many potential applications, including military
reconnaissance and scientific exploration. Our project is focused on developing
the control and planning algorithms for high speed autonomous rough-terrain
ground vehicles. Currently, state of the art high-speed autonomous rough-terrain
robotic vehicles travel at speeds of about 10 mph on relatively gentle
terrain. Our goal is to develop a system for traversing rugged, desert-type
terrain at substantially higher speeds, tolerating vehicle slip and ballistic
from IBC 2003
The VISTA Project
The VISTA Project explores how visually impaired viewers, many of
whom are elderly, can get the best out of the unprecedented choice of
channels and services on digital television. It will conduct research
towards the development of a complete Virtual Human Interface for digital
television. A virtual human will be combined with voice input and output
systems to create an easy to use prototype interaction system for elderly
and visually impaired digital television viewers. VISTA will provide new
insights into the potential utility of a talking EPG for visually impaired
and elderly viewers. Results of human factors evaluations will inform
the development of optimally effective virtual human interfaces for visually
impaired and elderly viewers. Useful information regarding the appeal
of the VISTA virtual human interface to the target user groups will inform
any future decisions regarding potential commercialisation of the VISTA
SONY Broadcast & Professional Research Labs
Broadcast & Professional Research Labs is an overseas research facility
for the Broadband Network Solutions Company an operating division of Sony
Corporation. Eighty percent of funding comes from BNSC Company in Japan,
twenty percent from other sources in Europe.
Metadata means "data about data". In the Broadcast & Professional area,
usually the second 'data' means digital audio and video data, often called
'essence' data. During the lifetime of this 'essence', from conception
onwards, there is a huge quantity and variety of metadata available: such
as the detail of the location where the essence was captured; details
of who is being interviewed; details of the names of the cast... and the
list goes on and on. These metadata are extremely valuable as explained
below, but in many cases, not captured at the point where the data exists,
so much of the value is often lost.
The Human Factors Group
How do we measure whether the performance of a system of people, networks
and broadcast equipment is optimum? Existing Human Factors tools and methods
provide an understanding on the user-system interaction. The HF group
actively investigates the application and iteratively the improvement
of these methods and techniques. In many cases this is in partnership
with Academic organisations which have a track record in the field.
BBC Research & Development
BBC R&D is a world leading centre for media production and broadcasting
MixTV: Mixed reality in future production
The goal is to enhance and innovate BBC broadcast productions by using
Mixed Reality Technologies in various genres.
The MixTV system can in real time: a) merge real and virtual elements,
b) work in a conventional studio or outdoor production, c) allow free
movement and zooming of the camera and d) allow interaction with the virtual
How does it work?
1) Markers are tracked in real time and replaced by virtual elements
2) The virtual elements are merged with the live video image
3) Animations are triggered by bringing two markers together
4) Mask layers are generated using the alpha channel
5) Transitions from real to virtual camera and video walls
Hands-on productions can use a standard PC and a web camera. These can
be productions for road shows, public spaces, web and for proof of concept.
One example is the Euro Table that shows ways of communicating financial
data to the viewers to inform and entertain.
Conventional studio or outdoor productions can use a standard POC and
any study quality camera. One example is the WarBoard production that
shows possible ways to communicate and analyse news of a country in conflict
to the viewers.
Adopt-IT is a dissemination project aimed at promoting good practices
identified among the RTD (Research & Technology Development) projects
funded under Key Action III of the IST Programme of the European Commission.
Some Adopt-IT projects:
WEDELMUSIC is an innovative idea to allow the distribution and sharing
of interactive music via Internet totally respecting the publisher rights
and protecting them from copyright violation. WEDELMUSIC allows content
distributor (publishers, archives, etc.), corporate consumers (theatres,
orchestras, music schools, libraries, music shops), and users (students,
musicians, etc.) to manage interactive multimedia music in WEDELMUSIC
XML format. WEDELMUSIC objects may include: multilingual cataloguing information,
music notation scores, audio files (e.g., WAVE, MP3, MIDI), image of music
scores (PNG, TIF, PDF, etc.), multilingual lyric (XML), video files (MPEG,
AVI, MOV, QT, etc.), documents (DOC, HTML, XLS, PDF, PS, etc.), pictorial
images (TIFF, GIF, PNG, JPEG, BMP, etc.), animations and sliding show
(FLASH, PPT, etc.), synchronizations, etc., in any format.
Xaudio is an active service entailing the insertion of inaudible (watermark)
codes into the broadcast audio. These codes will survive broadcast, transmission
through the air between speakers and microphones, and can be extracted
in real-time by portable mobile devices such as mobile phones and Personal
Digital Assistants (PDAs). The extracted codes, which uniquely identify
both the broadcaster and the content being played, will then be used to
automatically locate interactive information and e-commerce services that
are directly relevant to the chosen audio material.
OpenDrama will develop a novel platform to author and to deliver rich
cross-media digital objects of lyric opera and other vocal dramatic music,
such as: cantatas, oratorios, masses, Lieder, society and entertainment
music, etc.. The project will open this heritage to a dimension of learning,
exploring and entertainment. It will provide the digital music market
with innovative content and with a set of integrated solutions for advanced
publishing, entertainment and audio-visual on-demand services. The OpenDrama
service will feature musical streaming with plot visualisation, the real-time
display of score and libretto, 3D virtual staging, access to background
information and karaoke, within the frame of an advanced community service.
It will foster a crucial transition from today's concept of digital music
as a 'blob' of read-only media towards a rich interactivity where users
will be able to deal with separately encoded and stored media streams
(e.g., for orchestral accompaniment, voices, lyrics, etc.) with whom they
choose to deal independently at the time of consumption. The service will
be delivered on broadband Internet through a specialised portal and on
complementary media (primarily DVD disks, and in perspective on ATVEF
interactive TV, DAB radio and DVB broadcasts).
Please visit the SONARIS Conference &
Events Calendar at:
New Subscription: http://www.sonaris.info/newsletter.htm
For Advertising: newsletter@Sonaris.info
To unsubscribe: http://www.ymlp.com/unsubscribe.php?SonarisNewsletter
© 2003 Sonaris Consulting, Felix Bopp. All rights reserved. Reproduction
in whole or in part in any form or medium without written permission
is prohibited. Sonaris Consulting cannot accept responsibility for the
accuracy of information supplied herein or for any opinion expressed.