USB Synchronous Multichannel Audio Acquisition System
USB Synchronous Multichannel Audio Acquisition System
USB Synchronous Multichannel Audio Acquisition System
Acquisition System
David Abran-Côté, Massinissa Bandou, Alexandre Béland, Gabriel Cayer, Sébastien Choquette,
Frédéric Gosselin, Francis Robitaille, Diallo Telly Kizito, François Grondin, Dominic Létourneau
University of Sherbrooke
Abstract — This paper is a technical description of a XMOS XS1- eight preamplifier microphone cards, all powered by the main
L2 based synchronous multichannel audio acquisition system audio card.
designed for mobile applications, or more specifically mobile
robots. The design is made available through an Open Source The USB 2.0 High-Speed interface is used for data transfer as
Hardware license, allowing easy modifications and sharing. The it is more commonly available compared to other interfaces
system includes an 8 channels USB sound card coupled with 8
such as FireWire or Peripheral Component Interconnect (PCI)
active and differential microphones. The sound card also
provides an analog stereo output. It can be powered by either and more portable. The USB 2.0 transfer rate reaches 480
USB or an external wide range (7-36V) DC power supply. The Mbits/sec, which is sufficient to transfer the raw
electrical consumption over USB does not exceed 2.5 W, (uncompressed) data of 8 microphones and one stereo output.
conforming to the USB 2.0 power specifications enabling the
system to be solely powered by the USB bus. Differential codecs This paper presents a description of the designed system. The
and amplification circuitry are used, allowing operation in noisy project specifications, global architecture, materials and
setups. Finally the sound card is compatible with the following software, tests results and analysis are presented.
operating systems: Windows, Linux and Mac OS. Audio data is
transferred to the computer using the USB 2.0 High-Speed Audio B. State of the Art
Class 2.0 standard avoiding designing a different driver for each Robot artificial audition systems use studio sound card since
operating system. there is no audio acquisition card with eight input channels
specifically designed for robotic applications. These
Keywords – Synchronous data acquisition, artificial audition, professional cards often provide unnecessary functionalities
audio codecs, sound sources localization, open source, mobile (sound effects, integrated mixing, optical inputs/outputs,
robotics, differential signals, USB Audio Class 2.0, USB 2.0 High
Speed, XMOS
S/PDIF, MIDI, numerous analogs outputs, etc.). Moreover,
they are large, expensive, and require a significant amount of
I. CONTEXT power. Clearly, these devices do not meet the requirements for
the current application.
A. Introduction and Motivation
Previously, sound cards like the RME MULTIFACE II (PCI)
ntRoLab [1] is a research laboratory based in
I Sherbrooke, Québec, Canada. IntRoLab is pursuing the
goal of studying, developing and integrating
and the MOTU Ultra-Lite-mk3 Hybrid (FireWire) were used
at IntRoLab with the ManyEars system. These cards cost
around US $500-$1000, consume too much power (10W+,
technologies for the design of autonomous and intelligent without the power consumption of the eight microphones),
systems. Research activities involve software and hardware provide inadequate or complicated connectivity (power supply
design of mobile robots, embedded systems and autonomous + PC interface) and dimensions are not suitable for robotic
agents. To enhance the robot perception, an artificial audition applications.
system named ManyEars [2] has been developed to locate,
track and separate sound sources in real time with an array of So far, the USB synchronous audio acquisition interfaces use
eight microphones. This capability allows the robot to locate a mostly either the USB Audio Class 1.0 standard, which is
person or an interesting event in its environment. Tracking the limited to two input channels, or a proprietary USB protocol.
sound sources over time also lets the robot follow different A few other systems use the USB Audio Class 2.0 standard
speakers in motion. ManyEars is also able to separate the but these are not well suited for this application.
individual speech of multiple simultaneous speakers. Tests
show that the system performs well with up to four sound The definition of the audio class norm is applicable to all USB
sources located within seven meters. computer compatible devices that have integrated functions
for voice or audio manipulation.
Since the algorithm uses an array of eight microphones, an
eight channel input synchronous audio acquisition system is Recently, the USB Audio Class 2.0 standard was introduced in
needed. Two major challenges are present in mobile robotics: few audio acquisition devices [3]. The improvements over the
1) power consumption must be low and 2) physical Audio Class 1.0 standard include more channels and better
dimensions of the system must be optimized. The goal of this sampling resolutions and frequencies. Most mobile robots use
project is thus to design a small, low-cost and low-power a personal computer (PC) as their central processing system.
audio acquisition card with microphones. To accommodate the Linux is the most popular operating system as it exclusively
needs of the ManyEars algorithm, the card has eight input supports the Robotic Operating System (ROS) [4].
channels and one stereo output channel and is connected to
Figure 1: System diagram
When the external power supply is turned on, a power Strategies to preserve the audio signal integrity are described
multiplexer (TPS2111) automatically detects this new source this section.
and disables the USB power source. This allows hot plugging
of power sources. The analog signal coming from the microphones is transmitted
in differential mode to the codecs. Differential mode is
To maximize the signal-to-noise ratio and to obtain the 12 preferred to single-ended signaling because of its better noise
effective bits resolution previously specified, a high analog immunity and because it increases the dynamic range of the
reference voltage is needed. This is done with a low-dropout analog/digital converter of the codec. Since the acquisition
voltage regulator (LDO) that ensures that the analog reference card is designed to operate on a robot, many external devices
voltage stays at 4.3V when the USB voltage is at its minimum can induce electromagnetic interference in the transmitted
voltage (4.4V). Consequently, when the ADCs acquire the
signal. Twisted pairs for signal transmission distribute the divided into two sections, one for the digital parts and another
interference and a differential amplifier rejects the common one for analog parts.
mode noise, which minimizes the effect of overall
electromagnetic interference. The power plane is divided in three sections, as shown in
figure 3. The first section is used for the digital parts (in red),
Moreover, the amplitude of a differential signal is higher than the second one is used for the analog parts (in orange) and the
the amplitude of a single-ended signal for the same reference third one powers the XMOS cores (in blue).
voltage, which increases the dynamic range. According to the
datasheet of the codec, the dynamic range goes from 102dB
(single-ended signaling) to 105dB (differential signaling),
which yields to a 41% increase.
REFERENCES
[1] IntRoLab. (2009, August 26). [Online]. Available :
http://www.introlab.gel.usherbrooke.ca
[2] ManyEars. (2010, November 19). [Online]. Available :
http://www.manyears.sourceforge.net