Skip to content

daobilige-su/SSM_LinearArray

Repository files navigation

SSM_LinearArray

##Video showing SSM_LinearArray :

(Click on the image below to watch the youtube video or click here)

SSM_LinearArray Youtube Video

Authors: Daobilige Su

Current version: 1.0.0

SSM_LinearArray is a real-time SLAM library for sound sources mapping using an off-shelf robotic perception device (e.g. Kinect or PS3-Eye), which have a linear microphone array embedded inside. SSM_LinearArray reads raw image (Mono/RGB-D), audio data and computes the camera trajectory, sound sources locations and a dense/sparse 3D reconstruction when a Kinect (RGB-D Camera) / PS3-Eye (Monocular Camera) is used.

#1. License

SSM_LinearArray is released under a GPLv3 license. For a list of all code/library dependencies (and associated licenses), please see Dependencies.md.

For a closed-source version of SSM_LinearArray for commercial purposes, please contact the authors: daobilige.su (at) student (dot) uts (dot) edu (dot) au.

#2. Prerequisites I tested the library in 14.04, but it should be easy to compile in other platforms. A powerful computer (e.g. i7) will ensure real-time performance and provide more stable and accurate results.

C++11 or C++0x Compiler

New thread and chrono functionalities of C++11 are used.

Pangolin

Pangolin is used for for visualization and user interface. Dowload and install instructions can be found at: https://github.com/stevenlovegrove/Pangolin.

OpenCV

We use OpenCV to manipulate images and features. Dowload and install instructions can be found at: http://opencv.org. Required at leat 2.4.3. Tested with OpenCV 2.4.11.

Eigen3

Download and install instructions can be found at: http://eigen.tuxfamily.org. Required at least 3.1.0.

BLAS and LAPACK

BLAS and LAPACK libraries are requiered by g2o (see below). On ubuntu:

sudo apt-get install libblas-dev
sudo apt-get install liblapack-dev

DBoW2 and g2o (Included in Thirdparty folder)

Modified versions of the DBoW2 library are to perform place recognition and g2o library to perform non-linear optimizations. Both modified libraries (which are BSD) are included in the Thirdparty folder. Original version of libgp is used for building microphone array sensor model using Gaussian Process.

ROS

ROS is needed to process the live input of the sensor or pre-recorded rosbag files.

Freenect (optional)

Freenect library is needed for live processing of kinect microphone array. In this case, once the freenect library is compiled, copy the libfreenect.so into ROS/SSM_LinearArray/lib folder. A precompiled libfreenect.so file under Ubuntu 14.04 is already included, so if it matches your version of OS, probably it's enough to go.

PyAudio (optional)

Pyaudio library is needed for live processing of PS3-Eye microphone array. On Ubuntu, pyaudio can be installed by:

sudo apt-get install python-pyaudio

#3. Installation

Clone the repository:

git clone [email protected]:daobilige-su/SSM_LinearArray.git

A script build.sh is to build the Thirdparty libraries and SSM_LinearArray. Please make sure you have installed all required dependencies (see section 2). Execute:

cd SSM\_LinearArray
chmod +x build.sh
./build.sh

This will create libSSM_LinearArray.so, libcsparse_extension at lib folder and other executables in ROS/SSM_LinearArray/bin folder.

#4. Run

Running the pre-recorded data

In the case of Kinect, run the following commands in terminal:

roslaunch SSM_LinearArray freenectrosbag+ssmlineararray.launch 

In the case of PS3-Eye, run the following commands in terminal:

roslaunch SSM_LinearArray ps3eyerosbag+ssmlineararray.launch 

Open another terminal, go to the folder containing recorded rosbag files and play it by:

rosbag play XXX.bag

where XXX.bag is the recored rosbag file.

Pre-recorded Data

An example (recorded rosbag file) of mapping 2 sound sources using Kinect can be found here(1GB).

An example (recorded rosbag file) of mapping 5 sound sources in a computer lab using Kinect can be found here (3.36GB)

An example (recorded rosbag file) of mapping 2 sound sources using PS3-Eye can be found here (2.87GB).

If you find the system is running a bit slow, decompress the rosbag files above first.

Running with live data

In the case of Kinect, the freenect firmware needed to be loaded to Kinect first. For details, have a look at instructions related audio data at Freenect library. The firmware needed to loaded each time Kinect is reconnected to PC. Then, run the following commands in terminal:

roslaunch SSM_LinearArray freenect+ssmlineararray.launch 

In the case of PS3-Eye, run the following commands in terminal:

roslaunch SSM_LinearArray ps3eye+ssmlineararray.launch 

#5. Processing your own sequences Need to change the setting files with the calibration of your sensor. The setting files are inside the folder ROS/SSM_LinearArray/config. The calibration model of OpenCV is used for camera calibration.

#6. To use other sound sources Direction of Arrival (DOA) estimation algorithms In this implementation, SRP-PHAT is used for the sound source DOA estimation. The microphone array transfer function is based on geometric locations of microphones. For those who want to achieve better DOA estimation accuracy, the pre-recoded transfer function of the microphone array should be used instead of the geometric locations of microphones. In this case, we recommend using HARK to estimation DOA angle using the pre-recorded transfer function. The installation and usage of HARK can be found on the HARK online documentation.

To use another sound source DOA estimation algorithm (HARK as an example), the HARK ROS node should subscribe to the topic "/microphone_array_raw" which publishes raw multi channel audio data. Then, pulish the DOA likelihood w.r.t. each angle to the topic "/srp_phat_fd_value". You can rename the output ROS topic name to something more meaningful in you case.

About

3D Sound Sources Mapping Using a Linear Microphone Array

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published