Tuesday 29 May 2012

In search of (low) latency

This is a followup to my investigations into playing my Songken DVD DKD files on my laptop. In an earlier blog I described how to decode the DKD files into Midi or Midi+WMA files. The intent was then to build a Midi player that would also show the notes of the melody and also the notes the singer was singing.

Well, I did all that. Java Sound has a Midi player. Java Sound has a Sampled API to handle sounds from the microphone to the loudspeaker. Java has a GUI for showing stuff. TarsosDSP by Joren Six has implemented a number of pitch detection algorithms such as YIN and they can be pulled in to give an estimate of the pitch sung. Java can convert characters from language encodings such as GB2312 to Unicode and display them so I can see Chinese and other characters.  So it's all there....

... but latency still kills it. The Midi player introduces latency somehow into the sampled sounds, but even if you work around it - even if you just do sampled data alone - then there is still that little delay. Here are my Java source files. Maybe I will write up an explanation of what I was doing with them later. I'm going to stop work on them right now till I get the latency sorted out.

The standard audio system for (consumer) sound on Linux is Pulse Audio. But as Lennart Poettering explained at the Linux Audio Conference 2010, pro audio has different aims to consumer audio, and this project is closer to pro audio than consumer audio (although to think of Karaoke singers as pros is stretching it a bit :-). In consumer audio, latencies of upto 2 seconds may be permissible, while pro audio sets an upper limit of 20 milli-seconds.

Java Sound is estimated to have a 50msec delay: "These measurements suggest that the latency introduced by buffers in the "Java Sound Audio Engine" is about 50 ms, independant of the sample rate." Now that's on old equipment, but it means there is an uphill struggle.

The sound quality of the builtin soundcard HDA Intel PCH (STAC92xx) on my Dell laptop is appalling. That has to be overcome too. This laptop doesn't have a microphone input, so I started looking at USB sound cards. My first attempt was with a AnPu Portable USB 3D Virtual 5.1 Audio Sound Card Adapter Blue  from Dino Direct. Dino was good: delivery post-free within 2 weeks. But the card was cheap (A$4) and broke when I inadvertently yanked it out of the USB slot.


My second attempt was with Swamp Industries for an XLR to USB Adapter. That was about A$20 but I got it with a microphone as well. The service was good again. Well, the card's okay for input, but still has to go out through the onboard soundcard.


The third attempt was with a Sound Blaster X-Fi Surround 5.1 Pro at A$70. It's a USB 1.1 device (Linux still has issues with USB 2 devices, apparently).  Pulse Audio only recognises it as an input device, not as an output device, so it didn't seem to improve things.


Pulse Audio is an audio layer above Alsa (OSS was used previously to Alsa). Alsa could see the device fine:

$arecord -l
**** List of CAPTURE Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: STAC92xx Analog [STAC92xx Analog]
  Subdevices: 0/1
  Subdevice #0: subdevice #0
card 2: Pro [SB X-Fi Surround 5.1 Pro], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
 

and

$aplay -l
...
card 2: Pro [SB X-Fi Surround 5.1 Pro], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: Pro [SB X-Fi Surround 5.1 Pro], device 1: USB Audio [USB Audio #1]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
 

Now about this time I went off on what turned out to be a wild goose chase (at least so far) by looking at Jack: "JACK is [a] system for handling real-time, low latency audio (and MIDI)". Jack currently also uses Alsa. Now that looks good - but Java Sound and Jack don't play together.

Java Sound has a couple of weird bits where "obviously equivalent" things aren't. I hit this first with volume control in playing a Midi file: you can't set the volume on the default device but you can if you iterate through the devices and select the default one. Then you can set the volume on it. Huh? Thanks to Greg Donahue for solving that one. You hit similar problems trying to find the sound cards and you end up either with
  • Java Sound not playing to your default card; or
  • When you explicitly select the default card then Java Sound throws an exception saying that its PulseAudio drivers can't find it.
So after all that, where are we?
  • Java Sound has latency problems
  • The inbuilt soundcard is crap
  • Pulse Audio can't properly find the USB soundcard
  • Java Sound uses Pulse Audio
  • Pulse Audio has latency issues
  • Jack is ignored by Java Sound
  • Alsa and Jack can find the USB soundcards
Is it possible to have latency-free sound on Linux? Well, Jack claims to be latency-free, but then it has to go through the Alsa layer. Can the Alsa layer be latency-free? Not completely, but I finally figured out the following test:

    arecord  -f dat -B 4  -D hw:0| aplay -B 4 -D hw:2 -f dat -

i.e record at DAT standard (16 bits, 48k samples) from the builtin mike (hw:0) played on the USB soundcard (hw:2), with 4msec buffer time. And hey! It works! No latency that my poor ear can hear. This simple pipeline isn't perfect: any overrun introduces latency into the pipeline, but that can be handled in code by dropping samples. The sample size can be increased and it still sounds okay - 4ms was the lowest I could take it.

Conclusion:
  • the top-down approach through Java works but has latency issues
  • the bottom-up approach through Alsa handles latency
I just need to combine the two...



5 comments:

  1. I just did a crude latency test. Using my program PlayMicrophone.java which just copies microphone to speaker, I made finger-clicking sounds and picked up both my sound and the played sound on another computer recording using Audacity. The two sounds could be seen distinctly. The gap between them was consistently about 40msecs. Too high. And it gets worse when playing a Midi file at the same time.

    ReplyDelete
    Replies
    1. 40ms would be acceptable for my purposes. Do you mind sharing the source for PlayMicrophone.java?

      Delete
    2. Its on my web site under the LinuxSound/Sampled/Java chapter https://jan.newmarch.name/LinuxSound/Sampled/JavaSound/

      Delete
    3. Thank you Jan. I tried the PlayMicrophone code. I was getting stuttering so I made one small change to the SampleRate from 44100 to 8000. I measured the latency with Audacity and was seeing about ~250ms on Windows 7 with JDK 8u40. Did you do anything special to get latency down to 40ms?

      Delete
  2. Hi.
    I'm developing an open source java library.
    Currently, it has the following packages:
    - matrix
    - media
    -audio
    -midi
    - util
    The matrix package provides classes to perform the most common operations applied to matrices and vectors and implements funcionality to solve nxn linear systems and make LU decomposition.
    The media package provides a sound player with capability to play audio and midi with mp3 support and features llike loop and shuffle. Also has support to play entire directories and m3u files among other funcionalities.
    The audio package provides an audio player in case you don't need MIDI support.
    The MIDI package also provides a MIDI player, a MIDI file reader and writer and classes to manage MIDI data with its correspondent wrappers to java.sound.midi classes.
    The util package provides a useful static class to resize, get chunks, clone and print one, two or three-dimensional arrays of any kind of type included the atomic ones.
    I hope that it can be useful for someone.
    You can visit:
    http://imr-lib.blogspot.com
    to download the whole code and api documentation.
    It's totally free and you don't need to subscribe or any thing like that.
    Just download it and post your opinion if you want.
    Well, have a nice day.

    ReplyDelete