Hacker Newsnew | past | comments | ask | show | jobs | submit | rrrooss's commentslogin

Frequency response and dynamic range are low hanging fruit in high res audio debates. They are not the only factors in accurate reproduction of an input wave. What is rarely considered is the temporal resolution of the format. Which is the formats ability to describe change over time. To plot a graph but depth is the y axis resolution and sample rate the x axis. Again I'm talking to temporal resolution. Transient response is directly effect by this. The link below shows the advantages of hi res audio formats inside the human hearing range. So if accuracy is of concern high res formats DO hold more information about the original wave form. If file size is of concern hi res is not applicable.

http://www.eirec.com/DPimages/digisqwvtest.jpg This is an example if transient response of different sampling rates.


I tried to find the context for that image but there wasn't anything on the site. The image is most certainly incorrect -- or more likely, looking at the output of the analog electronics after the DAC.

Sampling theory says that a perfect square wave can be represented at any frequency below Nyquist. That doesn't mean that the codec or the analog electronics are capable of responding instantly at those frequencies, but that has nothing to do with the fact that a 1kHz square wave can be perfectly sampled with a 44.1kHz sample rate. The image is simply incorrect.

Transient response in the real world is generally limited only by the acoustic transducer response of the system, because everything else has the ability to respond much faster than audio rates. With Pono, this means that the earbuds or headphones you use with the player will have a greater affect on the transient response than the electronics inside.


This image is from RME a sound card maufaturer. And the image is of the analog output after dac. So still in the electrical domain. Short comings of a playback system has what to do with the accuracy of a recoding medium? There are SO many factors that stop and audio signal reproduction from being perfect. But the signal domain is the easiest to make better. Transducers that have to fight the law of physics for accuracy are the obvious weak point. But as with anything rubish in = rubbish out. Before you even get to the weakest point of an audio system the transducer (the speaker) transistors and amplification have their own short comings the accuracy of an amplifiers transient response is measured by its slew rate. A square wave is impossible for a speaker cone to reproduce in theory the rising and falling edge is describing an instantaneous movement. A speaker cone can't do this, inertia says no. And this is called distortion. As every part of the system introduces its own distortion the accuracy gets less and less. To accept distortion at the signal level in the persuit of accuracy is counterintuive. But if your system can't take advantage of this extra accuracy, go ahead and use a lesser format, no sense in the extra file size. If you want a reference quality signal, use the high res formats. To be clear here I'm not saying I want to listen to digitally encoded square waves, a square wave is a tool for showing the accuracy of input=output. A square wave being the hardest analog waveform to capture and reproduce. I am objecting to the idea that hi res formats have no advantage over what we are used to as consumers. It is the content distributors that are taking advantage saying that 24bit/96k is more expensive. The bandwidth and storage does not make this a more premium product you should be paying for the album not the format. But why limit choice because the masses can't see the benefits. If you buy the album you as the consumer can choose the format. I'm not sure if all this arguing stands up to pick whatever format suits your needs. This argument should be if I bought the album why can I choose the format most applicable to my playback needs/desires. And if your concern is that company's charge more for the high res, that refeclts poorly of the distribution company. If you think it's a waste of space just download the lesser format. If I had a choice I'd download it as it left the studio/mastering house.


Not sure what point you're trying to make here, but I was responding to your post implying that 192kHz sampling rate improves transient response. It doesn't improve it at all. While there is sampling "distortion" by way of quantization error, this error is inaudible because we're sampling at 24 bits, beyond the dynamic range of human hearing. High sample rates actually REDUCE audio quality because they sample ultrasonics that will interact within the audible range and potentially damage speakers or pick up EMI interference.

It has been proven over and over via ABX testing that high res formats are completely indistinguishable from a 24b/44.1kHz master. And further studies have proven most audio engineers can't distinguish between lossless audio and 320kbps MP3. That's what the Xiph link elsewhere in the thread shows.


More to it than frequency response and dynamic range. The goal should be for the most accurate reproduction of the input waveform. Here is a pic of transient response for different sample rates. http://www.eirec.com/DPimages/digisqwvtest.jpg


There has been a lot of talk about quantative aspects of but little about the qualitative. What does a 18khz square wave look like wher recorded and reproduced in 44.1 kHz. How is phase affecte?. Phase of an audio signal reaching the ears helps you perceive distance and position. How many samples are used to represent the frequencies at the higher end of the audible range. What is the quantization error difference between 16 and 24. And if the author can't hear the difference between 4bit audio and 12bit audio I question what he/she is listening on. The aliasing would be HUGE.


An 18 kHz square wave sampled at 44 kHz looks like an 18 kHz sine wave, everything after the fundamental frequency is well outside the Nyquist limit and will have been thrown away by anti-aliasing filters. And furthermore you couldn't hear it even if it wasn't. Fourier decomposition of a square wave gives the sum of odd multiples of the fundamental, the next frequency is 3 x 18 kHz = 54 kHz.


Okay now actually look at the response on Matlab. As you get closer to the nyquist freq there are less samples to describe the wave. And while you get a bastadised 18k signal it's a far throw from what you put in. Anf while most are crying your splitting hairs the frequency response and dynamic range arguments pale in comparison to making the exact wave you put into the encoder come out. On the right system with the right recording the tiniest nuance in a room reverb helps trick your brain into believing the sound as actually happening. And this is not important to all listeners. But people saying that it makes no difference and is not important as a format to release music in are thinking only of their current needs and experiences. If you have actually heard a good recording on a good system in a good room you will know what I'm talking about. If listening experience has been laptop speakers, headphones and your mum and dads mini system then I totally agree any more than 320kbs mp3s are overkill. But to say that high res formats gave no place in consumer land is to show your lack of understanding of different people and different needs. In the age of iTunes you surely you buy the album and download it in whatever format u see fit. If u think audiophiles are wackos go get ya 320's. I would like to get it as I came fr the studio. The best it can be is as it came from the studio.


First off, forget about MP3; I'm not arguing for a lossy standard.

You can't get an 18 kHz square wave out of a system with 44 kHz sampling. You need at least 1 harmonic before it'll even LOOK square, and that requires a frequency response out to 54 kHz, ie. a sampling frequency of 108 kHz. You CERTAINLY won't find one in a reverb tail, even assuming you had a generator for one in the first place (you might JUST get one from a cymbal crash, but I don't think the physics works)

The point being, your source material can't contain an 18 kHz square wave either since it's been through a studio production system with the same antialiasing filters.

Since you know nothing about me but seem to be making assumptions anyway, here's some background. I've worked in broadcast audio; I own studio recordings in 24 bit / 192 kHz (Linn release of Mozart's Requiem, studio master series). I also own studio equipment that can actually play it. Audiophiles are, by and large, cash cows for companies with no scruples.


Good an audio nerd. I am making the point about the square wave close to the nyquist to point out the short comings of a format for accurately reproducing an input. Square waves in the real world are rare but I am arguing for a format that produces the most accurate representation of the intended signal. Imagine the situation where i have my guitar cab set up and I have a square wave (distortion) coming out of it and I far mic it up so as to capture the room a give a feeling of space. To really feel like your there you would want the resulting complex wave made up of the 18k direct sound from the cab and the room response a recording medium that can't do that accurately is second rate especialy when the formats are out there. And the higher the sample rate the further from the nyquist that 18k is and the more samples that can be used to describe the resulting wave an the more convinced my brain is that sound is real.


...right up to about 20 kHz, whereafter YOU CAN'T HEAR THEM. Hence 44 kHz sampling.

Seriously, A/B test this, you might be surprised.

Also if you think that's anything like a square wave coming out of a guitar speaker (or that that is even desirable in the most case), I've got a bridge to sell you. And yes, I do play.


http://www.eirec.com/DPimages/digisqwvtest.jpg

Okay here is a picture of what I'm trying to explain. And the author of this picture used a frequency much further inside human hearing range. This is transient response test I guess. My main argument is for the verbatim capture of the input wave. It will make the sound at 10k but it isn't the same wave that went in.


> What does a 18khz square wave look like wher recorded and reproduced in 44.1 kHz.

It looks like an 18khz sine wave, possibly with slightly reduced amplitude depending on the anti-aliasing filter rolloff and fc, but not enough to be audible (18khz isn't audible for a large portion of the population anyway).

> How is phase affecte?.

Probably delayed a bit by the anti-aliasing filter.

> Phase of an audio signal reaching the ears helps you perceive distance and position.

Not at 18khz, the wavelength is too short for your ears to notice any realistic group delay. High frequency localization is most done by ILD and effects caused by ear shape.

> What is the quantization error difference between 16 and 24.

Quantization error in a DAC just defines the noise floor. 16 and 24bit DACs are usually within a few dB of each other in dynamic range, it's really not audible.

> And if the author can't hear the difference between 4bit audio and 12bit audio I question what he/she is listening on. The aliasing would be HUGE.

What does aliasing have to do with bit depth?


http://www.eirec.com/DPimages/digisqwvtest.jpg Here is an example of higher sampling rates being useful INSIDE the human hearing range. This example shows transient response. Frequency response and dynamic range are low hanging fruit in high res audio debates. To argue that you can't hear it is a mute point when accuracy is the point, this is a reference quality format. Temporal resolution is the next area to persue if accuracy is of concern. If file size is of concern then high res is not aplicable. What is everyone arguing here? That if they any hear it on their setup it is of no use?


This test is flawed in a few ways. The and the error lies in the source material. The why would there be a perciveable difference between the same inferior input shown to both encoding formats. Changing between encoding formats is a topic of its own but the problem here is similar to getting a 640x480 compressed jpeg and putting it into a 1920x1080 lossless format and the same source into a 1280x720 lossless file and saying. What's the difference? They both look crappy! Many people have an opinion on high res audio formats but they are so many places to go wrong. Audio should be given a little more credit for it immersive ability. When done right it can take you places. Something to consider is that our eyes can only see 1 octave of information our ears can hear 10 octaves. Whilst these high res formats aren't needed for every application they do make a big difference when the source material can take advantage of it. I feel in 2014 music should be released in the best format available from the studio and if u want a crappier version so u can shove 5000 songs on your iWhatever that's your choice.


It's not the same as the same inferiour source material.

When you apply negative gain to the 16-bit signal, it has less amplitude in absolute terms, but it should have all the resolution of a whole 16-bit sample in that space.

The point is that whether that 16-bit signal is represented with the lower 4 bits of a 16-bit sample, or the lower 12 bits of a 24-bit sample, you can scarcely hear anything at all, let alone a qualitative difference between the signals.

192KHz is not a higher quality format, it is a production format. The purpose is to reduce the cost and finality of anti-aliasing filters when sampling the signal, it is just cheaper to manufacture. There's a good reason for this being the standard in professional audio, they need to buy a LOT of audio interfaces, many studios will have thousands of such inputs.

Thanks to the basic laws governing signals, we know with near certainty that not only is 192KHz overkill, but so is 48KHz, and so is 44.1. Without significant new evidence showing humans hearing signals with frequencies greater than 24KHz, you will not make any convincing argument as to why we should go with any sample rate higher than 48KHz for human listening.

As for 24-bit, it is another production interchange format. It's there so that you don't need to stand around adjusting gain knobs on audio interfaces so that you get decent fidelity but also don't clip. With 24-bit you can just sample your audio once, and assuming it's within a reasonable range, you can adjust the gain in the discrete signal. There is some indication that 16-bit sampling is less than completely ideal. The very best ears in humanity(newborn ears) distinguish about 21 bits in the safe ranges of amplitude, 24-bit may make sense in an audio system for newborn babies.

I feel that in 2014, music should be released in the best format available from the studio, and if you want a crappier version so that you can shove only 100 songs on your iWhatever, that's your choice.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: