The Naked Scientists

The Naked Scientists Forum

Author Topic: Choir volume: could more singers mean quieter?  (Read 12375 times)

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #25 on: 30/04/2013 23:11:37 »
No, no. Square wave is perfectly valid.

...

Randomly phased/frequency square waves approach a normal distribution with enough waves added together

Okay, so if you remove my simplifications you'd get straight to a graph showing the square root values 1, 1.41, 2, etc., but the square waves will eventually catch up with the graph when it gets to higher values. That means that when we're dealing with a millions singers, we've only got 1/1000 the volume that we'd get if all their sounds were completely in phase with each other such that they added up without any cancellation. They aren't in phase though, so it looks to me as if 99.9% of the sound they're producing must be being cancelled out.

...and the sound power grows as the square of the sound pressure amplitude.
So sound power grows proportional to the number of singers, conserving energy.

Does that sound power include all the cancelled sound though? If not, there's a problem with it when you apply it to a whole lot of sine waves which are in phase with each other such that they add without cancellation, because then if you go from one sine wave with an amplitude of 1 and sound power of 1 to having a thousand sine waves with an amplitude of 1000, the sound power will be a million - that's 1000 times the amount of power put in. This is the key point that is blocking my way to understanding how the volume and power are related.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #26 on: 30/04/2013 23:24:17 »
Okay, so if you remove my simplifications you'd get straight to a graph showing the square root values 1, 1.41, 2, etc., but the square waves will eventually catch up with the graph when it gets to higher values. That means that when we're dealing with a millions singers, we've only got 1/1000 the volume that we'd get if all their sounds were completely in phase with each other such that they added up without any cancellation. They aren't in phase though, so it looks to me as if 99.9% of the sound they're producing must be being cancelled out.
No, volume is best understood as power, rather than amplitude. The power is a million times more with a million singers.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #27 on: 01/05/2013 18:34:23 »
Okay, so if you remove my simplifications you'd get straight to a graph showing the square root values 1, 1.41, 2, etc., but the square waves will eventually catch up with the graph when it gets to higher values. That means that when we're dealing with a millions singers, we've only got 1/1000 the volume that we'd get if all their sounds were completely in phase with each other such that they added up without any cancellation. They aren't in phase though, so it looks to me as if 99.9% of the sound they're producing must be being cancelled out.

No, volume is best understood as power, rather than amplitude. The power is a million times more with a million singers.

It looks to me as if when the power put in is a million, the power cancelled out is nearly a million, and the power remaining in the sound is going to be a thousand. If you do the experiment with a million sine waves all perfectly aligned though, the amplitude will be a million rather than a thousand and the power in the sound will have to be a million too with no part of the sound cancelled out. If you align them with half opposing the other half, you'll put in a million units of power and get zero power in the sound, all of it being cancelled out instead (and maybe generating heat).

If you were to try to capture energy from sound, this would show how much power is really in the sound. There would be a certain amount of wobble in the wave which would carry extra power beyond the main amplitude, but I'd be surprised if that could make up the missing energy in a case where a million voices are only producing an amplitude of a thousand units.

If I'm right about this, then most sound really is cancelled out whenever there are large numbers of sound makers, but my initial mistake was to think that because the percentage cancelled out was heading for 100% it would eventually lead to silence: what would actually be happening is that the cancelled sound gets closer and closer to 100%, but it always falls short by an amount that grows in absolute terms and never declines. It's a case where we have cancellation percentages going up to 99.90% for a million singers, then 99.99990% for a trillion, but these numbers always end with something that isn't a 9 if you follow them far enough and they should never be rounded up to 100% if you're interested in the volume produced.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #28 on: 01/05/2013 19:25:58 »
Well, the amplitude predominately cancels out, according to the square root law, but amplitude is not power.

You have to square amplitude to get power.

I believe that's your central mistake.

It's a bit like electrical voltage. You would think that power goes proportional to the voltage, but if you think about it, power nearly always goes as the square on voltage, because  when voltage goes up, so does the current; and it's current times voltage that is power (both DC and AC).

Similarly when you double the pressure/amplitude of the sound, the flow of the air in each vibration goes up too.
« Last Edit: 01/05/2013 19:28:44 by wolfekeeper »
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #29 on: 02/05/2013 18:10:26 »
You have to square amplitude to get power.

I believe that's your central mistake.

I'd agree with that except for the problem that when you apply ten sine waves to a speaker with identical alignment, each having an amplitude of 1, the combined amplitude will be 10 rather than the square root of 10, so if you then square the amplitude 10 you get a power of 100 which is ten times as much power as you put in.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #30 on: 03/05/2013 21:43:33 »
This is a bit more subtle.

Basically if you're adding them electronically, there's no problem with that; you can get out more power than you put in, because you have an amplifier!

If, as a separate example, you're not adding them electronically, but you're adding them in the normal way you have them all sending sound at you, and being picked up by a microphone; you're allowed by physics to have more power available at a point in space. Conservation of energy only applies to the overall, total energy, and you can get nodes and antinodes that have much more or less than normal, by putting the microphone at that point, you've definitely picked a node.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #31 on: 04/05/2013 20:46:07 »
This is a bit more subtle.

Basically if you're adding them electronically, there's no problem with that; you can get out more power than you put in, because you have an amplifier!

You can't hide the problem in the amplifier. If we have a speaker with ten independent sets of magnets controling its movement, each set of magnets linked to a different input, we can have one input working on its own to produce an amplitude of 1, then add in a second (in exact alignment with the first) to produce an amplitude of 2, etc., all the way up to ten inputs generating a combined amplitude of 10. If you square the 10 to get the power, the official sound power is now ten times the amount of power that was put in.

Do the experiment again with the outputs out of alignment with each other and you'll get a combined amplitude of root 10, and when you square that you'll get the sound power in the sense that it's the amount of power put in. It looks to me though as if this isn't the real sound power because you could only tap a maximum of root 10 units of energy from the sound if you had a mechanism to capture all the power in a sound wave.
« Last Edit: 04/05/2013 20:47:48 by David Cooper »
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #32 on: 04/05/2013 21:03:11 »
Oh well, then you've disproved conservation of energy.

;)

In reality the situations depend subtly on the differences between what you do.

If you just have microphones, each with electronic amplification, then there's basically no connection between the microphones, but if you have simple solenoid-type-magnets then the microphones can also act as loudspeakers, so that when one microphone gets pushed down by the air pressure, another will try to pop up, so it's not at all the same situation.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #33 on: 05/05/2013 18:32:02 »
Let me simplify things further for you. Imagine a gong with ten people hitting it with their bongers (correct technical term for gong-hitting implements not known). If they all hit the gong with their bongers at exactly the same instant, the amplitude of the sound produced will be ten times as loud as if only one hits it. If they all hit it at different points in time though, with half of them hitting it when it's coming back towards them, they will cancel out a lot of the movement of the gong and it will be a lot quieter. It might be better to think of half of them hitting the gong from the other side, so if two bongers hit the gong at the same time from opposite sides, they will cancel each other out and create heat instead. If they're all deaf and blind, they will hit the gong at random times and produce an amplitude of root 10 with the sound energy supposedly being 10, but if they are able to coordinate the movement of their bongers perfectly they can generate a sustained amplitude of 10 with the sound energy supposedly being 100. It doesn't add up.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #34 on: 09/05/2013 15:33:54 »
Let me simplify things further for you. Imagine a gong with ten people hitting it with their bongers (correct technical term for gong-hitting implements not known). If they all hit the gong with their bongers at exactly the same instant, the amplitude of the sound produced will be ten times as loud as if only one hits it.
There will be ten times more energy in the gong.
Quote
If they all hit it at different points in time though, with half of them hitting it when it's coming back towards them, they will cancel out a lot of the movement of the gong and it will be a lot quieter.
No, that's not right. If the hit it at different times, they will add ten times more energy to the gong at different times. (To a pretty good approximation, it does depend a bit on precisely what way it's struck).
Quote
It might be better to think of half of them hitting the gong from the other side, so if two bongers hit the gong at the same time from opposite sides, they will cancel each other out and create heat instead.
Only if they hit it at EXACTLY the same time, then they will effectively not have hit the gong; the hammer will bounce back at exactly the same speed it was struck at, and no energy will be added to the gong, but in virtually any normal case, this perfect strike will not happen.
Quote
If they're all deaf and blind, they will hit the gong at random times and produce an amplitude of root 10 with the sound energy supposedly being 10, but if they are able to coordinate the movement of their bongers perfectly they can generate a sustained amplitude of 10 with the sound energy supposedly being 100. It doesn't add up.
The thing you can always hang your hat on is conservation of energy.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #35 on: 09/05/2013 19:46:58 »
Let me simplify things further for you. Imagine a gong with ten people hitting it with their bongers (correct technical term for gong-hitting implements not known). If they all hit the gong with their bongers at exactly the same instant, the amplitude of the sound produced will be ten times as loud as if only one hits it.
There will be ten times more energy in the gong.

I'm not sure it actually works though as it's harder to push something that's already moving away from you, and ten bongers would accelerate the gong faster such that it may be hard for each bonger to transfer as much energy to it. It may be better to go back to using an example with ten sets of electromagnets.

Quote
Quote
If they all hit it at different points in time though, with half of them hitting it when it's coming back towards them, they will cancel out a lot of the movement of the gong and it will be a lot quieter.
No, that's not right. If the hit it at different times, they will add ten times more energy to the gong at different times. (To a pretty good approximation, it does depend a bit on precisely what way it's struck).

No, it's like trying to push a child on a swing when they're moving towards you - you end up absorbing energy from them instead and they swing less far afterwards. You still have to work hard to do this, and the energy must become heat.

Quote
Quote
It might be better to think of half of them hitting the gong from the other side, so if two bongers hit the gong at the same time from opposite sides, they will cancel each other out and create heat instead.
Only if they hit it at EXACTLY the same time, then they will effectively not have hit the gong; the hammer will bounce back at exactly the same speed it was struck at, and no energy will be added to the gong, but in virtually any normal case, this perfect strike will not happen.

It can happen and it's vital for science to account for that case. If you have a speaker in which the moving part contains a fixed magnet which is then made to move using electromagnets, it is possible to have ten sets of electromagnets which attempt to move the fixed magnet. If they all apply a force of 1 unit each in the same direction at the same time (which is very easy to arrange), the fixed magnet will move with an amplitude of 10 units (while a single set of electromagnets applying a force of 1 would lead to an amplitude of 1). If you make five of the electromagnets apply their force in the opposite direction, the speaker magnet won't move at all. If you let the electromagnets apply force at random times, the amplitude will be root 10.

You can argue that 10 units of energy is put in in one case, no energy at all is put in in another case, and root 10 units of energy in the third, but in each case the energy will have been put into the electromagnets regardless of the result. The key point though is that if you do this, you can't then argue that the energy in the sound is the square of the amplitude, because the first case would need the sound energy to be 100 units. That is the point that needs to be addressed.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #36 on: 09/05/2013 20:46:25 »
The thing you're continually failing to understand is that the details, really, really matter.

The way the clappers hit the gong, the shape of the clappers, the timing of the clappers, whether the particular point on the gong that it hits is moving or not.

In the normal case, where you hit the gong with ten clappers, the energy added to the gong is ten times the amount of energy- and most of it will NOT end up at the centre; the wave energy will spread out in different directions in an inverse law from each clapper, and only a small fraction ends up in the centre at all.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #37 on: 10/05/2013 17:32:35 »
The thing you're continually failing to understand is that the details, really, really matter.

Of course the details matter, but you're chasing all manner of kleinigkeiten instead of tackling the point that really matters.

Quote
The way the clappers hit the gong, the shape of the clappers, the timing of the clappers, whether the particular point on the gong that it hits is moving or not.

Use your imagination and picture the difference between ten bongers hitting the gong at the same instant in the same direction and ten bongers hitting it with the same force but at random times such that some of them reduce movement of the gong rather than adding to it. There is a clear difference between these two cases and it will result in a different amplitude for the sound produced by the movement of the gong (though it's also important not to be confused by the sound of the bongers hitting the gong as these are additional sounds and not the main event - the bongers could actually be padded such that there is no impact sound while still transferring energy to make the gong bong).

Quote
In the normal case, where you hit the gong with ten clappers, the energy added to the gong is ten times the amount of energy- and most of it will NOT end up at the centre; the wave energy will spread out in different directions in an inverse law from each clapper, and only a small fraction ends up in the centre at all.

If you've got a real big gong, all ten bongers can hit it practically at the centre. In fact, we can even eliminate this trivial problem altogether by having them hit the exact same central point, not at the same time but at intervals matching the frequency of the gong bong such that the resulting waves match up exactly and add together without any cancellation. If they hit at random times instead, the amplitude after ten bongers have hit will be considerably less.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #38 on: 17/08/2014 19:39:24 »
Update on the sound analysis program I was writing:-

The original way I tried to do it produced semi-useful results, but I switched to working with area instead and that worked much better, enabling me to isolate pure frequencies without any extra work to eliminate false results. The method can indeed be used as an alternative to FFT, though I don't know how well it compares in speed terms.

The method works by storing the area enclosed by the wave in alternating accumulators, with a stretch of the wave being divided up into equal-length chunks and the area of all the odd numbered chunks being added to a variable summing up the odd chunks while the even numbered chunks' areas are added to a different variable summing up the even chunks. It's a little more complicated than that because a high-frequency wave on top of a low frequency wave will often be superimposed on a steep slope which can generate false results, so you actually have to take the start and end altitudes of the chunk, average them, multiply by the length of the chunk and then subtract that from the area in order to isolate any area caused by a deviation away from a straight line. There is still some noise left over, but it is trivial.

The process would be extremely slow if it was done carelessly as there would be an enormous amount of adding up of areas for different frequencies, but it is only necessary to add results to one of the summing variables when a chunk ends and its variables need to be switched round. Each sample is simply added to a single variable used for all frequencies which keeps track of the area enclosed by the wave and its value alternates between positive and negative, and I call that variable "totality". Each frequency has its own set of variables, one of which stores the value of totality the previous time the chunk for that frequency ran out, thereby making it easy to work out the change in area since then. The previous altitude also has to be stored so that the area to be subtracted can be worked out.

Each frequency has two sets of variables rather than one so that they can analyse the wave from offset starting points, then their results are combined to work out the true amplitude (combined does not mean added - it's a little more complicated than that). I'm currently working with 8 octaves at quarter-tone resolution, but will probably switch to semitones or even whole tones to speed things up (because frequencies in between can still be worked out from that). Each set of variables for a frequency are pointed to by shorter entries in a buffer which are kept in expiry order - the ones that time out first get their clock reset and are put further back in the queue for next time. Sorting them into order is the slowest part of the process, but reducing the number of frequencies tested for will speed things up a lot. Another thing that will speed it up is avoiding testing all the time for the higher frequencies - they only need to be checked for occasionally to see if they're still active, so their clocks can be set to put them a long way down the queue, to the point that there's very little processing needing to be done at all. I haven't done much work on optimising it yet though because I'm still working on phoneme recognition with both spoken and whispered sounds. Another thing I plan to do is test how good the results are by trying to compress sound in the same kind of way as MP3 does, playing back the results to see how good/poor the sound quality is, but that may take some time. I'm concentrating on speech recognition first.
 

Offline evan_au

  • Neilep Level Member
  • ******
  • Posts: 4130
  • Thanked: 249 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #39 on: 18/08/2014 13:28:05 »
It seems that the new algorithm multiplies the input signal by a +/-1 square wave.

This will pick out different frequencies in the input signal, but:
  • The square wave consists of many different frequencies (f, 3f, 5f, etc), and so it will respond to many different frequencies. It may be better to multiply by a sine wave.
  • The square wave will not detect sine waves which are at 90 degrees phase to the square wave. It may be better to multiply by two square waves, at 90 degrees phase difference.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #40 on: 18/08/2014 17:39:02 »
I'm not doing anything with a square wave. The only multiplying being done is to calculate areas to subtract to eliminate errors, as explained below. I can't work out how the diagram creating thing on this forum works, so I'll have to do this with text graphics and hope it looks the same on your machine.

           ____
       -            -
     /                \
--/--------------\-------------- /----
 /                       \                 /
                            -  ____   -
  a    b    c    d    a'   b'   c'   d'  a

If you imagine that as a sine wave centred on the X-axis, adding up the area from a to a' by adding all the samples in that region together will contrast greatly with the negative area recorded from a' to the next a. By subtracting the latter from the former, you end up with a substantial result. If you start at c instead and add up the area to c', you get 0, and from c' to the next c will also record 0, so alignment is important if you are to be sure of detecting a signal. However, if you use both alignments rather than just one, you can combine the results to get the whole truth out of it.

The actual alignment you use still depends on luck, but if you happen to work from b to b' and then b' to the next b while also working from d to d' and d' to the next d, you will get two scores which can be adjusted according to their relative size to get a fairly accurate representation of the actual areas enclosed by the sine wave (done quickest by looking up a table to see how much adjustment to use). So, the alignment doesn't matter - you can do everything with just two starting positions.

Now imagine that the sine wave above is actually wandering about upon a much longer wavelength wave, so the horizontal line through the middle of it is centered not on Y=0, but Y=100. The same process can be used as before, and when we subtract the area from a' to the next a from the area from the first a to a', we get exactly the same value for the area enclosed between the sine wave and the straight line drawn through the middle of it. What I do though is subtract the area between that line and the X-axis first, even though it makes no difference to the end result in this case. The reason for subtracting this area is to cover cases where the straight(ish) line through the middle of the sine wave is running on a slant, which it will be most of the time because our small sine wave is weaving about on top of another sine wave with a much greater wavelength. That line is not entirely straight, but the errors are small. When that line is tilted, the area underneath it needs to be subtracted because it can distort the results badly otherwise. If the alternating totals for the descent start with a to a' followed by a' to the next a, followed by the next a to a', etc., then all the a to a' sections are bigger than the a' to a sections that follow, so a large false result will be building up as you go along, and this error may not be cancelled out on the way back up again: if the alignment on the way back up starts with a' to a instead of a to a', the error is doubled instead of cancelled, so a high frequency can be detected where there is no sound at that frequency at all. By subtracting the area between the centreline of the wave we're trying to detect and the X-axis, we are left with the area enclosed by that wave itself, plus a small error due to the centreline of the wave not quite being straight, but that's a small error which will not be sufficiently large to affect the result - it will just be low-level noise.

The result of using this method is that I'm getting very clean-looking data out when working with computer generated waves, actual recordings of notes from musical instruments, and from speech sounds where the base note and a variety of harmonics stand out as bright bands against a black background.
« Last Edit: 18/08/2014 19:36:21 by David Cooper »
 

Offline evan_au

  • Neilep Level Member
  • ******
  • Posts: 4130
  • Thanked: 249 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #41 on: 18/08/2014 22:09:25 »
OK, with this second description, it seems that the new algorithm consists of:
  • "Adding up the area", which is equivalent to the mathematical operation of integration. In the audio domain, it is also equivalent to a low-pass filter, which will attenuate higher frequencies, and be better at extracting the fundamental frequency.
  • Adjusting the boundaries near the zero crossings, which is a form of interpolation
  • Detecting zero crossings, which is a way to estimate frequency.
  • Separately accumulating the positive and negative areas, and then comparing them at the end. Mathematically, this is just the same as adding them all up (comparison is a form of subtraction)
The problem with using zero crossings is that it only considers the input values when the waveform is near the zero crossings - this is why interpolation is so important. The Fourier transform (or multiplying by a square/sine wave) uses the entire waveform, and so is more sensitive; interpolation is not needed for these algorithms.

[The earlier description sounded like the input was partitioned into fixed-length sections which were accumulated separately - this is equivalent to multiplying by a square wave. The second description sounds like the positive and negative segments of the input waveform are accumulated separately.]
 

Offline alancalverd

  • Global Moderator
  • Neilep Level Member
  • *****
  • Posts: 4728
  • Thanked: 155 times
  • life is too short to drink instant coffee
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #42 on: 19/08/2014 07:54:59 »
The square root summation of noise has significant sociological implications.

If you have a crowd of N people with random motivations, you only need to consistently coordinate the actions and voting of √N in order to achieve a particular objective over a long period.

This is the mathematical basis of radical politics and successful religion. Beware the patient few!
 
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #43 on: 19/08/2014 19:59:21 »
"Adding up the area", which is equivalent to the mathematical operation of integration. In the audio domain, it is also equivalent to a low-pass filter, which will attenuate higher frequencies, and be better at extracting the fundamental frequency.

It works equally well at high and low frequencies, though more noise may be appearing at very high frequency when lots of sounds at frequencies are present, though I can't yet tell if it's all noise - a lot of it probably is real signal, and it may even be that most of it is. What is certain though is that if I produce artificial waves with two frequencies present, one high and one low, both show up clearly. When looking at more natural sounds where there is more complexity in them, there is a range of harmonics showing up with musical instruments and speech, and there's a fair bit of white noise with speech. What I'm seeing visually with speech is a strong base note with a strong harmonic an octave up, and with the sound "oo" that's practically all there is to the sound, apart from a little white noise at very high pitch caused by air being forced through a restricted opening. With "oh" there's a second harmonic coming in a fifth up (three and a half tones higher). With "aw" another strong harmonic comes in two octaves up from the base note and more come in with "ah" while the lowest harmonic weakens. Similar things happen at the low frequency end with "ee", "ay", "e" (as in "bed") and "a" (as in "cat"), but there are higher frequency components of white noise with them which distinguish them from the other set. The "ugh" vowel in "bird" seems to dampen all of the harmonics. I'm seeing enough detail to tell them apart by eye, so it should be possible to write code that can do the same. (I could tell them apart by eye with the original method too, even with false signals all over them, but the processing would have been more complicated and the extra code needed to eliminate the false signals would always have been slow - I never got round to trying it out but just put it on the shelf to get on with other work instead while waiting for the right idea to do it properly.)

Quote
Adjusting the boundaries near the zero crossings, which is a form of interpolation

Detecting zero crossings, which is a way to estimate frequency.

What I'm doing is detecting crossings not just at zero, but at any altitude where the wave repeatedly switches direction (up/down), and also detecting crossings of a steeply sloping line.

Quote
Separately accumulating the positive and negative areas, and then comparing them at the end. Mathematically, this is just the same as adding them all up (comparison is a form of subtraction)

It is indeed just adding them up - I'm adding up all the area between a sine wave of a specific frequency as it oscillates on top of a wandering line regardless of where that wandering line goes and what angle it is tilted at. Some of that recorded oscillation does not belong to the frequency I'm measuring, but it generally cancels out over a long enough stretch, causing the detection of false signals at some other frequencies, but at low enough levels for the real signals to outgun them. (I'm doing the same thing for 192 different frequencies adding up different alternating areas for each, but all done in one pass to cut the amount of processing required to a tiny fraction of what would be required if they were all counted up separately.)

Quote
The problem with using zero crossings is that it only considers the input values when the waveform is near the zero crossings - this is why interpolation is so important.

I'm collecting the same quality of data on crossings at other altitudes and on slopes by adjusting to make the wave I'm testing for act as if it is oscillating across the X-axis at all times, and I'm doing this for all frequencies.

Quote
The Fourier transform (or multiplying by a square/sine wave) uses the entire waveform, and so is more sensitive; interpolation is not needed for these algorithms.

If I could work out how FFT is done, I could try programming it, but I get lost as soon as "i" comes into an equation. I'm looking for a simpler way of doing things because of that, and I suspect that what I'm doing is closer to what the ear and brain does (and that it has to deal with the same noise issues). I don't have the brain's advantage of parallel processing though, so I'm looking for ways to speed up the process such as only adding up the area once for all the different frequencies at once instead of doing it individually for all 192 of them - the areas only get added to the variables for individual frequencies when they switch from looking for positive areas to looking for negative areas.

It is currently taking 40 times as long to analyse a sound file as the sound file takes to play (that's working on a slow Atom processor) [correction: 20 times as long, but 40x as long as an MP3 encoder running on the same machine which is doing the same job and more], but there are a number of things I can do to speed that up. I'm currently using 4 staggered offsets per frequency instead of 2, and there's no advantage in using 4 over 2. Changing to using only 2 will halve the processing time, and more because it'll speed up the queue sorts even more. I could also drop the quarter tones for another doubling in speed and I should still be able to detects sounds at those frequencies using the ones to either side. I might not even need the semitones. Alternatively, I could retain them all but switch them in and out dynamically when they're required for testing exact pitch, but I don't want to rush into writing more complex code that may not work. I could drop the whole of the top octave as it doesn't show very much, and it uses more twice as much processor time as the octave below it, 4 times as much as the one below that, and so on all the way down. I could switch over to occasional sampling of higher frequencies instead of testing for them continuously; I'm planning to do that first as it might speed things up by as much as 100 times (while all those frequencies could still be monitored continually for activity without the same degree of precision) [though this would depend on not monitoring stretches where a sound can be assumed to continue for a time without changing]. I could also drop one of the channels and just do mono, but I programmed it to be stereo from the outset as I want to be able to detect horizontal position at some stage to help separate out different voices [though dropping stereo would hardly speed it up at all]. Anyway, by doing most of those things, it should be able to process the data in real time as it comes in from a microphone. I'd like to try FFT as well, but I'll see how far I can get with this approach first. I want to test the quality of the processed signal by creating artificial waves at the right strengths for all the frequencies detecting activity and then to use them to build a new sound wave that should sound something like the original - that should give me a direct demonstration of how much noise has been added, but all that really matters is that the results are clear enough to detect speech sounds correctly, and the visual evidence shows me that sufficient detail is there. It's another matter entirely though working out how to probe the data with code to interpret it - lots of the experiments I did last time failed to detect subtle differences that I could see by eye. It should be easier this time as the signal stands out much more clearly, but I still expect it to take a lot of work, and success is not guaranteed.
« Last Edit: 20/08/2014 16:58:45 by David Cooper »
 

Offline evan_au

  • Neilep Level Member
  • ******
  • Posts: 4130
  • Thanked: 249 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #44 on: 19/08/2014 22:43:00 »
Quote
altitude [amplitude] where the wave repeatedly switches direction (up/down)
  • Detecting changes in direction is equivalent to the mathematical function of differentiation, or in the audio domain it is a high-pass filter.
  • But integration and differentiation are opposites of each other (an inverse function), so if you are differentiating and then integrating you will be doing a lot of processing work to end up back where you started.

In any computer program (especially doing signal processing, like this one), there are some central loops which may make up only 1% of the code, but take up a majority of the execution time. Focus in on these, and small changes can often make a large impact on processing time.
 
Have a look at a public domain MP3 encoder. This will include code for a FFT.

The MP3 encoder detects the main frequencies present, then uses a model of the human auditory system to throw away sounds which are not consciously audible to humans. This may "clean up" the signal so there is less to process - hopefully without throwing away sounds which are subconsciously audible!
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #45 on: 20/08/2014 17:20:48 »
Quote
altitude [amplitude] where the wave repeatedly switches direction (up/down)
  • Detecting changes in direction is equivalent to the mathematical function of differentiation, or in the audio domain it is a high-pass filter.
  • But integration and differentiation are opposites of each other (an inverse function), so if you are differentiating and then integrating you will be doing a lot of processing work to end up back where you started.

I'm definitely not doing and undoing anything - all I'm doing is calculating area and collecting totals with different boundaries. (I haven't been placing the boundaries with exact precision though, so the errors are highest at the highest frequencies, and that's where noise is being generated that is hiding any real signal. I need to reprogram that part of things, though the sounds in that area are higher than those that can pass through a telephone, so I could just drop them.)

Quote
In any computer program (especially doing signal processing, like this one), there are some central loops which may make up only 1% of the code, but take up a majority of the execution time. Focus in on these, and small changes can often make a large impact on processing time.

Indeed, and I know that it's my sort routine that's slowing things down the most. It was written to be lightning fast for one specific application but is very slow for most other purposes. I'm only using it for this because it's reliable, but I'll write a new one to go with this program later. By reducing the number of entries being moved down the queue at a time though, the sort speed becomes less important, to the point that a slow sort routine could be just as fast as a fast one due to the tiny amount of work being done, so I'm not going to rush into fixing that until I know if it's worth the effort.
 
Quote
Have a look at a public domain MP3 encoder. This will include code for a FFT.

I avoid looking at other people's code because it can interfere with your right to write and distribute your own, so I'll seek help with understanding FFT on a maths forum at some point instead. I'll continue with what I'm already doing first though, because I think it could be just as fast if it's done the right way, and it might even be faster.
 

Offline evan_au

  • Neilep Level Member
  • ******
  • Posts: 4130
  • Thanked: 249 times
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #46 on: 20/08/2014 22:33:38 »
Quote
I'm definitely not doing and undoing anything
Algorithms for integrating and differentiation will look very different from each other, but that doesn't stop them being inverses of each other.

Just like the methods we are taught for doing multiplication and division "by hand" look very different from each other, but they are still inverse functions.

Quote
at the highest frequencies, and that's where noise is being generated that is hiding any real signal

Looking for changes in signal direction (differentiation) emphasises high frequencies, and emphasises noise.

You could try filtering the input signal to keep it within the human speech band - there is little useful information above 7kHz.
 

Offline David Cooper

  • Neilep Level Member
  • ******
  • Posts: 1505
    • View Profile
Re: Choir volume: could more singers mean quieter?
« Reply #47 on: 21/08/2014 17:55:54 »
I've rewritten the code to eliminate most of the errors that were accumulating at the highest frequency end, and now it's a lot cleaner. I can finally see dramatic differences between S (high pitched white noise in a narrow range), SH (extensive white noise covering several octaves), HL (the Welsh sound LL - there's a bit less white noise at the highest end than with S and SH), HR (similar to HL, but more white noise lower down) and KH (similar to HL, but with two patches of white noise and a gap between them, as well as less at the highest end), while F and TH are distinct from all of those but not greatly different from each other - there's just a hint of more white noise at lower pitch for F (both of these have white noise in the same place as S, but the zone stretches down twice as deep, though nothing like as deep as SH).

So, the quality's certainly good enough to work with now. The next thing to do is optimise the code to make it more practical to work with, and then I'll start writing routines to identify phonemes. After that, I'll have to build a phonetic dictionary, bat aj qlredj hxv a wj qv tajpik wurdz in fqneticlj hwitsh wil mjc dhxt tasc yzy (but I already have a way of typing words in phonetically which will make that task easy). ajv byn ywzik it fqr menj yyrz (I've been using it for many years).
 

The Naked Scientists Forum

Re: Choir volume: could more singers mean quieter?
« Reply #47 on: 21/08/2014 17:55:54 »

 

SMF 2.0.10 | SMF © 2015, Simple Machines
SMFAds for Free Forums