The Naked Scientists

The Naked Scientists Forum

Author Topic: Could we cut down bandwidth by sending sums and formulae rather than answers?  (Read 3909 times)

Offline bichoverde

  • First timers
  • *
  • Posts: 3
    • View Profile
I have been thinking if it would be possible to find a way/formula to compute a particular data stream instead of storing it all.
For instance, we could say, this stream of data is the first million of digits of Pi in binary base...
It may be very difficult to find them, but for every binary sequence there must be simple formulas to obttain them mathematically.
This way one could send a huge file having to send only a relativly small math formula...

Do you get the idea? What do you think?
Do you reckon one day some form of super-quantum-computers could harness such memory power in computing?


[MOD EDIT - PLEASE PHRASE YOUR THREAD TITLES AS QUESTIONS, IN LINE WITH OUR FORUM POLICY.]
 
« Last Edit: 12/04/2011 21:52:13 by chris »


 

Post by MarkDonovan click to view.

Offline MarkDonovan

  • First timers
  • *
  • Posts: 1
    • View Profile
Yes nice post . Keep posting here .
Thanks ....
 

Offline RD

  • Neilep Level Member
  • ******
  • Posts: 8134
  • Thanked: 53 times
    • View Profile
I have been thinking if it would be possible to find a way/formula to compute a particular data stream instead of storing it all. For instance, we could say, this stream of data is the first million of digits of Pi ...

Formulae are used to generate Pi ... http://en.wikipedia.org/wiki/Computing_%CF%80#20th_century
« Last Edit: 12/04/2011 11:49:40 by RD »
 

Offline graham.d

  • Neilep Level Member
  • ******
  • Posts: 2208
    • View Profile
Data is usually regarded, for the most part, a set of random numbers. Where it is not, it can be compressed. I think what you are saying is that irrational numbers (like PI for example) have digits or groups of digits that would obey all the usual tests for randomness (even though PI is calculable) and, therefore, if we have a set of random numbers, we should be able to find a calculable irrational number that would be able to reproduce the data.

It is an interesting concept, particularly as it can be shown that there are more irrational numbers than rational ones. However, I think that the problem is that, although PI and e are irrational numbers that have a simple and concise formula that allows their calculation, infinitely more of them do not. My gut feeling is that this would not work as a way of compressing random data but it is way beyond my ability to prove this. Information theory would say that this could not work but then it's possible Information Theory may be wrong!
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
This is a form of data compression you're talking about.

Compression is really limited to removing the redundant data.

You can certainly compress things like this, but only really to the extent that the data follows the formula. And there's lots of different formulae you could use, adn the coefficients of the formula would also need to be contained in the compressed file; beyond a certain point you find you haven't saved any space!

The ultimate catch with any compression is that you can't always compress things; because otherwise you would compress the compressed file again and end up with an even shorter file; right up to the point you get down to a single bit, of zero or one. And then you could compress that!
 

Post by BioWizard click to view.

Offline BioWizard

  • Jr. Member
  • **
  • Posts: 24
  • The Naked Scientist
    • View Profile
    • ForumMagnum - All Things Rome
Although it is usually simplest to describe CW as turning on and off a carrier frequency, that isn't really a very good description from a signal point of view. Actually CW is 100% modulated AM. When the signal is turned off, its amplitude is exactly zero, and when it is turned on its amplitude is the carrier signal amplitude. So that means we can analyze a CW signal using the same techniques that are used to analyze AM signals.
Our first thought might be to evaluate a CW signal as if it were an AM signal modulated by a square wave with an amplitude of 1. That is the same as switching the carrier instantaneously on and off. As with any AM signal, we can write the resulting signal as a product of a carrier frequency and a modulating signal, in this case a square wave. But a square wave can also be represented in terms of sine functions as is shown in most basic signal processing books. In fact, a square wave is composed of an infinite series of sine waves all added together. The formula is the sum from zero to infinity of all of the odd harmonics of the modulating frequency, with a decreasing amplitude as the harmonic order increases. Now that may sound complicated, but what it means is that if we instantaneously turn on and off a carrier wave to generate CW, we end up with a signal that has an enormous bandwidth. If you listen to a signal actually generated that way, what we hear are terrible key-clicks, that represent all of the higher frequency components of the modulating square wave.
Surely that doesn't sound so good! We want a narrow band signal, but we find that simply turning on and off a carrier generates a wideband signal instead. In fact, the normal technical definition of bandwidth is the frequency range where the signal is above 30 dB of its peak value. Since the coefficients are equal to 4/Πk where k is an odd integer, the peak value is when k=1 (the fundamental modulating frequency) and the coefficients decrease with increasing values of k, the drop in magnitude of the harmonics is 30 dB < 20 log(k) or log(k) > 1.5, which implies k > 31. Recall in another section we found that the time interval for a CW element was about 1.2/W (W in words per minute) or the frequency of the symbols was about (0.8333 W). Considering that the wavelength of a modulating square wave would be equivalent to a dit and a space element, that means that the equivalent modulating square wave would have frequency of about (0.4167 W), so that the 31st harmonic would be (12.92 W) on each side of the carrier for a total bandwidth of about (26 W) Hz. That means that at 5 WPM, the bandwidth would be about 130 Hz and at 20 WPM the bandwidth would be over 500 Hz! Clearly that isn't acceptable and our experience says that it isn't usually true in practice, either.
The problem is that real electronic equipment, like radios, don't generate square waves precisely. Instead the signal rises over some time period controlled by the electronic components used. But the lesson is simple: Don't use square waves to modulate a CW signal!
 

Offline bichoverde

  • First timers
  • *
  • Posts: 3
    • View Profile
No, I don't think this is the same as compression in the sense of removing redundant data. What I'm saying is that you could find a simplified math expression for an amount of bits.
For instance, we know that PI is equal to:
3.141592653589793238462643383279502884197169... and so on

So, if we substitute each digit by 0 if the digit is even and by 1 if it is odd,
you have the following digits:

1.101110011101111010000001101011100000111101...

So as you can see you get a stream of bits...

Maybe I'm not making myself clear but I don't know how to explain better...
Anyway, it has nothing to do with redundant data...

For instance, for the contents of a DVD, you would have exactly the same bits, but you would find a formula to extract them mathematically... Damn, I'm finding it really difficult to explain... But I hope you get the idea. It is a simple concept.
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
Yup. That's how data compression works.

You find a scheme that gives (more or less, or ideally exactly) the data in the file, and then you send that, along with any deviations from the scheme.

For example an MP3 encoder generates a model of the sound in terms of a mathematical psychoacoustic model of how your ears work, and then send the parameters that they measure from the music that best fit the music according to that mathematical model. And then the MP3 decoder reproduces the sound according to that model and out pops recognizable sound.

MP3 is 'lossy' because it doesn't perfectly reconstruct the sound (but is usually perfectly good enough), but other audio compression systems work in much the same way, but can do it perfectly, but give less compression.

Basically, with your PI example you're saying that everything except the parity of the digits is redundant, and you've removed them before transmission.
« Last Edit: 13/04/2011 17:21:08 by wolfekeeper »
 

Offline bichoverde

  • First timers
  • *
  • Posts: 3
    • View Profile
No, this is different from data compression.

As this site ( newbielink:http://cplus.about.com/od/introductiontoprogramming/p/compression.htm)explains [nonactive]:

How does Compression Work?: In one word: redundancy. Extra information that is unnecessary is removed. Take a look at this sentence. "Yu cn undrstnd ths, evn wth mny ltrs msng". Hopefully you could make out that it said "You can understand this, even with many letters missing".
Those missing letters are redundant in English, as you can still understand the sentence without them. In 8 bit text which has 2^8 = 256 different values, less than 80 are needed to write English.


....

What I describe has nothing to do with redundancy. That's not the point.
Maybe the example was not very good but you could think of something like PI in binary.
 
 

Offline wolfekeeper

  • Neilep Level Member
  • ******
  • Posts: 1092
  • Thanked: 11 times
    • View Profile
The thing is that something with a small Kolmogorov complexity (like pi) can be trivially compressed, but most real-world data is not low complexity like that.
 

Offline Geezer

  • Neilep Level Member
  • ******
  • Posts: 8328
  • "Vive la résistance!"
    • View Profile
This reminds me of the old story about inmates telling each other jokes from their cells. They had all been in the clink so long that they had reduced all the jokes they knew to a number.

One day a new prisoner was brought in, and he was amazed to find that when one of the imates shouted out a number, the whole place errupted in laughter. His cell mate explained to him how it worked, so he thought he'd give it a try too.

He yelled out "forty seven". Nothing! He tried again. Still nothing! He asked his cell mate what the problem was, to which his cell mate replied,

"It's all in the timing." 
 

Offline RD

  • Neilep Level Member
  • ******
  • Posts: 8134
  • Thanked: 53 times
    • View Profile
How does Compression Work?: In one word: redundancy. Extra information that is unnecessary is removed.
Take a look at this sentence. "Yu cn undrstnd ths, evn wth mny ltrs msng".

That's lossy compression, sending the formula would be like lossless compression, (no data discarded with the latter).
 

Offline Geezer

  • Neilep Level Member
  • ******
  • Posts: 8328
  • "Vive la résistance!"
    • View Profile
How does Compression Work?: In one word: redundancy. Extra information that is unnecessary is removed.
Take a look at this sentence. "Yu cn undrstnd ths, evn wth mny ltrs msng".

That's lossy compression, sending the formula would be like lossless compression, (no data discarded with the latter).


He's right! Just because I might understand it, it doesn't mean I can't misunderstand it.
 

The Naked Scientists Forum


 

SMF 2.0.10 | SMF © 2015, Simple Machines
SMFAds for Free Forums
 
Login
Login with username, password and session length