I would like to speak a little bit about entropy, because in the last time I hear a lot of descriptions and most of them seams to mean the right thing, but are not exactly right.
Well what is entropy?
Like Tom I would first describe entropy as a measurement of disorder. But what is order in the first place? Imagine order without need of space. Can you do it?
Well if the word order ist not quite intuitive, I will describe entropy as „Amount of Information in Data“ because it is the way how computer scientist will do it (and I am one). And here we come to the first false assumption!
- Tom and some other people says that low entropy system contains more Information as high entropy System.
This is not true! What they mean is this:
- Low entropy system contains more USEFUL Information as high entropy!
Why is the first sentence not right? I will show you right now:
Before you can understand amount of Information, you must forget the meaning of usefulness.
- Imagine a 1 MB of Data block. That are 8388608 bits.
- Imagine that all bits are just 0. How much information does this block contains? Well just 1 Bit of information. You can describe the whole 8388608 bits just with one pattern of one bit, you can actually describe infinit amount of data like this. Amount of Information is very low, the lowest to be exact. So the entropy is the lowest to! Everything is the same, nobody is different.
- Imagine the bits in this 1 MB are like 01010101… so you can describe the whole MB with the pattern 01. The entropy grows!
- Imagine the whole 8388608 of bits are so randomly set, you cannot describe it with less Data. If you try to convert it, you will actually need more space. This RANDOMNESS contains the highest amount of Information and has the highest Entropy at all.
So high entropy means more information than low entropy!
Does it stand in Conflict with Toms description? NO! Because now we will learn what is USEFULNESS!
With 1 MB of Data you can describe very much things! Think about of all low quality images, or songs, or programs.
Is a low entropy image better than high entropy? Its like asking you if a black or white image is better as noise! None of them are good…., none of them has USEFUL information. So what is useful? This is a big question! Everyone describe usefulness on his own. At first you must say information is useful if you need this sort of information. If you do not need it, it will never be useful to you. I describe usefulness as following:
- Amount of Usefulness of information depends on the amount of interpretations of this information.
Each interpretation is „build“ from the same mater as information (in this example bits) It's like a instruction how you can read the information. Is it a new type of Information and the amount of this new information describes the usefulness of the first type information?
So at this point it seams like another entropy is coming into the game! Right?
The Entropy of Interpretation.
Imagine you have an Image and a Monitor, that displays this image. The monitor is holding a huge amount of interpretations! Every each setting you can make (contrast, sharpness, brightens...) and the combinations of them, are new interpretations! Which setting is more useful? Well for the most people just the standard settings. But here you can see, that the usefulness of the representation of an image depends on each individual! The one loves it more brighter, the other more contrast or sharpness.
To understand why the low Entropy of interpretation creates more useful information (this is what Tom means by the way) you can simply imagine a translator. A Google Translator for instance.
Do you remember how bad it was many years ago? Why was it bad? Well at the point of creation of Google Translator it was yeah... very dumb. There are no translations given. So developer must train it!
Imagine they used the machine learning approach. For each translation, the developer could chose yes or no for good or bad translation. At the beginning translator gives a lot of possible translations, because it does not know which of them is right. After a while of training, translator have learned that some translations are better than others and the set of answers become smaller and smaller… do you see the entropy now? At the beginning there was no order! Translator gives you a full randomly answer. As user you will completely misunderstand the information you want to translate. This is not useful! So the entropy is very high at the beginning. With time the entropy sink. The lowest state of entropy by an translator means:
- For each request of translation, there is just one output and this translation is the best of all.
This state of a system is not possible. If you rich that state, you will recognize, that this to languages (source and target) are both the same. For instance you can wrap each word from one language by this rule:
Word —> Ordw (first letter goes to the end of word)
You will create a new Language and translation from A to B is just perfect, because they are both the same. This case is redundant and not important. It's important to understand, that the goal of our existence is to learn communicate with others, by being different.
So there are two kinds of entropy now:
- First entropy describes the amount of information of a information system
- Second entropy describes the amount of interpretations of the information from that system. (This one is what Tom describes)
This are first thoughts of my Language Theory, that describes Toms information systems as languages. From that on it needs a little leap to understand, that every existing thing… is a language.
I understand, that this is a lot of text, and it is very hard to ask Tom what he thinks about that… but maybe you can discuss it?
Thank you all who reached the down side of this post :)