Part 1 of this posting focused on the properties of data.
Information is not the same as data, but is rather a property of histories of data, and their relationship to future data. Information (also called Shannon entropy) is a measure of how uncertain an observer is before making a measurement, compared to after making a measurement. The uncertainty arises from having made many measurements in the past, and having learned how reliable or unreliable new observations are likely to be. The act of measurement informs the observer, reducing their uncertainty.
How does this work?
If I roll a fair die, I will be resolving uncertainty from six equally likely possible states to one actual state. In this case, I will gain 2.6 'bits' in binary units (or 0.78 'decs' in decimal units) of information. This is the number of bits needed to represent six distinct states in data, although physical data bits only come in wholes, so we must use three. Unlike data bits, information bits don't need to come in wholes, as they are abstractions that measure smoothly varying likelihoods or expectations, so we don't need to round up...we can be exact.
If I am rolling a loaded die that lands on '1' five times as often as each of the other faces, I am still resolving uncertainty from six potential states to one actual state, but this time not every state is equally likely. I only gain 0.66 bits of information, because this die is much more predictable, which is exactly why I could use it to cheat in a game. I am more certain about the outcome before I ever roll the die, so the information that I gain by rolling is less.
If my loaded die always lands on '1', it is completely predictable, and I gain no information each time I roll it.
The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE's concurrence with, or support for, the positions, opinions, or viewpoints expressed by the author.