Probably one of the dozen or so most important publications of the twentieth century. Ironically though it would be Norbert Wiener's interpretation of information (exactly Shannon's times minus one) that would cement itself in the popular lexicon due to it being far more intuitive. Where Wiener posits information as negative-entropy (or "order"), Shannon interprets it as representing a degree of freedom, or uncertainty. The problem with Wiener's is that the underlying epistemology carries a subjective bent, where information is equivocal to "meaning" or semantic content. Shannon's, meanwhile is ultimately superior as an objective metric, though far less intuitive (under his interpretation, a random string is technically the most information-saturated construct possible, because it possesses the highest degrees of freedom).
To say Wiener and Shannon founded information theory is like saying Lorenz and Einstein founded speial relativity. I don't think physicists would consider Lorenz having made a major advance, and in fact even after the relativity was explained some people still didn't understand it.
Wiener made important contributions to mathematics but not to information theory. He wrote a book saying that entropy is related to "information" and is maximized for a Gaussian. That's about his involvement. The paper attached goes way beyond that.
But isn't the information gained from a message necessarily subjective? It depends on just on this message, but from the distribution from which this message was drawn, and more appropriately, the distribution anticipated by the observer will determine the value (information content) of the message. That is exactly how it becomes useful in practice.
This "surprisal" can be parametrized by relative entropy (Kullback-Liebler divergence), which reduces to the Shannon entropy formula only when the observer has the correct anticipated distribution.
I think there is a distinction to be made between information in the ontological sense and information in the pragmatic sense. The former does not have any purchase on "meaning" or "use", only a metric of informational entropy or degrees of freedom, irrespective of observation. However, the sense in which you speak of information is the pragmatic dimension, where an observer receives a datum of semantic content against a given backdrop of pre-established "meaning". This sense is problematic as an objective metric.
I'm trying to find a quote I read about this paper once. To paraphrase, not only did this paper create a new field of academic inquiry, it answered most of that new field's interesting questions, too.
> With Shannon’s startling ideas on information, it was one of the rare moments in history, an academic would later point out, “where somebody founded a field, stated all the major results, and proved most of them all pretty much at once.”
What I think is telling, is how easy to read Shannon’s paper is. Even today it is used pretty much as is at many EE colleges to teach communication theory.
Another cool fact: Shannon’s master thesis is most likely the most influential one of all time: in it, he linked Boolean algebra to electrical circuits with switches, essentially inventing digital circuit theory.
BTW here's the paper by Hartley cited on the first page it is also very readable and insightful. I found it helped clarify some of the subtler points in Shannon's paper to read it as well.
If you're interested in this, check out Cover's Information Theory textbook — the rabbit hole goes much deeper. One of the most interesting examples, is that when you're betting on a random event, Shannon entropy tells you how much to bet & how quickly you can compound your wealth. Cover covers (heh) this, and the original paper is Kelly: http://www.herrold.com/brokerage/kelly.pdf
Kelly's paper (based on this paper by Shannon), is responsible for fundamentally reshaping equity, commodity and even sports betting markets.
I highly recommend William Poundstone's book, "Fortune's Formula" as a biography of those ideas - it's almost as good as any Michael Lewis book on the subject would be.
The article was renamed "The Mathematical Theory of Communication" in the 1949 book of the same name, a small but significant title change after realizing the generality of this work.
The forward is also essential reading, even for those uninterested in the math. It's one of the best descriptions of a field of science I've ever read.
> The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning ...
I never fail to suppress a chuckle when I read that.
It seemed clear to me at the time when I read (a few of - but especially https://bayes.wustl.edu/etj/articles/theory.1.pdf) ET Jaynes' papers on classical thermodynamics that they were commensurable. I'm mildly surprised there is some question about that!
The alphabet is to distinguish states. But this is the case in thermodynamic entropy as well, that each state is distinguishable. What is important is quantifying the set of possible states.
In a thermodynamic sense, information input (say, of photons) is generally going to increase the entropy of a receiving system. Only certain kinds of information can reduce the entropy of the receiving system. Too much information (i.e., too many photons) can literally burn your eyes.
I don't see how to square those manifest physical effects with the immateriality of Shannon information entropy.
Is this not the problem of Maxwell's Daemon, if an intelligent agent observes and interferes with a particle system to reduce the system's entropy then overall entropy is still increasing because the agents needs to store and calculate information in some physical medium, be it brain matter of silicon.
In the Shannon formulation, there isn't a notion of "too much information" -- it is all symbols.
But in reality, too many photons can burn our eyes or a denial of service attack can break a website. Too much information is a reality --because we are thermodynamic entities.