With reference to David Weinberger’s exegesis, please watch…
Many of us thought the Internet would level the playing field in politics, governance, publishing and any system entailing information exchange back in the late 1980s to early 1990s. Alas, I am of that age…
In the early 1980s, I completed two master’s degrees at a Connecticut college that offered a faculty “mishpokhe” rate of $5 per semester. So, for $10 a piece, I earned MSs in Library Science (LS) and Instructional Technology (IT). Such a deal…
The MLS was to organize my purportedly “world’s largest boring book collection.” The MSIT was to learn better ways of teaching with technology. I have to say that from the back of the classroom in History of Books and Print, the sound of knitting needles was deafening. But I digress…
In a course entitled Acquisitions and Organization, we learned about Zipf’s Law, defined in Wikipedia as follows.
Zipf’s Law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.
One way to build library collections in a given subject area is/was to do a random sample of the literature and record citations to other resources. These appear in each work’s footnotes, endnotes and bibliography. You then create a frequency table, and using Zipf’s Distribution, identify the top 5% of literature cited in the sample set. That subset constitutes a “core collection”–the “must haves” for your library.
Something similar is done in academia to “power rate” journal publications–used in the determination of faculty promotion, tenure and renewal (PTR) rewards.
It also applies to personal ads.
There you have it–a mathematical solution to a love problem.
Zipf’s Law is a manifestation of the Pareto Principle, defined in Wikipedia as follows:
The Pareto Principle (also known as the 80/20 rule, the law of the vital few, or the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes.
Optimists call it the 80/20 Rule. Pessimists call it the 90/10 Principle. There are numerous manifestations of the rule/principle. Here are some examples: 20% of your computer storage is utilized 80% of the time and 80% of the wealth of a nation is controlled by 20% of the people.
Zipf’s Law and the Pareto Principle are mathematically describable by a class of mathematical expressions called power functions. Hence, the term “power laws.”
Power Laws apply not only to printed literature, but as DW pointed out despairingly, to Internet nodes (and consequently data/information).
Websites link to other sites in a manner analogous to citations. Thus, sadly, there is no uniform distribution of references on the Internet. They are just as hierarchical as book citations, governments, businesses, and a host of naturally occurring organizations.
There was an edition of New York Magazine dedicated to blogging back in the mid-2000s. You can access an article from that collection entitled “Linkology” at http://nymag.com/news/media/15972
The article contains a kewl “Linkology” poster (Figure 1) that you can download and print that is suitable for framing.
Figure 1: The top blogs get and will continue to receive the most hits.
I’m interested in ways to jump from the long tail of the power distribution curve (Figure 2) to the top 10%.
Figure 2: The “Long Tail” of the Power Distribution.