Friday, July 03, 2015

The Scientific Richter Scale

Perhaps it's unhealthy statistical fixation, but I tend to mull over scientific citations, those being one of the important factors in the multi-currency economics of science.  From my thoughts, observations, and wanderings on Google Scholar, has gelled the following rough interpretation of scientific impact in terms of the rough order of magnitude of citations:
  • Order 100 citations: This paper is languishing in darkness and obscurity.  That doesn't mean it's a bad paper, just that nobody particularly cares about it.
  • Order 101 citations: Either some people have noticed this paper, or else it's part of a strong ongoing line of research and is picking up a lot of self-citations.
  • Order 102 citations: This paper is having a significant intellectual impact.  People have noticed it and, whether or not they are actually using the ideas within it, those ideas are having a noticeable effect on the scientific discourse.
  • Order 103 citations: This paper contains something that lots of people are actually finding that they need and are putting to use.  It is no coincidence that many of the "Top 100" papers identified by Nature are methodology papers (though the fact that the list omits Claude Shannon's exceedingly highly cited paper on information theory is another great example of just how shoddy citation databases are in their coverage of computing fields).
Of course, there's no sharp boundaries between levels, nothing to say that a paper with 300 citations (closer to magnitude 2) is really all that different than one with 350 citations (closer to magnitude 3). Perhaps it will be best to think of this as a scientific Richter scale, measuring the intellectual impact of individual scientific events [papers].  Right now, I don't think it makes much sense to look beyond three orders of magnitude because there are so few papers out there, but then, there aren't many magnitude 8.0 earthquakes either.

Thinking about this in terms of the Richter scale immediately sends the complexity theorist in my brain to think about scale-free distributions and log/log plots, and so I made a plot of my own current personal citational spectrum, according to Google Scholar.  It looks like this:
Yes, that certainly looks relatively linear on a log/log scale.  But then, so do so many things.  The the linear fit points out that there is definitely a big kink in the line, so it's not a straight scale-free distribution.  That means... I have no idea.  But this was certainly a fun way to fritter away half an hour while starting my vacation and watching my daughter sleep.

No comments: