Ethan Zuckerman’s online home, since 2003

(Notes from a talk given yesterday at the MIT Media Lab Members meeting.)

Albert-László Barabási, a Hungarian scientist, is one of the leading theorists on the properties of networks. As it happens, networks appear in many different branches of science, from sociology to computer science to genetics. It’s hard to pin down what type of scientist Barabási is, as a result: he’s a network scientist.

In that sense, he’s following in the steps of other great Hungarian mathematicians, including Paul Erdős and Alfréd Rényi, who conducted early work on understanding networks. In studying complex networks – friendships on Facebook, links between web pages – Erdős and Rényi began by building graphs and assuming that nodes were connected randomly. In a network like this, the number of connections each node has is predictable: they’ll order themselves in a poisson distribution with a prominent peak around the average degree. The peak has a steep falloff, which means it’s very hard to find nodes with lots of connections, or to find totally disconnected nodes.

Of course, that’s not how networks in the real world work. In a society where we were organized randomly, we’d each have similarly sized circles of friends. But in real networks, some people are largely disconnected and some are very popular. Barabási studied distribution of links on the early Web, using robots to index web pages. He discovered that the distribution of links between webpages doesn’t form a poisson distribution – instead, it’s a Pareto distribution, a power law – most pages have very few links, while a small number have a massive amound of links. Barabási terms this a “scale free” network, a network in which a small number of highly connected hubs holds the network together.

He invites us to consider the game, “Six Degrees of Kevin Bacon”. It’s not hard to find connections between actors who’ve appeared in the same films together and build chains that connect virtually everyone in Hollywood. (Marilyn Monroe is two degrees away from Kevin Bacon.) The game closely parallels a game mathematicians play, calculating Erdős numbers. If you’ve published a paper with Erdős, your number is 1, published with an Erdős collaborator and your number is 2, and so on. What both Erdős numbers and the Kevin Bacon game demonstrate is tha Hollywood and mathematics are both interconnected networks.

What’s bizarre is that similarly structured networks show up far from human interactions. Barabási shows us a network of protein interactions in metabolic networks and tells us that this networks shows a very similar scale free structure, despite the fact that there’s no human intervention in building these networks, and the network is 4 million years old.

One of the breakthroughs in modeling networks is moving away from the assumption that network size is fixed. Instead, networks continually expand – people make new friendships, webpages are created and link to older pages. We can model these networks by adding nodes and then randomly connecting them to a small existing network. Again, that’s not how networks actually work. It’s more likely that the new ties will connect to popular nodes. This idea – preferential attachment – means that networks have a powerful first-mover advantage. It’s likely that the popular nodes will get more popular, and those that are marginal will stay that way.

But this can’t be the only principle at work, otherwise we’d still be searching using AltaVista, and Google would never have arisen. We need to model networks by considering other properties like “fitness”, how desirable it is to link to a given node. Nodes with a higher fitness can overcome the first-mover effect and can always maintain a fraction of available links.

Scale networks have interesting properties when it comes to robustness. You can remove random nodes from networks, and the network will likely stay connected. That’s because the majority of nodes in the network are not highly connected. If you remove links strategically, you can disconnect a scale network easily – you simply target the hubs. Remove links at random in a scale free network and you can remove 99% of the links and stay connected – remove strategically, and you can disconnect a network by eliminating a few hubs.

Barabasi looks at some practical applications of this understanding of human networks. He considers the structure of relationships within a Hungarian corporation that was having management problems. It turns out that the best connected person in the company, the person likely to spread news and gossip through network, is a low-level manager responsible for environmental compliance. Discovering the power of an individual like this can be uncomfortable – do you fire this guy because he’s bringing everyone down? Try to cheer him up? Create someone else who is as deeply connected across the organization?

He brings us back to the Kevin Bacon example: why Kevin Bacon? Is he the best connected node in Hollywood? No, not by a long shot. Mel Blanc, the voice behind Bugs Bunny, has the most films to his credit. The rest of the top 10 in terms of film appearances are largely porn stars. Kevin Bacon doesn’t appear on a list of the most film appearances until number 870 or so. It’s a coincidence, just a clever way of understanding the highly connected network space that is Holywood.

And another next step, in a more praxical and political perspective: http://yannickrumpala.wordpress.com/2009/10/21/tracing-and-reconfiguring-networks-to-build-a-political-alternative/

Ethan,

As always, I enjoyed reading your post. I have a tiny point to pick with you first, and then will have an example of a network that I work with all the time that is quite a bit different than the network you discussed here.

First the nit: I thought that the “Six Degrees of Kevin Bacon” was in part a play using the same meter as in the movie “Six Degrees of Separation”. That, and for a while, it seemed like Kevin was the hardest working actor in Hollywood (albeit always playing minors roles).

Now for my thoughts on another type of network: I am a polymer chemist (polymers are very long molecules – imagine a 20 foot spaghetti noodles) and some of these polymers are made into networks – there are spot along the chain where a link is made to another chain. Because of the laws of chemistry, it isn’t possible to have a high number of links at any one site – 2 links is pretty much the limit except in some rare circumstances. Some sites will have only 1 link and others (usually the vast majority) will have none. An every day example of such a network is the rubber that makes up a car tire.

The other extreme for a network is a random network where two points are linked at random. That is also not what occurs in a polymer network, as the links are of a limited (microscopic) length – a polymer chain at the top of the tire cannot be linked to a chain at the bottom.

The problem I have with this, is that I was previously aware of some of the points you summarized here about scale-free networks and so I certainly understand their universality, scaling,… and yet my polymer networks don’t fit that pattern. It’s disconcerting to some degree.

The Battle of Algiers comes to mind . . .