Thursday, November 17, 2011

The social graph is...

"...neither social nor a graph..."

...provided you redefine the words "social" and "graph" to mean something other than what they mean to everyone else.

M. Ceglowski is just being deliberately obtuse, or more precisely he is taking a wild excess of rhetorical license in order to make his statements seem more profound and unconventional. For example, he writes:

We nerds love graphs because they are easy to represent in a computer and there is a vast literature on how to do useful things with them. . . . In order to model something as a graph, you have to have a clear definition of what its nodes and edges represent.

Well, that's actually bullshit. In a dynamic Bayesian network, you don't have a complete definition a priori of what nodes and edges represent. Well, you do, in that the nodes represent variables and the edges represent relationships between those variables, but the weights on the edges are learned statistically from data. An edge may represent a meaningful connection, or it may mean nothing at all. The graph precedes semantics, not vice versa. Likewise with the social graph. People are connected, and you don't necessarily know what each connection means. But it's still a graph.

The labels on the social graph's edges may be subtler and more multidimensional than the simple weights you put on Bayesian network edges. And we don't have a good handle on how to learn those labels, or even what the labels should be. However, calling for the abandonment of a useful mathematical construction in an emerging field of science because it's incomplete is something that you do when you want to convince people that you're smarter than the people working in that field. It's not something you do when you want people to become better-informed.

Ceglowski also writes that the social graph is "not social" because... well, actually, I have trouble even locating a coherent argument in that part of the essay. He seems to be confusing "social" with "sociable". The social graph is social, since it describes relationships between people. Perhaps some activity involved in digitally reifying the social graph is anti-social (Note that anti-social is not the opposite of social — anti-social behaviors are social behaviors!). But that doesn't make the social graph "not social". By that standard, sociology is not a social science because sociologists spend a lot of time by themselves in libraries.

Incidentally social scientists have been modeling social connections as graphs for decades.

Here is a short list of the valid points Ceglowski makes:

  1. FOAF relationship labels are kind of dumb and embarrassing.
  2. Manually maintaining anything other than a very coarse-grained digital reification of a social graph is a tedious chore.
  3. Making your social network and behavior the property of a company whose revenue model is not aligned with your long-term interests is a bad idea.

And here is a short list of other, non-terminological points that Ceglowski just gets wrong:

  1. Social networks do "[g]ive people something cool to do and a way to talk to each other". It turns out that sharing photos, videos, and links is one of the most broadly appealing online activities, and social networking sites seem to do this better (along some dimensions) than dedicated photo-, video-, and link-sharing sites.
  2. Judging communities by the outward-facing cultural artifacts they produce is a radically inadequate measure of value. The vast majority of communication is point-to-point, not broadcast, and the vast majority of interpersonal interactions are social grooming. Social grooming is a deep-seated primate instinct which nerds devalue at their peril. Social networks have made online social grooming far easier than their predecessors did.
  3. People on WoW, Eve Online, and 4chan have healthier social lives than people on Facebook? Really?

Note that I write all the above as someone who dislikes Facebook and is skeptical of reductive approaches to modeling social relationships. And I've been advocating* an end to proprietary social networks for years — long before I started working at Google, and in fact before Facebook was even the predominant social network. So I'm broadly sympathetic to Ceglowski's aims. But I don't like at all the way that he goes about explaining them.


*Incidentally, rereading this old post, I realize that I completely missed the possibility that the dominant social network site would simply become a huge platform for third-party applications. I guess it never occurred to me that serious companies would bet their livelihoods on being sharecroppers in the walled garden. Go figure. I could speculate that this willingness can be traced directly to the Valley vogue for building companies to flip rather than to create sustainable, decades-long sources of enduring value — if you're just holding on until your "liquidity event" then it doesn't matter that your business is built on the fickle forbearance of your platform landlord — but I'm not sure how right that is.

Monday, October 31, 2011

Occupy .* and the Iraq War

Then, as now, conservative opinion and elite bipartisan opinion was mostly contemptuous of the protesters. Well-fed, well-educated, well-salaried pundits looked on the shaggy protesters and remarked: how unsophisticated were the protesters' opinions, how disorganized their complaints! Fortunately, the nation was run by a select few who understood the harsh realities of a world where some suffering was necessary (for other people) so that the existing order could be maintained. And the march to war rolled on.

In all likelihood, the protests today will be as futile as those were. It's taken a couple hundred years, but the system of governance by elected representatives has evolved an immune system with nearly impervious defenses against street protests. Nevertheless in a society supported by the many and operated for the few, it is perhaps useful, for aesthetic reasons if nothing else, to have some people calling attention to that fact. If you, as a critic, imagine yourself on the side of the angels in damning the protesters, then you should perhaps reconsider.

Thursday, October 13, 2011

Cynicism and libertarian ends (again)

Why M. Yglesias is a nationally acclaimed writer and I am not, exhibit #7572: a couple of years ago I wrote a somewhat convoluted post about cynicism and libertarianism, whereas today Yglesias wrote this which is elegant, readable, and much more worth your time.

Friday, October 07, 2011

Wednesday, August 03, 2011

g++ unordered_multimap: an exercise

I discovered this randomly several weeks ago while debugging something else at work, and I thought it was worth sharing since the g++ STL is widely used.

Step 1: Save the following file as mapdemo.cpp:

#include <iostream>
#include <unordered_map>

int main() {
    typedef std::unordered_multimap<int, int> int_multimap;
    int_multimap map;
    for (int i = 0; i < 10000; ++i) {
        map.insert(int_multimap::value_type(17, i));
    }
    std::cerr << *static_cast<int*>(0);
    return 0;
}

Step 2: Compile the file:

g++ --std=c++0x -g mapdemo.cpp

Step 3: Load the file in gdb and examine the buckets:

$ gdb -silent ./a.out
Reading symbols from xxx/a.out...done.
(gdb) run
Starting program: xxx/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400b4d in main () at mapdemo.cpp:10
10     std::cerr << *static_cast<int*>(0);
(gdb) p map._M_bucket_count
$1 = 15173
(gdb) p map._M_buckets[0]
$2 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[15172]
$3 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[17]
$4 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x605080
(gdb) p *map._M_buckets[17]
$5 = {_M_v = {first = 17, second = 0}, _M_next = 0x670c90}

Yes, g++'s implementation of unordered_multimap (known as hash_multimap in pre-C++0x versions of C++) uses bucket hashing, but the size of the backing array is proportional to the count of elements in the multimap, not the count of distinct keys.

Exercise for the reader: Explain what I just did; explain why the result of step 3 is curious; and then explain why the authors might have chosen to do it this way anyway.

It occurs to me that this would have made a decent interview question if I hadn't written it up here. Oh well, I have others.