A quick note: I was reading Techcrunch earlier today, when I realized that I could not reconcile their post “Twitter at scale: Will it work?” with my views on how to build a scalable application.
Nick Cubrilovic contends that “Every new Twitter user and every new connection results in an exponentially greater computational requirement.”
And yet, I fail to see the exponential quality of it all.
It looks like Nick is saying: “I have U users, posting P posts, read by F followers. Hence, if I were to draw this on paper, I would end up with an exponential slope.”
That’s odd because, as I understand Twitter’s architecture, we indeed have U people posting P posts - BTW, P is unknown, and as Twitter goes I’d wager that the f(P) curve would be logarithmic; but I digress. Let’s, for simplicity’s sake, consider the total number of posts and call it X.
Now, Nick would obviously be referring to a f(F) curve. If F followers have to monitor X posts, then yes, I expect the slope to be exponential.
But that’s not how it works: Twitter is a pull service; each Twitter client regularly asks the server(s): “Do you have anything for me?” The server replies: “No” or “Yes, these x posts.” “x”, not “X” because only relevant posts are retrieved.
Since “F” is bound to be much bigger than “x”, and the overheard of retrieving multiple posts is very small compared to the time elapsed between two polls, it seems to me that the formula we should use here is that of a linear approximation.
Just my 2 cents. My math is *very* rusty but it seems to me that Nick’s argument doesn’t hold water.
Sphere: Related Content