Saturday, November 01, 2008

Another Look at Open Spaces

"The engine imbalance is what caused the worm-hole in the first place. It'll happen again if we don't fix it." -- Scotty

I've been reading as much as I can about the Open Space crisis, and unlike Jack Linden I have not "read all the comments" but enough that I have a reasonable sense of the trends. One post that stuck out from the crowd was a Dale Innis post outlining his "I’m guessing that what happened" perspective.

I listen to Dale, although we are likely Meyer's Briggs polar opposites, he's technically very savvy and he has the luxury of the insider's advantage when it comes to Linden Lab. Dale tells a story about what might have happened to provide an opposing view to the Linden Lab conspiracy theories and other rampant speculation. I've worked in enough walled garden development groups to know that Dale's tale, while disheartening, is perfectly reasonable.

It's the conservation of possibilities theory: Groups intent on doing great things have equal chance of doing great harm as great good. Okay, I made that up, but the essence is that one person can make a locally bad decision, two people can make stupid community decisions, but an unbridled group can actually unintentionally do far more damage globally.

Dale's story weaves through a set of probable circumstances regarding how the Open Space problem circulated around the Lab and ends here:
Someone suggests actually thinking for five minutes about how to break this to the users, but everyone’s hungry so they go to lunch instead and just announce it baldly in a blog posting, because everyone in the company who understands anything about customer relations is either out on sick leave to recover from the last crisis, or has been assigned to the “boring corporate people in suits” desk and isn’t allowed to talk to retail customers.
An interesting tale, a horrifically sad (and I hope untrue) ending but it gave me pause to think. Don't get me wrong, I still think Linden Lab is emulating a modern day John Sutter, but as to the root of the problem, I'm guessing Dale has some good inside information if not, at least a hunch based on his technical skills.

So I kept reading and I ran across a few bits of information that implied that the real root of the Open Space crisis may actually a deep seeded set of technical issues, the sum of which is colluding to create the most recent decline of our Second Life experience. Open Spaces are not the root cause, but merely a catalyst by which the true problem has been illuminated.

You may have noticed the first problem yourself, textures seem unnaturally slow to load. Residents have been accustomed to standing around when landing in a new place and while everyone is greeting them, returning the greeting with a simple yet well understood "Hi everyone, waiting for rez". We've been trained to embrace our inner gray.

A few dedicated residents have worked hard to track down a report of a problem we've learned to brush off as just "part of the deal" in this JIRA report. This JIRA report was filed on August 5, 2008 shortly after the 1.21 Release Candidate uprade. It's a worthy read, but lexo Bethune summarized it nicely here:
Wow, seems we've wandered really far down the rabbit hole on this one, considering it just started with "slow textures".
So, in summary, we've got a texture loading bottleneck, misdetection of video card memory, and rapid flushing of the cache requiring textures to be redownloaded from the asset server, which causes avatars to hit the asset server much more than they should, slowing down the asset server and causing even more requests, descending into a loop that causes a DDOS attack on the asset servers, causing massive instability, and rendering all tests of sim load on all regions unreliable (and removing justification for price hikes on Openspaces, since the load on the OS sims is due to an internal error, and not overuse).
What we have here is a pile of bugs that have come together, in a chain reaction, to ultimately cause massive problems and actually effect the pricing policy decisions of LL.
Wow. Excellent detective work, guys. Combined, this tree of bugs is probably the biggest issue in SL, and it looks like the ball's rolling. Thanks bunches to all of you. :3

In my opinion, if this in fact is the true root of the Open Space debacle, it is not all bad. Yes, lousy architecture and short sighted work-arounds are bad in the long term.

However, the good news here is that we get to see the power of a transparent issue tracking system combined with the passionate furor of residents that are a) capable and willing to help and b) infinitely resilient.

Paying, passionate and prolific content consumers and creators live here. Linden Lab, you are in fact, sitting on a gold mine.

