How broke is it? #BugSmashing

How broke is it?

That title/comment isn’t directed at Star Trek Online. It’s context for this rather long post.

In varying roles over my career, I have heard that comment more than a few times. It’s cringe worthy. In a recent case it came from an engineer refereeing to a problem unconnected to what we make, but that in the media, we most certainly would be seen as being a part of. Such is the fun of a publically traded company.

When something works for the majority, sometimes the problems that impact the minority are overlooked. From a financial standpoint, companies make a business choice. When you look at a quality brand, that decision isn’t even a consideration. It’s something your actively work to avoid.

I’m lucky. I work for one such company.

The relevancy of that argument holds sway with any company that produces a product. The real question is whether they listen and can actually act

The Letter

22 days ago a player with significant experience in the game, and an established leadership role in our community wrote an open letter to Cryptic on the state of the game.

Serious discussion on some big issues is long overdue by Snipey47a 

Without trying to reduce Snipeys’ well written letter to a single quote or idea, what I will say is that it was a watershed moment for many of us both as customers and as community members.

This is not to say that significant efforts to address the issues identified in the letter were not already underway within the team at Cryptic. @Salami_inferno was directly addressing the issue on a number of fronts. Networking issues, code, hardware and even ISP connectivity were being reviewed in an effort to improve the visible lag issue. Changes were being made and actively communicated, but the improvements were simply not visible to many.

Changes take time, especially in the world of enterprise software. And no, that’s not an intentional pun.

The discussion that followed Snipey’s letter was for the most part sane and well written. Reddit’s discussion thread grew quickly to 200+ posts with feedback from STO, and ultimately triggering one of the more ‘rollercoaster’ special episodes of The Show.

@LaughingTrendy surprises the heck out of… everyone.

In what I call a gutsy moment, Trendy holds an impromptu Twitch cast and addresses the problems head on, not with platitudes and spin, but with a plan. In less than seven minutes she laid out how they were going to approach the issue of performance and lag, and how we were going to become part of the solution.

The plan at that point was still in a nascent stage, but Trendy took the time to review and understand the root causes. She visited each of the teams and helped propose a plan of action. More importantly she stood up, and in front of some of the games harshest critics said the she got our concerns, that she agreed with us, and that the company did as well.

I wasn’t the only one who thought – holy shit, can this really be happening?

The initiative would be company supported. It would be official. And that it would start almost immediately. By Wednesday the first ‘fight the bugs’ with the devs event was held on Tribble, with several of the games top players taking part to demonstrate first hand to the developers exactly what we’ve been experiencing. By Friday the fruits of those efforts were seen on Holodeck and it was working.

Team Cryptic was pulling out all the stops to see this happen, although I suspect Trendy’s hammer played a role 😉

Operational Support Team 

(Patch design by Morrigan and Thomas 😉

The formal announcement came about a week later with a Starship Troopers inspired blog post, and a forum post inviting player to join the team to fight the bugs.  By this time, two special sessions between devs and testors had already taken place.

Hey everyone!

I’m really excited to announce that we’re now starting up our latest initiative: the Operational Support Team! I’ve seen a huge interest in the community to help improve the game, so the team and I have built this program. The Operational Support Team is a volunteer force that will work with our team in order to submit, review, and resolve bugs that you might find. Our goal is to stomp out the bugs!

We will be taking volunteers from the community to join the Operational Support Team. We’re looking for high-caliber members from the community who can effectively work alongside us in hunting down bugs. We have a process to assess potential candidates, and will be taking a select amount of people for the program. Members will work alongside the STO QA Team tracking down bugs by developing reproducible steps for issues. As we review the program and its effectiveness, the roster for the team will grow.

Let us know if you’re interested in joining the program or if you have any questions!


A Quick Chat with Trendy

I had a chance to chat online with Trendy for 30 minutes the day of the official announcement.

We texted about the real desire within Cryptic to tackle the issue of lag and bugs within the game. The initiative was her way to grab the bull by the horns with bug hunting. STO’s QA lead Queen Vaccine was one of the first public faces that joined the initiative.

And that this and other special projects was one of the reasons why she couldn’t make STLV2015.

For context, in the corporate world we call these special initiatives ‘Tiger Teams’, calling in our best minds to identify, propose solutions, and then to act on those solutions. They can be very effective, but often times it takes customers willing to help. For Star Trek Online, the player’s involvement is what will make the #BugBasher initiative viable.

Trendy: I’m getting players together who’ll be able to help coordinate with getting more detailed feedback to help the team get rid of bugs

It’s one thing to report a bug. It’s another to report it properly so it can be replicated. And it’s an entirely different experience when you have the developers experiencing it side by side with you.

Our conversation touched on the common issues with bug reporting, and how tracking down a problem isn’t as cut and dry as you might expect. Every detail, no matter how small, plays a role in the sleuthing process and for that reason why they’re reaching out to experienced players for help. I asked if the testing process was more data-mining after the fact, using our experiences as a guide. In this case it turns out Cryptic is being proactive and has set up processes to monitor the issues as they happen. By experiencing the issues first hand, by monitoring the process, using reporting by players and with the traditional sleuthing, they’ve been able to make significant steps in just the past 2 weeks.

There won’t be an end date necessarily for this, but it will be an ongoing process. Given the responses form the devs and players alike, it’s a huge step in the right direction.

The remainder of our discussion talked about the process, and that while it is a public initiative some elements of the process – how and where to report, and about what – are being kept out of the public eye. Insights learned by the testers, if in the wrong hands could be misused. And for that reason the qualified testers are asked for a certain level of professional discretion. That, and Trendy has a Hammer. A private forum and reporting process has been added to the company forums.

Do you believe you have what takes to be a #BugBasher? Then add your name here.

Finding/Fixing a Bug -or- Why it is not always simple

Paraphrasing @SarcasmDetector: Perception of what’s happening is as important as the event itself.

When your day to day knowledge of the efforts is limited to the company forums or ESD chat (yup, that’s still a thing), your view of the situation can be colored.  I know firsthand the challenges of getting the word out when the community relies on as much speculation and hearsay as it does official forums.

T6 Constitution – you heard it here first!  <- not real.

This extends to the efforts in the fight against lag and bugs.  Yesterday @tumerboy made a post on the Reddit forums trying to use an analogy to explain the complexity of the process.  Some of the responses by players were meant in jest, while others took on a more snarky tone.  His post was trying to move the discussion forward.  At the heart of his point, was that this process was complex and it was going to take time and effort.

I was at Star Trek Las Vegas recently, and if you’ve ever gone to STLV, and stayed at the Rio, you know that there is a fairly long walk from the hotel to the conference center. (About 1/4 mile one way).

Each morning I would get up, dress myself, and tie my shoes. Each day, I would walk, in those shoes, to the conference center.

On most days, I had no issues. However, on one day, I got about a third of the way down to the conference center, a mere 5 minutes since I’d tied my shoes, and I noticed my shoelace had begun to untie itself.

“Hmm,” I thought, “I wonder what I did differently today, that caused my shoelace to untie. It hasn’t untied itself any other day. . .”

Perhaps I simply hadn’t pulled hard enough on the loops to set the knot. Perhaps I had left slack in the line as I tied the loops in the first place. Alas, once it had occurred, there was no way for me to track down where I had gone wrong in the first place. Nor was there any way ensure I reproduced the issue the next day, such that I could correct my error.

I’m going to end this 1900-word post with my response from Reddit:

I get the analogy. The more complex the system, the more possible interactions, the greater the chance for unintended outcomes.

Star Trek Online is not ‘just another video game’. There are no save points for the player, no way to just ‘shut it down’ at the end of the day. As a cloud-based application, it’s a living breathing entity that has been built by and updated by many, and played with by many more.

A fix/patch isn’t always a solution. Root problems may be deeper, or have impact on other systems far beyond the understanding of everyone involved at the time it’s tested. Or something gets forgotten. A +100 instead of a +0.01. As a result we occasionally get interesting ‘unintended consequences’.

Krenim Science consoles for $500 Alex.

This isn’t an admonishment of the team/process/company, but an understanding that problems may be more complex than the average player would comprehend. It’s not black and white.

I’ve worked in enterprise software. Even with the best documentation, the most on-the-ball-brilliant-savant-level-staff, we missed stuff. All the testing the world can’t match the real world environment of a live server where people push the limits of the intended use every day 😉

Kudos to the new #Bugsmashing team 😉

The Lootcritter

2 thoughts on “How broke is it? #BugSmashing

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s