undefined

upvote

points

by 827a7 hours ago |

upvote

by dijksterhuis6 hours ago|

[-]

> I've felt like its a little unfair to judge the uptime of company platforms like this; by saying "if any feature at all is down, its all down" and then translating that into 9s for the platform.

This is definitely true.

At the same time, none of the individual services has hit 3x9 uptime in the last 90 days [0], which is their Enterprise SLA [1] ...

> "Uptime" is the percentage of total possible minutes the applicable GitHub service was available in a given calendar quarter. GitHub commits to maintain at least 99.9% Uptime for the applicable GitHub service.

[0]: https://mrshu.github.io/github-statuses/

[1]: https://github.com/customer-terms/github-online-services-sla

(may have edited to add links and stuff, can't remember, one of those days)

reply

upvote

by windward5 hours ago|

[-]

So what happens for those enterprise customers now? Is there a meaningful fallout when these services fail to meet their SLAs?

reply

upvote

by 12 minutes ago|

[-]

deleted

reply

upvote

by dijksterhuis5 hours ago|

[-]

> If GitHub does not meet the SLA, Customer will be entitled to service credit to Customer's account ("Service Credits") based on the calculation below ("Service Credits Calculation").

The linked document in my previous comment has more detail.

reply

upvote

by Lalabadie4 hours ago|

[-]

It's worth adding that big (BIG!) business clients will usually negotiate the terms for going below the SLA threshold. The goal is less to be compensated if it happens, and more to incentivize the provider to never let it happen.

reply

upvote

by drob5182 hours ago|

[-]

Right. Basically, they give you a coupon to lower your cost of future consumption. So, you have to keep consuming the service. If you just leave, you get no rebate. Obviously, very large customers get special deals.

reply

upvote

by lucideer7 hours ago|

[-]

You're right that labelling any outage as "Github is down" is an overgeneralisation, & we should focus on bottlenecks that impact teams in a time sensitive matter, but that isn't the case here. Their most stable service (API) has only two 9s (99.69%).

They're not even struggling to get their average to three 9s, they're struggling to get ANY service to three 9s. They're struggling to get many services to two 9s.

Copilot may be the least stable at one 9, but the services I would consider most critical (Git & Actions) are also at one 9.

reply

upvote

by ARandomerDude6 hours ago|

[-]

I love multiple 9s as much as the next guy but that's only 27 hours per year of downtime. For a mostly free (for me) service, I'm thankful.

reply

upvote

by wavemode6 hours ago|

[-]

Most people complaining about uptime aren't free users or open-source developers. It's people whose companies are enterprise GitHub customers. It's a real problem and affects productivity.

reply

upvote

by sefrost6 hours ago|

[-]

GitHub going down during office hours in a large enterprise has knock on effects for hours as well. Especially if you are in a monorepo.

reply

upvote

by 6 hours ago|

[-]

deleted

reply

upvote

by skeeter20206 hours ago|

[-]

I'm happy to report that my one-person sysops has successfully hit nine-fives for the 20th year in a row!

reply

upvote

by malfist6 hours ago|

[-]

If there's only a 9 in availability, they've got a minimum downtime of 87.6 hours per year (98.99999999999999999%)

reply

upvote

by lucideer6 hours ago|

[-]

Honestly, you're right - 2̶7̵ 87+ (correction from sibling) hours per year is absolutely fine & normal for me & anything I want to run. I personally think it should be fine for everybody.

On the other hand the baseline minimal Github Enterprise plan with no features (no Copilot, GHAS, etc.) runs a medium sized company $1m+ per annum, not including pay-per-use extras like CI minutes. As an individual I'm not the target audience for that invoice, but I can envisage whomever is wanting a couple of 9s to go with it. As a treat.

reply

upvote

by maccard3 hours ago|

[-]

87 hours a year is 1.5 hours a week. If that 1.5 hour window is when you need to use it it matters a hell of a lot more than if it’s 4am on a Sunday.

reply

upvote

by toast05 hours ago|

[-]

Nine nines is too hard; my target is eight eights.

reply

upvote

by calvinmorrison7 hours ago|

[-]

ONLY TWO NINES! Meanwhile vital government services here have a whopping 25% availability.

reply

upvote

by lucideer6 hours ago|

[-]

Two things can be bad.

reply

upvote

by bigfishrunning2 hours ago|

[-]

Lemme guess, those government services are run by the lowest bidder?

reply

upvote

by shimman6 hours ago|

[-]

This company is part of the portfolio of a $trillion+ transnational corporation. The idea that we can't judge them, when they clearly have more resources than 99% of other companies on this planet, doesn't hold up to any scrutiny.

Why defend a company that clearly doesn't care about its customers and see them as a money spigot to suck dry?

reply

upvote

by thinkingtoilet6 hours ago|

[-]

The OP clearly never says we can't judge them. He was speaking to how the uptime is measured. I'm not saying I agree or disgree with the OP but at least address the argument he's making.

reply

upvote

by saxonww5 hours ago|

[-]

There's a completely reasonable comment by jamiemallers on this thread which is marked as 'dead' even after vouching. Not sure what's going on there.

reply

upvote

by zahlman5 hours ago|

[-]

Presumably what's going on is https://news.ycombinator.com/item?id=47340079 . It's been quite an issue lately.

reply

upvote

by masfuerte5 hours ago|

[-]

Take a look at his comment history.

reply

upvote

by foobiekr5 hours ago|

[-]

It doesn't help that almost all of the big tech companies talking about 5 9s are lying about it; "Does it respond to the API at all, even with errors? It's up!" and so on. If you spend a lot of time analyzing browser traces you see errors and failures constantly from everyone, even huge companies that brag a lot about their prowess. But it's "up" even if a shard is completely down.

The five nines tech people usually are talking about is a fiction; the only place where the measure is really real is in networking, specifically service provider networking, otherwise it's often just various ways of cleverly slicing the data to keep the status screen green. A dead giveaway is a gander at the SLAs and all the ways the SLAs are basically worthless for almost everyone in the space.

See also all of the "1 hour response time" SLAs from open source wrapper companies. Yes, in one hour they will create a case and give you case ID. But that's not how they describe it.

reply

upvote

by sumtechguy2 hours ago|

[-]

Thats the rub.

Once you dig into the details what does it mean to have 5 9s? Some systems have a huge surface area of calls and views. If the main web page is down but the entire backend API still is responding fine is that a 'down'? Well sorta. Or what if one misc API that some users only call during onboarding is down does that count? Well technically yes.

It depends on your users and what path they use and what is the general path.

Then add in response times to those down items. Those are usually made up too.

reply

upvote

by jamiemallers6 hours ago|

[-]

[dead]

reply