back to article Why did Visual Studio Marketplace go down in the Great Azure TITSUP? Ask Azure DevOps

The team behind Microsoft's Visual Studio Marketplace has issued an explanation as to why it also took the day off after Azure's weather-based wobble. In a commendable act of openness, the software giant laid out what went wrong with its Marketplace in a blog post that points a shaky finger at Azure DevOps for the outage back …

  1. Anonymous Coward
    Anonymous Coward

    S3 has scale-in issues too, but quite minor by comparison

    but unlike the M$ solution the team has 5-10 minutes to re-mediate an over-active 'partition' and it will generally shard itself into N sub-keys over the course of a few hours (depends on key concentration) and that bucket will be throttled till the index-sharding (objects are not moved or re-computed) is completed.

    Sounds to me that M$ has made several classic errors.

    1) using a RDBMS in the first place - almost always this is the wrong technology. You might have the 'master' in a RDBMS or things like billing info there but object access should NEVER hit this tier.

    2) not being the first to eat their own dog food - still not using AVs?

    3) implementing services with SPOF

    4) client software that has no concept of re-try and back-off

    1-3 of these are simple ILLEGAL options when purporting to run a "cloud" service.

  2. Herby

    Definition of "The Cloud"

    Is: Someone else's computer that by the way you have little control over and might fail at ANY time with absolutely no recourse for your operations.

    Or: "Good luck with that!".

    1. This post has been deleted by its author

    2. Anonymous Coward
      Anonymous Coward

      Cloud vs Banksters

      The problem is, decision makers who opt to migrate to the cloud for cost savings today, get rewarded today like banksters on early bonuses, with no clawback ever... But for those with no choice but to remain, they suffer the most from increased Cloud costs, downtime inconvenience and security / privacy issues, just like the hassle for those who remain after firings / outsourcing etc.

      Then there's the assumptions... Cloud is like the mother of all subscription models... What happens when an organization gets so reliant on Cloud, that the thumbscrews come out and subscription charges go up.. Where are your cost savings now organizations? Are you going to unplug? Most won't be able to go back and retool old solutions. Like addicts, they'll be completely stuck...

      Overall, the assumptions being made here by Cloud clients are all wrong. Cloud is a gamble and like gambling, the house is usually the only winner. So this won't end well for most, and that's assuming there's no leaks, breaches or cloud buckets left open too etc...

      If that happens its going to be a nightmare for the most naive / vulnerable organizations. In the era of Banksters & CDO's it was individuals, local governments and SME's that got creamed. Insiders had CDS swap insurance. Don't expect any other outcome here...

      1. Anonymous Coward
        Anonymous Coward

        Re: Cloud vs Banksters

        What happens when an organization gets so reliant on Cloud, that the thumbscrews come out and subscription charges go up.. Where are your cost savings now organizations? Are you going to unplug? Most won't be able to go back and retool old solutions. Like addicts, they'll be completely stuck...

        This is where far too many enterprises, of all sizes, are at right now. Not as a result of adopting "The Cloud," rather relying on SAP, Oracle, IBM, .... The list is rather long. Trading away from one crack dealer to get an "introductory offer" from another. [Which lead to undeserved bonuses later only seen in hindsight.]

        1. Anonymous Coward
          Anonymous Coward

          Re: Cloud vs Banksters

          "What happens when an organization gets so reliant on Cloud"

          The 'cloud' is just another deployment target. If you can't switch deployment targets, you're tooled incorrectly. Locking yourself into services (Azure-specific APIs etc) are more of an issue, but if your code is decoupled and agnostic, switching to another provider isn't immediate and free, but it shouldn't be an exorbitant burden either.

          This is 2018. Digital-aware businesses should know their estate and domain much better than the hide-bound days. And most 'big' business has already faced that pain enough times to have adapted. Or died trying.

    3. Anonymous Coward
      Anonymous Coward

      Re: Definition of "The Cloud"

      Tellingly, they seem to believe their own propaganda despite evidence to the contrary for their very own experiences. Time and time again, single points of failure are newly detected in the field rather than as a result of failure analysis and formal verification. [Yes, you really can do FV on distributed systems if you design them correctly.]

    4. TechDrone

      Re: Definition of "The Cloud"

      It doesn't matter if you place a system into the cloud or a server in your shed. If you don't understand where the single points of failure are, likely failure modes, who is responsible for each aspect of the services needed to run it, and have taken steps to mitigate likely problems then sooner or later you are going to have an incident.

      It doesn't matter if it's a cloud provider, your in-house techies, or your hubby/missus/gardener. If YOU are not managing it properly and have the appropriate governance in place, it's YOUR problem.

  3. SVV

    When locked in means locked out

    Being dependent on a single repository for developer resources, just so that Microsoft can collect its oh so important customer data is both stupid and condescending. Why do they need to know you downloaded a copy of vim? It's not really valuable information is it? All the developer resources I use are freely available in multiple repositories worldwide.

    By being such control freaks they're just guaranteeing such disruptive situations will occur again.

  4. Claptrap314 Silver badge

    It's Magick...

    I don't care what sort of "guarantees" someone gives, there is no 100% solution to availability. Google was switching over to some model along these lines while I was there. Yeah, if your application was small enough. And, if they did not have to do anything drastic with power. And nothing too unusual blew up while they were in the middle of their work. Then yes, you would not go down short of a datacenter-wide event.

    But for the first year, (the time I was there), these "rare" events happened more than once a month.

    So I don't fault the SREs at M$ if they were nervous about switching over. I also don't fault M$ for trying this new technology (which has a strong scent of salesware) less critical datacenters before going to more critical ones.

    I DO fault them for having one datacenter that is critical in the first place.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like