S3 has scale-in issues too, but quite minor by comparison
but unlike the M$ solution the team has 5-10 minutes to re-mediate an over-active 'partition' and it will generally shard itself into N sub-keys over the course of a few hours (depends on key concentration) and that bucket will be throttled till the index-sharding (objects are not moved or re-computed) is completed.
Sounds to me that M$ has made several classic errors.
1) using a RDBMS in the first place - almost always this is the wrong technology. You might have the 'master' in a RDBMS or things like billing info there but object access should NEVER hit this tier.
2) not being the first to eat their own dog food - still not using AVs?
3) implementing services with SPOF
4) client software that has no concept of re-try and back-off
1-3 of these are simple ILLEGAL options when purporting to run a "cloud" service.