Would be fascinated to see your data over a period of months.
Application up time is flakey, but what was worse were fly deploys failing for no clear reason. Sometimes layers would just hang and eventually fail for no particular reason; I'd run the same command an hour or two later without any changes and it would just work as expected.
I'd love to make a monitoring service to deploy a basic app (i.e. run the fly deploy command) every 5 minutes and see how often those deploys fail or hang. I'd guess ~5% inexplicably fail, which is frustrating unless you've got a lot of spare time.
I used to run a service that created k8s clusters on GCP for our customers. We did want to check that that functionality kept working and had a prober test it periodically. It was actually broken a lot.
Always good to monitor your dependencies if you have the time. Then when someone complains about an issue in your service, you can check your monitoring to see if your upstream services are broken. If they are, at least you know where to start debugging.
My downtimes from fly are pretty rare but generally global when they happen, in this outage we had no downtime but couldn't deploy for a few hours. I have issues with deploying about once per quarter(deploy most days across a few apps)
If that’s the case I suspect fly is getting a lot more reliable. I stopped using them about a year ago so haven’t kept up on their reliability since. Glad to hear, it’s good for a competitive market to have many providers, and fly might have issues but hopefully has a bright future
They are definitely getting more reliable. I was an early user and moved off them to self hosted for quite a while because of the frequent downtime in early days.
Their support still leaves a lot to be desired even as someone that pays for it but the ease of running and deploying a distributed front end keeps bringing me back.