There’s a lot of momentum behind moving to Platform-as-a-Service (PaaS) for delivery of applications on the cloud, but is there enough maturity in PaaS to deploy a mission-critical application? Here’s a story regarding one loyalty program provider that offers his customers SaaS solutions. This SaaS provider deployed his platform on a PaaS that offered SQL Server and ASP.NET services. The SaaS has been operating perfectly for months handling the needs of his clientele, but unexplainably stopped working midday on a Saturday—a busy retail time period.

While the PaaS provider supposedly offered 24×7 support, they responded only to the first support ticket with a single response, “we really can’t identify problems within your application.” The developers did what they could to discern the problem, but all their research pointed to something changing on the server. Upon hearing this, I had a bevy of thoughts regarding the maturity of PaaS:

1. With IaaS, the customer takes responsibility for maintenance and updates to the server OS, so, hopefully, they test changes before rolling out into production. SaaS providers also could implement changes that could impact their customers, but, again, I would expect that these would be tested before rolling out into production with the ability to rollback should some unforeseen problem arise. However, PaaS providers can make changes to the platform that are all but impossible to test against every customer’s application, meaning that changes the PaaS provider rolls out could shut your application down. Moreover, they may not even be aware of all the nuances of a vendor-supplied patch making it very difficult to recover and correct.
2. Out of IaaS, PaaS, & SaaS, the PaaS provider has the greatest likelihood of pointing their finger back at you before pointing at themselves. After all, other customer’s applications are up and running, so it must be your application. Hence, a PaaS provider needs to provide much more expensive help desk support than the other two service models in order to be able to ascertain the severity of a problem and get it corrected. Additionally, PaaS providers need to provide more 1-on-1 phone support as these issues are too complex to handle by email and trouble tickets alone.
3. IaaS, PaaS and SaaS each offer less visibility into the internals of the service respectively. However, lack of control over the PaaS means that key settings that may affect the performance of your application may be outside the ability for you to affect. Case in point, the specific problem this loyalty SaaS provider had was that the PaaS was reporting a general error and telling him to make certain changes to his application configuration environment. However, the recommended changes were already implemented and it was seemingly ignored by the platform. At this point, the lack of visibility meant that the PaaS service provider was the only one capable of debugging the problem, even it was the fault of the application, which in this case it was not.
4. Which brings us to the next point, the problem according to the PaaS provider was that an update forced the application to run under an incorrect version of .NET. Even though the developers forced the appropriate version through their control panel once problems started occurring, it seems that the control panel changes didn’t actually have any affect. Hence, the customer must have a lot of faith that the tools the PaaS provider offers actually do what they state they are doing and if they don’t then the visibility issue in point #3 will once again limit any ability to return to full operating status.

All this leads me to question if PaaS is actually a viable model for businesses to rely on. IaaS gives them complete control, but they will need to learn how to architect for scale. SaaS removes all concern for having to manage the underlying architecture and the SaaS provider will live or die based on their ability to manage the user experience. PaaS, is fraught with pitfalls and dangers that could cause your application to stop running at any point. Moreover, should this occur, the ability to identify and correct the problem may be so far out of your hands that only by spending an inordinate amount of time with your PaaS provider’s support personnel could the problem be corrected.

I welcome input from PaaS providers to explain how they overcome these issues and guarantee service levels to their customers, when they cannot guarantee that customer’s applications are properly written, that the libraries they use will run as expected in a multi-tenant environment or that a change they make to the platform won’t stop already applications from running. Additionally, I’d be interested in hearing answers that equate to PaaS solutions that are more than pre-defined IaaS images, such as CloudFoundry or Azure.

14 thoughts on “Thar Be Danger in That PaaS”
  1. I think what you are describing in your post is the generic fear associated to any outsourcing mechanism, whenever you leave some of your control on the table – I don’t think this is specific to the cloud.

    If you look at the AWS incident a few months back, this led to big issues for a number of companies and while AWS’ answer hasn’t been “it not our fault”, the outcome was pretty much the same: apps and systems were not working.

    As for PaaS, the example you are describing is inherent to any middleware layer, in the cloud or not. If you were to ask any support organization at a middleware vendor, there is a fine limit between fixing an AS bug vs. fixing an app. At one of the leading AS vendor I know well, more than 90% of the support tickets complaining about an AS bug were never an AS bug, so the (bad) temptation to reply “check your app, our platform is doing fine” is high – but obviously vendors should resist that temptation since this is typically not for that kind of replies that customers are paying support for.

    This probably hints to one thing: in the cloud – possibly more than anywhere else – the quality of support is a critical factor.

  2. Sorry, I disagree with your entire definition of PaaS because your describing SaaS architecture developed by the end user running on IaaS they’ve provisioned from a service provider. PaaS is concept that’s based on a SaaS application layer (that a provider specialises in) provisioned over IaaS (which the provider also specialise in) to provide a fully managed service. Anything to the contrary to a fully managed service is SaaS and IaaS under separate SLA’s because as you rightly pointed out, leads breaks in liability on the event of failures, and can’t possibly function as a service without breaking the PaaS SLA.

    PaaS is as described, a Platform as a Service, and so all the fundamental layers and architectures that build the platform must be covered as part of the service, and thus any SLA..

    1. I believed my description was clear in the blog, but if not let me clarify here. The user acquired ASP.NET PaaS service from PaaS provider. They cannot affect the infrastructure or OS that this is running on and they developed a SaaS architecture on top of that PaaS. I believe that this is the goal of providing a PaaS–to deliver a scalable application platform that removes the need for developers to have to design the entire scalable and available architecture in order to deploy an application. However, I also believe that PaaS covers a broad range of models that covers anything between pure IaaS and pure SaaS. Some consider PaaS to be a machine instance with pre-deployed applications that serve a particular purpose, but you’re still responsible for deploying on a IaaS.

  3. To further explain, lets assume a customer wants to purchase an blogging solution from a provider, but has no technical expertise, they just want to post their thoughts online.

    Typically, the provider’s solution will be a opensource/commercial SaaS blogging product provisioned across the providers shared hosting infrastructure (IaaS).

    In this scenario the provider manages the Infrastructure and Application completely independently with little awareness on the overall affects of, for example, system updates (as you pointed out), but if it broke the customers blogging solution, the provider would have to deal with fixing that breakage because after all they are providing that software application as a service.

    So here, because the provider is still offering both Infrastructure and Software as a Service, continues to fully support both under their SLA.

    Same applies for PaaS, but one assumes the SaaS architecture is proprietary and layered into the Infrastructure so that when system updates happen the provider understands exactly what may or may not be effected.

    If you look at WordPress for example, they offer both PaaS and SaaS solutions.
    The PaaS solution is hosted by them across their own infrastructure and scaled as needed to ensure even if your blog gets 10 hits a day or 1 million hits, it’s still available regardless of what resources you may or may not require.. WordPress operate the same SLA for both infrastructure and software because it is a PaaS concept, so if WordPress updated their backend infrastructure, they would have with any hope tested the change on their product code to ensure there would be no breakages, and if it did happen (as some times it does) they take liability and fix it..

    As for the SaaS solution, they allow end users to host the WordPress Application internally or with another provider. Liabilities are separate and more complex. For example, you could host a buggy wordpress with a hosting provider and get your site hacked, but if wordpress fixed those bugs that led to your site getting hacked and you haven’t provisioned the update to your hosted site, the software vendor isn’t liable for that hack, that falls to whom ever is responsible – host providing the service or software use who uploaded it to the hosted space.

  4. More often than not, PaaS will be a Proprietary service offering and where not, the Platform part of the service must be a speciality of the provider, otherwise they are simply offering SaaS over IaaS – which is not the same thing as PaaS…

  5. This exact same problem exists with desktop software and rollouts of Windows Service packs. Sometimes those service packs break software that is installed and being used by users throughout an enterprise.

    Would you suggest the treat of that means we should revert to pen and paper and avoid computers?

    The rationale for the cloud is abundantly clear. Even with outages like the one you describe, the increased productive and capacity for customer acquisition dwarfs the negative impacts of a temporary outage due to a buggy rollout.

    1. You clearly do not have an understanding of the problem domain. This is not the exact same problem since an internal IT department should know what applications are running on those desktop and what hardware assets they have and be able to test a service pack before rolling it out. And if they don’t or can’t…well shame on them. In a public PaaS, there is no such luxury or opportunity. Now, someone intelligently stated on a Focus.com roundtable we had last week that a private PaaS would probably not be subject to the same issues, which, in my opinion, falls under the guise of the internal IT service pack rollout scenario and they should know all the applications running in that PaaS cloud and be able to test any changes in a staging environment.

  6. JP, thanks for the great post!

    I agree with Sacha that the generic fear that is associated with something outsourced is most definitely present in PaaS, and that it suffers from the normal ailments of middleware. This is to be expected in any system where arbitrary code (e.g. a guest application) and a well known foundation co-habit memory and execution spaces. It creates a natural “fingerpointing” scenario. With PaaS, unfortunately, the situation is exacerbated by the fact a 3rd party is responsible for operating the entire stack. This is different than traditional middleware in that the same party is responsible for both the middleware offering AND the service, creating a poor alignment that essentially amounts to “operational lock-in.” This scary form of lock-in is the crux of the problem.

    A “deploy-anywhere PaaS” model makes huge strides in normalizing the risk for software developers like the example you highlighted. Essentially, this means that the PaaS layer is independent of the operating layer. A software company can download the PaaS software and deploy it on any infrastructure, essentially getting access to the PaaS’ primary value prop (which is not the outsourcing, but rather the developer productivity and agility as well as commoditized cloud architecture patterns) value prop but control over the raw metal or IaaS infrastructure under the hood. This helps obviate some of the control issues and would give access to both the PaaS and infrastructure in these sensitive situations.

    Furthermore, a downloadable PaaS allows any organization to deploy an instance of that PaaS layer as a service (yes, it’s redundant, but I co-opted PaaS to refer to the software layer). This gives end users freedom of choice by being able to have a consistent PaaS software layer to depend on, but the ability to choose a provider so that when service degrades or any organization fails to provide consistent service, the PaaS customer can uproot (clearly, this would be difficult in it’s own right, but at least options are available).

  7. JP, great post as always. Sorry for my late review. I haven’t been keeping up with Google+ like I should. I really see PaaS as the new OS, but with dramatically improved deployment characteristics, with simplified and centralized management. All good things for a rapid deploy and stable environment.

Leave a Reply

Your email address will not be published. Required fields are marked *

*