On the surface it looks to be a chicken and egg problem. Is cloud driving IT transformation or is IT transformation driving cloud’s growth? The movement to cloud requires a shift in approach for many IT organizations for how to deal with procurement, management, security, etc. In some cases, public cloud offerings are driving IT to change how it delivers services as the business has access to options that don’t require IT’s participation in order to acquire.
The truth is that IT organizations can build and deliver private cloud offerings without changing how they operate internally. Often, these efforts result in less than stellar adoption by the business. Without transformation of the business and operational processes, cloud is just another silo that increases operational overhead and often acts to “prove” the businesses’ point that IT is too slow and difficult for the needs of today’s businesses.
Alternatively, changes in operational and business processes without incorporating cloud computing can still yield significant advantages for the business in lowering operational overhead, reducing risk and, most importantly, fostering a change in how IT is perceived by the business. The movement toward delivering IT as a Service provides certain key advantages:
- It exposes the complexities for delivering a highly-available, resilient and recoverable system to the business. These costs are often masked or hidden making IT look more expensive when compared to other options—especially public cloud—when in fact adding the same services to the competitive options results, many times, in an overall lower total cost of ownership.
- It spells out for the consumer the expectations and ensures appropriate communications. When consumers know what to expect they are less irritated by what they perceive to be an inability to deliver. A good analogy here is waiting at the gate in the airport for a delayed flight. In cases where the airline continually updates the passengers there is less anxiety and aggression toward the airline workers.
- It builds trust. The biggest argument I have heard against moving to a shared services model run by a single IT organization is a lack of trust for that organization to deliver. In business with multiple divisions, there is often a hesitancy to allow a “corporate IT” to build and operate a shared infrastructure due to past failures and a demonstrated inability to deliver on time and in a timely fashion. In businesses where these organizations introduced governance boards and demonstrated small successes based on shifts toward ITaaS, attitudes shifted quickly as the divisional IT groups recognize the benefit of leveraging economies of scale across all divisions.
- It simplifies the life of the operational staff. In businesses where IT is spending 75% of their time putting out fires and rarely being able to take on new projects, shifting to ITaaS has simplified the operational environment, delivered greater reliability, reduced the number of individuals required to participate in troubleshooting and root cause analysis, and shifted time back to focus on the backlog of outstanding requests. In turn, the perception of the business is that these IT organizations became more responsive and relied less on identifying “shadow IT” solutions.
This list is just a fraction of the complete benefits for moving to ITaaS and, interestingly, none of the benefits mentioned the word cloud. Does this mean that cloud isn’t really all that important? No. It just means that the transformation toward ITaaS should be viewed as a higher priority for IT than building or moving to cloud. Indeed, in many respects, cloud success is predicated on this transformation. Certainly, cloud technologies can be applied in the data center to simplify data center operations, but the savings and benefits of this activity quickly plateaus and is short-lived if the IT organization continues to operate in a traditional manner.
If I were an acting CIO/CTO today, my first action would be to develop a business plan to take to management to request investment in transformation. This investment is critical as the funding is above and beyond what most IT budgets today can accommodate. The additional funds would then be used to develop a team—using both internal and external (consulting) resources—to develop a roadmap for the transformation, develop new processes and define how to transition to the new processes. I’d also ensure that this team included financial analysts to help develop accurate cost models to be able to show back to management how their investment will result in reduced operations costs and marketing staff to help advertise the benefits back to the business as well as keep them informed of how these changes will result in a more agile organization to support their needs.
Moving to ITaaS is not as painful as difficult as it may seem. It does require an acceptance that this transition will not occur rapidly and that no one expects to “eat the elephant in one bite”. It’s a very repeatable methodology that most organizations can follow with success. Most importantly, and perhaps most importantly, it requires that management be pragmatic with regard to the cultures and staff that will attempt to ensure status quo and make sure to allay fears and concerns of these individuals through training and support, but be willing to remove venomous individuals who continually act to stand in the way of the transformation.
I was recently reviewing some sales and marketing materials regarding building out Infrastructure-as-a-Service (IaaS). Part of these materials included attributes of IaaS, one of which is multi-tenancy. Having been working with some large enterprise customers lately around private cloud, this attribute really got me thinking.
In most cases for the enterprises I have been working with, the private cloud is about agility and the workloads are all owned and related to the same business line. From the macro perspective, there is effectively one tenant, yet from a micro perspective (workload) there are multiple tenants that may require some isolation for purposes of service level and resource management.
Why is this important? Multi-tenancy is expensive. It requires additional resources and overhead to manage including encryption and key management, isolation across network, compute and storage, and additional support for tenant management. Remove this overhead and those physical resources can be focused on workloads instead of overhead.
Further confusing the issue is the duality of the use of private cloud to describe physicality and consumer models. Private cloud when described through its physicality typically is represented as “on premise” either in the customer’s data center or a third-party hosting co-location facility. However, the cloud being a business consumer model, I like to describe it based on the consumption model as anything that has a single tenant (macro) is a private cloud. Hence, in order to move away from describing private cloud by its physicality requires that we establish a macro and micro representation of tenancy. That is, by recognizing at a macro level that there is only one real tenant consuming the private cloud, then many of the multi-tenancy features can be minimized in favor of greater use of resources for workloads.
For enterprises that are operating as a shared IT services provider across multiple divisions that have various legal and operational requirements for security and isolation, where the cloud exists on premise or in a co-location scenario, secure multi-tenancy will be a requirement. However, using the existing cloud computing vocabulary, it may be more effective to identify these types of clouds as community clouds instead of private cloud since all the tenants have a shared common interest. In this way, once again, we can describe the appropriateness of multi-tenancy by consumer instead of physicality.
With many of my clients looking to develop private cloud, success will be predicated on developing and delivering the right features and functions. Identifying the real consumer for the cloud service and an understanding of the relationships between the workloads can go a long way to using either using axe to kill a fly or falling prey to an inadequate service level model. By identifying private cloud as a single tenant at the macro consumer level and a community cloud as multiple related tenants at the macro consumer level, we have a model that ensures regardless of the physical infrastructure deployment we are providing the appropriate features and capabilities.
I’ve been granted an incredible opportunity. Over the past three and a half months I have gotten to lead a real world large-scale delivery of a cloud solution. The final solution will be delivered as Software-as-a-Service (SaaS) to the customer via an on-premise managed service. While I have developed SaaS/PaaS (Platform-as-a-Service) solutions in the past, I was fortunate enough to have been able to build those on public cloud infrastructures. This has been a rare glimpse into the “making of the sausage” having to orchestrate everything from delivery of the hardware into the data center in four countries to testing and integration with the customer environment.
All I can say about this opportunity is that the term, “it takes a village” applies well. I thought I’d share some important generalities about this type of effort. It’s important to note that this is a Global 100 company with data centers around the globe. Regardless of what the public cloud providers are telling the world, this application is not appropriate for public cloud deployment due to the volume of data traversing the network, the amount of storage required, the types of storage required (e.g. Write-Once-Read-Many), level of integration with internal environments and the requirements for failover.
The following are some observations about deploying cloud solutions at this scale:
- Data Centers. As part of IT-as-a-Service (ITaaS) we talk a lot about convergence, software-defined data centers and general consolidation. All of this has major implications for simplifying management and lowering the total cost of ownership and operations of the data centers. However, we should not forget that it still takes a considerable amount of planning and effort to bring new infrastructure into an existing data center. The most critical of these is that the data center is a living entity that doesn’t stop because work is going on, which means a lot of this effort occurs after hours and in maintenance windows. This particular data center freezes all changes between mid-December till mid-January to ensure that their customers will not have interrupted service during a peak period that includes major holidays and end of year reporting, which had significant impact on attempting to meet certain end-of-year deliverables. On site surveys were critical to planning the organization of the equipment (four racks in total) on the floor to minimize cabling efforts and ensure our equipment was facing in the right direction to meet the needs for hot/cold isles. Additionally, realize that in this type of business, every country may have different rules for accessing, operating in and racking your equipment.
- Infrastructure. At the end of the day, we can do more with the hardware infrastructure architectures now available. While we leverage virtualization to take advantage of the greater compute power, it does not alleviate the requirements around planning a large-scale virtual environment that must span countries. Sometimes, it’s the smallest details that can be the most difficult to work out, for example, how to manage an on-premise environment, such as this one, as a service. The difficulties here is that the network, power, cooling, etc. are provided for by the customer, which requires considerable efforts to negotiate shared operating procedures, while still attempting to commit to specific service levels. Many of today’s largest businesses do not operate their internal IT organizations with the same penalties for failure to meet a service level agreement (SLA) as they would apply to an external service provider. Hence, service providers that must rely on this foundation face many challenges and hurdles to ensuring their own service levels.
- Security. Your solution may be reviewed by the internal security team to ensure it is compliant with current security procedures and policies. Since this is most often not the team that procured or built the solution, you should not expect that they will be able to warn you about all the intricacies for deploying a solution for the business. The best advice here would be to ensure you engage the security team early and often once you have completed your design. In US Federal IT, part of deployment usually requires that those implementing the system obtain an Authority to Operate (ATO). Quite often, medium- and large-sized businesses have a similar procedure; it’s just not spelled out so succinctly. Hence, these audits and tests can introduce unexpected expenses due to the need to modify the solution and unexpected delays.
- Software. Any piece of software can be tested and operated under a modest set of assumptions. When that software must be deployed as part of a service that has requirements to meet certain performance metrics as well as meet certain recovery metrics in the case of an outage, that same software can fall flat on its face. Hence, the long pole in the tent for building out a cloud solution at this scale is testing for disaster recovery and scalability. In addition to requiring time to complete, it often requires a complementary environment for disaster recovery and failover testing, which can be a significant additional cost to the project. I will also note that in a complex environment software license management can become very cumbersome. I recommend starting the license catalog early and ensure that it is maintained throughout the project.
- Data Flow. A complex cloud-based solution that integrates with existing internal systems operating on different networks across multiple countries will have to cross multiple firewalls, routers and run along paths with varying bandwidth carrying varying levels of traffic. Hence, issues for production operation and remote management can be impacted by multiple factors both during planning and during operation. No matter how much testing is done in a lab, the answer seemingly comes down to, “we’ll just have to see how it performs in production.” So, perhaps, a better title for this bullet might be “Stuff You’re Going To Learn Only After You Start The Engine.” Your team will most likely have a mix of personalities. Some will be okay with this having learned from doing similar projects in their past, others will not be able to get past this point and continually raise objections. Shoot the naysayer! Okay, not really, but seriously, adopt this mandate and make sure everyone on the team understands it.
- Documentation. I cannot say enough about ensuring you document early and often. Once the train is started, it’s infinitely more difficult to catch up. Start with good highly-reviewed requirements. Review them with the customer. Call to order the ARB and have them review and sign off. This is a complex environment with a lot of interdependencies. It’s not going to be simple to change one link without it affecting many others. The more changes you can avoid the more smoothly the process of getting a system into production will be.
Most importantly, and I cannot stress this enough, is the importance in building a team environment to accomplish the mission. Transforming a concept into a production-ready operational system requires a large number of people to cooperatively work together to address the hurdles. The solution as designed on paper will hardly ever match perfectly what is deployed in the field for the reasons stated above. This project is heavily reliant upon a Program Management Organization with representatives from engineering, managed services, field services, product and executive leadership to stay on track. Developing the sense of team within this group is critical to providing the appropriate leadership to the project as a whole. Subsequently, we also formed an Architecture Review Board (ARB) comprised of key technical individuals related to each aspect of the solution to address and find solutions for major technical issues that emerged throughout the project. In this way we ensure the responses were holistic in nature, not just focused on the specific problem, but also provided alternatives that would work within the scope of the entire project.
For the 1st time in my life after owning three homes I am in the position of having to replace my air conditioning unit. The company I chose to purchase and install the new unit told me the name of the manufacturer for the unit and I set out to read up on customer reviews about that manufacturer. I found one site that had over 1,000 reviews with a high majority of them being negative (highly unsatisfied). As I began to read through the reviews, one thing became readily apparent, those who understand heating, ventilation and air conditioning (HVAC) praised the unit while customers sweating their … well you know what … off, were highly disappointed.
Having worked on installing new HVAC systems in both of my prior homes I was able to assess that the positive reviews were most likely correct. Frankly, the key theme that iterated through the reviews was if you select a knowledgeable and qualified installer, the unit will work perfectly an if you select an unqualified installer chances are you will have issues with the unit as soon as a week into use. The reason I agree with the positive reviewers was because I have some understanding of the components and common elements of these systems. They’re not plug-n-play. You cannot just put the new unit in place of the old unit and expect everything to work as expected. For example, most new units use a new type of refrigerant and if you don’t properly evacuate the system the older refrigerant will corrupt the new compressor likety-split.
This led me to think about how many customers are dissatisfied with their IT systems and actualizing the promises of cloud computing. Gartner Group has introduced the world to the “trough of disillusionment”, in which technology doesn’t live up to the promises of vendors, analysts and pundits. Service Oriented Architecture (SOA) is a perfect example of this trend. I developed a Platform-as-a-Service (PaaS) for supporting retail and supply-chain industries back in 2005. I leveraged a solid SOA design and the system was incredibly agile and allowed us to develop new business services in weeks instead of months. However, for many, SOA was a complete flop. Here, like with the HVAC units, I blame the installer.
It’s very likely that we will enter the “trough of disillusionment” with regard to cloud computing as well. Many that have jumped on board the public train early are now beginning to see issues with costs and controls and are considering how to move off public cloud to private cloud environments. Others have jumped on the private cloud bandwagon based on promises of lower IT costs only to learn that there is an initial investment that sometimes masks future costs savings. Moreover, those costs savings are now being shown to plateau, which doesn’t answer the need for continually shrinking IT budgets. Once again, with all these disillusionments I blame the installer because I’ve seen multiple customers achieve concrete benefits and goals from cloud computing that met or exceeded their expectations.
So, whether shopping for a new HVAC system, getting your car fixed, or selecting a strategy and direction for your IT organization, keep in mind that you get what you pay for and if you are not happy or the system doesn’t work as planned, it might be a problematic component, but often times, it’s the installer!
There’s been a lot of discussion about what makes cloud computing different than other forms of computing that have come before. Some refer to the set of attributes set forth by NIST, while others rely on less succinct qualifications simply satisfied to identify network accessible services as cloud, and others define cloud by applicable business models. In the past, I have written about scale as a common abstraction based on upon some of these other definitions. However, more recently, I’ve come to the realization that we need to define cloud by where it’s going and not what it is in its infancy.
Cloud computing is following in the vein of the automobile and fast food industries. These industries introduced their first products with little to no customization and then changed and competed on value based upon significant customization. The automobile industry started out offering only a black Ford Model T and today allows buyers to order a completely custom designed car online delivered to their home. Likewise, cloud computing started out as vanilla infrastructure services and is rapidly moving towards greater levels of customization. Ultimately, cloud computing will not be defined by service model monikers, but will be a complete provision, package and deliver (PPD) capability facilitating control over the type of hardware, operating systems, management systems, application platforms and applications.
When building a new home, buyers go through a process of choosing carpeting, fixtures, countertops, etc., but ultimately, their expectations are that they will be moving into a completed house and not showing up to a pile of items that they then need to further assemble themselves. This is the perspective that we should be applying to delivery of cloud computing services. Consumers should have the opportunity to select their needs from a catalog of items and then expect to receive a packaged environment that meets their business needs.
Much of today’s cloud service provider offerings either approximate raw materials that require additional refinement or are a pre-configured environment that meets a subset of the overall requirements needed. The former approach assumes that the consumer for these services will take responsibility for crafting the completed service inclusive of the supporting environment. The latter approach simplifies management and operations, but places restrictions on the possible uses for the cloud service. Both of these outcomes are simply a result of the level of maturity in delivering cloud services. Eventually, the tools and technologies supporting PPD will improve leading to the agility that epitomizes the goals for cloud computing.
Meeting the goals for PPD entails many prerequisite elements. Chief among these is automation and orchestration. Cloud service providers manage pools of resources that can be ‘carved’ up many different ways. Due to the complexity in pricing and management, most cloud service providers limit the ways this pool is allocated. As the industry matures, service providers will improve at developing pricing algorithms and have greater understanding for what consumers really need. Meanwhile, we will see great improvements in cloud manager software that will facilitate easier management and allocation of resources within the pool allowing for much more dynamic and fluid offerings. Coupled with automation and orchestration, cloud service providers will find it easier to offer consumers greater numbers of options and permutations while still being able to balance costs and performance for consumers.
Defining cloud computing by its nominal foundations is akin to specifying the career choice for a young child. Infrastructure, platform and software services illustrate possibilities and solve some important business problems today. However, most businesses still find cloud environment too limiting for their mission critical applications, such as manufacturing and high-volume transactions. It won’t be long, though, before users can specify the speed of the network, the response requirements for storage, the security profile, number and types of operating system nodes and quality-of-service parameters that the environment must operate under, among many other attributes, and have the service provision, package and deliver to us our requested virtual environment. This is what we should be using as the profile by which we define cloud computing.
In the spirit of leaving Las Vegas from EMC World I started thinking about the transformation of Las Vegas from a desert oasis created by organized crime to a billion-dollar industry. How did big business and Wall St. push the mob out of Vegas? Movies and television shows, such as Casino, Goodfellas, and the Sopranos illustrate a side of organized crime that few rarely ever witness or have witnessed. Las Vegas was created by organized crime to service their syndicate with money laundering, prostitution, entertainment and schmoozing services. It worked splendidly; until it didn’t. Something changed and it created a landslide that eventually saw Vegas’ founders ousted and replaced with a larger, more powerful and adept landlord—mega-corporations. How they accomplished this is a lesson that clearly should be noted by IT.
Believe it or not, Governance, Risk and Compliance (GRC) was the tool of choice in ousting organized crime from Vegas. Big business and government made it an inhospitable environment for crime syndicates to operate; at least with regard to gaming and hoteling. The first step was to foster transparency upon the casinos. Long suspected for ‘rigging’ of games, local and federal government initiatives pushed for regulation and compliance for gaming. This had the impact of reducing the ill-gotten gains for organized crime, while lowering the risk for gamblers since the odds were considerably greater in their favor without the magnetic roulette ball or aces tucked under the blackjack table.
Unfortunately, the downside of regulations is the need for auditing and, of course, the first twenty auditors are buried in the foundations of the older hotels—just joking! Seriously, though, increased regulation transformed gaming and hoteling for organized crime from stealing and laundering to operations and management, which clearly required much more effort than organized crimes leaders were willing to deal with leaving them to abandon the gaming and hoteling aspects to real businesses that were interested in providing these services.
So, in the first three paragraphs, we see the use of GRC to foster transparency, which changed the rules for the current operators and transformed the environment into something that was hundreds of times more profitable than the original operators ever imagined. If GRC has the power to transform Las Vegas from a criminal institution into a reputable business, what do you think it could do for transforming your IT organization? Today, on average, businesses are paying upwards of 70% to keep the lights on and that number is rising. IT, as we know it, grew up organically around the introduction of computers into the business and, just like the organized crime syndicates, as that number rises it’s going to get less and less interesting to do that job.
We need to use GRC to transform our IT organizations so that the next incarnation can start to arise on the ashes of the old, just like it did in Vegas. The next generation of IT will be faster, smarter and provide services that consumers want with the compliance and assurances that provide them comfort. After all, Vegas’ consumers didn’t change, it grew and more types of consumers were added to the mix. That’s the outcome we should expect for a transformed IT organization and the steps are already shown to us:
- Engage governance
- Change the rules
- Ensure the rules are being followed
- Make it less desirable to continue operating in the current manner
One final note, many speak of IT transformation as a move to IT-as-a-Service (ITaaS), but an outcome of thinking about this blog entry has made me realize, we already are offering ITaaS, it’s just that the current services are a) not desired by the consumer, b) too costly for the consumer, or c) takes too long to consume. In the early days of Vegas the organized crime syndicates were providing services as I mentioned in the opening paragraph. The issue was there was a limited audience for the services they were offering. I believe that’s kind of what we see happening now with IT. By transforming IT and bringing in new operators and new rules, we change the services we are delivering and how they are delivered, thus creating a much larger audience.
Once big business took over in Vegas they started using data to remove their risk. For example, they eliminated card counters and collusion between dealers and players. Also, they changed the rules of the games so that odds were in their favor. Hence, they became more profitable and were able to expand into other services, such as conventions and large scale catering. Plus, as the economy ebbs and flows, they are oriented to respond quickly and modify their service catalog and costs to suit appropriately. So, now big business isn’t watching their costs rise and feeling like there’s no end in sight, but instead are focused on what is needed to continue growing and making profits. Ultimately, this is what IT needs to strive toward supporting.
One of the leading problems plaguing IT organizations is the high costs of operations and maintenance. The industry average is roughly 70% with some organizations going as high as 90%. Picking apart these costs one often finds a stratified organization focused on narrow bands of computing with little crossover between the bands. Moreover, the weighting of political density between layers often makes it too risky for basic collaboration between the stratified layers. Hence, when problems arise, each layer attempts to solve the problems only with the tools at their disposal. The result is the Operation Petticoat wired together with chewing gum and bras that we call IT.
JP’s IT Axiom #124: Design flaws at the top of the stack will highlight limitations at the bottom of the stack. Likewise, the design at the bottom of the stack impacts performance at the top of the stack.
There’s no escaping the fact that a poorly-designed application will put undue burden on the operating infrastructure. A “chatty” application impacts bandwidth. Improperly designed database queries will consume memory and disk capacity. Poorly-designed storage architecture will limit the amount of I/O per second (IOPS) and, thusly, limit the speed of retrieval of data to the application. IT transformation is about moving from a stratified organization to an agile organization through the use of DevOps culture and other collaborative techniques.
Short of correcting this organizational challenge, the stratified layers will attempt to correct issues using the tools at their disposal. Hence, infrastructure & operations (I & O) will scale linearly with memory, servers and storage to correct design flaws in the application. Software engineering will add specialized code to work around limitations in the infrastructure, such as timeouts and latency. Removal of the stratification in favor of collaborative teams means that issues can be rooted out and solved appropriately.
Moreover, this stratification has greater implications for delivery of private cloud services to the organization. Indeed, while many organizations focus on delivering Infrastructure-as-a-Service (IaaS) from their private cloud, it begs the question, “What is cloud strategy for the organization?” IaaS implies that the consumer will manage their own applications in the cloud and that IT is simply the supplier of infrastructure services. I posit that this is merely an extension of the stratification of IT with the I & O layer delivering within their swimlane. However, it misses the greater opportunity for the business a whole, which is to deliver reliability, quality, trust and scalability for data and applications in a consistent manner.
Hence, IT organizations should be focused on delivering Platform-as-a-Service (PaaS) to the business as this will provide a consistent way to design, build, deploy and manage applications resulting in lowering operational overhead while delivering greater overall agility. By delivering IaaS, the business loses the opportunity for this consistency as engineering teams are now responsible for building and deploying their own application runtime platforms. Even if a single vendor’s application platform is used, the various configurations will make it more difficult to support, lead to longer repair cycles and add undue complexity to operational concerns.
Private cloud computing represents a unique opportunity for the business to reduce operating overhead significantly through the three C’s: consolidation, consistency and congruence. To achieve this goal, IT needs to break down the stratified layers and formulate workload teams comprised of members from various parts of the IT organization and together become responsible for the workload’s availability, performance and consumer experience.
I’ve recently been theorizing around a new model for IT transformation. There’s anecdotal evidence that, in general, business problems tend to change slower than the rate of technology innovation. Thus, we can discern that IT has focused on the application of technical innovation to solve existing business problems in more effective ways versus using technology innovation to solve only new problems or continually having to evolve the solution to an existing business problem.
The figure below illustrates a generally-observed pattern in IT. Existing problems move to the new platform under the guise of “technical refresh” while technology innovation introduces new business problems to be resolved. Ultimately, what we learn is that there’s very little stickiness with legacy platform selection and that users will eventually attempt to migrate their solution domain forward onto the latest solution set in an attempt to derive lower costs, easier support, better performance and overall improved customer experience.
However, something interesting is occurring in this most recent migration that is best described as service-orientation. This change has profound impact for the IT industry that we will call the “Facebook effect” for lack of a more widely-understood term. The Facebook effect is best explained as follows:
In general, small populations of Facebook users will leave the service completely and , for many, overall time on the site may diminish, however, Facebook, as a service, is seemingly entrenched to a point where it cannot be unseated.
That is, another service will most likely not emerge offering the same function and capabilities driving a migration of users away from Facebook to this new service. This is because the user has invested in customizing the service to a point where it is extremely painful to recreate in a completely separate service.
Over time, Facebook adds more features and the underlying performance is improved through technology innovation and continuous platform improvement, but the consumer of the Facebook service is relatively unaware and unaffected by these changes. In contrast, today, enterprise IT users are very aware of underlying platform changes. For example, a migration away from a client/server version of the application to the Web-based version of the application will represents a significant shift in user experience.
As IT organizations start to adopt, and more succinctly, perfect, a more service-oriented approach toward IT delivery, I believe, they will start to experience the Facebook effect within their own domain. There will be less significant change at the service level with more material changes continuing to occur in the platform. Moreover, IT will start users will start to invest their time in building connections with and automating their processes around the service. This will greatly limit enterprise IT’s ability to arbitrarily change the service in a way that impacts the user. The net impact of this on the business is yet to be seen, but in general, it is clearly the change that we have heard forecasted for years about business taking the lead in information technology direction.
Application development has been moving in the direction of platform abstraction. That is, the need for developers to have detailed knowledge of the infrastructure that the application was being deployed on was becoming less important with increasing sophistication of the application platform for which they were developing. Cloud computing is now reversing this course of action, at least in the short term.
Actually, the platform abstraction is a bit of a misnomer since the implementation resulted in operations struggling to tweak the infrastructure to meet performance requirements. Additionally, most applications typically had their own dedicated hardware allowing for specialization to meet the needs of the applications deployed on that hardware.
So, more accurately, cloud computing illustrates the flaws in the approach of pure platform abstraction and a ‘Chinese Wall’ between application development and operations as operations now has fewer tweaks at their disposal to make an application perform in a multi-tenancy environment. Hence, it is imperative that application architects begin to incorporate into their design the impacts of operating in the cloud into their architectures. Application architects must be able to understand how the application will perform given the environment that the application will be operating under.
Impacts that application architects will need to think about in this cloud world include:
- Databases – running a highly-available database in the cloud is a daunting task; especially without direct control over the storage. Environments like Amazon offer database services that deliver greater performance than can be achieved if you put up your own database in their IaaS, but there are also pitfalls.
- Software failover – applications can now implement failover far less expensively using commodity hardware. Hence, failover should now be developed into the application instead of relying on the application platform or hardware infrastructure. Given that application architects have not focused on this use case in many cases, it will require some education and experience before this can become common.
- Virtual networking – virtual networks enable the application development team to take control over their own application’s networking infrastructure. Once again, the lack of experience here means that there are likely to be many misconfigurations that impact the performance and availability of the application in addition to enabling security flaws.
- Instrumentation, logging and monitoring – these are areas that the application development teams have been pushing responsibility off onto the application platforms. However, without visibility beyond the hypervisor, it’s imperative that they incorporate this back into the applications or they may have significant issues troubleshooting or auditing their applications.
As my famous Uncle Winthrop liked to say, “Now that I've given you a band saw, I need to teach you how to use it or you will just be wasting a lot of wood and in the worst case might lose a few fingers.”