Posted:
(Cross-posted on the Google for Work Blog)

Many businesses around the world rely on VMware datacenter virtualization solutions to virtualize their infrastructure and optimize the agility and efficiency of their data centers. Today we’re excited to announce that we are teaming up with VMware to make select Google Cloud Platform services available to VMware customers via vCloud Air, VMware’s hybrid cloud platform. We know how valuable flexibility is to a business when determining its total infrastructure solution, and with today’s announcement, enterprise businesses leveraging VMware’s datacenter virtualization solutions gain the flexibility to easily integrate Google Cloud Platform.

Businesses can now use Google Cloud Platform tools and services – including Google BigQuery and Google Cloud Storage – to increase scale, productivity, and functionality. VMware customers will benefit from the security, scalability, and price performance of Google’s public cloud, built on the same infrastructure that allows Google to return billions of search results in milliseconds, serve 6 billion hours of YouTube video per month and provide storage for 425 million Gmail users.

With Google BigQuery, Google Cloud Datastore, Google Cloud Storage, and Google Cloud DNS directly available via VMware vCloud Air, VMware customers will benefit from a single point of purchase and support for both vCloud Air and Google Cloud Platform:

  • vCloud Air customers will have access to Google Cloud Platform under their existing service contract and existing network interconnect with vCloud Air, and will simply pay for the Google Cloud Platform services they consume.
  • Google Cloud Platform services will be available under the VMware vCloud Air terms of service, and will be fully supported by VMware’s Global Support and Services (GSS) team.
  • Certain Google Cloud Platform services are also fully covered by VMware’s Business Associate Agreement (BAA) for US customers who require HIPAA-compliant cloud service.

Google Cloud Platform services will be available to VMware customers beginning later this year, so we’ll have more information very soon. In the near future, VMware is also exploring extended support for Google Cloud Platform as part of its vRealize Cloud Management Suite, a management tool for hybrid clouds.

Today’s announcement bolsters our joint value proposition to customers and builds on our strong, existing relationship around Chromebooks and VMware View and also around the recently announced Kubernetes open-source project. We look forward to welcoming VMware customers to Google Cloud Platform.

-Posted by Murali Sitaram, Managing Director, Global Partner Strategy & Alliances, Google for Work

Posted:
Today’s guest blog comes from Graham Polley, Senior Consultant for Shine Technologies, a digital consultancy in Melbourne, Australia. Shine builds custom enterprise software for companies in many industries, including online retailers, telecom providers, and energy businesses.

Wrestling with large data sets reminds me of that memorable line from Jaws when police chief Brody sees the enormous great white shark for the first time: “You’re gonna need a bigger boat”. That line pops into my head whenever we have a new project at Shine Technologies that involves processing and reporting on massive amounts of client data. Where do we get that ‘bigger boat’ we need to help businesses make sense of the billions of ad clicks, ad impressions, and other data that can guide business decisions?

Four or five years ago, without any kind of ‘bigger boat’ available, we simply couldn’t grind through terabytes of data without plenty of expensive hardware, and a lot of time. We’d have to provision new servers, which could take weeks or even months, not to mention costs for licensing and system administration. We could rarely analyze all the data at hand because it would overwhelm network resources and we’d end up usually trying to analyze just 10% or 20%, which didn’t give us complete answers to client questions or provide any discernible insights.

When one of our biggest clients, a national telecommunications provider in Australia, needed to analyze a large amount of their business data in real time, we chose Google’s DoubleClick for Publishers product. We realized we could configure DoubleClick to store the data in Google Cloud Storage, and then point Google BigQuery to those files for analysis, with just a couple of clicks.
Finally, we thought, we’ve found something that can scale effortlessly, keep costs down, and (most importantly) allow us to analyze all of our client’s data as opposed to only small chunks of it. BigQuery boasts impressive speeds, is easy to use, and comes with a very short learning curve. We don’t need to provision any hardware, or spin up complex Hadoop clusters, and it comes with a really nice SQL-like interface that even makes it possible for non-techy people, such as Business Analysts, to easily interrogate and draw insights from the data.

When the same client came to us with a particularly complex problem, we immediately knew that BigQuery had our backs. They wanted us to stream millions of ad impressions from their large portfolio of websites into a database, and generate analytics about that data using some visually compelling charts - in real-time. Using its streaming functionality, we started to pump the data into BigQuery, which went off without a hitch, and we sat back and watched as millions of rows started flowing into BigQuery. When it came to interrogating and analysing the data, we experienced consistent results in the 20-25 second range for grinding through our massive data set of 2 billion rows using relatively complex queries to aggregate the data.

By leveraging the streaming capability of BigQuery, it allows us to analyze our client’s data instantly, and empowers them with ‘real-time insights’, rather than waiting for slower batch jobs to complete. The client can now instantly see how ad campaigns are performing, and change the ad creative or target audience on the fly in order to achieve better results.

Simply put, without BigQuery it just would not have been possible to pull this off. This is bleeding edge technology that we are using and the idea of doing something similar in the past with a relational database management system (RDBMS) was simply inconceivable.

The success of this project opened up a lot of doors for us. After we blogged about it, we received several requests from prospective clients wanting to know if we could apply the same technology to their own big data projects, and Google invited us to become a Google for Work Services partner. Our clients are continuously coming up with more ideas for driving insights from their data, and by using BigQuery we can easily keep up with them.

Big data can seem like that great white shark in Jaws - unmanageable and wild unless you have the right tools at your disposal to tame it. BigQuery has become our go-to solution for reeling in data, processing it, and discovering the value within.

Contributed by Graham Polley, Senior Consultant, Shine Technologies

Learn more about Shine Technologies and the business impact of BigQuery. Watch as BigQuery takes on Shine Technologies' 30 Billion Row, 30 Terabyte Challenge.



Posted:

Part 1 - Virtual Compute


When designing infrastructure systems, whether creating new applications or deploying existing software, it’s crucial to manage cost. Costs come from a variety of sources, and every approach to delivering infrastructure has its own tradeoffs and complexities. Cloud infrastructure systems create a whole new range of variables in these complex equations.

In addition, no two clouds are the same! Some bundle components while others offer more granular purchasing. Some bill in different time increments, and many offer a variety of payment structures, each with differing economic ramifications. How do you figure out what each costs and make a choice?

To help you work this through, we’ve created an example for you. For this example, let's look at a fairly common scenario, a mobile application with its backend in the cloud. This application shares pictures in some way, and has about 5 million active monthly users. Let’s go through what instance types this application will need to meet that user-driven workload and then price out what that will cost in an average month on Google Cloud Platform and compare against Amazon Web Services.

Our example application has 4 components:

  • An API frontend that mobile devices will contact for requests and actions. This portion will consume the majority of the compute cycles.
  • A static marketing and blog front end.
  • An application layer that will process and store images as they come in or are accessed.
  • And on the back end, a Cassandra cluster to store operational metadata.

For capacity planning, we have scoped as follows:

  • The API frontend instances can respond to roughly 80 requests per second. We expect about 350 requests per second given this number of users. Therefore we should only need four regular instances for this layer.
  • The marketing front end shouldn’t need more than two instances for redundancy.
  • The application layer will need four instances for image processing and storage control.
  • The Cassandra cluster will need five instances with a higher memory footprint. Let’s assume for now that the workload is entirely static, and autoscaling isn’t being used (oh don’t worry, we’ll add that and more back in later).

In Figure 1, you can see our example application logical architecture looks like this:
To explain the nuances of cloud pricing, let’s use Google Cloud Platform and Amazon Web Services as the example cloud infrastructure providers, and start at the most simple, on-demand model. We can use calculators that each provider offers to find out correct pricing quickly:

Please note that we completed these calculations on January 12, 2015, and have included the output prices in this post. Any discrepancies are likely due to pricing or calculator changes following the publishing of this post.

Here is the output of the pricing calculators:

Google Cloud Platform estimate:
Monthly: $2610.90

Amazon Web Services estimate:
Monthly: $4201.68

It’s important to note that right away things don’t look equivalent, with Google’s pricing being 38% lower. Why? Google includes an automatic discount called Sustained Usage Discount, which reduces the cost of long-running instances. Since we didn’t autoscale or otherwise vary our system over the course of the month, the full 30% discount applies. Even without that, pricing before the discount comes in at $3729.86, or an 11% discount off Amazon’s on-demand rates. Over the course of a year, going with Google would save you just over $19,000!

Reserved Instances

Amazon Web Services has an alternate payment model, where you can make a commitment to run infrastructure for a longer period of time (either 1 or 3 years), and opt to pay some portion of the costs up front, which they call Reserved Instances. Here are the costs for our example app with Amazon’s Reserved Instance pricing:

Amazon Web Services, no-upfront, 1 year estimate:
Monthly: $2993.00

Over a one-year term with Amazon, if you commit to pay for the instance for that entire period, and you opt for the “no-upfront” option, you still end up with a 13% higher cost than making no commitment to Google.

Amazon Web Services, partial upfront, 1 year estimate:
Upfront: $18164.00
Monthly: $1093.54
Effective monthly: $2607.21

If you opt to pay over $18k up front using the “partial upfront” model, you arrive at a lower price, saving $44 dollars (not thousands) over the course of the year

Amazon Web Services, all upfront, 1 year estimate:
Upfront: $30,649.00
Monthly: $0.00
Effective monthly: $2554.08

If you choose instead to pay 100% of the yearly cost up front, you’d end up saving $681.78 over the course of the year versus Google Cloud Platform, or 2.3%. As you can see, however, the upfront payment is over $30,000!

Similarly, Amazon offers three-year options for the partial upfront and all upfront models:

Partial upfront, 3 year estimate:
Upfront: $27,585.00
Monthly: $897.90
Effective monthly: $1664.15

All upfront, 3 year estimate:
Upfront: $56,303.00
Monthly: $0.00
Effective monthly: $1563.97

If you’re willing to part with just over $56,000 for the three-year, all upfront Reserved Instance, you’d receive a 40% discount off of Google’s rate, for a total projected gap of over $37k.

However, as I’m sure you can surmise, there are several risks that a significant up front commitment and payment create. The bottom line –- you’re locked in to a long-term pricing contract, and you risk missing out on substantial savings. Lets look at why:
  1. Infrastructure prices will drop, either for Google (which has happened 3 times in the last 12 months, as we've reintroduced Moore’s law to the cloud), or for Amazon (which has happened 2 times in the last 12 months). For 2014, this worked out to an average of a 4.85% price reduction per month on Google Cloud Platform. Due to on-demand pricing, any reduction in prices is something you automatically receive on GCP.
  2. Also, don’t forget, capital is expensive! Most businesses pay a ~7% per year cost of capital, which reduces the value of these up-front purchases significantly. For this example, that adds an effective $11,823.63 to the 3-year all up-front Reserved Instance price from Amazon.

So, let’s revisit that $37,689.40 gap. By adding in the cost of capital, and subtracting likely instance price reductions, even at the most aggressive discount AWS offers, AWS costs $60,244.21 and Google Cloud Platform costs $57,959.57, which equates to a 3.9% cost advantage.

By combining conservative evaluations of the basic facts of public cloud pricing dynamics (3% per month price reductions, 7% cost of capital) even 3-year all-upfront RI’s from AWS are not cost efficient compared to on-demand Sustained Use Discounts from Google Cloud Platform.


Flexibility

There are also cost risks to this structure presented by commitment to specific usage choices.

  1. New instance types might make your old choices inefficient (c3 instances from AWS are substantially more cost efficient for some workloads than older m3 instances, for example).
  2. Your software might change. For example, what if you improve the efficiency of your software to reduce your infrastructure requirements by 50%? Or what if you re-platform from Windows to Linux? (Reserved Instances require a commit on OS type) Or what if your memory needs to grow, and instances need to switch from standard to high-memory variants?
  3. Your needs might change. For example, what if a new competitor arrives who takes ½ of your customers, which reduces the load on your infrastructure by 50%?
  4. What if you picked everything right but the geography, and your app is suddenly popular in Asia or Europe?

The “on-demand” agility and flexibility of cloud computing is supposed to be a huge financial benefit, especially when your requirements change. Let’s imagine in the second month, several of those risks above actually happen: you move to the Asian market, resize a few instances to better map to actual workload, and shrink a bit on the cassandra cluster redundancy due to how reliable instances with live-migration are. That would look something like Figure 2.
Google Compute Engine estimate:
Monthly: $909.72

Amazon Web Services Partial upfront, 1 year, estimate:
Upfront: $6350.00
Monthly: $331.42
Effective monthly: $860.59

This system costs less than ½ of what the original system costs, and is on an entirely different continent, but what does it cost to change your plan? This change costs very little at Google: you don’t pay any direct penalty for changing your infrastructure design. Your only costs would be how long the two different systems are up and running simultaneously to facilitate a zero-downtime cut-over.

In stark contrast, the cost for changing the Amazon system are essentially the total loss of whatever committed funds you applied to earn the discount, plus, the new requirement for upfront funds to get an efficient price (and re-commit!) in your new configuration, on top of the above-mentioned dual system usage (which costs more per hour...)

Let’s look at this from a cash flow perspective, not even in the worst case, but just assuming that you wanted to break-even with Google pricing on Amazon and chose the partial up front one-year Reserved Instance.

Google: Month 1 usage: $2610.90 + Month 2-13 usage: $909.72 x 12 = $13,527.54

Amazon: Month 1 Commit: $18,164.00 + Month 1 usage: $1093.54 + Month 2 commit: $6350.00 + Month 2-12 usage: 331.42 *12 = $29,584.58

That’s a big gap, even without figuring in the cost of capital! You can see how risky those commitments can be. AWS has a service to mitigate some of that risk, a RI marketplace, which allows you to attempt to sell back Reserved Instance units to other AWS customers. However, as I’m sure you can imagine, this is another process that presents a few risks:
  1. Are the RI’s you’re selling, for instance, types that are now clearly inefficient for many workloads and therefore not desirable to other customers?
  2. Will your RI’s sell for full price, or some discount to encourage a sale?
  3. How many buyers are there in the marketplace, and how quick will your RI’s sell, if at all?
  4. What if you didn’t start out in the US? The RI Marketplace is only available for customers with a US bank account.
One risk that's a guaranteed loss: every sale on the RI marketplace comes with a 12% fee, payable to Amazon. Let’s say you have great luck and are able to sell 10 months of your original 12-month RI (they have to be sold in whole-month increments, rounding down), at full original price, which nets you back $13,320.27 after fees. Now your 13-month total is $16,083.19, so you’ve only lost $2,555.65 compared to what you would have paid using Google. But what a hassle, and how much risk did you take on? What if the RI’s didn’t sell for a few months? Every month, you lose $1,332. Ouch!

Automatic Scaling

But this is a backwards example you say, cloud isn’t intended for this kind of static sizing, you’re supposed to be autoscaling to tightly follow load. True! So, let’s imagine that the above reflects the requirements of our steady-state load, and we have four small peaks during the day: morning rush, lunch peak, after-work, and midnight madness, each of which pop at 10x the above workload. (Our application passes the toothbrush test!) Our backend handles these spikes fine, but our web and API tiers need to autoscale dramatically. Let’s say each of these peaks onset very rapidly, say over the course of five minutes, and last for 15 minutes each. Note, we see systems that spike at 100x or more, so this scenario isn’t extreme!

This kind of system is pretty easy to build efficiently on Google. Instances take roughly a minute to launch, so we can easily autoscale to accommodate load, and since we charge only a minimum of 10 minutes and bill in per-minute increments, this only adds $110.77 a month to our bill. 10x peaks!

Google Compute Engine estimate:
Monthly additional: $110.77

Building this on AWS is just not as efficient. Because instances take >5 minutes on average to launch, we need to pre-trigger our instance boots (read, timing logic or manual maintenance). Also, AWS bills for instances in full hour increments, so we pay for 60 minutes when we only use ~20, for each of our 4 peaks. This makes the total additional cost $341.60, and without any ability to appropriately discount via reserved instances, that’s a number an AWS customer can’t bring down today.

Amazon Web Services estimate:
Monthly additional: $341.60
            + instance launch management logic manual ops or development

While this spike example is one utilization behavior we see frequently, we also see basic diurnal (twice daily, aka day/night) variability on almost every customer-facing service of anywhere from 2x-5x utilization. If that natural variation isn’t being followed by use of Autoscaler or other automated resource management, you are definitely leaving money on the table!

Summary

While there are many more dimensions to evaluate, hopefully this is a helpful analysis of how pricing systems differ between Google and Amazon. We’re not stopping here; look forward to more comparisons with more cloud providers and more workloads to help you understand exactly what you get for your money.

We are hyper-focused on driving cost out of cloud services, and leading the way with innovations such as Sustained Usage Discounts and per-minute billing. As one of our customers, StarMaker Interactive VP of Engineering Christian F. Howes said, “App Engine's minute-by-minute scaling and billing saves us as much as $3,000 USD per month.”

We think pricing considerations are critical for users trying to make the best decision they can about infrastructure systems design. I’d love to hear your thoughts and what matters to you in cloud pricing? What areas are confusing, hard to analyze, hard to predict? What ideas do you have? Reach out!

-Posted by Miles Ward, Global Head of Solutions, Google Cloud Platform

Posted:
Interested in cloud computing with containers? Join us for an evening with the experts on Kubernetes, the open source container cluster orchestration platform. There will be talks, demos, a panel discussion, and refreshments sponsored by Intel.

Many contributors to Kubernetes will be attending, including Google, Red Hat, CoreOS, and others.

Time: 6:00PM-10:00PM PST
Location: San Francisco, CA

Detailed agenda coming soon. Register here.

Posted:
Today, Black Duck Software announced their annual Open Source Rookie of the Year awards. We’re very excited that two of our open source projects, Kubernetes and cAdvisor, were amongst those selected! The award recognizes the top new open source projects of the past year. Both projects center on containers and how they’re run in clusters. Kubernetes is a container cluster manager and cAdvisor analyzes the performance of running containers. Read on to learn more about these projects.


Kubernetes
Developers want to focus on writing code, and IT operations want to focus on running applications efficiently. Using Docker containers helps to define the boundaries and improve portability. Kubernetes takes that one step further and lets users deploy, manage, and orchestrate a container cluster as a single system.

Kubernetes is designed to be portable across any infrastructure, which allows application owners to deploy on laptops, servers, or cloud, including Google Cloud Platform, Amazon Web Service and Microsoft Azure.

It lets you break applications down into small sets of containers that can be reused. It then schedules these containers onto machines and actively manages them. These can be logically grouped to make it even easier for users to manage and discover them. Kubernetes is lightweight, portable, and extensible. You can start running your own clusters today.


Kubernetes started about a year ago as a small group of Googlers who wanted to bring our internal cluster management concepts to the open source containers ecosystem. Drawing from from Google’s 10+ years of experience running container clusters at massive scale, the group developed the first few prototypes of Kubernetes. Six months, and lots of work later, the first version of Kubernetes was released as an open source project. We were all humbled and excited to see the overwhelming positive response the project received. Although it started as a Google project, it quickly gained owners from RedHat, Core OS, and many many contributors. In November, we announced Google Container Engine, which offers a hosted Kubernetes cluster running in the Google Cloud Platform. This makes it even easier to run Kubernetes by letting us manage the cluster for you.

What’s next for Kubernetes? The team and community is furiously working towards version 1.0, the first production-ready release. Expect to see a slew of improvements in user experience, reliability, and integration with other open source tools.



cAdvisor
cAdvisor analyzes the resource usage and performance characteristics of running containers. It aims to give users and automated systems a deep understanding of how their containers are performing. The information it gathers is exposed via a live-updating UI (see a screenshot below) and through an API for processing by systems like InfluxDB and Google’s BigQuery. cAdvisor was released alongside Kubernetes back in June and has since become a defacto standard for monitoring Docker containers. Today, it’s run on all Kubernetes clusters and can monitor any type of Linux container. cAdvisor has even become one of the most downloaded images on the Docker Hub.

Below is a screenshot of part of the cAdvisor UI showing the live-updating resource usage of a container. The screenshot shows total CPU and memory consumption over time as well as the instantaneous breakdown of memory usage.

Continuously updating view of a container's resource usage


The cAdvisor team is working to make it even easier to understand your running containers by surfacing events that let you know that your containers are not getting enough resources. Alongside these, come suggestions on actions you can take to remedy the problem. Events and suggestions can be integrated into systems like Kubernetes to allow for auto-scaling, resizing, overcommitment, and quality of service guarantees for containers.

We’re extremely grateful to the open source community for embracing both of these projects so widely. Our aim was to address a need we saw in the open source containers community and start a dialogue around containers and how they should be run. And as we continue to collaborate with the open source community, we look forward to evolving these projects. We invite you to join us in making Kubernetes and cAdvisor better! Try them out, open issues, send patches, and start discussions. Happy hacking!

-Posted by Greg DeMichillie, Director of Product Management

Posted:
Aucor, based in Finland, designs WordPress and Drupal websites for clients. When their growing customer base needed more capacity than their private servers could manage, the company knew they needed to lighten the weight by moving to the cloud.

Aucor turned to Google Cloud Platform so they could keep their focus on what they do best – designing fantastic websites – not managing servers.

The team took Google App Engine out for a test drive. Janne Jääskeläinen, CEO at Aucor, noted, “Our test site could handle over 70,000 requests per second without the users noticing a thing. Let’s put that into perspective: it’s as if every single Finn (about 5.4 million people) would have spent a good hour clicking around the site, without it crashing or even slowing down.”

With these speeds, the team was able to easily transition over 70 of its sites to Google App Engine in little time. Learn more about Aucor’s story here.

-Posted by Kelly Rice, Product Marketing Manager

Posted:
In 2015, we're introducing a monthly webinar series to take an in-depth look at diverse elements that help us solve complex business challenges in the cloud and nurture business growth. We’ll cover unique IT management and implementation strategies and the people, tools, and applications that increase impact. We're opening it up to a live online and global forum with the aim to foster collaborative learning through use cases we can all relate to and real-time Q/A sessions. Our first webinar features, zulily, a high-growth online retailer that leverages big data to provide a uniquely tailored product and customer experience to a mass market around the clock.

Zulily is one of the largest e-commerce companies in the United States. Its business is retail, but its DNA is in technology, using data and predictive analytics to drive decisions. As the company grows, so does the amount and complexity of data. Zulily’s IT realized that in order to keep up and properly scale, they had to redesign the way they process, analyze and use big data.

Zulily transitioned to the Google Cloud Platform to meet these challenges and ultimately use the big data it collected to improve online customer experience. Join us as we take a technical deep dive into zulily’s new application infrastructure built on the Google Cloud Platform. The team will share key learnings and discuss how they plan to scale their efforts and impact.

Big data experts from Google Cloud Platform and zulily will share:

  • Best practices and implementation strategies to drive value from big data using products such as Google BigQuery and Hadoop
  • How zulily uses Google Cloud Platform to improve customer experience, increase sales, and increase relevance via marketing initiatives
  • Key leadership and technical benefits and risks to be aware of as you plan, execute and optimize your big data implementation strategy across one or multiple business units

Live Webinar: zulily turns big data into a big advantage with Google Cloud Platform

  • Wednesday, January 28, 2015
  • 10:30 - 11:00 a.m. PT
  • Speakers: William Vambenepe, Lead Product Manager for Google Cloud Big Data Services and Sudhir Hasbe, Director Software Engineering for Data Services, BI and Big Data Analytics for zulily

Register here