Large companies have traditionally had an impressive list of batch workloads (which are workloads that run at night, when people have gone home for the day). This includes such things as application and database backup jobs; extraction, transform, and load (ETL) jobs; disaster recovery (DR) environment checks and updates; online analytical processing (OLAP) jobs; and monthly billing updates to name a few.
Traditionally, with on-premise data centers, these workloads have run at night to allow the same hardware infrastructure that supports daytime workloads to be repurposed. This offers several advantages:
- It avoids network contention between the two workloads (as both are important), allowing the interactive workloads to remain responsive.
- It avoids data center sprawl by using the same infrastructure to run both, rather than having dedicated infrastructure for interactive and batch.
Things Are Different with Public Cloud
As companies move to the public cloud, they are no longer constrained by having to repurpose the same infrastructure. In fact, they can spin up and spin down new resources on demand in AWS, Azure or Google (News - Alert) Cloud Platform (GCP), running both interactive and batch workloads whenever they want.
Network contention is also less of concern, since the public cloud providers typically have plenty of bandwidth. The exception of course is where batch workloads use the same application interfaces or APIs to read/write data.
So, moving to public cloud offers a spectrum of possibilities, and you can use one or any combination of them:
- You can run batch nightly using similar processes as in online data centers, but on separate provisioned instances/virtual machines. This probably results in the least effort to moving batch to the public cloud and the least change to your DevOps processes. This saves some money by having instances sized specifically for the workloads and being able to leverage cloud cost savings options (e.g., reserved instances);
- You can run batch on separately provisioned instances/virtual machines, but concurrently with existing interactive workloads. This will likely result in some additional work to change DevOps processes, but offers more freedom and similar benefits mentioned above. You will still need to pay attention to application interfaces/APIs the workloads may have in common; or
- At the extreme end of the cloud adoptions spectrum, you could use cloud provider platform as a service (PaaS) offerings, such as AWS Batch, Microsoft Azure Batch or GCP Cloud Dataflow, where batch is essentially treated as a “black box”. In short, these are fully managed services, where you queue up input data in an S3 bucket, object blob or volume along with a job definition, appropriate environment variables and a schedule and you’re off to races. The advantage of this approach is potentially faster time to implement and (maybe) less expensive monthly cloud costs, because the compute services run only at the times you specify. The disadvantages of this approach may be the degree of operational/configuration control you have and the fact, that these services may be totally foreign to your existing DevOps folks/processes.
A Simple Alternative
If you are looking to minimize impact to your DevOps processes (that is, the first two approaches mentioned above), but still save money, then parking schedules can help.
Normally, with the first two options, there are cron jobs scheduled to kick-off batch jobs at the appropriate times throughout the day, but the underlying instances must be running for cron to do its thing. You could put parking schedules on these resources, such they are turned OFF for most of the day, but are turned ON (News - Alert) just-in-time to still allow the cron jobs to execute.
At ParkMyCloud, we have been successfully using this approach in our own infrastructure to control a batch server used to do database backups. This would, in fact, provide more savings than AWS reserved instances.
Let’s look at specific example in AWS. Suppose you have an m4.large server you use run batch jobs. Assuming Linux pricing in us-east-1, this server costs $0.10 per hour, or about $73 per month. Suppose you have configured cron to start batch jobs at midnight UTC and that they normally complete 1 to 1-½ hours later.
You could purchase a Reserved Instance for that server, where you either pay nothing upfront or all upfront and your savings would be 38%-42%.
Or, you could put a parking schedule where the instance is only ON from 11 pm-1 am UTC, allowing enough time for the cron jobs to start and run. The savings in that case would be 87.6% without the need for a one-year commitment. Depending on how many batch servers you run in your environment and their sizes, that could be some hefty savings.
Public cloud will offer you a lot of freedom and some potentially attractive cost savings as you move batch workloads from on premise. You are no longer constrained by having the same infrastructure serve two vastly different types of workloads — interactive and batch. The savings you can achieve by moving to public cloud can vary, depending on the approach you take and the provider/service you use.
About the Author: Jay Chapel is the CEO and co-founder of ParkMyCloud. After spending several years in the cloud management space, Jay saw that there was no simple solution to the problem of wasted cloud spend – which led him and co-founder Dale Wickizer to found ParkMyCloud. Before that, he spent 10+ years with Micromuse (News - Alert) and IBM Tivoli, a provider of business infrastructure management software. After an acquisition by IBM, he led the successful sales integration and subsequent growth of the IBM (News - Alert) Tivoli/Netcool business in Europe. He also held several regional and worldwide sales roles in Switzerland, the UK and the US. Jay earned a BA in Finance and an MBA from West Virginia.
Edited by Mandi Nowitz