Big Data Workload Simulations and Cost Estimates Now Available for EC2 Instances

By | 2018-01-04T22:57:59+00:00 December 18th, 2017|Categories: AWS, Big Data Performance|Tags: , , , , |

Figuring out the right kinds of cloud machines (commonly known as instances) to run your Big Data workloads. To get the best price to performance ratio is often a lengthy trial and error exercise. To that end, MityLytics MiCPM© can now simulate your Spark and Hadoop (more to come) application workloads. Simulation can be on various [...]

Configuring YARN Capacity Scheduler Queues in AWS EMR

By | 2017-12-01T16:01:50+00:00 November 2nd, 2017|Categories: AWS, Big Data Performance, EMR, Hadoop, Scheduler, Spark|

Introduction AWS EMR clusters by default are configured with a single capacity scheduler queue and can run a single job at any given time. This blog talks about how you can create and configure multiple capacity scheduler queues in YARN Capacity Scheduler during the creation of a new EMR cluster or when updating existing EMR clusters. [...]