by Stephen P. Smith
Internet-scale applications utilizing cloud computing infrastructure demand that their architects achieve elastic scaling by stitching together a large set of computation entities.
This concept of pulling together multiple platforms to achieve a goal is not new – HPC-type applications have for decades achieved parallel performance by techniques such as data decomposition, where the decomposed datasets fit comfortably on each node of their compute grid. What is new with cloud computing is the need to get this ability from less well-decomposed applications. But numerous recent examples, e.g., the Google File Systems “cell” concept, database “shards”, etc., show that it is possible, as long as the designer adheres to certain simple concepts.
What is not so well understood is that, at the core, each of the applications parts, which we will call “entities” in this note, must fit “comfortably” on real hardware. This means that those applications need to effectively utilize the capability of each virtual platform that make up their compute infrastructure and not overload those platforms to the point that they can no longer compute effectively on real-world hardware. We will examine the implications of this fact in this short note.


