I've written before about how AWS Lambda is proving to be a powerful tool for data-intensive workloads (such as model selection in machine learning). Parallelization with Lambda is as easy as executing as many functions as you need to cover the full depth and breadth of your dataset, in real time as it grows. It's like having a CPU with virtually infinite cores.
Although each Lambda function has a relatively small footprint (1.5 gig of RAM, 300 seconds of execution time), the real power is pulling them together into a large, parallel system. They are the lionbots, coming together to form Voltron.
Eric Jonas, a postdoctoral researcher at the legendary AMP Lab, had a great illustration of this, pulling 25 TFLOPs of performance using plain-old Python functions, with over 6O GB/sec read and 50 GB/sec write to S3. That's close to in-memory speeds, with nearly linear scaling of read and write throughput to S3.