AWS Lambda is a lesser known powerhouse for machine learning. We demonstrate how you can bundle up a large generative model (8GB) into a Docker image for a serverless deployment, using parallel model loading to improve the inference time.Its not fast, but hey it is free.