This is the second in a series of posts which describe the adventures encountered while sticking our heads even further in the clouds.
- The first article is mostly Project Introduction & Background, describes what we’re doing and why in higher level terms
- This post is about getting our static data hosted at AWS as our first steps to using the AWS cloud
- Then we’ll talk about Migrating Site Local MySQL database to AWS RDS
- And the big kahuna part 1: Establishing the Ruby on Rails Web App Environment on AWS EBS
Ok firstly let’s think about the types of static content we want to serve from AWS, and then look at how this impacts our web app and then the build and distribution workflows for the Drum Score Editor app itself.
We’ve got 2 types of static content we want served, firstly there’s the Drum Score App installer images for each platform, plus example scores and PDFs. We’re going to store these in an AWS S3 bucket primarily to reduce the size of the web app so it can be deployed in later steps using AWS Elastic Beanstalk, which has an untweakable hard maximum web app size it can deal with (500MB at last look).
Secondly there’s the Rails asset pipeline, all the static css, javascript etc that goes with an app. Some reading of the various opinions on the inter web reveals that AWS CloudFront is the CDN which caches copies of static assets closer to the user. You simply specify the origin of the files and it does it’s magic to make them appear. That origin could be the S3 bucket containing the installer images etc, or the origin could be your web site itself, or the assets can be precompiled to an S3 bucket also. We’ll look at the options and why we chose the solution we did later in this article.
Step 1 – getting the installer images into an S3 bucket
Before we can put anything in a bucket, we need to acquire that bucket. Should we just the use AWS Console as this is a one off operation, or maybe we want to have different buckets for UAT and Production separation and so should create a reusable script for their creation. Do we need that complexity? Probably not at this stage given our overall use case.
First we create a bucket for these resources, imaginatively called drumscoreportal-resources, and we upload our installer image into it using the console (or AWS command line tools) and make it public, by right clicking on the uploaded filename. Selecting properties will show the public name of the file, so to test this works we copied the link presented and pasted it into a command line and used curl to pull it down. Remember it’s a common installer we want to be available to everybody, no need to control access to it so no need to work out any permission stuff. This all just worked, so in theory we can tweak the download links in the appropriate page in the web app and off we go.
However, everybody does local development right? You really don’t want to be running up your AWS network costs by pulling the copies down when in an iterative dev/test cycle. So we need to somehow make the development environment use local copies while UAT and production use the S3 objects.
Given all we’re doing though is pulling down the objects via http, we don’t need anything more clever than a local web server and the files in a similar URL so the app can switch through Rails environment specific initialisers. The secrets.yml file is already used to separate out which hosts are used for each environment for things like the Facebook and Paypal integration. Might be impure, but popping a line in there for the resource_host and using that in the link tags might be viable.
The download link in the view then becomes
<a href="#{Rails.application.secrets.resource_host}/drumscoreportal-resources/DrumScoreEditor-2.23.dmg" class="button radius" download>Download For Mac OS X</a>
The secrets.yml entry for the environment then specifies https://s3-eu-west-1.amazonaws.com for production and UAT and http://localhost:8080 for development. Why that URL for development, well every Mac comes with python and from the directory you want to serve files from, the command below works well.
python -m SimpleHTTPServer 8080
Simple really, in summary our web app just serves up a link to the resource in the S3 bucket rather than from it’s own host. Really need a Rails guru to chime in and say what the best way to set the resource_host variable would be though. I’m sure secrets.yml isn’t meant for this!
Last thought before moving on, putting that installer image in the S3 bucket cost money, for transfer in fees, every download costs money, and we’ve made it publicly available – this worries me.
Step 2 – moving the rails assets pipeline to S3 & CloudFront
I’ve chosen not to do this at this stage. What? I thought this was an article about how to do that! Here’s the deal, if I follow the pretty simple advice out there to simply use CloudFront, e.g. https://www.happybearsoftware.com/use-cloudfront-and-the-rails-asset-pipeline-to-speed-up-your-app.html, then I’m simply running up my costs further.
Well that’s how it seems at the moment, with no control over bandwidth costs due to user behaviour or any other malevolent person choosing to do so, we’re exposed enough already. I’ll park this for now, and return when I understand better what techniques are available for controlling exposure here.
For reference, my current setup in the single VM, which hosts both the UAT and Production website in Apache virtual sites, and the database is different schemas in a single MySQL instance, costs less than a tenner a month, and has unlimited (subject to fair use) bandwidth. For sure this project is about adding resilience and scalability but as always there’s a decision about costs versus value. We don’t understand our total costs yet, perhaps there’s a way of modelling and understanding potential costs based on apache and mysql logs, but also need to understand the behaviours on costs that EBS, RDS, S3 and maybe eventually CloudFront add to this.