Industrial Curiosity

Monday, 22 June 2020

A templated guide to AWS serverless development with CDK

If all you're looking for is a no-frills, quick, step-by-step guide, just scroll on down to The Guide!

The Story

Or, how I learned to let go of my laptop and embrace The Cloud...

Motivation

A while ago I outlined a project that I intend to build, one that I'm rather enthusiastic about, and after spending a few hours evaluating my options I decided that an AWS solution was the right way to go: their options are configurable and solid, their pricing is very reasonable and, as an added bonus, I get to familiarize myself with tech I'll be using for my current contract!

All good.

Once I'd made the decision to use AWS, becoming generally familiar with the products on offer was easy enough and it took me very little time to flesh out a design. I was ready to get coding!

Or so I thought...

Hurdle 1

Serverless. And possibly other frameworks, I'm pretty sure I looked at a few but it seems like forever ago. Serverless shows lots of promise, and I know people who swear by it, but it's only free up to a point and I immediately ran into strange issues with my first deployment. I found the interface surprisingly unhelpful, and it looks like once you're using it you're somewhat locked in.

Pass.

Hurdle 2

Cognito. Security first. Cognito sounds like a great solution, but once I'd gotten a handle on how it works I was severely disappointed by how limiting it is, and getting everything set up takes real developer and user effort (and I suffer enough from poor interface fatigue). After playing around with user pools and looking at various code samples, I realized that I'd rather allow users to register using their email addresses / phone numbers as 2nd-factor authentication (mailgun and twilio are both straightforward options), or use oauth providers like Facebook, Google and GitHub, and I certainly want to encourage my users to use strong, memorable passwords (easily enforced with zxcvbn, and why is this still a thing?!) which Cognito doesn't allow for.

You'll need to configure Lambda authorizers either way, so I really don't think Cognito adds much value.

Hurdle 3

Lambda / DynamoDB. Okay, so writing a lambda function is really easy, and guides and examples for code that reads to and writes from DynamoDB abound. Great! Except for the part where you need to test your functions before deploying them.

My first big mistake - and so far it's proved to be the most expensive one - was not understanding at this point that "testing locally" is simply not a feasible strategy for a cloud solution.

Hurdle 4

The first code I wrote for my project was effectively a lambda environment emulator to test my lambda functions. It was far from perfect, and it did take me a couple of hours to cobble together, but it did the job and I used it to successfully test lambda functions against DynamoDB running in Docker.

Hurdle 5

Lambda Layers. Why do most guides not touch on these? Why are there so few layers guides written for Javascript? It took me a little while to get a handle on layers and build a simple script to create them from package.json files, but as far as hurdles go this was a relatively short one.

Hurdle 6

Deployment. It's nice to have code running locally, but uploading it to the cloud? Configuring API Gateway was a mixed bag of Good Interface / Bad Interface, same with the Lambda functions, and what eventually stumped me was setting up IAM to make everything play nicely together. What's the opposite of intuitive? Not counter-intuitive, in this case, as I don't feel that that word evokes nearly enough frustration.

Anyway, it became abundantly clear at that point that manual deployment of AWS configurations and components is not a viable strategy.

Hurdle 7

A coworker introduced me to two tools that could supposedly Solve All My Problems: CDK and SAM. This seemed like a worthy rabbit-hole to crawl into, but I couldn't find any examples of the two working together!

I began to build my own little framework, one that would allow me to configure my stack using CDK, synthesize the CloudFormation templates, and test locally using SAM. Piece by piece I put together this wonderful little tool, hour by hour, first handling DynamoDB, then Lambda functions, then Lambda Layers...

It was at that point that realization dawned: not only are SAM and CDK not interoperable by design, but SAM does not, in fact, provide meaningful "local" testing. Sure, you can invoke your lambda functions on your local machine, but the intention is to invoke them against your deployed cloud infrastructure. Once I got that through my head, it was revelation time: testing in the cloud is cheaper and better than testing locally.

The Guide

If you're like me, and you intend your first foray into cloud computing to be simple, yet reasonably production-ready, CDK is the easiest way forward and it's completely free (assuming you don't factor in your time figuring it out as valuable, but that's what I'm here for!).

Over the course of the past couple of weeks, I've put together the aws-cdk-js-dev-guide. It's a work in progress (next stop, lambda authenticators!), but at the time of writing this guide it's functional enough to put together a simple stack that includes Lambda Layers, DynamoDB, functions using both of those, API Gateway routes to those functions and the IAM permissions that bind them.

And that's just the tip of the CDK iceberg.

Step 1 - Tooling

It is both valuable and necessary to go through the following steps prior to creating your first CDK project:

Create a programmatic user in IAM with admin permissions
If you're using Visual Studio Code (recommended), configure the aws toolkit
Set up credentials with the profile ID "default"
Get your 12 digit account ID from My Account in the AWS console
Follow the CDK hello world tutorial

Step 2 - Creating a new CDK project

The first step to creating a CDK project is initializing it with cdk init, and a CDK project cannot be initialized if the project directory isn't empty. If you would like to use an existing project (like aws-cdk-js-dev-guide) as a template, bear in mind that you will have to rename the stack in multiple locations and it would probably be safer and easier to create a new project from scratch and copy and paste in whatever bits you need.

Step 3 - Stack Definition

There are many viable approaches to setting up stages, mine is to replicate my entire stack for development, testing and production. If my stack wasn't entirely serverless - if it included EC2 or Fargate instances, for example - then this approach might not be feasible from a cost point of view.

Stack definitions are located in the /lib folder, and this is where the stacks and all their component relationships are configured. Here you define your stacks programmatically, creating components and connecting them.

I find that the /lib folder is a good place to put your region configurations.

Once you have completed this, the code to synthesize the stack(s) is located in the /bin folder. If you intend, like I do, to deploy multiple replications of your stack, this is the place to configure that.

See the AWS CDK API documentation for reference.

Step 4 - Lambda Layers

I've got much more to learn about layers at the time of writing, but my present needs are simple: the ability to make npm packages available to my lambda functions. CDK's Code.fromAsset method requires the original folder (as opposed to an archive file, which is apparently an option), and that folder needs to have the following internal structure:

.
.nodejs
.nodejs/package.json
.nodejs/node_modules/*

You can manually create this folder anywhere in your project, maintain the package.json file and remember to run npm install and npm prune every time you update it... or you can just copy in the build-layers script, maintain a package.json file in the layers/src/my-layer-name directory and run the script as part of your build process.

Step 5 - Lambda Functions

There are a number of ways to construct lambda functions, I prefer to write mine as promises. There are two important things to note when putting together your lambda functions:

1. Error handling: if you don't handle your errors, lambda will handle them for you... and not gracefully. If you do want to implement your lambda functions as promises, as I have, try to use "resolve" to return your errors.

2. Response format: your response object MUST be in the following structure:

This example lambda demonstrates using a promise to return both success and failure responses:

Step 6 - Deployment

There are three steps to deploying and redeploying your stacks:

Build your project
If you're using Typescript, run
> tsc
If you're using layers, run the build-layers script (or perform the manual steps)
Synthesize your project
> cdk synth
Deploy your stacks
> cdk deploy stack-name
Note: you can deploy multiple stacks simultaneously using wildcards

At the end of deployment, you will be presented with your endpoints. Unless your lambda has its own routing configured - see the sample dynamodb API examples that follow - simply make your HTTP requests to those URLs as is.

List objects
GET https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/prod/objects
Get specific object
GET https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/prod/objects/b4e01973-053c-459d-9bc1-48eeaa37486e
Create a new object
POST https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/prod/objects
Update an existing object
POST https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/prod/objects/b4e01973-053c-459d-9bc1-48eeaa37486e

Step 7 - Debugging

Error reports from the API Gateway tend to hide useful details. If a function is not behaving correctly or is failing, go to your CloudWatch dashboard and find the log group for the function.

Step 8 - Profit!!!

I hope you've gotten good use of this guide! If you have any comments, questions or criticism, please leave them in the comments section or create issues at https://github.com/therightstuff/aws-cdk-js-dev-guide.

Saturday, 28 March 2020

An improved (fairer) playlist shuffling algorithm

Lots of people find playlist shuffling insufficiently random for a variety of reasons, some of which have been addressed by the industry.

There's one aspect that my wife and I haven't seen, though, and that's making sure that no songs get "left behind" whenever a playlist is reshuffled, whether intentionally or by switching back-and-forth between playlists.

In an attempt to sow seeds, I've just put together an example of an improvement that can easily be applied to any of the existing shuffle algorithms.

Tuesday, 24 March 2020

The value of git (re)parenting

EDIT: I have subsequently learned about git merge --no-ff and no longer re-parent. But I'll leave this up in case someone finds it useful anyway.

I'm not a command-line person, I may have grown up with it but... let's just say I've developed an allergy. I want GUIs, I want simplicity and I want visual corroboration that I'm doing what I think I'm doing. I really don't see why I need to become an expert in every tool I use before it can be functional.

This is why I adore SourceTree, and (last time I used it) GitEye. Easier, more intuitive interface, and visualizations to help you see whether what you're doing makes sense. You're less likely to make mistakes!

I'm also (and for similar reasons) a huge fan of squashed merges in Git; one commit per feature / bugfix / hotfix, and (theoretically) the ability to go through the individual commits on my own time if I ever need to. I use interactive rebasing* a lot to achieve a similar result, but getting other devs to use it responsibly is a lot more trouble than teaching them to add --squashed to the merge command. The downside is that because I've become used to interactive rebasing to squash my commits before merging, I've also become used to using regular merges and seeing a neat line on the branch graph showing me where my merges originated. Squashed commits don't give you that. And that's sad. It's also problematic - this morning I freaked out because I thought the code in a branch I'd squash-merged into wasn't where it needed to be, all because I was relying on those parenting lines and didn't immediately think to do a diff**.

* Ironically, from the command-line. It's the one thing I find less intuitive and more risky using SourceTree.

** While we're talking about branch diffs, if you're using Bitbucket don't trust the branch diff - the UI uses a three-dot diff and what you actually want is a two-dot, ie git diff branch1..branch2

So, after much surfing around the internets and learning lots more about git's inner workings than I care to, I came across an elegant little solution and have wrapped it with a bash script in the hopes that it'll be found useful. All you have to do is run this script as follows:

./add_parent.sh TARGET_COMMIT_ID NEW_PARENT_COMMIT_ID

Sunday, 26 January 2020

Lessons from my first Direct-to-Print experience

The direct-to-print paperback edition of Shakespeare's Sonnets Exposed: Volume 1 is now up for review, after I finally ironed out the formatting kinks last night and finished fiddling with the cover around 2am. So now that I've had a chance to sleep on it, and re-re-re-review everything before submitting it, I have a few notes for anyone who wants to self-publish a book on Kindle.

1. Apple's Pages is a fantastic tool for a number of reasons, but what it produces by default really isn't great for reflowable epubs or paperback formats, which are both important if you want your work to be accessible, readable and attractive to your readers. It's a good place to start, though, just as Microsoft Word is, because it exports to Word and once you have a Word doc you can then import your work into Amazon's Kindle Creator.

2. I wish I'd begun with Kindle Creator, even though my intention is to publish on other platforms as well. Kindle is not only the easiest platform to get started on - and is probably the most accessible for your audience - but from a formatting perspective is effectively the lowest common denominator: it's tough to use custom fonts for Kindle publications, and I suspect they've made it so intentionally in order to standardize the reading experience.

My advice is to sort out the formatting for the ebook, publish it (see step 5), convert it to EPUB for other platforms, then tweak the formatting (and possibly content) for print publishing.

3. Don't bother adding your ISBN barcodes to the books yourself. Even if you have one issued for your paperback, it's best to use the KDP generated one for the Kindle direct-to-paperback offering and reserve any others for paperbacks where you have to add the code to the cover manually.

4. You can download cover templates here. I didn't realize that and I made my own, which in retrospect was silly.

5. After you've "published" your book, you'll have a KPF file that you can upload to KDP. From Converting KPF to EPUB format:

I recently managed to successfully convert kpf to epub format using jhowell's KFX conversion plugin for Calibre. Just install the plugin and use drag-and-drop to load your kpf file into Calibre. Then convert the kpf file to epub in the normal way using Calibre. Save your new epub to your desktop and then run Epubcheck on it to ensure that it is a valid epub(it always passes).

If you run into any issues with these suggestions, please let me know in the comments!

Monday, 20 January 2020

ISBN codes for Dummies

Step 1

Acquire ISBN codes. For South African residents this is a free service (thank you, NLSA!), and all you have to do is request them and assign them (there's really no need to pay anyone any money, simply look up contact details on their website and call or email until you reach someone).

It's important to note that not only does each format (paperback, hardcover, etc) need its own ISBN, but each e-book format (eg. epub, mobi, PDF) does as well!

IMPORTANT UPDATE: I've recently been informed that eBookFairs.com has, in addition to a really cool concept, an excellent free online barcode generator that I've now tried out. It's far less effort than what I originally published, and takes care of both steps 2 and 3 below.

(NOTE: use the full ISBN including the check digit, see below for a detailed explanation).

Step 2

Generate the actual barcode using the ISBN 13 section of this free online generator. As explained here:

Before making an ISBN barcode, the user must first apply for an ISBN number. This number should be 10 or 13 digits, for example 0-9767736-6-X or 978-0-9767736-6-5. Once the ISBN number is obtained, it should be displayed above the barcode on the book. All books published after January 1, 2007 must display the number in the new 13-digit format, which is referred to as ISBN-13. Older 10 digit numbers may be converted to 13 digits with the free ISBN conversion tool.

The last digit of the ISBN number is always a MOD 11 checksum character, represented as numbers 0 through 10. When the check character is equal to 10, the Roman numeral X is used to keep the same amount of digits in the number. Therefore, the ISBN of 0-9767736-6-X is actually 0-9767736-6 with a check digit of 10. The ISBN check digit is never encoded in the barcode.

Simply remove the hyphens (dashes) and the check digit from your ISBN, paste it into the text box and hit "refresh". I recommend changing the image settings to PNG format with 300 DPI. You can also change the colors if you wish.

Step 3

You now have ISBNs and their barcodes, but for a professional look you'll want to right barcode title in the right font. That's as simple as adding ISBN 0-9767736-6-X above the barcode image. The free generator uses the Arial font, but the more traditional font is monospace.

Monday, 23 September 2019

@mysql/xdevapi joy!

after hitting a wall last night with the existing node.js mysql packages, which for some reason have been years behind integrating with mysql 8's authentication, i discovered @mysql/xdevapi. the documentation's a bit heavy and not really comprehensive, but now that i've figured out how to use it, i'm very pleased with how it operates! unfortunately, i'm going to have to roll my own migrations management, but that's a small price to pay for being able to interface with the latest mysql databases in an intelligent way.

in the interests of reducing friction for other adopters, i've rolled some sample queries into my database creation script. enjoy!

UPDATE: i've subsequently learned that table joins have not been implemented, so to perform those you'd have to use the session.sql method and use it in the same way (the same promise chaining) as the CRUD methods. seems like a serious oversight, but whatever.

Saturday, 24 August 2019

simple-free-encryption-tool frontend now handles files

If your browser supports it, simple-free-encryption-tool can now encrypt and decrypt any local files you wish!

Friday, 23 August 2019

Handling emails with node.js and Mailparser

After sorting out mail forwarding and piping emails with postfix, I then needed to understand how to handle emails being POSTed to an API endpoint.

To parse an email with node.js, I recommend using Mailparser's simpleParser. I'm using express with bodyParser configured as follows:
app.use(bodyParser.json({ limit : config.bodyLimit }));
In your handler:
const express = require('express'); const router = express.Router(); const simpleParser = require('mailparser').simpleParser;
or
import { Router } from 'express'; import { simpleParser } from 'mailparser';
and then
api.post('/', (req, res) => { simpleParser(req) .then(parsed => { res.json(parsed); }) .catch(err => { res.json(500, err); }); });
Mailparser is excellent, and documented, but the documentation assumes that we're familiar with the email format. Fortunately, oblac's example email exists for those of us who aren't!

To test, send the example email to the endpoint via curl:

curl --data-binary "@./example.eml" http://your-domain-name/api/email

or Postman (attach file to "binary").

And we're good to go!

Thursday, 22 August 2019

Mail forwarding and piping emails with Postfix for multiple domains

The past few weeks I've been learning about mail servers, and the biggest takeaway for me is that it's generally worth paying someone else to handle the headache. As usual, the obstacles to configuring a mail server correctly are primarily in the lack of useful documentation and examples, so I'm putting this down here in the hopes that it'll be helpful to like-minded constructively-lazy devs.

While I'm happily using mailgun for mail sending (after much frustration I threw in the towel trying to integrate DKIM packages with Postfix to get my outgoing emails secured) I was certain that I could at least have my mail server handle mail-forwarding for my multiple domains, and while that proved to be fairly straightforward I then tumbled down a rabbit-hole trying to get Postfix to pipe certain emails to a node.js script for processing.

Set up your A and MX records for your domain, the A record @ pointing to the IP address of the server you’re going to be receiving emails on and MX with the hostname @ and the value 10 mail.your-domain-name

If your mail server is not the same as your primary A record, simply create an additional A record mail pointing to the correct IP address.
sudo apt-get install postfix
Select "Internet Site" and enter your-domain-name (fully qualified)
sudo vi /etc/postfix/main.cf
- Add mail.your-domain-name to the list of mydestination values
- Append
```
virtual_alias_domains = hash:/etc/postfix/virtual_domains
virtual_alias_maps = hash:/etc/postfix/virtual
```
  to the end of the file

sudo vi /etc/aliases

curl_email: "|curl --data-binary @- http://your-domain-name/email"

sudo newaliases
sudo vi /etc/postfix/virtual_domains
```
example.net   #domain
example.com   #domain
your-domain-name   #domain
```
(the #domain fields suppress warnings)
sudo postmap /etc/postfix/virtual_domains

sudo vi /etc/postfix/virtual

info@your-domain-name bob@gmail.com
everyone@your-domain-name bob@gmail.com jim@gmail.com
email_processor@your-domain-name curl_email@localhost
@your-domain-name catchall@whereveryouwant.com
ted@example.net jane@outlook.com

sudo postmap /etc/postfix/virtual
sudo /etc/init.d/postfix reload

You should be able to find your postfix logs at /var/log/mail.log. Good luck!

Monday, 5 August 2019

FreeVote: anonymous Q&A app

I can't believe something like this doesn't exist already! It's far from a polished product, but it's already pretty functional. Feel free to send me feedback, it'll help me prioritize the long list of improvements I've already come up with. This was inspired by my coworkers, every retrospective we all write down answers to sensitive questions "anonymously" on folded papers that one team member gathers and reviews... this way we can do it truly anonymously and all see the results.

FreeVote

Saturday, 6 April 2019

Hosting my own podcast

I've been having trouble with podcast garden, turns out they're a really unprofessional outfit; more than half the time their website is inaccessible, and they have poor customer service. Also, they've taken a lot more money off me than I agreed to and I've opened a dispute with PayPal, hopefully they'll sort this all out.

In the meanwhile, I've spent the last couple of days migrating to a custom solution, I honestly don't know what I'd do if I wasn't a software developer. The Shakespeare's Sonnets Exposed podcast is now being hosted right here on industrial curiosity, and not only am I no longer at the mercy of other people's incompetence, it's a much cheaper solution, too!

Now I just need to find the time to make my solution easily available to the public...

ARCHIVE NOTICE