## Sort Algorithms Cheatsheet

This is a summary of the key features that make up an algorithm.

## Motivation

While it is easy to understand the concept of each sort algorithm, I find it difficult for me to remember the key steps that define an algorithm or is characteristic of that algorithm. That key step may prove to be pivotal in solving the algorithm but may come in as insignificant from the macro view of its concept.

Hence, I decided that it will be better for me to jot the key pointers down.

## Quicksort

``````def quicksort array
recursion(array, 0, array.length - 1) #IMPT
array
end

def recursion array, start, finish
if start < finish # IMPT
pivot_index = partition(array, start, finish)
recursion(array, start, pivot_index - 1) # IMPT
recursion(array, pivot_index + 1, finish) # IMPT
end
end

def partition array, start, finish
pivot = array[finish]
pivot_index = start

(start...finish).each do |index| # IMPT
if array[index] <= pivot # NOTE
array[index], array[pivot_index] =
array[pivot_index], array[index]
pivot_index += 1
end
end

array[finish], array[pivot_index] =
array[pivot_index], array[finish]

pivot_index
end``````

### Key Steps

• Gist: Using a pivot value, distribute the array into 2 halves that are not ordered, but are collectively smaller on the left side and collectively larger on the right side.
• The array is mutated.
• The pivot value of each iteration will find its rightful position in the array at every iteration, eventually leading to a sorted array.

### Discussions

Let’s start with the `recursion` function.

In the `recursion` function, note that the arguments are indices of the array, not the length. Keep this at the back of your mind so that you can understand when to end a loop.

Line 7 ensures we are iterating at least 2 elements.

In lines 9 and 10, the recursion occurs on either side of the pivot index in that iteration. Note that the pivot index does not participate in the next recursion, since it is already at where it belongs.

Now for the `partition` function.

In the `partition` function, the pivot does not participate in the reordering. Line 18 ensures the loop ends before reaching the last index, `finish`, which is the pivot, with the non-inclusive range constructor operator.

In the loop in line 18, we are are trying to push the values smaller than or equal to the pivot to the left here. It is also ok to use `<`.

We do so by swapping them with those that are bigger than the pivot, but exist on the left of those that are smaller.

The `pivot_index` increments at each swap and remembers the last position that was swapped. Hence, at the end of the loop, it holds the position of the first value that is bigger than the pivot. Everything on the left is either smaller than or equal to the pivot.

This is where the pivot belongs to in the array. We swap the pivot into that position. Ascend the throne!

The state of the array does not change in this last swap: all elements on the left of the pivot is still smaller or equal to the pivot, while all elements on the right of the pivot is still bigger than the pivot. They remain unsorted

The function returns the pivot’s position to the parent `recursion` function, which needs it to know where to split the array for the next iteration.

Lastly, let’s go back to the calling function where the initial recursion function is triggered. Make sure to pass in the last index of the array instead of its length.

## Mergesort

``````def merge_sort(list)
return list if list.length <= 1

mid = list.length / 2
left = merge_sort(list[0...mid])
right = merge_sort(list[mid...list.length])
merge(left, right)
end

def merge(left, right)
return right if left.empty?

return left if right.empty?

if left.first <= right.first
[left.first] + merge(left[1...left.length], right)
else
[right.first] + merge(left, right[1...right.length])
end
end``````

### Key Steps

• Gist: first recursively halve array until we are dealing with 1 element, then recursively merge the elements back in a sorted order until we get back the array of the same size, and now it will be sorted
• A recursive function that consist of 2 parts in order: recursively split and recursively merge
• The array will be mutated
• In lines 16 and 18, we are continuously appending the smaller of the first element on the right vs left array.
• Lines 11 and 13 will take care of the comparison that is still ongoing, when 1 side has been fully appended while the other still have elements inside. Since these arrays are already sorted at whichever iteration, we can just append the whole array.
• Remember the breaking function in line 2

Unfortunately, while this use of recursion is great, the number of recursions may become too excessive and cause a “stack level too deep” error.

We may need to to think prepare an alternative if the stack overflows.

``````def merge_sort(list)
return list if list.length <= 1

mid = list.length / 2
left = merge_sort(list[0...mid])
right = merge_sort(list[mid...list.length])
merge(list, left, right)
end

def merge(array, left, right)
left_index = 0
right_index = 0
index = 0

while left_index < left.length &&
right_index < right.length
if left[left_index] <= right[right_index]
array[index] = left[left_index]
left_index += 1
else
array[index] = right[right_index]
right_index += 1
end
index += 1
end

array[index...index + left.length - left_index] =
left[left_index...left.length]
array[index...index + right.length - right_index] =
right[right_index...right.length]

array
end``````

Line 15 till 36 basically carry out the operation with a while loop instead of recursion. It mutates the array along the way.

## Insertion sort

### Key Steps

• Gist: insert elements one by one from unsorted part of array into sorted part of array
• Divide the array into sorted portion and unsorted portion
• Sorted partition always starts from the first element, as array of 1 element is always sorted
• First element of unsorted array will shift forward until the start of the sorted portion of the array OR until it meets an element bigger than itself
• Order of the sorted portion is maintained
• The last element of the sorted array takes its place
• The next iteration start on the next element of the unsorted portion, which is now the first element of the current unsorted portion
• The loop mutates the array

### Discussions

• Best case is an already sorted array, so no shifting of elements from the unsorted to the sorted portion of the array, resulting in a time complexity of `n`
• The worst case is a reverse sorted array, which results in the whole sorted array having to shift for each iteration. The first element of the unsorted portion of array is always at the the smallest and need to go to the front of the sorted portion. Time complexity is `n^2`

## Selection sort

### Key Steps

• Gist: scan array to find the smallest element and eliminate it for the next iterations
• Swap smallest element with the front most element
• Scan the array in the next iteration excluding the smallest element(s)
• Last remaining single element will be of the largest value, so iterations take place until `n - 2`

### Discussions

• Time complexity is `n^2`

## Bubble sort

``````def bubble_swap array
swap_took_place = true
while swap_took_place
swap_took_place = false
(0...array.length - 1).each do |index|
if array[index] > array[index + 1]
array[index + 1], array[index] =
array[index], array[index + 1]
# increment swaps here to record
# number of swaps that took place
swap_took_place = true
end
end
end
array
end``````

### Key Steps

• Gist: keep swapping adjacent elements if left is larger than right down the array, and repeat this iteration for as many times as there are elements in the array. The last iteration will not have any swap occur to declare the array swapped.

### Discussions

• Time complexity is `n^2`

## Connecting MSSQL Database Using Ruby On Rails

This is a documentation on how to connect to a MSSQL database in a Rails application. We will use FreeTDS as the main toolkit to establish the connection.

## Motivation

I came across a gig that requires me to connect to a MSSQL database to extract the data via the application that I was building in Ruby On Rails. I spend quite some time experimenting  and playing with it before I can manage to get it to work.

It will be good to document my steps and reasons in case I come across another such request and my memory fails me.

## Installation

While Ruby on Rails has a gem that serves as a wrapper around the FreeTDS library of files, it requires the FreeTDS binaries to be installed natively on the machine that is running the application.

This presents a number of challenges. First, the local machine used may be different for different users. Second, the operating system used in the servers and local machine may be different too.

For my case, I use macOS for my development work, and the Amazon flavored linux for my staging and production sites.

## Installing FreeTDS on macOS

The steps listed here follows this guide closely.

First, install using these files locally in the kernel using homebrew.

``````​
brew update
brew install unixodbc freetds
``````

ODBC is an API that is meant for database access across different platforms. `unixODBC` is the driver manager that allows unix systems to connect to ODBC-capable databases.

MSSQL is one such database. However, while it uses ODBC for connection, it uses the TDS protocol on the application layer for communication. Hence, a ODBC driver alone is insufficient for the machine to process the data in the database. This is where `FreeTDS` comes in.

FreeTDS is a set of libraries that will do the translation and allow our application to connect to the database and retrieve the data.

## Installing TDS on Amazon Linux

Credits to this answer on stackoverflow. He even gave the steps required to install the packages via Elastic Beanstalk, which is convenient for me as I also use Elastic Beanstalk for deployment.

``````[ ! -e /home/ec2-user/freetds-1.00.86.tar.gz ] && \
wget -nc ftp://ftp.freetds.org/pub/freetds/stable/freetds-1.00.86.tar.gz -O /home/ec2-user/freetds-1.00.86.tar.gz || \
true``````

The first section of the code that is enclosed within a pair of square bracket is a unix command to check the existence of the zip file, which contains the necessary libraries, in the home path of the server. In the Amazon Linux system, the home path is `/home/ec2-user` by default. Adjust accordingly if you are installing in a linux local machine.

Should the file exist, the subsequent command to download the file will not be executed due to the logical `&&` operation.

The last `||` operation with a `true` ensures the command returns a true, and the whole Elastic Beanstalk process will continue even if the file already exist. Of course, this step is not necessary if we are installing the libraries manually on our local linux machine.

``````[ ! -e /home/ec2-user/freetds-1.00.86 ] && \
tar -xvf /home/ec2-user/freetds-1.00.86.tar.gz -C /home/ec2-user/ || \
true``````

Similarly this step check for the presence of the unzipped file to prevent repeated and unnecessary unzipping of the compressed library.

``````[ ! -e /usr/local/etc/freetds.conf ] && cd /home/ec2-user/freetds-1.00.86 && \
sudo ./configure --prefix=/usr/local --with-tdsver=7.4 || \
true

[ ! -e /usr/local/etc/freetds.conf ] && \
( cd /home/ec2-user/freetds-1.00.86 && sudo make && sudo make install ) || \
true``````

The next 2 commands set up the configurations for FreeTDS and start finally installing its libraries. Upon installation, the config file `freetds.conf` will be produced, which explains the checks against its existence to prevent duplicate installation operations.

## Application in Ruby on Rails

With the FreeTDS libraries installed in the kernel, we can look at how to use the `tiny_tds` gem to communicate with the MSSQL database. After installing it via bundler, we can sue the following commands to connect.

``````client = TinyTds::Client.new(
host: Rails.application.credentials.dig(Rails.env.to_sym, :deltek, :host),
port: Rails.application.credentials.dig(Rails.env.to_sym, :deltek, :port),
database: Rails.application.credentials.dig(Rails.env.to_sym, :deltek, :database)
)``````

Following the new practice of using credential file to store secrets, I have stored all the database credentials in the encrypted `credential.yml.enc` file.

``````client.execute("
SET ANSI_WARNINGS ON;
SET ANSI_NULLS ON;
SET QUOTED_IDENTIFIER ON;
SET ANSI_NULL_DFLT_ON ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SELECT @@OPTIONS;
").each
``````

This next snippet sets the settings of the connection. I would not pretend to understand the reasons for the settings made here. However, this is the final settings that worked for me to make the subsequent queries to the database tables. I came to this final configurations after googling around for the different errors that were thrown at me while getting TDS to work.

``result = client.execute("SELECT TOP 1 * FROM SOME_TABLE").each``

This is an example of executing a query in SQL language. The `result` variable will be an array of hashes, where each hash represent 1 row of record.

``result = client.execute("SELECT TOP 1 * FROM SOME_TABLE").each``

Last but not least, make sure to close the client’s connection. This is not active record that “automagically” does that for you.

## Data Structures Cheatsheet

This is a concise glossary of the concepts, features and applications of various data structures.

## Motivation

Due to the coronavirus outbreaks, the major lockdowns in Europe that ensued, and the stay home quarantine I have to undergo upon return to my country, I am ceasing my digital nomad life, which I have recorded in my Instagram account. So here I am, refreshing my memory on data structures as I prepare to welcome a new phase of my life.

I have a problem finding a good concise cheatsheet that can properly remind me of the concepts of all the data structures, their key features and their runtime performance for various operations. More importantly, when to use them and, as I am a rubyist, how are they applied in ruby.

Each section will talk about 1 data structure. It will consist of the main concept behind how they are constructed, some key features that are unique to them, when it is the best use case for them, and if there is something similar in ruby. These concept follows the HackerRank’s youtube channel’s playlist on Data Structures.

## ArrayList

An `arraylist` is a dynamic `array` that will expand its capacity when it reached its maximum. An `array` requires pre allocated memory to be created. That means we need to establish the size of each element in the array and their total count.

Typically, when the `arraylist` reaches capacity, its size will be doubled by some complicated built-in algorithm in one of the library files of the language. It also has methods that can be called manually to ensureCapacity of the array.

### Key Features

• Expands capacity when required

### Runtime

• Access: O(1) with use of known index of element in array
• Search: O(n)
• Insert
• prepend: O(n) due to need to shift all elements
• append: O(1)
• Deletion: O(n) due to need for search to destroy

### Applications

• List of items of any kind of order

### Ruby Alternative

In ruby, everything is an object. That includes arrays. Arrays in ruby are made dynamic to behave like ArrayList, like in most other dynamic languages. The array object has some operation to ensure capacity for the array.

It is also heterogenous, which allows for different data types exists together as elements of the same array (since all of them are objects anyway).

## Binary (Search) Tree

Trees are most of the time referring to binary trees. Each node in binary tree can have a maximum of 2 nodes. This “tree” is kind of like a linked list of objects. It is not an array.

And for binary search tree (BST), it has to have an increasing order in relation to a node from its left to right nodes. Based on this rule, the binary search can be carried out by propagating through the nodes by asking the deterministic question: Is the left node more or less than the right node. With a known sort order, each iteration can, probably, halve the total nodes to search. This results in a faster search time.

This is only  “probably” achievable if the BST is balanced. If the tree is lopsided on the right side for example, each iteration does not exactly halve the number of nodes to search. The worst case scenario would be to comb through all the nodes if they are all existing on the right node of one another.

There are many self balancing trees, one of them is the AVL tree named after its inventors. It involves changing the root node when it becomes unbalanced to ensure that “the heights of the two child subtrees of any node differ by at most one“.

Duplicates are allowed in some BST, meaning the there can be 2 nodes with the same value. It should always be obey the rule that the left node is `<=` to the current node. Duplicates introduce complexity in the search algorithms to determine the correct node to pick.

### Key Features

• A node and its left and right nodes has to be sorted in a specific order that can be classified as ascending or descending
• Needs to be balanced to be useful
• Traversal is always from left node then right node, with the current node hoping between and around the left and right to define the 3 different methods of traversal, ie.
• inorder
• preorder
• postorder

### Runtime

• Balanced
• Access: O(log n)
• Search: O(log n)
• Insert: O(log n)
• Deletion: O(log n)
• Imbalanced (worst case scenario)
• Access: O(n)
• Search: O(n)
• Insert: O(n)
• Deletion: O(n)

### Applications

• Database like CouchDB
• Huffman Coding Algorithm for file compression
• Generally large data with sortable characteristic, and its size should be large enough to justify the use of BST over arrays

### Ruby Alternative

There is no native implementation of BST in ruby. However, there are gems out there that implement it. RubyTree by evolve75 is my favorite as it allows for content payload to be added to each node.

BST is quite an old and establish concept. Hence, these gems might appear old and unmaintained.

## Min/Max Heap​

This tree always populated from the left to right across each level. It is considered minimum or maximum depending on whether the smallest or largest value is at the root node respectively.

After insertion, the new node is “bubbled” up to the correct node by a series of swapping with its parent node until it reaches the root node, if it reaches the root node.

If root node is deleted, the last node replaces it and “bubbled” down to the correct position.

Because of the way it is data is populated, there will be no gaps in between nodes, hence this tree can be stored as an array (no need for linked list)! One can simply use the index of the node in the array to access itself, and some formula to get the index of its neighbouring nodes and access them as well:

• parent: (index – 1) / 2 (rounded down)
• left: 2 * index + 1
• right: 2 * index + 2

### Key Features

• Essentially an array
• Root node is always the minimum or maximum, the last node is always the opposite
• Nodes in between does not necessarily obey  the order.
• Root node is usually the one being removed in application, replaced by the last node, and bubbled to the correct position accordingly
• Min heap always look to find the smallest value among its children to swap down, opposite for max heap

### Runtime

• Access: O(1) with use of known index of element in array
• Search: O(n)
• Insert
• append/prepend: O(1)
• ordered insert: O(log n)
• Deletion: O(log n)

### Applications

• Priority queues (eg. for elderly and disabled then healthy adults using weighted representation)
• Hospital queues for coronavirus victims based on age and, therefore, savableness
• Schedulers (eg task with higher priority will have higher weightage and will be bubbled to the correct position when added to the queue)
• Continuous median problem

### Ruby Alternative

There is no native heap implementation in Ruby. Gems are available.

## Hash Table

Interestingly, a hash table consist of a hashing function and an array of linked lists. Together, they form a key value datastore.

The key to map to the value to store undergoes a hashing function to get an integer. This integer will represent the index in the array in which to store the data, that is the value corresponding to the key in the hash table. It will be added to the linked list behind the index of that array.

The data is saved as a linked list instead of an element in the array due to the probability of collisions from the hashing function. This allows multiple values to be stored in the same index of the array, but only if their key is different. Otherwise, they will overwrite the old data, as hash tables do no allow duplicate keys.

It is crucial for the hashing function to have a good key distribution. This is to prevent any of the linked list from being overwhelmingly long, resulting in long search time hopping through the linked list. Murmur hash is a good hashing function for this purpose.

### Key Features

• Hash function maps keys to index of array
• Array is made up of linked list to store data while avoiding collisions from the hashing
• Hash function with good distribution crucial to performance
• No order

### Runtime

• Access: O(1)
• Search: O(1)
• Insert: O(1)
• Deletion: O(1)

### Applications

• Anything that does not require order

### Ruby Alternative

Murmur hash seems to be used in the native ruby hash. Note that it is easily reversible. Hence while it can be used for maintaining good key distribution, it is not ideal for cryptographic purposes.

Each node will point to the next node. The last node will point to `null`. Accessing elements can be slow as the pointer need to jump through nodes, unlike array which can access instantly via the index. The advantage of linked list over array is that you do not need to allocate the required memory at the start. You will only use the memory that you need without wastage. It is very space efficient.

Another advantage is its speed during prepending elements or inserting them in the middle. Unlike the array where every element thereafter has to be shifted, it can be done in constant time in a linked list.

A variation, the doubly linked list, gives bearing to adjacent node on both ends. It allows traversal in both direction as its biggest advantage. The maintenance needed to maintain that the 2nd neighbouring node in all operations may be costly.

Last variation is the circular linked list. That said, there;s a classic linked list question on how to detect if a linked list has a cycle (not necessarily circular). The solution is to use a fast pointer and a slow pointer to loop through the link list until they point to the same node in a linked list with a cycle, or null for a non circular linked list. This is simple cycle detection algorithm known as Floyd’s tortoise and hare, and is entertainingly portrait in the video below.

Side note for me: the distance of the loop, not coincidentally but mathematically, equals to the start of the linked list to the location where the hare and tortoise meet. Again, not coincidentally but mathematically, the distance from the start of the linked list to the start of the loop equals to the distance from location where the hare and tortoise met to the start of the loop (continuing in the direction that the tortoise was originally moving in).

### Key Features

• Head node may be `null`
• Last node will point to `null`
• Doubly linked list is another variation, where each node points to its previous and next node

### Runtime

• Access: O(n)
• Search: O(n)
• Insert
• append/prepend: O(1)
• ordered insert: O(n)
• Deletion: O(n)

### Applications

• Anything that requires order and needs to save on memory

### Ruby Alternative

There is not native linked list in ruby. However, there are gems and this by spectator is still pretty active.

## Queue

Queue is a collection of data that obeys the First In First Out (FIFO) principle.

Theoretically, as traversal is not suppose to happen in a queue, I believe that it is best implemented with linked list rather than an array. There is no resizing overheads, and no need to shift all the elements every time an element is taken out of the front of the queue.

Addition to the queue might mean having to hop through the whole link list to add the element at the back. However, I would solve this by using a circular linked list to have a grip on the first and last element, which is actually all the queue would care about. Of course, things will be different if it is a not-so-simple queue like a least recently used (LRU) implementation.

However, there are certain advantages we should consider implementing with arrays. Arrays can be cached more easily as they are consist of memory units adjacent to one another. On the other hand, a linked list consist of memory units that exist sparsely in the memory pool which hurts its caching capabilities. The reason is a TODO for me when I go beyond data structures during this revision weeks.

Nonetheless, cache engines like Redis has their own implementation of a linked list (Redis List) in their cache database. I do not know if this is the same caching mechanism that is affected by the sparse memory locations of a linked list, but it is probably good to know.

• FIFO

### Runtime

• Insert (prepend): O(1)
• Deletion (shift): O(1)

### Applications

• Restaurant queues

### Ruby Alternative

Arrays are usually used as queues in ruby. It, however, does have a native Queue class, which is meant for multi threaded operations. On top of that, it has a SizedQueue class to ensure the size is within capacity.

## Stacks

Stack are like the brother of queues. The only difference is they obey the Last In First Out (LIFO) principle.

Again as traversals are not supposed to happen, we can use linked list for the same advantages and considerations as explained in the “Queue” section above. And instead of appending to the end of the linked list, we will prepend to the linked list instead, where the head of the linked list represents the top of the stack. It will be constantly changing and where all the action will take place.

This ensures there is no overhead from resizing from using the array when the data gets too big, but it will need to take the need and performance in caching into consideration.

Arrays will be more suited to implement a stack than a queue. This is because all the push and pop will take place at the end of the array, unlike the queue which need to remove element from the front of the array and cause shifting of all elements forward.

Two stacks can be used to implement a queue with minimal performance overhead as well, as shown in the video below.

• LIFO

### Runtime

• Insert (push): O(1)
• Deletion (pop): O(1)

### Applications

• Matching balanced parenthesis problems
• Anagram / palindrome problems
• Backtracking in maze
• Reversals

### Ruby Alternative

No native stack in Ruby. A simple array will suffice. Linked list gems are available too.

## Graph

A graph is a superset of linked list. Unlike a linked list where it needs to be associated to a next element, and its previous element for a doubly linked list, a graph can have links to multiple nodes, not just to the adjacent ones.

The link between each node of a graph contains data to give more meaning to the relationship between nodes. In graph terminology, this link is called an “edge” (as a matter of fact, “nodes” are termed “vertices“). An edge can be directed or undirected. Think being friends (undirected) versus being a follower (directed) between users in a social network.

2 common ways to search a graph is the Depth First Search (DFS) and Breadth First Search (BFS). DFS has a weakness where it will search the full depth from one edge before moving on to the next. This translates to inefficiency if the vertex that we are searching for is on the other edge. Hence BFS is preferred.

Typically in BFS, a queue is used to store the next vertices to search.

There can be cycles of vertices having an edge to one another, hence during the search, it is imperative to check if a vertex has been visited or not to prevent going round in circles during the search. Unless we are talking about a Directed Acyclic Graph (DAG) where there are no directed cycles.

### Key Features

• Nodes are termed vertices
• Edges contain data to describe the relation between vertices
• BFS preferred
• Flag to check if visited the vertex before in a search algorithm to prevent looping inside a cyclic relationship among vertices.

### Runtime

Time complexity of a graph depends on how the edges and vertices are stored. The optimal choice of storage depends on any prior knowledge of how the graph might look like.

### Ruby Alternative

There is no native graph data type in ruby. Gems are available.

## Set

The data structure of a set is the same as that of a hash table. The difference is that set is not really concerned with the mapped value of a key. It just tracks whether the key is present.

This implies that there can be no duplicates in the keys just like a hash table. And unlike a hash table where a key can be mapped to a null value, the key will be removed if nullifed for the case of a set.

### Key Features

• No order
• No duplicates

### Runtime

• Access: O(1)
• Search: O(1)
• Insert: O(1)
• Deletion: O(1)

• Attendance

### Ruby Alternative

Ruby has a native data type for a set.

## Rails select helper with default selected option disabled and prompt

This is a documentation once and for all on how to render a select tag in rails view with a disabled option that is preselected.

## Motivation

I find myself having to refer to stackoverflow often for the solution because it just isn’t intuitive. It requires some sort hacking before it can be rendered the way I needed it. Often, I would not understand the purpose of writing the code that way unless I can remember the purpose and  see the end goal of what it is trying to achieve.

Code that is not self-explanatory is a sign of code smell. That is so not Rails.

## The Old Way Of Coding

``````<= tweet_form.select :user_id,
options_for_select(
@users.map { |user| [user.name, user.id] },
selected: tweet_form.object.user_id || 'Please select user',
),
{},
{ class: 'form-control' }
%>``````

The way I used to code out the list of options is to create an array that contains yet another array as shown above. In the inner array, the first value will be the label that users will see when selecting the option from the list, while the second value is the actual value to be passed to the backend.

Here is the part that raises question. In line 3, I prepend the array of users using the `+` operator with an array containing `Please select user`. This is meant to be the first value to be selected so that it can act as a prompt for the select tag.

I use the `options_for_select` helper to make it a little easier to set the with the `selected` and `disabled` options. The `selected` key will select the prepended array’s value as the default value for the select tag, and the `disabled` key will set it as disabled so that users cannot choose it and send an invalid value to the backend.

Hence, without knowing the end goal, the code from line 3 to 5 will raise eyebrows. There has to be a better way and indeed there is. But first, let me touch on the remaining lines for completion sake.

Line 7 is for other options for the Rails `select` helper, like `include_blank` which we are not using.

Line 8 is for additional HTML attributes.

## The New Way Of Coding

Rails 6 has added the `prompt` helper in the Rails `select` tag to achieve exactly this purpose. Let’s compare the new way of writing that snippet.

``````<= tweet_form.select :user_id,
@users.map { |user| [user.name, user.id] },
{
selected: tweet_form.object.user_id || "",
disabled: "",
},
{ class: 'form-control' }
%>``````

We are no longer using the `options_for_select` helper as we do not need its `selected` and `disabled` options for any more unsettling hackery.

In its place, we use the `prompt` key in the line 6 under the `options` argument for the Rails `select` helper to give achieve the same result.

The `disabled` key here makes the prompt an unselectable option. Leave it out if your UX allow selection of `nil` value. You may need to engage the `include_blank` key here as well.

The `selected` key sets the selected value conditionally to be the prompt’s unless the form object already has a `user_id` value. This part is a little quirky; the concept of selecting the value of the form object should already be implemented by default without having to write out the conditional code as in line 4. I believe this is constructed to fit all scenarios for all UX requirements, but I can’t really be sure. I’ll check it out someday.

Note that by using prompt, the prompt will not be present as one of the options if the form object already has a value.

## Conclusion

This is a lot neater and much more like the Rails we know. There is no more questionable code and every line and option has a clear purpose.

## Handling DOM Elements From link_to remote: true Callback

This is a documentation of how to handle the response from a `link_to` `remote: true` API call and manipulate the DOM with minimal Javascript code.

## Motivation

In the past, I use Javascript to add a `click` listener on a button element in order to make `jquery.ajax()` API call to my Rails server.

A typical use case would be to delete a row in a list. While I can make a resource delete request to the Rails server and it will reload the page with the new list the RESTful way, this UX flow does not work out in some cases.

Hence, I had to use the `jquery.ajax()` way to work things out. I did not use the `link_to` with `remote: true` helper because I thought there is no way for me to listen for the response and react.

The the only way I can listen on the callback and handle the DOM element thereafter, without having to reload the page was to use `success` callback the `jquery.ajax()`, or so I thought.

## The Magic

So a better way is to use the `link_to` `remote: true` helper to render out the element without fuss the Rails way. And the key step is to add a Javascript listener on the element.

``````<!-- page.html.erb -->
<div data-model-id="<%= @model.id %>">
<%= link_to 'DELETE', model_path(@model), method: :delete, class: 'delete-model', data: { confirm: 'Are you sure?' }, remote: true %>
</div>``````
``````// page.js
\$(document).on('ajax:success', '.delete-model', event => {
const [response, status, xhr] = event.detail;
\$(`.parent-row[data-model-id="\${response.model_id}"]`).remove();
});``````

The listener will listen for a `ajax:success` callback on the element that made the API call. Upon triggered by the event, it executes the block of code in its callback function. In this callback function, we will receive the data passed from the backend, which we can use to remove the DOM element as required.

Note that you might not want to use `\$(document).on()` in a turbolinks environment as the listener will be added every time the page changes. A particular use case is documented here.

We can add a `ajax:error` listener as well to handle errors.

This is a no hassle method of writing code in `ruby` (well, for the rendering of the element at least). The old way that I do, which is to use the `jquery.ajax()` method, requires more tear-inducing Javascript code to conjure. For a full stack Rails developer, it is not the most welcome.

On top of that, using `Rails` helper to render out the HTML element allows us to make use of the various `Rails` helpers to supercharge our development speed.

Url route helpers parse out the actual RESTful route to call with ease. Since it is dynamically interpolated, no code change is required should there be a route change.

We can also still take advantage of `rails-ujs` which has some handy features commonly needed for development. In the example above, I added a `data-confirm` attribute. This will be trigger rails-ujs to ask for confirmation before proceeding with the request, and gracefully abort the operation should the user cancel the confirmation.

This will require proper setup with the new Rails 6 version. Check out my article on how to properly setup Rails 6 with bootstrap, and of couse, integrate the new `rails-ujs` in its brand new frontend paradigm running on `webpack`.

## Conclusion

Utilizing rails helpers as much as possible will exude the strength of the `Rails` framework even more, which is rapid development. This method of listening to remote API calls and act accordingly from the response allows exactly this.

## AWS Lamba and API Gateway Integration With Terraform To Collect Emails

This is a documentation on creating a service that collects emails. It runs on serverless technology utilizing AWS lambda and API Gateway. It is also made easy to deploy to the cloud with infrastructure as code via Terraform in the form of a plug-and-play methodology.

## Motivation

Often, I have to make static websites that are not exactly completely static because it requires a backend to collect the emails. While 3rd party services like mailchimp and sendgrid has their own SDKs to support easy integration for email collection, we have to be worried about hitting the limit in their packages and plans. This translate to stress for developers as we have to find a solution on it quickly and properly. If this happens on a weekend or a Friday, somehow this is always the case as more people are surfing the net then, the intensity is amplified.

For a new website, it is very hard to gauge the traffic and thus the plan required for the 3rd party service. this poses difficulties when budgeting for the project. Under utilizing the service also translate to unnecessary cost. The best kind of plan for such website is a pay as you go model, in my opinion, and that can be achieved by integrating with cloud providers like AWS.

## Technology Stack

### AWS Lambda

Enter AWS lambda where you only pay for what you use. You do not need to fork out money at the start of your project. Instead, you will just pay for how much you use, hence relieving you of the worry of wasting money on resources you are not using. In fact, this is only an issue if you are hitting 1 million potential users signing up with their email every month. The reason is because AWS lambda has 1 million free request every month before they start charging. This is highly unlikely for a new website, which means you now have a backend for your static website for free.

### API Gateway

For the Serverless fuction, that is AWS Lambda, to connect to the Internet via an API, we need the API Gateway. This exposes the serverless function to be accessible by the World Wide Web with a HTTPS endpoint. It runs on the encrypted transport security layer protocol to uphold security by default. This allows your websites to use the serverless function via API calls.

### Terraform

To set up the infrastructures, the usual way is to navigate the AWS management console, deploy the required AWS resoures and link them. This can be a challenge if you are not familiar with the required configurations. Not only will this translate to loss of precious time to debug these issues, which otherwise developers could have spent it with your loved ones and challenge the meme below, but it will also lead to frustration.

While frustration is a part and parcel of life as a programmer, we can also avoid them with our knowledge of code. Here is where Terraform enters the fray. It is an Infrastructure As Code where you write the configurations of the infrastructure once and you can deploy it multiple time without having to go through the whole forest of the AWS console each time. This means you do not need to remember every single step and do not need to deal with surprise bugs because you forgot one of them, or worse, had a spelling error.

Programming is like magic. You write very specific instructions in arcane languages to invoke commands, and if you get it even a little bit wrong you risk unleashing demons and destroying everything.

— Diana Carrier (@artemis_134) June 23, 2018

Since the blueprint infrastructure is in code, this means we can leverage version control features with git, and work together to improve the code base along the way without fear of not being able to rollback to the previous successful configuration.

## Terraform Files

I will start off with the terraform files required to setup the infrastructure to deploy the code. Let’s start off with the place to store our emails.

## The database – AWS DynamoDB

I will store the emails collected in AWS’s own noSQL database DynamoDB. This is a fast, simply structured and schemaless storage which fits my use case very aptly.

It allows fast and simultneous writes at high speed, so there is no fear of race conditions from spike in the volume of signups during a PR event promoting the product and getting people to leave their emails at the website.

Since it is schemaless, we can easily add new details of the users that you would like to collect on top of their emails along the way without having to migrate and fiddle with the structure of the database. With proper metaprogramming, you do not need to touch the backend code as well, leaving only the frontend to work on adding the new text fields for data collection.

For the sake of argument, we can also use the traditional relational database management system  (RDBMS) for this project. It is written in SQL, which is a langauge most, if not all, developers who every touched a database would have known. There is no need to use fancy noSQL for this simple project. In addition, the chances of leveraging the scaling advantage of noSQL over SQL databases are low, because you will need alot of traffic for that to become a worry. For a new website, that is highly unlikely to happen.

However, highly influenced by the cost, I am still sticking with DynamoDB in this case. To setup an AWS RDS to host a managed relational database, the cheapest MySQL database already goes for around 20 USD a month, as compared to the pay as you go model the DynamoDB employs. On top of that, it has a generous amount of free usage and storage under its free tier. This free tier does not last for the first 12 months after your signup but forever, unlike the RDS counterpart. We probably will NOT incur any cost using DynamoDB unless your marketing is brilliant for your new website.

``````resource "aws_dynamodb_table" "main" {
name = "\${var.project_name}-dynamodb_table"
billing_mode = "PROVISIONED"
write_capacity = var.dynamodb-write_capacity
hash_key = "email"

attribute {
name = "email"
type = "S"
}
}``````

Provisioning the database is the simplest. I am using Terraform variables to substitute values to set the number of reading and writing units required, as well as the table name for robustness sake.

I have set the billing mode to “provisioned” for simplicity sake. Afterall I am not expecting any insane burst of traffic for a site that is not popular. Even if it does, maybe due to some incrediable promotion at some hugely popular event, I do not expect the load to require me to scale the reading and writing capacities of the database. It is going to be a quick write of a few bytes.

On top of that, provisioned capacity means less configurations needed for the permissions to autoscale of the capacities of the database. It can take some time to configure that, and since that is outside the topic of the article, I will stick to “provisioned” billing mode.

The `hash_key`, or “partition key” in other definitions, is analogous to the primary key in a SQL database table. It requires specific details under the `attribute` property. You can specify the `range_key`, or “sort key” here if you require, and remember to add `attribute` to describe it as well.

Other attributes that are neither the partition key nor the sort key need not have a `attribute` property in this file. You can simply just write it in the database and it will register. Afterall, this is a schemaless database.

On top of that, it is a fully managed database, so it comes with all the goodies like backup and version maintenance to spare developers from all these chores.

## The backend – AWS Lambda

​Next is the lambda function. It is written in Javascript using Nodejs. The file below is the configuration file to set the infrastructure required. Let’s dive into it.

``````resource "aws_lambda_function" "main" {
filename = var.zipfile_name
function_name = "\${var.project_name}"
role = aws_iam_role.main.arn
handler = "index.handler"

source_code_hash = "\${filebase64sha256("\${var.zipfile_name}")}"

runtime = "nodejs12.x"
}

resource "aws_iam_role" "main" {
name = "\${var.project_name}-iam_lambda"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}

resource "aws_iam_policy" "main" {
name = "main"
path = "/"
description = "IAM policy for lambda to write to dynamodb table and logging"

policy = templatefile("\${path.module}/lambda_policy.tmpl", { dynamodb_arn = aws_dynamodb_table.main.arn })
}

resource "aws_iam_role_policy_attachment" "main" {
role = "\${aws_iam_role.main.name}"
policy_arn = "\${aws_iam_policy.main.arn}"
}

resource "aws_lambda_permission" "main" {
statement_id = "AllowExecutionFromAPIGateway"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.main.function_name
principal = "apigateway.amazonaws.com"

source_arn = "\${aws_api_gateway_rest_api.main.execution_arn}/*/*/*"
}``````

Uploading of the backend code will be using the base64 hash of the zipfile of the code. The code will need to be first compressed and zipped before taking this action. We will see how we can automate this process later.

This lambda function will need the permissions to write to the dynamoDB table. This is done using

• `aws_iam_role` to establish trust between the 2 AWS services
• `aws_iam_policy` to give permission for the lambda function access the database resource and perform the PutItem action. Details of the policy is interpolated via a template file, which we will go through later
• `aws_iam_role_policy_attachment` to bind the `aws_iam_role` to the `aws_iam_policy` on the lambda function
• `aws_lambda_permission`to allow `API Gateway` to be able to integrate the lambda function and invoke it

The template file for the `aws_iam_policy` is shown below. It lists the actions that the lambda function is permitted to perform on the specified dynamodb table. It also contains the permissions for lambda function to push the logs to `AWS Cloudwatch`. By the way, these logging permissions are the default permissions for a lambda function, and this template adds on the DynamoDB permissions to them. Note the `dynamodb_arn` variable that is interpolated, which jusitifies the use of the template file instead of hardcoding the whole policy in the main terraform file for robustness sake.

``````{
"Version": "2012-10-17",
"Statement": [
{
"Action": "dynamodb:PutItem",
"Resource": "\${dynamodb_arn}",
"Effect": "Allow"
},
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*",
"Effect": "Allow"
}
]
}``````

## The API layer – AWS API Gateway

The API Gateway is required to expose the lambda function to be consumed by servers and websites via a URL endpoint. The endpoint will be served over the HTTPS, which requires some extra configurations as documented below.

``````resource "aws_api_gateway_rest_api" "main" {
name = var.project_name
}

resource "aws_api_gateway_resource" "main" {
rest_api_id = aws_api_gateway_rest_api.main.id
parent_id = aws_api_gateway_rest_api.main.root_resource_id
path_part = "email"
}

resource "aws_api_gateway_integration" "main" {
rest_api_id = aws_api_gateway_rest_api.main.id
resource_id = aws_api_gateway_resource.main.id
http_method = aws_api_gateway_method.main.http_method
integration_http_method = aws_api_gateway_method.main.http_method
type = "AWS_PROXY"
uri = aws_lambda_function.main.invoke_arn
}

resource "aws_api_gateway_integration_response" "main" {
depends_on = [aws_api_gateway_integration.main]

rest_api_id = aws_api_gateway_rest_api.main.id
resource_id = aws_api_gateway_resource.main.id
http_method = aws_api_gateway_method.main.http_method
status_code = aws_api_gateway_method_response.main.status_code
}

resource "aws_api_gateway_method" "main" {
rest_api_id = aws_api_gateway_rest_api.main.id
resource_id = aws_api_gateway_resource.main.id
http_method = "POST"
authorization = "NONE"
}

resource "aws_api_gateway_deployment" "main" {
depends_on = [
"aws_api_gateway_integration_response.main",
"aws_api_gateway_method_response.main",
]
rest_api_id = aws_api_gateway_rest_api.main.id
}

resource "aws_api_gateway_method_settings" "main" {
rest_api_id = aws_api_gateway_rest_api.main.id
stage_name = aws_api_gateway_stage.main.stage_name

# settings not working when specifying the single method
# refer to: https://github.com/hashicorp/terraform/issues/15119
method_path = "*/*"

settings {
throttling_rate_limit = 5
throttling_burst_limit = 10
}
}

resource "aws_api_gateway_stage" "main" {
stage_name = var.stage
rest_api_id = aws_api_gateway_rest_api.main.id
deployment_id = aws_api_gateway_deployment.main.id
}

resource "aws_api_gateway_method_response" "main" {
rest_api_id = aws_api_gateway_rest_api.main.id
resource_id = aws_api_gateway_resource.main.id
http_method = aws_api_gateway_method.main.http_method
status_code = "200"
}

output "endpoint" {
value = "\${aws_api_gateway_stage.main.invoke_url}\${aws_api_gateway_resource.main.path}"
}``````

So let’s break it down.

The `aws_api_gateway_rest_api` represents the project in its entirety.

The `aws_api_gateway_resource` refers to each api route of this project, and there is only 1 in this case.

I have setup only 1 stage environment of `aws_api_gateway_stage` for this project using a Terraform variable. You can setup a different stages to differentiate the staging and production environments.

The `aws_api_gateway_stage` is associated to a `aws_api_gateway_method_settings` that sets the throttling rate of the API to prevent spams and overloading. For the `method_path` property, the wildcard route is used to apply to all routes instead of the only API route that was created. It is trivial in this case, but the explanation for picking this “easy” route is simply due to a bug. It I were to specify the exact route, which is in the form of `{resource_path}/{http_method}`, the settings on the throttling rate will not propagate. It was documented here on github but was not properly resolved. Leaving it here for now.

The `aws_api_gateway_deployment` configures the deployment of the API. Note the `depends_on` attribute that was assigned. This explicit dependency is critical to ensure the deployment is called into effect after all the necessary resources have been provisioned.

The `aws_api_gateway_integration` configuration sets the integration to `lambda proxy` using POST HTTP method without any authorization, as specified by the `aws_api_gateway_method` configuration. Lambda proxy allows us to handle the request from the server like how we would in a typical web application backend framework. The full request object is passed to lambda function and the API Gateway plays no part in mapping any of the request parameters. The API Gateway mapping has great potential to integrate interfaces properly, but for our use case, it is not necessary. I find this article doing a great job in explaining the API Gateway features with easy to consume information and summary, like a gameshark guide book written by the half-blood prince. Do take a look to understand AWS API Gateway better.

The `aws_api_gateway_integration_response` is responsible for handling the response from the lambda function. This is where we can make changes to the headers returned from the lambda function using the `response_parameters` property, which is not used in this case. This is also the place to map and transform the response data from the backend to fit the desired data structure using the `response_templates` property.

The `aws_api_gateway_method_response` is where we can filter what response headers and data from `aws_api_gateway_integration_response` to pass on to the caller.

The transform and mapping of the headers and data from the backend (ie the lambda function) in `aws_api_gateway_integration_response` and the filter of headers and data before passing to the front end in `aws_api_gateway_method_response` is not needed in this sample application. It is just good knowledge to have. There are 2 reasons why we do not need them here.

First, in a bit, we will go through the front end that will make an API call that is a simple request. A simple request does not require a preflight request, which is a API call made by browsers prior to the actual API call, as they are deemed safe since they are using standard CORS-safelisted request headers. In the event that one does need a preflight request because one is not making a simple request, we will need to set up another API route that will transform the headers returned from the backend and allow the relevant headers to be passed on to the front end for this preflight request. This will allow the frontend website to overcome the CORS policy enabled by default in modern browsers. This will mean we need to configure a new set of `aws_api_gateway_rest_api`, `aws_api_gateway_integration`, `aws_api_gateway_method`, `aws_api_gateway_integration_response`, `aws_api_gateway_method_response` just for this preflight request. Things can get complicated here, so I will leave out of this article. If you still to implement CORS, [this gist](https://gist.github.com/keeth/6bf8b67c82f9a085e03ecbb289a859d6) is a good reference.

Second, we are using lambda proxy integration, so the full response from the lambda will be passed to the front end and mapped automatically, provided the response from the lambda code is properly formatted. Refer to this documentation for more details on it.

At last, the `output` resource will print the value of the enpoint of the api for us to integrate in our frontend.

This file contains the details that we will need to setup terraform and the variables we are using. The `provider`‘s `region` attribute here is hardcoded, which should ideally not be the case. I have yet to figure out how to make this dynamic and robust. The name with the `todo-` prefix should be changed to fit the project.

We are using an S3 bucket as the Terraform backend to hold the state of the infrastructure provisioned by Terraform. ​Creation of the bucket will be automated via a script that we will go through during the section on deployment.

``````provider "aws" {
version = "~> 2.24"
region = "eu-west-1"
}

terraform {
required_version = "~> 0.12.0"
backend "s3" {
bucket = "todo-project-tfstate"
key = "terraform.tfstate"
region = "eu-west-1"
}
}

variable "project_name" {
type = string
default = "todo-project"
}

variable "region" {
type = string
default = "eu-west-1"
}

variable "stage" {
type = string
default = "todo-stage"
}

variable "zipfile_name" {
type = string
default = "todo-project.zip"
}

type = number
default = 1
}

variable "dynamodb-write_capacity" {
type = number
default = 1
}``````

## The Application

Here is the application code in written in nodejs. It is a simple write to the dynamodb with basic error handling. It takes in only 1 parameter, that is the email. This code can definitely be improved by allowing more parameters to be written to the database in a dynamic way, so that the same code base can be used for a site that collects the first and last name of the user, as well as another site that collects the date of birth of the user. I will leave that as a future personal quest.

``````// Load the AWS SDK for Node.js
const AWS = require('aws-sdk');

// Set the region
AWS.config.update({region: 'eu-west-1'});

// Create the DynamoDB service object
const ddb = new AWS.DynamoDB({apiVersion: '2012-08-10'});

exports.handler = async (event) => {
console.log(JSON.stringify(event, null, 2));
const params = {
TableName: 'todo-project-dynamodb_table',
Item: {
'email' : {S: JSON.parse(event.body).email}
}
};

// Call DynamoDB to add the item to the table
ddb.putItem(params, function(err, data) {
if (err) {
console.log("Error", err);
} else {
console.log("Success", data);
}
});

try {
const result = await ddb.putItem(params).promise();
console.log("Result", result);
const response = {
statusCode: 204,
"Access-Control-Allow-Origin" : "*",
},
};
return response;
} catch(err) {
console.log(err);
const response = {
statusCode: 500,
"Access-Control-Allow-Origin" : "*",
},
body: JSON.stringify({ error: err.message }),
};
return response;
}
};``````

A thing to note here is the need to return the `Access-Control-Allow-Origin` header in the response. The response also has to follow a particular but straightforward and common format in order for lambda proxy integration with API Gateway. This will map the response properly to the API Gateway method response and be returned to the frontend websites to overcome the CORS policy implemented by modern browsers.

## Deployment

I will be using 3 ruby scripts for deployment related tasks, namely `init.rb`, `apply.rb` and `destroy.rb`, and a helper service object, `get_aws_profile.rb` for the deployment process.

Let’s take a look at them.

## get_aws_profile.rb

``````# get_aws_profile.rb

class GetAwsProfile
def self.call
aws_profile = "todo-aws_profile"

begin
aws_access_key_id = `aws --profile #{aws_profile} configure get aws_access_key_id`.chomp
abort('') if aws_access_key_id.empty?

aws_secret_access_key = `aws --profile #{aws_profile} configure get aws_secret_access_key`.chomp
abort('') if aws_secret_access_key.empty?
rescue Errno::ENOENT => e
abort("Make sure you have aws cli installed. Refer to https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html for more information.")
end

p "AWS_ACCESS_KEY_ID = #{aws_access_key_id}"
p "AWS_SECRET_ACCESS_KEY = #{aws_secret_access_key}"

[aws_profile, aws_access_key_id, aws_secret_access_key]
end
end``````

This is a helper method that will get the `aws_access_key_id` and the `aws_secret_access_key` for usage in the scripts. Note that it uses the aws cli command to attain the keys. Hence, it has to be installed on your local machine prior to running. It also assumes you are using named profile to hold your credentials.

I don’t really like this setup since it requires these prerequisites. But well that can be solved again in the future.

## init.rb

The first script to run is `init.rb`.

The `init.rb` will create the S3 bucket to be used as the terraform backend. Line 20 checks for the presence of this bucket and throws an exception if the bucket does not exist. The rescue block, if triggered, will create the non-existent bucket.

The initialization process on terraform is run via its docker image.

``````require 'bundler/inline'

gemfile do
source 'https://rubygems.org'
gem 'pry'
gem 'aws-sdk-s3', '~> 1'
end

require './get_aws_profile.rb'

aws_profile, aws_access_key_id, aws_secret_access_key = GetAwsProfile.call

s3_client = Aws::S3::Client.new(
access_key_id: aws_access_key_id,
secret_access_key: aws_secret_access_key,
region: 'eu-west-1'
)

begin
bucket: 'todo-project-tfstate',
use_accelerate_endpoint: false
})
rescue StandardError
s3_client.create_bucket(
bucket: 'todo-project-tfstate',
create_bucket_configuration: {
location_constraint: 'eu-west-1'
}
)
end

response = `docker run \
--rm \
--env AWS_ACCESS_KEY_ID=#{aws_access_key_id} \
--env AWS_SECRET_ACCESS_KEY=#{aws_secret_access_key} \
-v #{Dir.pwd}:/workspace \
-w /workspace \
-it \
hashicorp/terraform:0.12.12 \
init`

puts response``````

## apply.rb

Once initialized, the next script to run is `apply.rb`.

Prior to applying the Terraform instructure, the backend code is packaged into a zip file. After application, the zip file is deleted for housekeeping.

``````require 'bundler/inline'

gemfile do
source 'https://rubygems.org'
gem 'pry'
gem 'rubyzip', '>= 1.0.0'
end

require './get_aws_profile.rb'
require 'zip'

aws_profile, aws_access_key_id, aws_secret_access_key = GetAwsProfile.call

folder = Dir.pwd
input_filenames = ['index.js']
zipfile_name = File.join(Dir.pwd, 'todo-project.zip')

File.delete(zipfile_name) if File.exist?(zipfile_name)

Zip::File.open(zipfile_name, Zip::File::CREATE) do |zipfile|
input_filenames.each do |filename|
end
end

response = `docker run \
--rm \
--env AWS_ACCESS_KEY_ID=#{aws_access_key_id} \
--env AWS_SECRET_ACCESS_KEY=#{aws_secret_access_key} \
-v #{Dir.pwd}:/workspace \
-w /workspace \
-it \
hashicorp/terraform:0.12.12 \
apply -auto-approve`

puts response

File.delete(zipfile_name) if File.exist?(zipfile_name)``````

With this, the api is now deployed and can be called from any website. We will go through a sample front end integration in a bit.

## destroy.rb

Once you are done with the project or are in the process of debugging, the destroy script will remove all the resources deployed. It will also remove the S3 backend that was created outside of Terraform.

``````require 'bundler/inline'

gemfile do
source 'https://rubygems.org'
gem 'pry'
gem 'aws-sdk-s3', '~> 1'
end

require './get_aws_profile.rb'

aws_profile, aws_access_key_id, aws_secret_access_key = GetAwsProfile.call

response = `docker run \
--rm \
--env AWS_ACCESS_KEY_ID=#{aws_access_key_id} \
--env AWS_SECRET_ACCESS_KEY=#{aws_secret_access_key} \
-v #{Dir.pwd}:/workspace \
-w /workspace \
-it \
hashicorp/terraform:0.12.12 \
destroy -auto-approve`

puts response

s3_client = Aws::S3::Client.new(
access_key_id: aws_access_key_id,
secret_access_key: aws_secret_access_key,
region: 'eu-west-1'
)

begin
bucket: 'todo-project-tfstate',
use_accelerate_endpoint: false
})

s3_client.delete_object({
bucket:  'todo-project-tfstate',
key: 'terraform.tfstate',
})
s3_client.delete_bucket(bucket: 'todo-project-tfstate')
rescue StandardError
puts "todo-project-tfstate S3 bucket already destroyed."
end``````

## Sample Frontend Integration

``````<!DOCTYPE html>
<html>
<script
src="https://code.jquery.com/jquery-3.4.1.min.js"
integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo="
crossorigin="anonymous"></script>
<body>

<h2>HTML Forms</h2>

<form id="form">
<label for="email">First name:</label><br>
<input type="text" id="email" name="email" value="test@test.com"><br>
<input type="submit" value="Submit">
</form>

<script type="text/javascript">
\$( "#form" ).submit(function(event) {
event.preventDefault();

\$.ajax({
type: "POST",
url: "https://todo-endpoint.execute-api.eu-west-1.amazonaws.com/todo-stage/email",
data: JSON.stringify({
email: \$('#email').val()
}),
success: function(data, textStatus, jqXHR) {
debugger
},
error: function(jqXHR, textStatus, errorThrown) {
debugger
}
});
});
</script>

</body>
</html>``````

Below is a simple html web page that will has the email prefilled for demonstration purpose. The form will submit via `jquery.ajax()` using default settings so as not to trigger the need for preflight request.

You will see that the email will be added to the DynamoDB table, and the logs of the lambda funciton will be recorded in AWS Cloudwatch.

## Conclusion

This exercise helped me understand how lambda is integrated with API Gateway, as well as the immense potential as a robust middleware the latter can be. In addition, I got to understand preflight request and CORS better, as well as the `jquery.ajax()` function.

The project is saved in this repository for future reference.

## How To Restrict File Search In Sublime Based On Project

This is a documentation on how to restrict text search to within specific directories per project you are working on in Sublime.

You may often find it annoying that a simple text search is searching in folders that is not part of your source code. While you can easily flip the switch in the user settings of the sublime text editor, it might not be ideal if you wear multiple hats like me and work on different frameworks.

One framework’s trash may be another’s treasure. Folders that are considered junk in one framework might be important in another. And if we happen to work on these framework together simultaneously, we would have to constantly flip the switch on and off as we jump between working on these projects.

If you are using sublime text because the framework you are working on is simple enough to manage and you do not want the computation-heavy indexing and compilation process to be running in the background constantly, this is an article for you to boost your productivity.

## The User Setting Way

The commonly documented way of configuring your sublime editor is to tweak the configuration option in the user setting. Using restricted file search as an example, we can simple press `CMD + ,` (assuming you are on a mac) to call up the user setting files in the sublime editor.

Next add this setting under the `Preferences.sublime-settings` file as shown:

``````{
...
"binary_file_patterns": [
"node_modules/*",
"public/packs/",
"public/assets/",
"public/packs-test/",
"tmp/*",
"*.jpg",
"*.jpeg",
"*.png",
"*.gif",
"*.ttf",
"*.tga",
"*.dds",
"*.ico",
"*.eot",
"*.pdf",
"*.swf",
"*.jar",
"*.zip"
]
...
}``````

The `binary_file_patterns` option will instruct Sublime to treat these files as binary files. Binary files are not readable by human, hence it is not considered in Goto Anything or Find in Files functions by default.

Line 4 to 8 are folders that we want excluded from our search process.

The rest are referring to specific files base on their extensions.

This setup is what I typically use for my Rails projects.

## The Project-specific Way

The better alternative is to base the setting on the project level so that we do no overlap the settings between projects.

Save the project as a sublime project via File > Save As…

If moving the mouse is not your thing, can you simply create a file in the root directory with the `.sublime-project` extension.

Next, add the same `binary_file_patterns` setting as shown previously under the `folders` key as shown:

``````{
"folders":
[
{
"path": ".",
"binary_file_patterns": [
...
]
}
],
}``````

Line 5, the `path` key is required. This is added by default when we save our project as a sublime project. More information on its function and purpose can be found here.

Next, close sublime. This time, open our project via the sublime project file that you have created.

Make a search now and you will realise that you are no longer searching in the folders that you have no interest in. The search process is also much faster than before because the program is going through less files, and ignoring the bulkier ones.

Hallelujah!

## Housekeeping

Once we created the .`sublime-project` file, another file with the extension `.sublime-workspace` will also be created. The latter contains user specific data and you will not want to share it with other developers who may be working on the same source code as you. Add this file to our <code>.gitignore</code> file to achieve this.

## Setup Bootstrap In Rails 6 With Webpacker For Development And Production

This is a documentation on how to setup `Bootstrap 4` in `Rails 6` using `Webpacker`. As the framework shifts away from `sprockets` and the asset pipeline to embrace the dominating methodology of handling frontend affairs in the `Javascript` world that is `webpack`, we have to adapt along.
The way to setup a css framework to bootstrap your application has undergone a revamp, and this article seeks to cover the essential steps to set it up.

## Pre-requisites

This article will assume you have set up all the required tools required for a typical Rails 6 application.

The main extra tool you will need as compared to previous versions of Rails is the `yarn` package manager. You can install `yarn` on your computer via various ways based on your preference and your OS.

## Setting Up Bootstrap

With the shift in paradigm of handling front end assets, we no longer install front end libraries using gems. In the past, these gems are merely wrappers around the Javascript libraries and files which present a number of problems.

First, the latest changes in the `Javascript` world will take some time to propagate into the Rails realm.

Second, having an intermediate wrapper increase the potential points of failure during the wrapping process.

Third, we are really dependent on the angels who are working on these wrappers. If they do not update the gems frequently, we are stuck with the old features. This can be frustrating if you are waiting for a certain bug fix or a new feature that is already available in the latest release.

To install `bootstrap`, run this command.

``yarn add bootstrap jquery popper.js``

This command will automatically install the latest `bootstrap` package in the yarn registries and add its dependency entry and version in your `package.json` file. `Jquery` and `popper.js` are libraries that `bootstrap` depends on, especially in their `Javascript` department.

## The JS And CSS Files

The main `Javascript` file, `application.js` should now reside in the `app/javascript/packs` folder. This is because `Webpacker` will now look for all the javascript files in this directory to compile. This is the default setting for `Webpacker`.

Of course, you can go ahead and change the configuration to your liking. However, keep in mind that `Rails` promotes convention over configuration. This implies that as much as possible, methodologies and practices should follow a certain default unless absolutely necessary. this has multiple advantages. My favorite one is the portability of code among fellow `Rails` developers. Developers can easily understand the flow of logic and where to find bugs because they are where are expected to be. This cuts down the development time and cost greatly.

The `application.js` file should look like this:

``````require("@rails/ujs").start()
require("@rails/activestorage").start()
require("channels")
require("bootstrap")

// stylesheets
require("../stylesheets/main.scss")``````

Line 1 to 4 are the default files already present in the file.

Line 5 adds the `Bootstrap` `Javascript` library.

Line 8 adds your custom stylesheet. Now, this file can be placed anywhere. In the above example, the path is relative to where the `application.js` file is. Hence, the file is placed in `app/javascript/stylesheets/main.scss` in this example.

Next, we import the `Bootstrap` stylesheet files in the main stylesheet file.

``@import "bootstrap/scss/bootstrap";``

Note that we are importing files from the `node_modules` folder, and not a `bootstrap` folder placed in the relative path of the current directory of the main stylesheet file.

Also, you do not need the `~` in front of the path to signify that it is from the `node_modules` folder like you would usually do for other non-Rails project using `webpack`. The tilde alias in webpack is a default `webpack` configuration that will resolve to the `node_modules` folder. While it will still work here, it is not required as the `node_modules` folder is already configured as part of the search paths that `webpack` will look for when resolving the modules.

Now, you may be wondering how to the `Bootstrap` libraries will work without importing any of its dependencies, that are `popper.js` and `Jquery`. We will come to that in a minute. Before that, let’s look at the views.

## The Views

Now, we will need to add the javascript and stylesheets files into the page. Following convention in this example, we will add to the `application.html.erb` layout so that the `Bootstrap` framework can be accessed in all pages. These lines of code are added in the `head` section of the layout template.

``````<%= stylesheet_pack_tag 'application' %>

There are a  number of things that are different from the old implementation.

Line 1 adds the compiled stylesheets path that `webpacker` will compile. Note that this only happens if the `extract_css` option is set to `true` in the `webpacker.yml` file. More about this later.

As you can see, there is no more `stylesheet_include_tag`. In the past, this helper method will get files from the `public/assets` folder, into which the asset pipeline will compile stylesheets and javascript files with other added pre and post processing. Now, everything is going to be done by `Webpack`.

Here what’s happening.

`Webpack` will look at `application.js` and find the stylesheet files that are included in it. Then, using a combination of `Webpack` loaders, `Webpack` will know how to compile and translate the `scss` syntax, the url paths of assets used etc. into a `css` file that the browser can read and implement its styling.

These `Webpack` loaders are already included by the `Webpacker` and its configurations set up. However, there are many loaders out there that are not included by default. They tend to be less used conventionally and will require manual intervention from your side.

One example is using ruby code inside your `javascript` files. This requires the `rails-erb-loader` that will “teach” `Webpack` to understand the `erb` syntax. The implementation involves a number of steps, one of which is to append this loader to the `Webpack` `environment.js` configuration file. Thankfully, for this case, the community has deemed it a pretty common use case that there is, at least, a rake task that comes together with the `Webpacker` gem to set this up easily.

The compilation process mentioned above, however, is not applied in the development environment by default. This is due to the `extract_css` settings in the `webpacker.yml` page. More about this and its implications in a bit.

Note that `stylesheet_include_tag` still works for assets you place in the `app/assets` folder. However, while that is true, as `Rails` moves away from the old `Sprockets` and assets pipeline convention, this is expected to become deprecated in the future.

## The Webpacker Configuration File

Lastly, we need to add the dependencies of `bootstrap`. This takes place in the `config/webpack/environment.js` file.

``````const { environment } = require('@rails/webpacker')
const webpack = require('webpack')

environment.plugins.append('Provide', new webpack.ProvidePlugin({
\$: 'jquery',
jQuery: 'jquery',
Popper: ['popper.js', 'default']
}))

module.exports = environment``````

As you can see, we are utilising the `ProvidePlugin` function of `Webpack` to add the dependency libraries in all the javascript packs instead of having to import them everywhere.

This is just an example of how we can import files with `Webpack` in `Rails`. And in this case, especially for `jQuery`, it makes a lot of sense as there is a high chance that we will be using it in other javascript files.

Coincidentally, this is how `jQuery` and `popperjs`, which are dependencies of the `bootstrap` library, are made available for the `bootstrap` library to use them.

## The extract_css Option

There is one last point I would like to touch on. That is the `extract_css` option in the `config/webpacker.yml` file.

When set to `true`, `webpack` will compile the stylesheet files that were imported into the javascript files into external standalone stylesheets. These compiled files will then be added into the views via the `stylesheet_pack_tag` helper method as mentioned earlier.

In comparison, when set to `false`, the stylesheets are not compiled into standalone files. Instead, they are added into the view as a blob during runtime by the the relevant javascript file. This takes place only after the javascript file has been completely downloaded by the browser.

In development mode, the conventional setting for the `extract_css` option is `false`, and this has quite a significant implication on how the website will behave.

One, there might be a flash of unstyled content (FOUC) when the page loads because the `javascript` files are loaded asynchronously. This is unlike the `css` files which are blocking resources that will pause the rendering of the website until the file has been downloaded. This asynchronous loading of files allows the website to continue rendering while it waits for itself to be completely downloaded before computing the `css` blob and insert it into the html source code. If the web page loads before this occurs, the style for the web page is not present, and FOUC will thus occur.

Two, the `stylesheet_pack_tag` is not needed in the development environment using the default setting. Things will seem to work fine only until it is pushed into the production environment where the `extract_css` option is set to `true`, desirably and by default.

So make sure to add the `stylesheet_pack_tag` helper, but only if your javascript is going to compile a stylesheet and your page is reliant on it. If not, you are in for a surprise when it gets pushed to production.

## Conclusion

At this point of time, the application should be running with `Bootstrap` in place. Do test out how it will differ in the production environment as compared to development.

## JWT With Refresh Token Using Devise And Doorkeeper Without Authorization

This is a documentation on setting up the authentication system of a rails project in a primarily `API` environment.

Rails is essentially a framework for bootstrapping applications on the web environment. The support for `APIs` is thus lacking. One aspect of it is an off the shelf authentication system that can fit both the `API` and web environment on the same monolith application.

The `Devise` gem, while hugely popular and has established itself as the de facto authentication gem in the Rails world, does not come supported with an authentication system fit for interaction via `APIs`. The main reason is because it relies on cookies, which is strictly a browser feature.

To overcome this, often, we have to use other gems to couple with it to leverage on its scaffolded features for user authentications.

In this article, we will use the `Doorkeeper` and `Devise` combination to provide an authentication using JSON Web Tokens (JWT), the modern day best practices for authentication via `APIs`.

But let us first understand what kind of authentication system we are building and why we choose `Doorkeeper`.

## The Example Authentication System​

Now, as a disclaimer, there are many ways to setup an authentication system.

One such consideration is the `devise-jwt` gem, which serves as a direct replacement to the cookies for your `APIs`. It is simple to implement and allows you to choose from multiple strategies to expire your token. Except that it does not come with a refresh token.

This implies that the token will expire and the user will have to login again. If your application requires such security, you can consider this gem instead.

However, in this article, the authentication system that I will like to set up is one that allows user to log in via `JWT` that will expire, and upon expiry, the front end can use the refresh token to get a new `JWT` without having the user to login again. This allows the user to stay logged in without compromising security excessively.

Why do we need to ensure the `JWT` expires?

## Security Considerations Using JWT

Allowing user to be logged in permanently is kind of the standard user flow for many applications nowadays. The easiest way to implement this is to not expire the JWT. However, that is a recipe for disaster. It is akin to passing your password around when making `API` requests. And the moment it gets compromised, malicious attackers can have all the time in the world to explore your account and even plan their attacks, and leaving the users all the time in the world to say their prayers.

We thus have to enforce expiry on the `JWT` at the very least. To accomplish that without forcing the user to have to login again is to use a refresh token.

A refresh token stays in the local machine for the whole of it lifetime, or until the user actively logs out. This allows that the access token, which is dispatched out into the wild wild west otherwise known as the Internet, can at least expire within a certain period of time. And when it expires, the front end can use the refresh token to get a new access token to allow the user to continue its current session as though he or she is still logged in. So even if the access token gets compromised in the world beyond the walls, the potential damage is reduced.

This mechanism is made into a standard known as Oauth. There are many libraries out there that implements this already, and it is widely adopted among many of the software products that we use like `Google account`, `facebook` and `twitter`.

However, while this works with authenticating with these external providers, it has a crucial requirement that we do not want when implementing our own in house authentication system (I am referring to the old school email and password login). That step is the authorization step.

Some of us may have come across such a  request when we try to sign up with an app via Facebook, as shown below:

While this feature is absolutely essential in the `OAuth` protocol. it presents an awkwardness when we want to leverage on the `OAuth` libraries to implement JWT with refresh token for our in house authentication.

## The Awkwardness Of OAuth

Just make sure we are on the same page, here are a summary of the points that led up to this awkwardness.

First, we need to make the tokens expire for security reasons.

Second, refresh token are here to the rescue, and they are used in the `OAuth` protocol.

Third, unfortunately, `OAuth` requires an authorization step, which in house authentication system do not need.

Last, we cannot leverage on the various OAuth implementation out there to implement a JWT with refresh token without having to hack these libraries and somehow sidestep the authorization step.

## Hacking Doorkeeper

The `OAuth` library that we will be using is `Doorkeeper`. Its wiki page already has a section on skipping the authorization step, which certainly signals the demand for such an implementation. However, there are some points missing from this implementation and this article will try to cover more of them. These steps are highly influenced by this blog post.

First, install `doorkeeper` and its migration files, following its instructions.

``````rails g doorkeeper:install
rails g doorkeeper:migration``````

## Changes To The Migration Files

Edit the migration file like this.

``````# frozen_string_literal: true

class CreateDoorkeeperTables < ActiveRecord::Migration[6.0]
def change
create_table :oauth_access_tokens do |t|
t.references :resource_owner, index: true
t.integer :application_id
t.text :token, null: false
t.string :refresh_token
t.integer :expires_in
t.datetime :revoked_at
t.datetime :created_at, null: false
t.string :scopes
end

# required to allow model.destroy to work
create_table :oauth_access_grants do |t|
t.references :resource_owner, null: false
t.integer :application_id
t.string   :token, null: false
t.integer  :expires_in, null: false
t.text     :redirect_uri, null: false
t.datetime :created_at, null: false
t.datetime :revoked_at
t.string   :scopes, null: false, default: ''
end

# Uncomment below to ensure a valid reference to the resource owner's table
end
end``````

Compared to the original generated copy of the migration file, we have removed the `oauth_applications` table which refers to the application that we want to grant permission to in the authorization step. Since we are skipping the authoirzation, there is no need to have this unused table.

Next we have changed

``t.references :application, null: false``

into

``t.integer :application_id``

Since the table is no longer present, we cannot use the `references` helper, and need to resort to specifying the the basic data type. We are still keeping this column in the database although we have deleted the `application` table because `Doorkeeper` uses this attribute while running its operation. Without it, an error will occur along the lines of “`column not found`“.

In fact, we also do not need the `oauth_access_grants` table, which is the bridge between the `oauth_access_tokens` table and the `oauth_applications`. It records which token authorized which application. However, without it, an error will be thrown when destroying a user record from the database. If you do not have such a feature, feel free to remove this table as well.

Lastly, only keep the foreign key implementation on `oauth_access_tokens` and change the model name according to whatever you have named your model.

## Changes To The Initializer File

Edit the configuration in the `doorkeeper` initializer file as such:

``````# frozen_string_literal: true

Doorkeeper.configure do
...
resource_owner_from_credentials do |routes|
user = User.find_for_database_authentication(email: params[:email])
request.env['warden'].set_user(user, scope: :user, store: false)
user
end
...
use_refresh_token
...
...
skip_authorization do
true
end
...
api_only
base_controller 'ActionController::API'
end``````

We are essentially following this documentation on their wiki, but with some additional content and some slight changes, to implement an authentication flow whereby the token is returned in exchange for the credentials of the resource owner, in this case the user’s email and password.

Line 5 to 9 is the main implementation.

On line 6, we are instructing `Doorkeeper` to use `Devise` method, `find_for_database_authentication`, for authenticating the correct user. This method will run use the underlying `warden` gem in `Devise` to do its authentication magic. This, however, will save the user in the session, which can be a problem when we check for sessions in the controller level. More on this later. We undo this in line 7.

On line 7, we instruct `warden` to set the user only for the request and not store it in the session, as documented here.

On line 11, uncomment `use_refresh_token` to ensure a refresh token is generated on login.

Line 13 is for older version of `Doorkeeper` at 2.1+. More information in the above mentioned wiki page.

Line 15 to 17, we instruct `Doorkeeper` to skip the authorization step.

Line 19, we set mode to `api_only`. This can help to optimize the application to a certain extent. For example, it skips forgery protection checks that is not necessary in an `API` environment, which reduces computational requirement and latency.

Line 20, I am just explicitly setting the base controller to use `ActionController::API` instead of the default `ActionController::Base`, although this should have already been implemented when the mode is set to `api_only`.

## Controller Level

`Devise` comes with a helper method, `current_user` or whatever your model name is, to access the current authenticated resource. This, however, will return a `nil` value in the current implementation because the underlying method will not be working. The underlying method is, taken from the source code:

``````def current_#{mapping}
@current_#{mapping} ||= warden.authenticate(scope: :#{mapping})
end``````

With reference to this stackoverflow answer, we will modify it to look like this:

``````def current_user
@current_user ||= if doorkeeper_token
User.find(doorkeeper_token.resource_owner_id)
else
warden.authenticate(scope: :user, store: false)
end
end``````

We have essentially overwritten the default implementation by `Devise` to check for the “`current_user`” using the `doorkeeper_token` first, and fallback on the default implementation. The fallback will be useful in the event where our application will still be using the traditional login methods via a web browser. Feel free to remove it if you are not going to have such any request coming from a web browser. And of course, remember to handle the scenario of a `nil` `doorkeeper_token`.

Last but not least, implement that authorization check at the correct routes and actions in the `Doorkeeper::TokensController` via the `before_action` callback like how you would when using just `Devise` alone.

``before_action :doorkeeper_authorize!``

## Custom Controller

I personally have some custom code that I want to add to all my APIs so that when the frontend consumes my APIs, they will not be left stunned by responses having different `JSON` structure.

I keep a `response_code` and a `response_message` in all my APIs for the frontend to react accordingly and trigger the desired UX flow.

Here is how I modify my controller. Let’s start off with some modification to the `Doorkeeper` modules.

``````module Doorkeeper
module OAuth
class TokenResponse
def body
{
# copied
"access_token" => token.plaintext_token,
"token_type" => token.token_type,
"expires_in" => token.expires_in_seconds,
"refresh_token" => token.plaintext_refresh_token,
"scope" => token.scopes_string,
"created_at" => token.created_at.to_i,
# custom
response_code: 'custom.success.default',
response_message: I18n.t('custom.success.default')
}.reject { |_, value| value.blank? }
end
end
end
end``````

Here, I modify the response from `Doorkeeper` to add in my required keys. I am using I18n to handle the custom messages and prepare the application for a global audience.

Next, the error response. By default, `Doorkeeper` returns the keys `error` and `error_description`. That is different from what I want. I will overwrite it totally.

``````module Doorkeeper
module OAuth
class ErrorResponse
# overwrite, do not use default error and error_description key
def body
{
response_code: "doorkeeper.errors.messages.#{name}",
response_message: description,
state: state
}
end
end
end
end``````

`name`, `description` and `state` are accessible variables in the default class. I integrate them into my custom API response for standardization purpose.

Now the controller. There are 3 main methods: `login`, `refresh` and `logout`. Let’s go through them.

``````module Api
module V1
class TokensController < Doorkeeper::TokensController
before_action :doorkeeper_authorize!, only: [:logout]

user = User.find_for_database_authentication(email: params[:email])

case
response_code = 'devise.failure.invalid'
render json: {
response_code: response_code,
response_message: I18n.t(response_code)
}, status: 400
when user&.inactive_message == :unconfirmed
response_code = 'devise.failure.unconfirmed'
render json: {
response_code: response_code,
response_message: I18n.t(response_code)
}, status: 400
when !user.active_for_authentication?
create
else
create
end
end

def refresh
create
end

def logout
# Follow doorkeeper-5.1.0 revoke method, different from the latest code on the repo on 6 Sept 2019

params[:token] = access_token

revoke_token if authorized?
response_code = 'custom.success.default'
render json: {
response_code: response_code,
response_message: I18n.t(response_code)
}, status: 200
end

private

def access_token
pattern = /^Bearer /
end
end
end
end``````

Firstly, I am applying the `doorkeeper_authorize!` callback on the `logout` method only as that is the only method that will require the user to be logged in.

The `login` method will largely follow what we defined in the initializer file under the `resource_owner_from_credentials` block. The modification here is to define specific error scenarios and their respective `response_code` here. For those scenarios that are of no interest to me, I will leave it to the catch-all case and and return what is now the default modified `ErrorResponse`.

The second case in particular is specific to my project. I allow admin users to create the users, and have a flag (`created_by_admin_and_authenticated`) to differentiate them.

• `nil` means the user registered normally
• `false` means they are created by the admin user, but have yet to authenticate with the email that our server sent out to them
• `true` means they are created by admin user and have also authenticated their email address

I will force users who are created by admin users but have yet to authenticate via email to reset their password, leveraging on what `Devise` has already provided with its password module.

Note: this is definitely much to be optimized here. For example, the `find_for_database_authentication` method is being called twice here for a successful user login, once in this custom controller and the other in the default `Doorkeeper::TokensController` `create` method.

The `refresh` method to refresh the `access_token` is practically the same as the default `create` method, but I am overriding it here because I use ApiPie to add documentation to the routes. For those who do not use ApiPie, we define its required parameters, headers etc. above the line 31 to define the documentation for the `refresh` method. I also can rename the route in doing so to create an API that the front end developers that I am working with would find more familiar with.

The `logout` method makes use of the `revoke_token` method, according to its source code, to revoke the JWT.

In my application, I require my frontend to add the JWT token in the `Authorization` header instead of a parameter in the request body based on convention. `Doorkeeper`, on the other hand, expects the token to be present in the `params`. To overcome this, I created the custom `private` `access_token` method to get the token in the header that the front end has placed in their requests. That token is then placed in the `params` object behind the key named `token` as `Doorkeeper` would have expected. `Doorkeeper` can then do its thing without having to modify any of its internal workings.

Since the `revoke_token` method provided by `Doorkeeper` will make use of the `token` key in the `params`, I will first use the `private` `access_token` method to extract the JWT token from the `Authorization` header. Then add it as the value to the `token` key of the `params` variable.

The `logout` method is required for the front end to dispose of the current access token they have for security purposes. I also use it to remove the users’ devices token so that they do not receive push notifications after logging out.

``````{
"email": "user1@test.com",
}``````

A login request will have these keys. In particular, the `grant_type` strategy used should be `password`.

## Conclusion

You should be able to login with the correct credentials with the default `Doorkeeper::TokensController` and access your controllers with the correct resource, just like how you would when using `Devise` alone. Otherwise, you can use your custom controller inherit and customise the authentication routes, as I have demonstrated.

## How To Setup A Standard AWS VPC With Terraform

This is a documentation on how to setup the standard virtual private network (VPC) in AWS with the basic security configurations using Terraform.

In general, I classify the basics as having the servers and databases in the private subnets, and having a bastion server for remote access. There is definitely much room to improve from this setup and certainly much more in the realms beyond my knowledge. However, as a start, this is, at the very least, essential for a production environment,

Personally, I have an Amazon Certified Solutions Architect (Associate) certificate to my name, but like most of the engineering university graduates out there who have forgotten how to do `dy/dx` or  what the hell is the L’Hôpital’s rule, I have all but forgotten the exact steps to recreate such an environment.

As a saving grace 😅, I should say that I do know how to set it up, just that I do not have it at the tip of my fingers. I would not get it right the first time, but given time I will eventually set it up correctly.

This is true for whenever I setup an environment for new projects. Debugging the setup which can be time consuming and frustrating. It is not efficient and is probably one of the key reasons why infrastructure as code (IaC) has become a trending topic in recent years.

Provisioning these infrastructures using code implies:

• version control on code and, in turn, infrastructural changes made by members of the development team
• easily reproducible infrastructures
• automation

One of the frontrunners in this industry is Terraform. All that is required are the configurations written in files ending with the “`tf`” extension placed in the same directory.

## The VPC

Start by provisioning the VPC.

We set the `CIDR` block to provide the maximum number private ip addresses that an AWS VPC allows. This implies that you can have up to 65,536 AWS resources in your VPC, assuming each of them require a private IP address for communication purpose.

``````resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16" # 65536 ip addresses

tags = {
Name = "\${var.project_name}\${var.env}"
}
}``````

The variables `project_name` and `env` can be placed in a separate `.tf` as long as they are in the same directory when Terraform eventually runs to apply the changes.

## The Gateways

Next, we setup the Internet gateway (IGW) and NAT gateway (NGW).

The IGW allows for resources in the public subnets to communicate with the outside Internet.

The NGW does the same thing,  but for the resources in the private subnets. Sometimes, these resources need to download packages from the Internet for updates etc. This is in direct conflict with the security requirements that placed them in the private subnets in the first place. The NGW balances these 2 requirements.

``````# IGW
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id

tags = {
Name = "\${var.project_name}\${var.env}"
}
}

resource "aws_route_table" "igw" {
vpc_id = aws_vpc.main.id

tags = {
Name = "igw-\${var.project_name}\${var.env}"
}
}

resource "aws_route" "igw" {
route_table_id = aws_route_table.igw.id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}

# NGW
resource "aws_route_table" "ngw" {
vpc_id = aws_vpc.main.id

tags = {
Name = "ngw-\${var.project_name}\${var.env}"
}
}

resource "aws_route" "ngw" {
route_table_id = aws_route_table.ngw.id
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main.id
}

### NOTE ###
resource "aws_eip" "nat" {
vpc = true
}

resource "aws_nat_gateway" "main" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public-ap-southeast-1a.id

tags = {
Name = "\${var.project_name}\${var.env}"
}
}``````

Both gateways need to be associated to their respective `aws_route_table` via an `aws_route` that will route out to everywhere on the Internet, as indicated by the `0.0.0.0/0` `CIDR` block.

The NGW requires some additional setup.

First, a NAT gateway requires an elastic IP address due to the way it is engineered. I would not pretend I know how it works to tell you why a static IP address is required, but I do know we can easily provision using Terraform.

This static IP address will also come in useful if your private instances need to make API calls to third party sources that require the instances ip address for whitelisting purpose. The outgoing requests from the private instances will bear the ip address of the NGW.

In addition, a NAT gateway needs to be placed in one of the the public subnet in order to communicate with the Internet. As you can see, we have made an implicit dependency on the `aws_subnet` which we will define later. Terraform will ensure the NAT gateway will be created after the subnets are setup.

## The Subnets

Now, let’s setup the subnets.

We will setup 1 public and 1 private subnet in each availability zones that the region provides. I will be using the `ap-southeast-1` (Singapore) region. That will be a total of 6 subnets to provision as there are 3 subnets in this region.

``````#### public 1a
resource "aws_subnet" "public-ap-southeast-1a" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.100.0/24"
availability_zone_id = "apse1-az2"

tags = {
Name = "public-ap-southeast-1a-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "public-ap-southeast-1a" {
subnet_id = aws_subnet.public-ap-southeast-1a.id
route_table_id = aws_route_table.igw.id
}

#### public 1b
resource "aws_subnet" "public-ap-southeast-1b" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.101.0/24"
availability_zone_id = "apse1-az1"

tags = {
Name = "public-ap-southeast-1b-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "public-ap-southeast-1b" {
subnet_id = aws_subnet.public-ap-southeast-1b.id
route_table_id = aws_route_table.igw.id
}

#### public 1s
resource "aws_subnet" "public-ap-southeast-1c" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.102.0/24"
availability_zone_id = "apse1-az3"

tags = {
Name = "public-ap-southeast-1c-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "public-ap-southeast-1c" {
subnet_id = aws_subnet.public-ap-southeast-1c.id
route_table_id = aws_route_table.igw.id
}

#### private 1a
resource "aws_subnet" "private-ap-southeast-1a" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone_id = "apse1-az2"

tags = {
Name = "private-ap-southeast-1a-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "private-ap-southeast-1a" {
subnet_id = aws_subnet.private-ap-southeast-1a.id
route_table_id = aws_route_table.ngw.id
}

#### private 1b
resource "aws_subnet" "private-ap-southeast-1b" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.2.0/24"
availability_zone_id = "apse1-az1"

tags = {
Name = "private-ap-southeast-1b-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "private-ap-southeast-1b" {
subnet_id = aws_subnet.private-ap-southeast-1b.id
route_table_id = aws_route_table.ngw.id
}

#### private 1c
resource "aws_subnet" "private-ap-southeast-1c" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.3.0/24"
availability_zone_id = "apse1-az3"

tags = {
Name = "private-ap-southeast-1c-\${var.project_name}\${var.env}"
}
}

resource "aws_route_table_association" "private-ap-southeast-1c" {
subnet_id = aws_subnet.private-ap-southeast-1c.id
route_table_id = aws_route_table.ngw.id
}
``````

Amidst this long snippet of configuration for the subnets, it is essentially a repeat of the same resources association.

For the public subnets, they are assigned the `CIDR` blocks `10.0.1.0/24``10.0.2.0/24` and `10.0.3.0/24` respectively. Each will have up to 256 ip addresses to house 256 AWS resources that requires an ip address. Their addresses will be from, taking the first subnet as example, `10.0.1.0` to `10.0.1.255`.

For the private subnets, they occupy the `CIDR` blocks `10.0.101.0/24`, `10.0.102.0/24` and `10.0.103.0/24` respectively.

To be exact, there will be less than 256 addresses per subnet as some private IP addresses are reserved in every subnet. Of course, you can provision more or less ip addresses per subnet with the correct subnet masking setting.

Each subnet is associated to different availability zones via the `availability_zone_id` to spread out the resources across the region.

Each public subnet is also associated to the `aws_route_table` that is related to the IGW, while each private subnet is associated to the `aws_route_table` related to the NGW.

## The Database

Next, we setup the database. We will provision the database using `RDS` and place it in the private subnets for security purpose.

At this point of time, I must admit that I do not know if this is the best way to setup the database. I personally have a lot of questions on how the infrastructure will change when the application scales eventually, especially for the database. How will the database be sharded into different regions to serve a global audience? How do the database sync across the different regions? These are side quests that I will have to pursue in the future.

For now, a single instance in a private subnet.

``````resource "aws_db_instance" "main" {
allocated_storage = 20
storage_type = "gp2"
engine = "mysql"
engine_version = "5.7"
instance_class = "db.t2.micro"
identifier = "rds-\${var.project_name}\${var.env}"
name = "something"

skip_final_snapshot = false
# notes time of creation of rds.tf file
final_snapshot_identifier = "rds-\${var.project_name}\${var.env}-1573454102"

vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.main.id

lifecycle {
prevent_destroy = true
}

tags = {
Name = "rds-\${var.project_name}\${var.env}"
}
}

resource "aws_db_subnet_group" "main" {
name = "db-private-subnets"
subnet_ids = [
aws_subnet.private-ap-southeast-1a.id,
aws_subnet.private-ap-southeast-1b.id,
aws_subnet.private-ap-southeast-1c.id
]

tags = {
Name = "subnet-group-\${var.project_name}\${var.env}"
}
}``````

As you can see, we can see and review the full configuration for the database using code as compared to having to navigate around the AWS management console to complete the puzzle. We can easily know the size of the database instance we have provisioned as well as its credentials (Ok this is debatable if we want to commit sensitive data in our code).

In this configuration, I ensured that the database will produce a final snap shot in the event it gets destroyed.

Access to the database will be guarded by an `aws_security_group` that will be defined later.

The database is also associated to the `aws_db_subnet_group` resource. This resource consist of all the private subnet that we provisioned. This creates an implicit dependency on these subnets, ensuring that the database will only be created after the subnets are created. This would also tell AWS to place the database in the custom VPC that the subnets exist in.

I also ensured the database will not be destroyed by Terraform accidentally using the `lifecycle` configuration.

## The Bastion

The bastion server allows us to access the servers and the database instance in the private subnets. We will provision the bastion inside the public subnet.

``````resource "aws_instance" "bastion" {
ami = "ami-061eb2b23f9f8839c"
instance_type = "t2.nano"
subnet_id = aws_subnet.public-ap-southeast-1a.id
vpc_security_group_ids = ["\${aws_security_group.bastion.id}"]
key_name = aws_key_pair.main.key_name

tags = {
Name = "bastion-\${var.project_name}\${var.env}"
}
}

resource "aws_key_pair" "main" {
key_name = "\${var.project_name}-\${var.env}"
public_key = "ssh-rsa something"
}

output "bastion_public_ip" {
value = aws_instance.bastion.public_ip
}``````

I am using a Ubuntu-18.04 LTS image to setup the bastion instance. Note that the AMI id will differ from region to region, even for the same operating system. The image below shows the difference in the AMI id between Singapore and Tokyo regions.

I will mainly use the bastion to tunnel the commands to the private subnet. Hence, there is no need for a large computation. The cheapest and smallest instance size of `t2.nano` is chosen.

It is associated to a public subnet that we created. Any subnet will work, but make sure it is public as we need to be able to connect to it.

Its security group will be defined later.

All `EC2` instances in AWS can be given an `aws_key_pair`. We can generate a custom private key using the `ssh-keygen` command or you can use the default ssh key in your local machine so that you can ssh into the bastion easily without having to define the identity file each time you do so.

Then, there is the `output` block. After Terraform has completed its magic, it will output values defined in these output blocks. In this case, the public ip address of the bastion server will be shown on the terminal, making it easy for us to obtain the endpoint.

## The Security Groups

Lastly, the connection is not completed without setting up the security groups that guards the traffic going in and out of the resources. This was the bane of my AWS Solution Architect journey. With the required configurations spelled out in code instead of steps in the console that exist only in the memory, Terraform has helped me greatly to further understand this feature.

There are a total of 3 `aws_security_group` resources  to be created, representing the bastion, the instances and the database respectively. Each of them have their own set of inbound and/or outbound rules, named “ingress” and “egress” in Terraform terms, that are configured separately.

While you can configure the inbound and outbound rules together within the resource block of the respective `aws_security_group`, I would recommend against that. This is because doing so will result in tight coupling between the security groups, especially if one of its `aws_security_group_rule` is pointing to another `aws_security_group` as the source. This is problematic when we eventually make changes to the security groups because, for example, maybe one cannot be destroyed because a security group that is it dependent on is not supposed to be destroyed.

And the frustrating thing is that Terraform, or maybe the underlying AWS api, do not indicate the error. In fact it takes forever to destroy security groups that are created this way, only to fail after making us wait for a long time, which makes debugging superfluously tedious.

There are many issues mentioning this and something related on Github, like this. This has to do with has been termed “enforced dependencies” that Terraform currently has no mechanism to handle.

By decoupling the `aws_security_group` and their respective `aws_security_group_rule` into separate resources, we will give Terraform and ourselves an easier time removing and making changes to the security groups in the future.

### Bastion

Let’s see how we can configure Terraform setup the security of the subnets. We start off with the security group for the bastion server. We will make 3 rules for it.

``````# bastion
resource "aws_security_group" "bastion" {
name = "\${var.project_name}\${var.env}-bastion"
description = "For bastion server \${var.env}"
vpc_id = aws_vpc.main.id

tags = {
Name = "\${var.project_name}\${var.env}"
}
}

resource "aws_security_group_rule" "ssh-bastion-world" {
type = "ingress"
from_port = 22
to_port = 22
protocol = "tcp"
# Opening to 0.0.0.0/0 can lead to security vulnerabilities
# You may want to set a fixed ip address if you have a static ip
security_group_id = aws_security_group.bastion.id
cidr_blocks = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "ssh-bastion-web_server" {
type = "egress"
from_port = 22
to_port = 22
protocol = "tcp"
security_group_id = aws_security_group.bastion.id
source_security_group_id = aws_security_group.web_server.id
}

resource "aws_security_group_rule" "mysql-bastion-rds" {
type = "egress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.bastion.id
source_security_group_id = aws_security_group.rds.id
}``````

The first is an ingress rule to allow us to `ssh` into it from wherever we are. Of course, this is not ideal as it means anyone from anywhere can ssh into it. We should scope it to the ip address where you work from, be it your home or your office. However, for my case, as a digital nomad, the ip address that I work with just changes so often as I moved around that it just makes more sense to open it up to the world. I made a calculated risk here. Please don’t try this at home.

The second is an egress rule that allow the bastion instance to `ssh` into the web servers in the private subnets. The source of this rule is set as the `aws_security_group` of the web servers.

The third rule is another outbound rule  to allow the bastion to communicate with the database. Since I am using <code>mysql</code> as the database engine, the port used is 3306. This allows us to run database operation on the isolated database instance in the private subnet via the bastion over the correct port securely.

### Web Servers

Next will be the security groups for your web servers. The only rule that it requires will be the ingress rule for the bastion to `ssh` into itself over port 22.

``````resource "aws_security_group" "web_server" {
name = "\${var.project_name}\${var.env}-web-servers"
description = "For Web servers \${var.env}"
vpc_id = aws_vpc.main.id

tags = {
Name = "\${var.project_name}\${var.env}"
}
}

resource "aws_security_group_rule" "ssh-web_server-bastion" {
type = "ingress"
from_port = 22
to_port = 22
protocol = "tcp"
security_group_id = aws_security_group.web_server.id
source_security_group_id = aws_security_group.bastion.id
}``````

### RDS

Lastly, the `rds` instance. It consist of 2 rules.

``````resource "aws_security_group" "rds" {
name = "rds-\${var.project_name}\${var.env}"
description = "For RDS \${var.env}"

vpc_id = aws_vpc.main.id
tags = {
Name = "\${var.project_name}\${var.env}"
}
}

resource "aws_security_group_rule" "mysql-rds-web_server" {
type = "ingress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.rds.id
source_security_group_id = aws_security_group.web_server.id
}

resource "aws_security_group_rule" "mysql-rds-bastion" {
type = "ingress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.rds.id
source_security_group_id = aws_security_group.bastion.id
}``````

The first is of course to open up port 3306 to allow request from the web servers to reach the database to run the application.

The second is to allow the bastion to communicate over port 3306. We have to define the egress rule applied on the bastion server itself to connect out to the `RDS` instance previously. Now, this ingress rule will allow the incoming request from the bastion server to reach the `RDS` instance instead of being blocked off.

## Terraform Apply

These resources can be defined in a single or multiple terraform files with the extension `tf`, as long as they are in the same directory.

If you are using `docker` to run `terraform`, you can do a volume mount of the current directory into the workspace of the `docker` container and apply the infrastructure!

## Improvements

We can harden the security of this setup further by, for example, configuring the Network Access Control Level (NACL or Network ACL). In this setup, the default is allow all traffic in bound and outbound for all the resources. However, this will be beyond the scope of this article.

## What’s Next

Note that I did not provision any `EC2` instances where my application will run. At this point of time, you can feel free to provision the EC2 instances for the web servers just like the bastion server, but associating them with the private subnets.

For me, I favor AWS Elastic Beanstalk in handling the deployment. What I have done so far is only the provisioning of the infrastructure. Hence, in my case, instead of defining the `EC2` instances, I will define an elastic beanstalk environment to host my Rails application and configure it to use the VPC to leverage on all the security.