How To Restrict File Search In Sublime Based On Project

This is a documentation on how to restrict text search to within specific directories per project you are working on in Sublime.

You may often find it annoying that a simple text search is searching in folders that is not part of your source code. While you can easily flip the switch in the user settings of the sublime text editor, it might not be ideal if you wear multiple hats like me and work on different frameworks.

One framework’s trash may be another’s treasure. Folders that are considered junk in one framework might be important in another. And if we happen to work on these framework together simultaneously, we would have to constantly flip the switch on and off as we jump between working on these projects.

sublime meme | vic-l

If you are using sublime text because the framework you are working on is simple enough to manage and you do not want the computation-heavy indexing and compilation process to be running in the background constantly, this is an article for you to boost your productivity.

The User Setting Way

The commonly documented way of configuring your sublime editor is to tweak the configuration option in the user setting. Using restricted file search as an example, we can simple press CMD + , (assuming you are on a mac) to call up the user setting files in the sublime editor.

Next add this setting under the Preferences.sublime-settings file as shown:

{
  ...
  "binary_file_patterns": [
    "node_modules/*",
    "public/packs/",
    "public/assets/",
    "public/packs-test/",
    "tmp/*",
    "*.jpg",
    "*.jpeg",
    "*.png",
    "*.gif",
    "*.ttf",
    "*.tga",
    "*.dds",
    "*.ico",
    "*.eot",
    "*.pdf",
    "*.swf",
    "*.jar",
    "*.zip"
  ]
  ...
}

The binary_file_patterns option will instruct Sublime to treat these files as binary files. Binary files are not readable by human, hence it is not considered in Goto Anything or Find in Files functions by default.

Line 4 to 8 are folders that we want excluded from our search process.

The rest are referring to specific files base on their extensions.

This setup is what I typically use for my Rails projects.

The Project-specific Way

The better alternative is to base the setting on the project level so that we do no overlap the settings between projects.

Save the project as a sublime project via File > Save As…

If moving the mouse is not your thing, can you simply create a file in the root directory with the .sublime-project extension.

Next, add the same binary_file_patterns setting as shown previously under the folders key as shown:

{
  "folders":
  [
    {
      "path": ".",
      "binary_file_patterns": [
        ...
      ]
    }
  ],
}

Line 5, the path key is required. This is added by default when we save our project as a sublime project. More information on its function and purpose can be found here.

Next, close sublime. This time, open our project via the sublime project file that you have created.

Make a search now and you will realise that you are no longer searching in the folders that you have no interest in. The search process is also much faster than before because the program is going through less files, and ignoring the bulkier ones.

Hallelujah!

Housekeeping

Once we created the .sublime-project file, another file with the extension .sublime-workspace will also be created. The latter contains user specific data and you will not want to share it with other developers who may be working on the same source code as you. Add this file to our <code>.gitignore</code> file to achieve this.

Setup Bootstrap In Rails 6 With Webpacker For Development And Production

This is a documentation on how to setup Bootstrap 4 in Rails 6 using Webpacker. As the framework shifts away from sprockets and the asset pipeline to embrace the dominating methodology of handling frontend affairs in the Javascript world that is webpack, we have to adapt along.
The way to setup a css framework to bootstrap your application has undergone a revamp, and this article seeks to cover the essential steps to set it up.

Pre-requisites

This article will assume you have set up all the required tools required for a typical Rails 6 application.

The main extra tool you will need as compared to previous versions of Rails is the yarn package manager. You can install yarn on your computer via various ways based on your preference and your OS.

We will not be covering it in this article.

Setting Up Bootstrap

With the shift in paradigm of handling front end assets, we no longer install front end libraries using gems. In the past, these gems are merely wrappers around the Javascript libraries and files which present a number of problems.

First, the latest changes in the Javascript world will take some time to propagate into the Rails realm.

Second, having an intermediate wrapper increase the potential points of failure during the wrapping process.

Third, we are really dependent on the angels who are working on these wrappers. If they do not update the gems frequently, we are stuck with the old features. This can be frustrating if you are waiting for a certain bug fix or a new feature that is already available in the latest release.

To install bootstrap, run this command.

yarn add bootstrap jquery popper.js

# for bootstrap 5.0
yarn add bootstrap jquery @popperjs/core

This command will automatically install the latest bootstrap package in the yarn registries and add its dependency entry and version in your package.json file. Jquery and popper.js are libraries that bootstrap depends on, especially in their Javascript department.

The JS And CSS Files

The main Javascript file, application.js should now reside in the app/javascript/packs folder. This is because Webpacker will now look for all the javascript files in this directory to compile. This is the default setting for Webpacker.

Of course, you can go ahead and change the configuration to your liking. However, keep in mind that Rails promotes convention over configuration. This implies that as much as possible, methodologies and practices should follow a certain default unless absolutely necessary. this has multiple advantages. My favorite one is the portability of code among fellow Rails developers. Developers can easily understand the flow of logic and where to find bugs because they are where are expected to be. This cuts down the development time and cost greatly.

The application.js file should look like this:

require("@rails/ujs").start()
require("turbolinks").start()
require("@rails/activestorage").start()
require("channels")
require("bootstrap")

// stylesheets
require("../stylesheets/main.scss")

Line 1 to 4 are the default files already present in the file.

Line 5 adds the Bootstrap Javascript library.

Line 8 adds your custom stylesheet. Now, this file can be placed anywhere. In the above example, the path is relative to where the application.js file is. Hence, the file is placed in app/javascript/stylesheets/main.scss in this example.

Next, we import the Bootstrap stylesheet files in the main stylesheet file.

@import "bootstrap/scss/bootstrap";

Note that we are importing files from the node_modules folder, and not a bootstrap folder placed in the relative path of the current directory of the main stylesheet file.

Also, you do not need the ~ in front of the path to signify that it is from the node_modules folder like you would usually do for other non-Rails project using webpack. The tilde alias in webpack is a default webpack configuration that will resolve to the node_modules folder. While it will still work here, it is not required as the node_modules folder is already configured as part of the search paths that webpack will look for when resolving the modules.

Now, you may be wondering how to the Bootstrap libraries will work without importing any of its dependencies, that are popper.js and Jquery. We will come to that in a minute. Before that, let’s look at the views.

The Views

Now, we will need to add the javascript and stylesheets files into the page. Following convention in this example, we will add to the application.html.erb layout so that the Bootstrap framework can be accessed in all pages. These lines of code are added in the head section of the layout template.

<%= stylesheet_pack_tag 'application' %>
<%= javascript_pack_tag 'application', 'data-turbolinks-track': 'reload' %>

There are a  number of things that are different from the old implementation.

Line 1 adds the compiled stylesheets path that webpacker will compile. Note that this only happens if the extract_css option is set to true in the webpacker.yml file. More about this later.

As you can see, there is no more stylesheet_include_tag. In the past, this helper method will get files from the public/assets folder, into which the asset pipeline will compile stylesheets and javascript files with other added pre and post processing. Now, everything is going to be done by Webpack.

Here what’s happening.

Webpack will look at application.js and find the stylesheet files that are included in it. Then, using a combination of Webpack loaders, Webpack will know how to compile and translate the scss syntax, the url paths of assets used etc. into a css file that the browser can read and implement its styling.

These Webpack loaders are already included by the Webpacker and its configurations set up. However, there are many loaders out there that are not included by default. They tend to be less used conventionally and will require manual intervention from your side.

One example is using ruby code inside your javascript files. This requires the rails-erb-loader that will “teach” Webpack to understand the erb syntax. The implementation involves a number of steps, one of which is to append this loader to the Webpack environment.js configuration file. Thankfully, for this case, the community has deemed it a pretty common use case that there is, at least, a rake task that comes together with the Webpacker gem to set this up easily.

The compilation process mentioned above, however, is not applied in the development environment by default. This is due to the extract_css settings in the webpacker.yml page. More about this and its implications in a bit.

Note that stylesheet_include_tag still works for assets you place in the app/assets folder. However, while that is true, as Rails moves away from the old Sprockets and assets pipeline convention, this is expected to become deprecated in the future.

The Webpacker Configuration File

Lastly, we need to add the dependencies of bootstrap. This takes place in the config/webpack/environment.js file.

const { environment } = require('@rails/webpacker')
const webpack = require('webpack')

environment.plugins.append('Provide', new webpack.ProvidePlugin({
  $: 'jquery',
  jQuery: 'jquery',
  // uncomment below for bootstrap 4.x
  // Popper: ['popper.js', 'default']
  // uncomment below for bootstrap 5
  Popper: ['@popperjs/core', 'default']
}))

module.exports = environment

As you can see, we are utilising the ProvidePlugin function of Webpack to add the dependency libraries in all the javascript packs instead of having to import them everywhere.

This is just an example of how we can import files with Webpack in Rails. And in this case, especially for jQuery, it makes a lot of sense as there is a high chance that we will be using it in other javascript files.

Coincidentally, this is how jQuery and popperjs, which are dependencies of the bootstrap library, are made available for the bootstrap library to use them.

The extract_css Option

There is one last point I would like to touch on. That is the extract_css option in the config/webpacker.yml file.

When set to true, webpack will compile the stylesheet files that were imported into the javascript files into external standalone stylesheets. These compiled files will then be added into the views via the stylesheet_pack_tag helper method as mentioned earlier.

In comparison, when set to false, the stylesheets are not compiled into standalone files. Instead, they are added into the view as a blob during runtime by the the relevant javascript file. This takes place only after the javascript file has been completely downloaded by the browser.

In development mode, the conventional setting for the extract_css option is false, and this has quite a significant implication on how the website will behave.

One, there might be a flash of unstyled content (FOUC) when the page loads because the javascript files are loaded asynchronously. This is unlike the css files which are blocking resources that will pause the rendering of the website until the file has been downloaded. This asynchronous loading of files allows the website to continue rendering while it waits for itself to be completely downloaded before computing the css blob and insert it into the html source code. If the web page loads before this occurs, the style for the web page is not present, and FOUC will thus occur.

Two, the stylesheet_pack_tag is not needed in the development environment using the default setting. Things will seem to work fine only until it is pushed into the production environment where the extract_css option is set to true, desirably and by default.

So make sure to add the stylesheet_pack_tag helper, but only if your javascript is going to compile a stylesheet and your page is reliant on it. If not, you are in for a surprise when it gets pushed to production.

Conclusion

At this point of time, the application should be running with Bootstrap in place. Do test out how it will differ in the production environment as compared to development.

JWT With Refresh Token Using Devise And Doorkeeper Without Authorization

This is a documentation on setting up the authentication system of a rails project in a primarily API environment.

Rails is essentially a framework for bootstrapping applications on the web environment. The support for APIs is thus lacking. One aspect of it is an off the shelf authentication system that can fit both the API and web environment on the same monolith application.

The Devise gem, while hugely popular and has established itself as the de facto authentication gem in the Rails world, does not come supported with an authentication system fit for interaction via APIs. The main reason is because it relies on cookies, which is strictly a browser feature.

To overcome this, often, we have to use other gems to couple with it to leverage on its scaffolded features for user authentications.

In this article, we will use the Doorkeeper and Devise combination to provide an authentication using JSON Web Tokens (JWT), the modern day best practices for authentication via APIs.

But let us first understand what kind of authentication system we are building and why we choose Doorkeeper.

The Example Authentication System​

Now, as a disclaimer, there are many ways to setup an authentication system.

One such consideration is the devise-jwt gem, which serves as a direct replacement to the cookies for your APIs. It is simple to implement and allows you to choose from multiple strategies to expire your token. Except that it does not come with a refresh token.

This implies that the token will expire and the user will have to login again. If your application requires such security, you can consider this gem instead.

However, in this article, the authentication system that I will like to set up is one that allows user to log in via JWT that will expire, and upon expiry, the front end can use the refresh token to get a new JWT without having the user to login again. This allows the user to stay logged in without compromising security excessively.

Why do we need to ensure the JWT expires?

Security Considerations Using JWT

Allowing user to be logged in permanently is kind of the standard user flow for many applications nowadays. The easiest way to implement this is to not expire the JWT. However, that is a recipe for disaster. It is akin to passing your password around when making API requests. And the moment it gets compromised, malicious attackers can have all the time in the world to explore your account and even plan their attacks, and leaving the users all the time in the world to say their prayers.

We thus have to enforce expiry on the JWT at the very least. To accomplish that without forcing the user to have to login again is to use a refresh token.

A refresh token stays in the local machine for the whole of it lifetime, or until the user actively logs out. This allows that the access token, which is dispatched out into the wild wild west otherwise known as the Internet, can at least expire within a certain period of time. And when it expires, the front end can use the refresh token to get a new access token to allow the user to continue its current session as though he or she is still logged in. So even if the access token gets compromised in the world beyond the walls, the potential damage is reduced.

This mechanism is made into a standard known as Oauth. There are many libraries out there that implements this already, and it is widely adopted among many of the software products that we use like Google account, facebook and twitter.

However, while this works with authenticating with these external providers, it has a crucial requirement that we do not want when implementing our own in house authentication system (I am referring to the old school email and password login). That step is the authorization step.

Some of us may have come across such a  request when we try to sign up with an app via Facebook, as shown below:

oauth authorization | vic-l

While this feature is absolutely essential in the OAuth protocol. it presents an awkwardness when we want to leverage on the OAuth libraries to implement JWT with refresh token for our in house authentication.

The Awkwardness Of OAuth

Just make sure we are on the same page, here are a summary of the points that led up to this awkwardness.

First, we need to make the tokens expire for security reasons.

Second, refresh token are here to the rescue, and they are used in the OAuth protocol.

Third, unfortunately, OAuth requires an authorization step, which in house authentication system do not need.

Last, we cannot leverage on the various OAuth implementation out there to implement a JWT with refresh token without having to hack these libraries and somehow sidestep the authorization step.

Hacking Doorkeeper

The OAuth library that we will be using is Doorkeeper. Its wiki page already has a section on skipping the authorization step, which certainly signals the demand for such an implementation. However, there are some points missing from this implementation and this article will try to cover more of them. These steps are highly influenced by this blog post.

First, install doorkeeper and its migration files, following its instructions.

rails g doorkeeper:install
rails g doorkeeper:migration

Changes To The Migration Files

Edit the migration file like this.

# frozen_string_literal: true

class CreateDoorkeeperTables < ActiveRecord::Migration[6.0]
  def change
    create_table :oauth_access_tokens do |t|
      t.references :resource_owner, index: true
      t.integer :application_id
      t.text :token, null: false
      t.string :refresh_token
      t.integer :expires_in
      t.datetime :revoked_at
      t.datetime :created_at, null: false
      t.string :scopes
    end

    # required to allow model.destroy to work
    create_table :oauth_access_grants do |t|
      t.references :resource_owner, null: false
      t.integer :application_id
      t.string   :token, null: false
      t.integer  :expires_in, null: false
      t.text     :redirect_uri, null: false
      t.datetime :created_at, null: false
      t.datetime :revoked_at
      t.string   :scopes, null: false, default: ''
    end

    # Uncomment below to ensure a valid reference to the resource owner's table
    add_foreign_key :oauth_access_tokens, :users, column: :resource_owner_id
  end
end

Compared to the original generated copy of the migration file, we have removed the oauth_applications table which refers to the application that we want to grant permission to in the authorization step. Since we are skipping the authoirzation, there is no need to have this unused table.

Next we have changed

t.references :application, null: false

into

t.integer :application_id

Since the table is no longer present, we cannot use the references helper, and need to resort to specifying the the basic data type. We are still keeping this column in the database although we have deleted the application table because Doorkeeper uses this attribute while running its operation. Without it, an error will occur along the lines of “column not found“.

In fact, we also do not need the oauth_access_grants table, which is the bridge between the oauth_access_tokens table and the oauth_applications. It records which token authorized which application. However, without it, an error will be thrown when destroying a user record from the database. If you do not have such a feature, feel free to remove this table as well.

Lastly, only keep the foreign key implementation on oauth_access_tokens and change the model name according to whatever you have named your model.

Changes To The Initializer File

Edit the configuration in the doorkeeper initializer file as such:

# frozen_string_literal: true

Doorkeeper.configure do
  ...
  resource_owner_from_credentials do |routes|
    user = User.find_for_database_authentication(email: params[:email])
    request.env['warden'].set_user(user, scope: :user, store: false)
    user
  end
  ...
  use_refresh_token
  ...
  grant_flows %w[password]
  ...
  skip_authorization do
    true
  end
  ...
  api_only
  base_controller 'ActionController::API'
end

We are essentially following this documentation on their wiki, but with some additional content and some slight changes, to implement an authentication flow whereby the token is returned in exchange for the credentials of the resource owner, in this case the user’s email and password.

Line 5 to 9 is the main implementation.

On line 6, we are instructing Doorkeeper to use Devise method, find_for_database_authentication, for authenticating the correct user. This method will run use the underlying warden gem in Devise to do its authentication magic. This, however, will save the user in the session, which can be a problem when we check for sessions in the controller level. More on this later. We undo this in line 7.

On line 7, we instruct warden to set the user only for the request and not store it in the session, as documented here.

On line 11, uncomment use_refresh_token to ensure a refresh token is generated on login.

Line 13 is for older version of Doorkeeper at 2.1+. More information in the above mentioned wiki page.

Line 15 to 17, we instruct Doorkeeper to skip the authorization step.

Line 19, we set mode to api_only. This can help to optimize the application to a certain extent. For example, it skips forgery protection checks that is not necessary in an API environment, which reduces computational requirement and latency.

Line 20, I am just explicitly setting the base controller to use ActionController::API instead of the default ActionController::Base, although this should have already been implemented when the mode is set to api_only.

Controller Level

Devise comes with a helper method, current_user or whatever your model name is, to access the current authenticated resource. This, however, will return a nil value in the current implementation because the underlying method will not be working. The underlying method is, taken from the source code:

def current_#{mapping}
  @current_#{mapping} ||= warden.authenticate(scope: :#{mapping})
end

With reference to this stackoverflow answer, we will modify it to look like this:

def current_user
  @current_user ||= if doorkeeper_token
                      User.find(doorkeeper_token.resource_owner_id)
                    else
                      warden.authenticate(scope: :user, store: false)
                    end
end

We have essentially overwritten the default implementation by Devise to check for the “current_user” using the doorkeeper_token first, and fallback on the default implementation. The fallback will be useful in the event where our application will still be using the traditional login methods via a web browser. Feel free to remove it if you are not going to have such any request coming from a web browser. And of course, remember to handle the scenario of a nil doorkeeper_token.

Last but not least, implement that authorization check at the correct routes and actions in the Doorkeeper::TokensController via the before_action callback like how you would when using just Devise alone.

before_action :doorkeeper_authorize!

Custom Controller

I personally have some custom code that I want to add to all my APIs so that when the frontend consumes my APIs, they will not be left stunned by responses having different JSON structure.

I keep a response_code and a response_message in all my APIs for the frontend to react accordingly and trigger the desired UX flow.

Here is how I modify my controller. Let’s start off with some modification to the Doorkeeper modules.

module Doorkeeper
  module OAuth
    class TokenResponse
      def body
        {
          # copied
          "access_token" => token.plaintext_token,
          "token_type" => token.token_type,
          "expires_in" => token.expires_in_seconds,
          "refresh_token" => token.plaintext_refresh_token,
          "scope" => token.scopes_string,
          "created_at" => token.created_at.to_i,
          # custom
          response_code: 'custom.success.default',
          response_message: I18n.t('custom.success.default')
        }.reject { |_, value| value.blank? }
      end
    end
  end
end

Here, I modify the response from Doorkeeper to add in my required keys. I am using I18n to handle the custom messages and prepare the application for a global audience.

Next, the error response. By default, Doorkeeper returns the keys error and error_description. That is different from what I want. I will overwrite it totally.

module Doorkeeper
  module OAuth
    class ErrorResponse
      # overwrite, do not use default error and error_description key
      def body
        {
          response_code: "doorkeeper.errors.messages.#{name}",
          response_message: description,
          state: state
        }
      end
    end
  end
end


name, description and state are accessible variables in the default class. I integrate them into my custom API response for standardization purpose.

Now the controller. There are 3 main methods: login, refresh and logout. Let’s go through them.

module Api
  module V1
    class TokensController < Doorkeeper::TokensController
      before_action :doorkeeper_authorize!, only: [:logout]

      def login
        user = User.find_for_database_authentication(email: params[:email])

        case
        when user.nil? || !user.valid_password?(params[:password])
          response_code = 'devise.failure.invalid'
          render json: {
            response_code: response_code,
            response_message: I18n.t(response_code)
          }, status: 400
        when user&.inactive_message == :unconfirmed
          response_code = 'devise.failure.unconfirmed'
          render json: {
            response_code: response_code,
            response_message: I18n.t(response_code)
          }, status: 400
        when !user.active_for_authentication?
          create
        else
          create
        end
      end

      def refresh
        create
      end

      def logout
        # Follow doorkeeper-5.1.0 revoke method, different from the latest code on the repo on 6 Sept 2019

        params[:token] = access_token

        revoke_token if authorized?
        response_code = 'custom.success.default'
        render json: {
          response_code: response_code,
          response_message: I18n.t(response_code)
        }, status: 200
      end

      private

      def access_token
        pattern = /^Bearer /
        header = request.headers['Authorization']
        header.gsub(pattern, '') if header && header.match(pattern)
      end
    end
  end
end

Firstly, I am applying the doorkeeper_authorize! callback on the logout method only as that is the only method that will require the user to be logged in.

The login method will largely follow what we defined in the initializer file under the resource_owner_from_credentials block. The modification here is to define specific error scenarios and their respective response_code here. For those scenarios that are of no interest to me, I will leave it to the catch-all case and and return what is now the default modified ErrorResponse.

The second case in particular is specific to my project. I allow admin users to create the users, and have a flag (created_by_admin_and_authenticated) to differentiate them.

  • nil means the user registered normally
  • false means they are created by the admin user, but have yet to authenticate with the email that our server sent out to them
  • true means they are created by admin user and have also authenticated their email address

I will force users who are created by admin users but have yet to authenticate via email to reset their password, leveraging on what Devise has already provided with its password module.

Note: this is definitely much to be optimized here. For example, the find_for_database_authentication method is being called twice here for a successful user login, once in this custom controller and the other in the default Doorkeeper::TokensController create method.

The refresh method to refresh the access_token is practically the same as the default create method, but I am overriding it here because I use ApiPie to add documentation to the routes. For those who do not use ApiPie, we define its required parameters, headers etc. above the line 31 to define the documentation for the refresh method. I also can rename the route in doing so to create an API that the front end developers that I am working with would find more familiar with.

The logout method makes use of the revoke_token method, according to its source code, to revoke the JWT.

In my application, I require my frontend to add the JWT token in the Authorization header instead of a parameter in the request body based on convention. Doorkeeper, on the other hand, expects the token to be present in the params. To overcome this, I created the custom private access_token method to get the token in the header that the front end has placed in their requests. That token is then placed in the params object behind the key named token as Doorkeeper would have expected. Doorkeeper can then do its thing without having to modify any of its internal workings.

Since the revoke_token method provided by Doorkeeper will make use of the token key in the params, I will first use the private access_token method to extract the JWT token from the Authorization header. Then add it as the value to the token key of the params variable.

The logout method is required for the front end to dispose of the current access token they have for security purposes. I also use it to remove the users’ devices token so that they do not receive push notifications after logging out.

Login Request

{
	"email": "user1@test.com",
	"password": "user1@test.com",
	"grant_type": "password"
}


A login request will have these keys. In particular, the grant_type strategy used should be password.

Conclusion

You should be able to login with the correct credentials with the default Doorkeeper::TokensController and access your controllers with the correct resource, just like how you would when using Devise alone. Otherwise, you can use your custom controller inherit and customise the authentication routes, as I have demonstrated.

Hope this was helpful!

How To Setup A Standard AWS VPC With Terraform

This is a documentation on how to setup the standard virtual private network (VPC) in AWS with the basic security configurations using Terraform.

In general, I classify the basics as having the servers and databases in the private subnets, and having a bastion server for remote access. There is definitely much room to improve from this setup and certainly much more in the realms beyond my knowledge. However, as a start, this is, at the very least, essential for a production environment,

Personally, I have an Amazon Certified Solutions Architect (Associate) certificate to my name, but like most of the engineering university graduates out there who have forgotten how to do dy/dx or  what the hell is the L’Hôpital’s rule, I have all but forgotten the exact steps to recreate such an environment.

AWS Associate Solutions Architect | vic-l

As a saving grace 😅, I should say that I do know how to set it up, just that I do not have it at the tip of my fingers. I would not get it right the first time, but given time I will eventually set it up correctly.

This is true for whenever I setup an environment for new projects. Debugging the setup which can be time consuming and frustrating. It is not efficient and is probably one of the key reasons why infrastructure as code (IaC) has become a trending topic in recent years.

Provisioning these infrastructures using code implies:

  • version control on code and, in turn, infrastructural changes made by members of the development team
  • easily reproducible infrastructures
  • automation

One of the frontrunners in this industry is Terraform. All that is required are the configurations written in files ending with the “tf” extension placed in the same directory.

The VPC

Start by provisioning the VPC.

We set the CIDR block to provide the maximum number private ip addresses that an AWS VPC allows. This implies that you can have up to 65,536 AWS resources in your VPC, assuming each of them require a private IP address for communication purpose.

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16" # 65536 ip addresses

  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

The variables project_name and env can be placed in a separate .tf as long as they are in the same directory when Terraform eventually runs to apply the changes.

The Gateways

Next, we setup the Internet gateway (IGW) and NAT gateway (NGW).

The IGW allows for resources in the public subnets to communicate with the outside Internet.

The NGW does the same thing,  but for the resources in the private subnets. Sometimes, these resources need to download packages from the Internet for updates etc. This is in direct conflict with the security requirements that placed them in the private subnets in the first place. The NGW balances these 2 requirements.

# IGW
resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

resource "aws_route_table" "igw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "igw-${var.project_name}${var.env}"
  }
}

resource "aws_route" "igw" {
  route_table_id = aws_route_table.igw.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id = aws_internet_gateway.main.id
}

# NGW
resource "aws_route_table" "ngw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "ngw-${var.project_name}${var.env}"
  }
}

resource "aws_route" "ngw" {
  route_table_id = aws_route_table.ngw.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id = aws_nat_gateway.main.id
}

### NOTE ###
resource "aws_eip" "nat" {
  vpc = true
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id = aws_subnet.public-ap-southeast-1a.id

  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

Both gateways need to be associated to their respective aws_route_table via an aws_route that will route out to everywhere on the Internet, as indicated by the 0.0.0.0/0 CIDR block.

The NGW requires some additional setup.

First, a NAT gateway requires an elastic IP address due to the way it is engineered. I would not pretend I know how it works to tell you why a static IP address is required, but I do know we can easily provision using Terraform.

This static IP address will also come in useful if your private instances need to make API calls to third party sources that require the instances ip address for whitelisting purpose. The outgoing requests from the private instances will bear the ip address of the NGW.

In addition, a NAT gateway needs to be placed in one of the the public subnet in order to communicate with the Internet. As you can see, we have made an implicit dependency on the aws_subnet which we will define later. Terraform will ensure the NAT gateway will be created after the subnets are setup.

The Subnets

Now, let’s setup the subnets.

We will setup 1 public and 1 private subnet in each availability zones that the region provides. I will be using the ap-southeast-1 (Singapore) region. That will be a total of 6 subnets to provision as there are 3 subnets in this region.

#### public 1a
resource "aws_subnet" "public-ap-southeast-1a" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.100.0/24"
  availability_zone_id = "apse1-az2"

  tags = {
    Name = "public-ap-southeast-1a-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "public-ap-southeast-1a" {
  subnet_id = aws_subnet.public-ap-southeast-1a.id
  route_table_id = aws_route_table.igw.id
}

#### public 1b
resource "aws_subnet" "public-ap-southeast-1b" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.101.0/24"
  availability_zone_id = "apse1-az1"

  tags = {
    Name = "public-ap-southeast-1b-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "public-ap-southeast-1b" {
  subnet_id = aws_subnet.public-ap-southeast-1b.id
  route_table_id = aws_route_table.igw.id
}

#### public 1s
resource "aws_subnet" "public-ap-southeast-1c" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.102.0/24"
  availability_zone_id = "apse1-az3"

  tags = {
    Name = "public-ap-southeast-1c-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "public-ap-southeast-1c" {
  subnet_id = aws_subnet.public-ap-southeast-1c.id
  route_table_id = aws_route_table.igw.id
}

#### private 1a
resource "aws_subnet" "private-ap-southeast-1a" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.1.0/24"
  availability_zone_id = "apse1-az2"

  tags = {
    Name = "private-ap-southeast-1a-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "private-ap-southeast-1a" {
  subnet_id = aws_subnet.private-ap-southeast-1a.id
  route_table_id = aws_route_table.ngw.id
}

#### private 1b
resource "aws_subnet" "private-ap-southeast-1b" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.2.0/24"
  availability_zone_id = "apse1-az1"

  tags = {
    Name = "private-ap-southeast-1b-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "private-ap-southeast-1b" {
  subnet_id = aws_subnet.private-ap-southeast-1b.id
  route_table_id = aws_route_table.ngw.id
}

#### private 1c
resource "aws_subnet" "private-ap-southeast-1c" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.3.0/24"
  availability_zone_id = "apse1-az3"

  tags = {
    Name = "private-ap-southeast-1c-${var.project_name}${var.env}"
  }
}

resource "aws_route_table_association" "private-ap-southeast-1c" {
  subnet_id = aws_subnet.private-ap-southeast-1c.id
  route_table_id = aws_route_table.ngw.id
}

Amidst this long snippet of configuration for the subnets, it is essentially a repeat of the same resources association.

For the public subnets, they are assigned the CIDR blocks 10.0.1.0/2410.0.2.0/24 and 10.0.3.0/24 respectively. Each will have up to 256 ip addresses to house 256 AWS resources that requires an ip address. Their addresses will be from, taking the first subnet as example, 10.0.1.0 to 10.0.1.255.

For the private subnets, they occupy the CIDR blocks 10.0.101.0/24, 10.0.102.0/24 and 10.0.103.0/24 respectively.

To be exact, there will be less than 256 addresses per subnet as some private IP addresses are reserved in every subnet. Of course, you can provision more or less ip addresses per subnet with the correct subnet masking setting.

Each subnet is associated to different availability zones via the availability_zone_id to spread out the resources across the region.

Each public subnet is also associated to the aws_route_table that is related to the IGW, while each private subnet is associated to the aws_route_table related to the NGW.

The Database

Next, we setup the database. We will provision the database using RDS and place it in the private subnets for security purpose.

At this point of time, I must admit that I do not know if this is the best way to setup the database. I personally have a lot of questions on how the infrastructure will change when the application scales eventually, especially for the database. How will the database be sharded into different regions to serve a global audience? How do the database sync across the different regions? These are side quests that I will have to pursue in the future.

For now, a single instance in a private subnet.

resource "aws_db_instance" "main" {
  allocated_storage = 20
  storage_type = "gp2"
  engine = "mysql"
  engine_version = "5.7"
  instance_class = "db.t2.micro"
  identifier = "rds-${var.project_name}${var.env}"
  name = "something"
  username = "something"
  password = "something"

  skip_final_snapshot = false
  # notes time of creation of rds.tf file
  final_snapshot_identifier = "rds-${var.project_name}${var.env}-1573454102"

  vpc_security_group_ids = [aws_security_group.rds.id]
  db_subnet_group_name = aws_db_subnet_group.main.id

  lifecycle {
    prevent_destroy = true
  }

  tags = {
    Name = "rds-${var.project_name}${var.env}"
  }
}

resource "aws_db_subnet_group" "main" {
  name = "db-private-subnets"
  subnet_ids = [
    aws_subnet.private-ap-southeast-1a.id,
    aws_subnet.private-ap-southeast-1b.id,
    aws_subnet.private-ap-southeast-1c.id
  ]

  tags = {
    Name = "subnet-group-${var.project_name}${var.env}"
  }
}

As you can see, we can see and review the full configuration for the database using code as compared to having to navigate around the AWS management console to complete the puzzle. We can easily know the size of the database instance we have provisioned as well as its credentials (Ok this is debatable if we want to commit sensitive data in our code).

In this configuration, I ensured that the database will produce a final snap shot in the event it gets destroyed.

Access to the database will be guarded by an aws_security_group that will be defined later.

The database is also associated to the aws_db_subnet_group resource. This resource consist of all the private subnet that we provisioned. This creates an implicit dependency on these subnets, ensuring that the database will only be created after the subnets are created. This would also tell AWS to place the database in the custom VPC that the subnets exist in.

I also ensured the database will not be destroyed by Terraform accidentally using the lifecycle configuration.

The Bastion

The bastion server allows us to access the servers and the database instance in the private subnets. We will provision the bastion inside the public subnet.

resource "aws_instance" "bastion" {
  ami = "ami-061eb2b23f9f8839c"
  associate_public_ip_address = true
  instance_type = "t2.nano"
  subnet_id = aws_subnet.public-ap-southeast-1a.id
  vpc_security_group_ids = ["${aws_security_group.bastion.id}"]
  key_name = aws_key_pair.main.key_name

  tags = {
    Name = "bastion-${var.project_name}${var.env}"
  }
}

resource "aws_key_pair" "main" {
  key_name = "${var.project_name}-${var.env}"
  public_key = "ssh-rsa something"
}


output "bastion_public_ip" {
  value = aws_instance.bastion.public_ip
}

I am using a Ubuntu-18.04 LTS image to setup the bastion instance. Note that the AMI id will differ from region to region, even for the same operating system. The image below shows the difference in the AMI id between Singapore and Tokyo regions.

ubuntu ami in ap-southeast-1| vic-l
ubuntu ami in tokyo region | vic-l

I will mainly use the bastion to tunnel the commands to the private subnet. Hence, there is no need for a large computation. The cheapest and smallest instance size of t2.nano is chosen.

It is associated to a public subnet that we created. Any subnet will work, but make sure it is public as we need to be able to connect to it.

Its security group will be defined later.

All EC2 instances in AWS can be given an aws_key_pair. We can generate a custom private key using the ssh-keygen command or you can use the default ssh key in your local machine so that you can ssh into the bastion easily without having to define the identity file each time you do so.

Then, there is the output block. After Terraform has completed its magic, it will output values defined in these output blocks. In this case, the public ip address of the bastion server will be shown on the terminal, making it easy for us to obtain the endpoint.

The Security Groups

Lastly, the connection is not completed without setting up the security groups that guards the traffic going in and out of the resources. This was the bane of my AWS Solution Architect journey. With the required configurations spelled out in code instead of steps in the console that exist only in the memory, Terraform has helped me greatly to further understand this feature.

There are a total of 3 aws_security_group resources  to be created, representing the bastion, the instances and the database respectively. Each of them have their own set of inbound and/or outbound rules, named “ingress” and “egress” in Terraform terms, that are configured separately.

While you can configure the inbound and outbound rules together within the resource block of the respective aws_security_group, I would recommend against that. This is because doing so will result in tight coupling between the security groups, especially if one of its aws_security_group_rule is pointing to another aws_security_group as the source. This is problematic when we eventually make changes to the security groups because, for example, maybe one cannot be destroyed because a security group that is it dependent on is not supposed to be destroyed.

And the frustrating thing is that Terraform, or maybe the underlying AWS api, do not indicate the error. In fact it takes forever to destroy security groups that are created this way, only to fail after making us wait for a long time, which makes debugging superfluously tedious.

There are many issues mentioning this and something related on Github, like this. This has to do with has been termed “enforced dependencies” that Terraform currently has no mechanism to handle.

By decoupling the aws_security_group and their respective aws_security_group_rule into separate resources, we will give Terraform and ourselves an easier time removing and making changes to the security groups in the future.

Bastion

Let’s see how we can configure Terraform setup the security of the subnets. We start off with the security group for the bastion server. We will make 3 rules for it.

# bastion
resource "aws_security_group" "bastion" {
  name = "${var.project_name}${var.env}-bastion"
  description = "For bastion server ${var.env}"
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

resource "aws_security_group_rule" "ssh-bastion-world" {
  type = "ingress"
  from_port = 22
  to_port = 22
  protocol = "tcp"
  # Please restrict your ingress to only necessary IPs and ports.
  # Opening to 0.0.0.0/0 can lead to security vulnerabilities
  # You may want to set a fixed ip address if you have a static ip
  security_group_id = aws_security_group.bastion.id
  cidr_blocks = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "ssh-bastion-web_server" {
  type = "egress"
  from_port = 22
  to_port = 22
  protocol = "tcp"
  security_group_id = aws_security_group.bastion.id
  source_security_group_id = aws_security_group.web_server.id
}

resource "aws_security_group_rule" "mysql-bastion-rds" {
  type = "egress"
  from_port = 3306
  to_port = 3306
  protocol = "tcp"
  security_group_id = aws_security_group.bastion.id
  source_security_group_id = aws_security_group.rds.id
}

The first is an ingress rule to allow us to ssh into it from wherever we are. Of course, this is not ideal as it means anyone from anywhere can ssh into it. We should scope it to the ip address where you work from, be it your home or your office. However, for my case, as a digital nomad, the ip address that I work with just changes so often as I moved around that it just makes more sense to open it up to the world. I made a calculated risk here. Please don’t try this at home.

The second is an egress rule that allow the bastion instance to ssh into the web servers in the private subnets. The source of this rule is set as the aws_security_group of the web servers.

The third rule is another outbound rule  to allow the bastion to communicate with the database. Since I am using <code>mysql</code> as the database engine, the port used is 3306. This allows us to run database operation on the isolated database instance in the private subnet via the bastion over the correct port securely.

Web Servers

Next will be the security groups for your web servers. The only rule that it requires will be the ingress rule for the bastion to ssh into itself over port 22.

resource "aws_security_group" "web_server" {
  name = "${var.project_name}${var.env}-web-servers"
  description = "For Web servers ${var.env}"
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

resource "aws_security_group_rule" "ssh-web_server-bastion" {
  type = "ingress"
  from_port = 22
  to_port = 22
  protocol = "tcp"
  security_group_id = aws_security_group.web_server.id
  source_security_group_id = aws_security_group.bastion.id
}

RDS

Lastly, the rds instance. It consist of 2 rules.

resource "aws_security_group" "rds" {
    name = "rds-${var.project_name}${var.env}"
    description = "For RDS ${var.env}"

vpc_id = aws_vpc.main.id
  tags = {
    Name = "${var.project_name}${var.env}"
  }
}

resource "aws_security_group_rule" "mysql-rds-web_server" {
  type = "ingress"
  from_port = 3306
  to_port = 3306
  protocol = "tcp"
  security_group_id = aws_security_group.rds.id
  source_security_group_id = aws_security_group.web_server.id
}

resource "aws_security_group_rule" "mysql-rds-bastion" {
  type = "ingress"
  from_port = 3306
  to_port = 3306
  protocol = "tcp"
  security_group_id = aws_security_group.rds.id
  source_security_group_id = aws_security_group.bastion.id
}

The first is of course to open up port 3306 to allow request from the web servers to reach the database to run the application.

The second is to allow the bastion to communicate over port 3306. We have to define the egress rule applied on the bastion server itself to connect out to the RDS instance previously. Now, this ingress rule will allow the incoming request from the bastion server to reach the RDS instance instead of being blocked off.

Terraform Apply

These resources can be defined in a single or multiple terraform files with the extension tf, as long as they are in the same directory.

If you are using docker to run terraform, you can do a volume mount of the current directory into the workspace of the docker container and apply the infrastructure!

Improvements

We can harden the security of this setup further by, for example, configuring the Network Access Control Level (NACL or Network ACL). In this setup, the default is allow all traffic in bound and outbound for all the resources. However, this will be beyond the scope of this article.

What’s Next

Note that I did not provision any EC2 instances where my application will run. At this point of time, you can feel free to provision the EC2 instances for the web servers just like the bastion server, but associating them with the private subnets.

For me, I favor AWS Elastic Beanstalk in handling the deployment. What I have done so far is only the provisioning of the infrastructure. Hence, in my case, instead of defining the EC2 instances, I will define an elastic beanstalk environment to host my Rails application and configure it to use the VPC to leverage on all the security.

How To Add Datatables To Webpacker In Rails

This is a documentation of using datatables with the latest version of Rails (6.0 at the time of writing) that uses webpacker as the default Javascript compiler.

I found some difficulty in looking for documentation of integrating this in the new Rails away from the lands of Sprocket.

Hopefully this can help you and my future self when I come back to understand what I did to make my codes work instead of leaving it to God.

Only God Knows | Vic-l

Datatables and custom styling

Datatables ship with its core files and some default styling packages with major CSS framework like Bootstrap and Foundation. Taking Bootstrap 4 as the example framework, install the packages below using yarn.

"datatables.net": "^1.10.19",
"datatables.net-bs4": "^1.10.19",

These are the latest versions at the time of writing. They will be added to the package.json file.

Require Datatables

In your javascript file (place it in application.js for now), require the file and initialize datatables.

require("datatables.net")
require('datatables.net-bs4')
require("datatables.net-bs4/css/dataTables.bootstrap4.min.css")

const dataTables = [];

document.addEventListener("turbolinks:load", () => {
  if (dataTables.length === 0 && $('.data-table').length !== 0) {
    $('.data-table').each((_, element) => {
      dataTables.push($(element).DataTable({
        pageLength: 50
      }));
    });
  }
});

document.addEventListener("turbolinks:before-cache", () => {
  while (dataTables.length !== 0) {
    dataTables.pop().destroy();
  }
});

Let me explain what each line does.

On line 1, we import the core datatables js files. These js files adds the standard search and sorting functions of datatables as well as wire up any of your custom configurations.

On line 2, the javascript that will work with Bootstrap 4 elements. It will add elements to the web page for, for example, the pagination feature using the common Bootstrap classes like row and col-*.

On line 3, it imports the custom css file that are required by the datatables JavaScript function but are not present in default Bootstrap stylings. Yes, we are importing the CSS files in a javascript file. Webpack will compile this JavaScript file into the public/packs folder and take care of loading the css into the webpage albeit via javascript. Note that if you set the extract_css option as true in the webpacker configuration, it will instruct webpacker to compile the css into a standalone file, instead of loading it as part of the Javascript code. Hence, you will need to rely on stylesheet_pack_tag to load the css file in the page for the styling to work.

Line 5 is where we declare a datatables array variable to be accessed within this module that is this script. This is a critical step for DataTables to play well with Rails in a turbolinks powered environment. The role of this variable is to store all instances of the tables that have been initialized.

The next 2 blocks of code add 2 listeners to the DOM.

The first triggers the dataTable() function on the desired elements that bear the class data-table. This sets up the pagination, search, sort etc functionalities that make datatables so powerful and simple on your table element. The event this occurs on is turbolinks:load, which is when the url changes and the page loads. Each element is initialized and stored in the dataTables array variable. The 2nd listener will reference them.

The second listener will destroy each of the dataTable instance that are stored in the namesake variable, if any is present. It is triggered during the turbolinks:before-cache event, which takes place when the page navigates away. This step is crucial to remove the elements that were added when the datatables script is evaluated, like the search bar and the pagination elements. If this is not done, there will be extra elements appearing on the webpage when the user navigates back through the browser history as mentioned in this Github issue.

NOTE that it is important NOT to name the class of your elements as “dataTable” as they will get destroyed in the process. If that happens, when the user navigates forward and back again or vice versa, the element will not be picked up and the dataTable() function will not be executed. Kudos to Philip for his comment.

Optimizing

= javascript_pack_tag 'custom/datatables', 'data-turbolinks-track': 'reload'
= stylesheet_pack_tag 'custom/datatables'

Not every page has a table that you will like to initialize the datatable functionalities on. You should only require this in pages that require the code to be executed. In this way, the initial load time of your page will be reduced by not downloading the extra files that you will not use and affect the page speed of innocent pages.
This means downloading 2 files instead of one which can affect page speed due to having to make 2 request instead of 1. However, the resultant overheads from the http requests are unavailing considering these are javascript files that are not render-blocking resources and they can be loaded asynchronously to mitigate it.
Put the above code snippet in another js file in the app/javascript/packs. Webpacker will pick up this file as another entry point and compile the js asset that you can add separately.
Call this js file in the page that require it as such:

Once again, you will find stylesheet_pack_tag useful only if you have enabled extract_css in the webpacker configuration. It will be responsible to load the compiled (or ‘extracted’ in this context) css file.

Playing Well With Turbolinks

# application.html.slim
head
  = yield :javascript_in_head
body
  = yield

# specific/page.html.slim
body
  - content_for :javascript_in_head do
    = javascript_pack_tag 'my-datatables-scripts', 'data-turbolinks-track': 'reload'

Be careful of where you load this javascript file. Make sure to load it in the head html element tag because turbolinks will only handle the javascript files loaded in the head and not the body html element tag.
To do this in the page, use the content_for helper.

Rails will insert the javascript file in the head section at line 3 the for the given page in the head section of the page’s layout, ensuring that turbolinks perform hooks on the javascript file as well.

Why Is Render JSON And Return Not Working In My Controllers

Recently, I stumbled upon on an unexpected error when i was refactoring my code to follow the style guide of rubocop, Ruby On Rails’ very own linter.

The and/or style guide recommends that the logical operator && be used in place of and. When I did that, my controllers started to break my tests.

Return JSON Did Not In Fact Return

I have a controller action that looks like this. It does a check on the current user and returns early with a custom json if the condition is right, instead of continuing the propagation to look for the view file corresponding to itself.

def show
  if current_user.one_piece_fan?
    render json: {
      messsage: 'My nakama!'
    } and return
  end
end

This works fine, but when I changed the and to &&, this custom json was not returned. What went wrong?

&& vs and

There is a very subtle difference between the two. It is their precedence order.

In a line of code consisting of multiple operations, it plays a part in deciding what gets evaluated first. This turns out to be fairly crucial for ruby that has stripped itself of the non-human-friendly characters like brackets and semi-colons. Let’s take a look at the code below.

secrets of "and" | vic-l

Did that surprise you?

Well, this is all because of precedence.

In the first line of code, and has a lower precedence than the assignment operator =, hence the assignment took place before the logical AND operation is carried out with false. In other words, this is actually how it looks like had ruby still have its clothes on.

(s = true) && false

Hence, the false value returned from this line of code is referring to the result of the && operation. And when one of the operand is false, the result will be false.

multiply by 0 | vic-l

As for the third line of code, it works just like how most people would commonly interpret it. The result of the && operation is assigned to the s variable, and it subsequent false value is the value that the variable s now holds.

Returning Early In Controllers

So back to the case of controllers.

class ApplicationController < ActionController::API
  def render_and
    render json: {
      message: 'Using "and return"'
    } and return
  end

  def render_amp
    render json: {
      message: 'Using "&& return"'
    } && return
  end

  def render_amp_with_brackets
    render(json: {
      message: 'Using "&& return" with brackets'
    }) && return
  end
end

Base on the snippet above, the actions render_and and render_amp_with_brackets will work just like how you would have expected. They will return the render function early and stop the controller from propagating further.

As for the render_amp method, it is rendering the result of the && operation between the return function and the hash. Essentially, it looks like this.

render({ ... } && return)

Since there is no eventual return in this render function, the controller will further propagate and carry out its search for the view corresponding the the action.

Final Thoughts

I hope this has help us understand our and and &&s better!

Credits to this stackoverflow answer.

Reopen And Add Methods To Models In Ruby Gems

This is a documentation on how to add class and instance methods to models that exist in ruby gems. Often, there is a need to add methods to models that are created in ruby gems.

In a recent project that I am working on, I found a particular need for adding images to a tagging gem. The purpose of the gem is for taxonomy and the final UI draft has different images allocated to each of the category (or should I say tag). We decided to have the rails backend handle the image and tag association. Hence, the ideal way to handle this would be to modify the models in the tagging gem to hold its image as well upon creation.

Project Specifics

The tagging gem that I am using is, contrary to the more popular and senior ActsAsTaggableOn gem, the Gutentag gem.

The reason I use the latter instead of the former is because the former does not support the new ActiveRecord 6 when I was working on the project. It returns erroneous results and throws error due to deprecated ActiveModel method in its normal usage for example.

The alternative I found is Gutentag. It support Rails 6 and the contributors are actively resolving issues, keeping its issues count at 0 at the time of writing. I found it reliable and it does its main job well, which is to provide the tagging module.

The only thing it lacks for this particular project is an image to associate with for each tag. Here is where I would need to hack it.

I want to add image to each tag using ActiveStorage via has_on_attached method, and also a custom instance method that will return the tag’s name and image url.

The Rationale

The way I am doing it is to create a module that defines the relevant methods, and have the Gutentag::Tag model include this custom module. I will include it during the initialization phase. This will require some workarounds because of we are accessing the ActiveStorage and ActiveModel/ActiveRecord railitie sduring the initialization phase where these railities are not loaded yet.

The Extension Module

Kudos to this answer on stackoverflow, define the extension module as such:

# lib/extensions/gutentag.rb
# frozen_string_literal: true

module Extensions
  module Gutentag
    extend ActiveSupport::Concern
    
    included do
      has_one_attached :image
    end

    def json_attributes
      custom_attributes = attributes.dup
      custom_attributes.delete 'created_at'
      custom_attributes.delete 'updated_at'
      custom_attributes.delete 'taggings_count'
      custom_attributes.delete 'id'

      #  add image path base on service used
      if Rails.env.test? || Rails.env.development?
        ActiveStorage::Current.set(host: 'http://localhost:3000') do
          custom_attributes['image'] = self.image.attached? ? self.image.service_url : nil
        end
      else
        custom_attributes['image'] = self.image.attached? ? self.image.service_url : nil
      end

      custom_attributes
    end
  end
end

I modified it slightly with the use of ActiveSupport::Concern to do it the Rails 6 way. This helps to resolve module dependencies gracefully.

In this extension, I attached an image to the module using ActiveStorage‘s has_one_attached class method, which will ultimately be applied to the Gutentag::Tag model.

I also defined the instance method json_attributes which will return only the name and the image url in the resultant tag when called. It is used in the api response when frontend clients are retrieving the list of tags for example.

The Initialization

The code will be added to the original Gutentag initializer file under config/initializers/gutentag.rb.

# frozen_string_literal: true

require 'extensions/gutentag'

Gutentag.normaliser = lambda { |value| value.to_s }

Rails.application.config.to_prepare do
  begin
    if ActiveRecord::Base.connection.table_exists?(:gutentag_tags)
      Gutentag::Tag.include Extensions::Gutentag
    end
  rescue ActiveRecord::NoDatabaseError
  end
end

The extension file is imported in line 3.

Line 5 is one of the provided original Gutentag configuration option. This is specific to my project and is trivial in relation to the topic of this article. I am leaving it here to show other Gutentag configuration changes will co-exist with this custom module of mine.

Line 10 is the main line of code to execute. It will add the module to the Gutentag::Tag model which is defined inside the source code of the Gutentag gem. However, as you can see, it is wrapped in a number of codes. Not doing so will result in errors.

Here is why.

As we are going to involve the ActiveRecord and ActiveSupport railities, which have not been initialized yet during the default rails initialization phase, we need to ensure we run the code after they have been loaded.

Rails has 5 initialization events. The first initialization event to fire off after all railities are loaded is to_prepare, hence we define the code after that happens inside its block.

Since we are interacting with a ActiveRecord model, during the initialization phase, it is possible that the table has not been created. In other words, the Gutentag tables migration has not been executed, resulting in errors about the table not existing. An if conditional check is done to prevent this error.

I am not handling the else condition as under normal circumstances, after the proper migration has been executed, this will not happen. A possible scenario that this would happen is during rake tasks to create or migrate the database, either of which does not use the new methods at all.

A non-existing table is not the only thing we have to guard when dealing with Railities during the initialization process. A non-existing database is also a probable scenario that may occur. An example is during the rails db:create step. Hence, we rescue the ActiveRecord::NoDatabaseError error to silence the error. As this is often the only scenario that will happen, I will not handle the exception in the rescue block.

Usage

Now we can use it in our application. For instance, I can seed some default tags with images attached to them as shown:

# db/seeds.rb
# frozen_string_literal: true

p 'Creating Tags'
[
  'Luffy',
  'Zoro',
  'Usopp',
  'Sanji',
  'Nami',
  'Chopper',
  'Robin',
  'Frankie',
  'Brooks',
].each do |name|
  tag = Gutentag::Tag.create!(name: name)
  tag.image.attach(io: File.open("#{Rails.root.join('app', 'assets', 'images')}/#{name}_avatar.jpg"), filename: "#{name}_image.jpg")
  tag.save!
end
p 'Tags created'

Then in my api response for listing the tags, I can use the json_attributes method as such:

# app/controllers/api/v1/tags_controller.rb
# frozen_string_literal: true

module Api
  module V1
    class TagsController < Api::BaseController
      def index
        @tags = Gutentag::Tag.order(:name)
      end
    end
  end
end


# app/views/api/v1/tags/index.json.jbuilder
json.tags do
  json.array! @tags do |tag|
    json.merge! tag.json_attributes
  end
end

How To Change Or Add New SSH Key for EC2

This is a documentation of how to change or add new ssh key for your EC2 instance if you lost, and maybe compromised your private key.

The gist of it is to add in a new key pair to the disk volume of the EC2 instance. Pretty straightforward! But how can you do it without  being able to ssh into the EC2 instance without the private key you just lost? You will need to attach the root volume of the EC2 instance to another temporary EC2 instance, which you can access with a new key pair, and add in the new key pair to the original volume from there.

Summon the NewKeyPair!

First, create a new key pair. You can either generate a private and public key pair on your own and import the public one into the AWS console, or create it from the AWS management console and download the private key that they generated for you thereafter. Should you go for the latter, make sure your browser is not blocking the download.

Blocked download | vic-l

For the rest of the article, the new key pair will be referred to as NewKeyPair, and the old key pair LostKeyPair.

Retire The Veteran

Stop your old instance. Do not terminate!

NOTE: Your instance root volume need to be EBS backed and not instance store as instance store volumes are ephemeral. They do not persist the data after power down.

Once it has successfully stopped, you will realise that its volume remains attached. That’s EBS for you!

We will come to detaching it in a while. For now, spin up a new server.

Katon: Summon-The-New-Server-Jutsu

Launch a new server with the NewKeyPair. This is a temporary server and can be any of the linux distribution.

Detach The Old Volume

In the volumes page, select the old instance volume and select Detach as shown. There should be no error unless your old instance is still in the process of shutting down.

Detach old EBS Volume | vic-l

Once it is detached, you will observe that its status has changed to available and its Attachment Information will become blank. Now it is freeee! Time to attach it to the new server and receive its new key pair.

Attach To New Instance

Attach the root volume to the new instance as shown.

Then select the device to mount on.

attach volume device | vic-l

I will set /dev/sdf as suggested. The other devices reserved for the root volume (/dev/sda) and instance store volumes (/dev/sd[b-e]). More information on the device naming in AWS EC2 can be found here.

Run the command lsblk to see the new volume mounted. Note that the linux kernel has change my mount point from sdf to xvdf as noted in the warning callout in the image above.

lsblk | vic-l

Mounting The Volume

You would not be able to use the volume right away after attaching without mounting the volume in the system. Mounting will tell the EC2 instance how to access this new device via its list of directory. This will require setting up a mount point. Run the commands below.

sudo mkdir /mnt/tempvol
sudo mount /dev/xvdf1 /mnt/tempvol

These commands will mount the root of the device to the directory named /mnt/tempvol. You can change directory into the volume and see that it contains content from your old server.

From the image above, you can see that the authorized_keys file containing the old public key is placed in /home/ubuntu/.ssh directory relative to the mount point. The new public key pair exist in the /home/ubuntu/.ssh directory in the absolute path, which exist in the root volume of the new instance.

Adding The NewKeyPair To The Old Volume

Eventually, we want to use the new key pair to access the old server, with the content of the old volume, just like the good old times. To do get, add the NewKeyPair to the ssh folder of the old volume.

Realise that this is now possible because of the attaching and mounting of the old volume to a new instance which can be accessed due to a fresh setup and key pair creation.

I have used the append operation, >> instead of the overwrite operation, which is a single >. This is not necessary. It is up to you to decide if you want to get rid of the old key pair or not, depending on your situation.

If you lost your old key pair, feel free to overwrite it. There is no point hoarding it, and Marie Kondo can’t help you declutter software.

Attaching The Volume Back

Next, you can shut down your new server and attach your old volume back to the old instance. Remember, the EC2 instance will not be deleted and is still available if you chose stop instead of terminate when shutting it down initially.

attach volume back | vic-l
Mount the volume, this time, to /dev/sda1.
attach to sda1 | vic-l

The reason to mount it at /dev/sda1 is because we need to give the instance back its root volume for its boot operations. If we were to mount it to another device, you will see this error when starting the server because no root volume is detected.

error starting old instance | vic-l

Back To The Past

Now you can try to connect to your old instance after it has started up.

NOTE: you will see that the connection instruction is still mentioning the LostKeyPair. Even if you had overwritten your ssh key pair to the new one, this wrong instructions will still persist. Of course, you should connect with your NewKeyPair.

Connect to old instance again | vic-l

To ascertain that you have the old public keys, head dow to the <code>~/.ssh</code> directory and see that the changes you made on the new instance via attaching and mounting of the old instance’s volume has persisted.

Now, we have successfully added a new key pair to the old instance, and we can use it to ssh into the old instance form now on, even though we had lost our original key pair that was used to create it.

Implementing Eslint In Sublime With Airbnb Style Guide

This is a documentation on how to setup eslinting on sublime text editor, bootstrapped with using the style guide set up by Airbnb.

Why Is Eslinting Necessary

Linting your javascript code catches syntax errors, and possibly some runtime errors, while you are coding. It helps you debug faster, putting things like missing ; or wrong closures out of the way from the work. No more rolling your eyes and reducing your lifespan over things that shouldn’t matter.

eyeroll | vic-l

It will also be helpful if we are working together with other developers. It helps to keep the flavor of the JavaScript code same across the code bases, which can speed up development work and make it enjoyable at the very least.

Ok, maybe a teeny weeny bit better since we are talking about JavaScript here.

javascript crying | vic-l

Why Airbnb Style Guide

Linting can also be a good teacher. We can follow style guides, or more technically eslint configurations, setup by other senpais with telling experience or hailing from reputable tech companies, and when we write codes that do no adhere to  what they deem as best practice, we can learn why and change accordingly.

Google, for instance, has it own set of JavaScript style guide, which comes with a shareable eslint config that we can just plug and play into our codebase.

However, I would personally suggest Airbnb’s style guide because of its well-documented reasons for implementing each rule. In its style guide page, it states the reason for implementing certain rules as well as give examples of good and bad coding examples. Here is an image to better explain this point.

Airbnb javascript style guide | vic-l

This is an accumulation of a wealth of knowledge from brilliant programmers that is made easily accessible to us.

Sometimes, we may even learn some quirks of JavaScript that we never knew existed because we have not experienced the problem before. It is like time traveling to the future by leveraging on the past lessons of those before us.

Installing eslinting

Start off by installing the packages required for eslinting with Airbnb style.

npm i -D eslint 
    babel-eslint 
    eslint-config-airbnb 
    eslint-plugin-import 
    eslint-plugin-jsx-a11y 
    eslint-plugin-react 
eslint-plugin-react-hooks

I will attempt to explain what each line does.

-D flag

The -D flag installs the packages as development dependencies for the project at hand. It makes sense to install it under the project instead of globally because the same configurations can be shared to other developers working on the same project. And install it only as a development dependency as it is only required during development work.

eslint

The eslint package is the main package that will handle the linting on Vanilla JavaScript and Vanilla JavaScript only. It is crucial, for the case of JavaScript, to understand the significance of Vanilla JavaScript, and that will require a bit of a history lesson.

babel-eslint

JavaScript is a fairly senior language. It is one of the pioneers of computing languages, which meant that it did inevitably made some bad mistakes in its syntax design. Over the years, people, or should I say geniuses, are unhappy about it and have come up with more efficient ways to write it.
CoffeeScript is one example. To define a JavaScript function that looks like this:

var greet = function(name) {
  return console.log(Hello, ${name});
};

CoffeeScript and its gang of caffeine addicts have decided to shrink the amount of code require with the use of indentations. The same definition can be written in CoffeeScript like this:

greet = (name) ->
  console.log `Hello, ${name}`

TypeScript is another example and their side of the camp stresses the need for JavaScript variables to be type safe, the lack of which has caused much of the JavaScript errors throughout its history. The same function defined in TypeScript looks like this:

var greet = function (name: string): void {
  console.log(`Hello ${name}`);
}

The official standard for JavaScript, however, is the ECMAScript specification, or ES for short (hope that answer the naming of the eslint package). And its official compiler is Babel. The specification has undergone multiple improvements over the years and new standards were iterated. Babel has also evolved with it. The same function can be written in Babel as such:

var greet = (name) => {
  console.log(`Hello ${name}`);
}

All these various compilers mean that we need a different set of linting depending on the compiler you are using to catch the corresponding syntax errors. Airbnb style guide uses Babel and follows the official modern JavaScript syntax. Hence we are installing the babel-eslint package.

eslint-config-airbnb

This package  consist of all the rules that the engineers in Airbnb have deemed as best practices and which their company follows.

eslint-plugin-import

Next is the eslint-plugin-import package which supports linting of file imports using ECMAScript. While the previous mentioned packages watch out for syntax errors in your code, this packages looks out for erroneous file imports. These errors may be due forgetting to export modules that exist in another file, or a wrong spelling in the file name to be imported.

The Other Packages

The other packages are react specific. Airbnb use react as the main JavaScript framework for their front end. Hence their default linting config requires the remaining packages to work, namely eslint-plugin-react, eslint-plugin-react-hooks, and eslint-plugin-jsx-a11y. Without them, you will get missing dependency errors.

Note that it also requires the eslint and eslint-plugin-import packages to work, but since its not specific to react and is a good-to-have tool for development work, I am explained a little more about them.

If you are not using react, but still want to use Airbnb style guide, you will be interested in their base configurations in the eslint-config-airbnb package.

The .eslintrc

What we have done so far is only downloaded the linting packages. To utilise them, some configurations need to be setup, and I will be setting it up using .eslintrc. There are a number different ways to setup the configurations.

Write the .eslintrc as such:

{
  "parser": "babel-eslint",
  "extends": "airbnb",
  "rules": {
    "camelcase": "off",
  }
}

The parser option specifies that we are going to write the latest ECMAScript syntax and using Babel as the compiler, as what the rules in the Airbnb configuration file is expecting. More parser options can be found here.

The extends options tells the linter to use the rules setup by the Airbnb configuration. The rules will be empty if this is not specified, and you will be essentially writing code with the recommended setting in babel-eslint package.

The rules options allows you to overwrite rules that might not fit your workflow. I added the example for ignoring warnings on non camel case variables which I am using right now. I do not follow the camel case practice when naming JavaScript variables because I am working with a Ruby On Rails backend. The Rails framework advocates camel cased variable naming which is different from that of JavaScript. Hence, I decide to keep the same casing for the variable naming so that it is easy to receive data from the API responses, as well as to package requests to send to the backend..

Text Editor Linter

Lastly, we need a package on the text editor you are using to read the eslint packages and configurations to highlight the relevant syntax errors accordingly. For sublime text editor, you need to install these packages.

SublimeLinter
SublimeLinter-eslint

The steps to install sublime text packages can be easily achieved with the Sublime Package Control.

Restarting Sublime Text

The last step is to restart your Sublime Text editor. Open a file with the js extension, and start writing some erroneous code to see the linting in effect!

Terraform With Docker

This is a documentation on how to use Terraform with Docker to provision cloud resources, mainly using AWS as the provider. It contains tips on certain practices that I personally deem best practices fo various reasons.

It will revolves around these 3 commands

docker run -v `pwd`:/workspace -w /workspace hashicorp/terraform:0.12.9 init
docker run -v `pwd`:/workspace -w /workspace hashicorp/terraform:0.12.9 apply
docker run -v `pwd`:/workspace -w /workspace hashicorp/terraform:0.12.9 destroy

The Terraform image comes with the entrypoint command terraform, so we will append the commands init and apply respectively.

The Flags

The most straightforward way to run Terraform on docker is to do a docker run with a volume mount connecting the directory where the terraform files are to the working directory in the docker container. Assuming the current working directory is where the files are, we can simply run the command.

The -v option mounts your current working directory into the container’s /workspace directory.

The -w flag creates the /workspace directory and sets it as the new working directory, overwriting the terraform image’s original.

To verify the this, we can run the command below to see that the current working directory in the container is in fact /workspace.

docker run -v `pwd`:/workspace -w /workspace --entrypoint /bin/sh hashicorp/terraform:0.12.9 -c pwd

Over here we are overwriting the default entrypoint command of Terraform to run a shell command.

terraform init

The init command will download the necessary provider files and modules required for the execution to the working directory in the container. And due to the volume mount, the files will be reflected in the current working directory on the local machine. The files would be downloaded to the folder .terraform.

It is paramount to have this files downloaded to the current directory because on subsequent runs, the files would not need to be downloaded again since they will be persisted and available.

terraform apply

apply is the command to deploy the resources to the cloud.

If you do not use a Terraform backend, the tfstate file that holds all the information for the provisioning will be written to the working directory, and in turn to the current directory.

If you do use a Terraform backend, there will be no tfstate file written locally. They will be written to the backend that you specified. However, in the event that a Terraform deployment fails, you will have a errored.tfstate file written to the working directory. This ​errored.tfstate​ file is extremely important to keep track of the state of provisioned environment in the event of failures.

A possible scenario is due to lost connectivity. I encountered that when I was traveling in some remote areas of Brazil. The volume mount saved my life. I am pretty surprised that there is not much documentation in the official terraform docs. I tried googling the query below but there is no result.

site:https://www.terraform.io/ "errored.tfstate"

Without the errored.tfstate file, undesirable duplicate resources may be created on subsequent deployments. In other cases, subsequent deployment itself might fail due to having resources of the same identification that is prohibited in AWS, which otherwise would not have occurred due to the out-of-sync state.

To update the state, we can run the command

docker run -v `pwd`:/workspace -w /workspace hashicorp/terraform:0.12.9 state push errored.tfstate

Apart from the errored.tfstate file, the tf log  file that you specify will also be written, which is may be used for debugging your terraform deployments.

terraform destroy

Lastly, destroy is the command to remove the resources from the cloud.