How To Remove Fingerprinting In Assets With Rails + Webpacker

In order to remove fingerprinting, that is the hash value appended to the file name of a compiled asset, be it a javascript, css or image file, you will need to configure the webpacker environment configuration files differently for each asset.

Due to the their different configurations, each method is different and I will try to explain why and how I got to this solution in this article.

TLDR

// config/webpack/environment.js

const { environment,  config } = require('@rails/webpacker')
//* REMOVE FINGERPRINT for js
environment.config.set('output.filename', 'js/[name].js')
//* REMOVE FINGERPRINT for images
const fileLoader = environment.loaders.get('file')
fileLoader.use[0].options.name = function(file) {
  if (file.includes(config.source_path)) {
    return 'media/[path][name].[ext]'
  }
  return 'media/[folder]/[name].[ext]'
}
//* REMOVE FINGERPRINT for css
const miniCssExtractPlugin = environment.plugins.get('MiniCssExtract')
miniCssExtractPlugin.options.filename = 'css/[name].css'
miniCssExtractPlugin.options.chunkFilename = 'css/[name].chunk.css'
module.exports = environment

Note that these codes are placed in the environment.js file because I wanted to apply the configuration to all my environments. If you need to apply to only 1 environment, go on to read my explanation of each step so that you understand the concept and can yield the code according to how you like and not the other way round.

Motivation

So I have a project where the identical code base needs to be hosted on 2 servers that each handles different traffic based on the requested paths.

Both servers sit behind a CDN, AWS Cloudfront in my case, and are 2 of the many origins I have set up. The CDN is configured to route, for explanation sake, paths starting with /onepiece to server 1, and the rest to server 2.

This setup poses a challenge where webpages going to server 1 will still be requesting assets, namely javascript, css and images files, that are configured to come from server 2.

The problem here is these files do not exist on server 2! This is because the fingerprint value generated by each server during compilation is different.

TODO: I need to double check on this point because wouldn’t this make caching problematic if deployment, and thus compilation, is frequent and the hash keeps changing?

This messes up the page styling and it seems that the best way is to remove the generated fingerprint during asset compilation, at least for my case.

Now I know, there is a good reason for the existence of fingerprinting, but for my particular case, I needed to get rid of it. I know what I’m doing, I hope.

So, 始ります!

Remove Fingerprint For Javascript Files

Javascript files are the easiest to have its fingerprint removed, or have their file names configured.

These are the only files the only files that webpacker can read, compile and output natively without any plugins or loaders.

Hence, changing its output name is as simple as 1 line of code as shown below.

environment.config.set('output.filename', 'js/[name].js')

We are removing the [contenthash] in the file name, from its original configuration of js/[name]-[contenthash].js, which is the variable that is telling the compiler to add a hash in the output file name.

In case you are wondering how do I get the default setting, you can do a console.log(environment.config.output.filename) and run the compile command to identify it. Of course it will be better if you can go to the source code of webpacker to confirm this but ain’t nobody got time for that.

This code can be placed in config/webpack/environment.js to standardise the configuration across all environment, or in config/webpack/production.js to execute only in the production environment.

Note that you are targeting only javascript files here. Other assets like css and images are handled differently. Hence, the extension is always js.

Some issues on Github complained about that [ext] does not work, and, I believe, is due to some misconception on how Webpacker work (Disclaimer: I don’t really know how it works either).

Of course it is not going to work here because the output of only javascript and javascript only files are configured in this step, hence there is no need for a variable to dynamically determine the extension value.

Remove Fingerprint For Images

So when do we use [ext]? Well, that’s in this section where we are configuring other assets like images, gifs and fonts (maybe even audio and video but I have not tried them), where there are multiple formats.

And this configuration is not performed on webpacker itself, but on file-loader.

file-loader is a loader that is included by default in Rails’ webpacker to handle the compilation of non javascript and css assets. We change the output of the file name here.

const { environment,  config } = require('@rails/webpacker')
...
const fileLoader = environment.loaders.get('file')
//* to see definition of name function
// console.log(fileLoader.use[0].options.name.toString());
fileLoader.use[0].options.name = function(file) {
  if (file.includes(config.source_path)) {
    return 'media/[path][name].[ext]'
  }
  return 'media/[folder]/[name].[ext]'
}

Since the file-loader is already part of the webpacker settings, we first need to get our hands on that object, and line 3 shows how it is done.

Next, we change the name option. This option is in charge of the format of the output and under the default settings of webpacker, it is a function with some conditional logic. You can see the original function using the code snippet in line 5. I basically remove the hash variable.

And yes, [ext] will work here, in file-loader.

Notice in line 7, there is this config.source_path. A little explanation about where it came from.

In the original code, it is “spelled” sourcePath. And if we were to just copy it and run our compilation code, it will throw an undefined value. So we have to understand its origins in order to provide the same object as the argument.

Basically, it is an attribute of the webpacker config object that has undergone an es6 destructuring to change from snake to camel case, and its value is populated from a yml file. You can do a search on sourcePath and source_path in the webpacker repository on Github to learn more about it.

And where do we get the config object? Back in line 1, we exported it along side environment. In the default environment.js file, this is not the default setting; only environment is imported so take note here.

With that, and understand the various placeholders of the file-loader loader, you can probably understand the logic behind the Rails webpacker team on generating the asset files, and change the configuration according to your needs.

Now on to the css files.

Remove Fingerprint For CSS

Here is yet another way to handle assets compilation.

For images, loaders were use. For css, we use plugins. The MiniCSSExtractPlugin is used by default in webpacker to minimize the final css files after they have gone through pre processes, like converting from sass, and post processes, like postcss, and output the minimized file. Naturally, here is where the output file name is configured.

const miniCssExtractPlugin = environment.plugins.get('MiniCssExtract')
miniCssExtractPlugin.options.filename = 'css/[name].css'
miniCssExtractPlugin.options.chunkFilename = 'css/[name].chunk.css'

Line 1 shows how we can get access to the plugin object that is already configured by default in webpacker.

In lines 2 and 3, we change the file name by removing the hash variable from the default value.

Some solutions that I have found mentioned they append the plugin. This still works, but it will generate both css files with and without the hash in the file name. And this is because 2 instances of MiniCSSExtractPlugin is executed in the process.

Conclusion

There is quite a lot of abstractions here for this simple configuration, and the documentation is not the best for webpacker. I guess the webpacker team is still trying to optimize webpacker and would rather focus on that than the API and the documentation. Hopefully that will change in the future!

Ruby Threads Are Concurrent, Not Parallel, Thanks To Global Interpreter Lock

This is a concept in Ruby, at least for Ruby 2 before the 2020 Christmas present by Matz and team, where threads can only run concurrently, but not in parallel.

This distinction needs to be clear.

In parallelism, multiple threads run together and doing stuff like iops, switching and allocating memory and variables, as well as making HTTP request.

On the other hand in concurrency, it is only when 1 thread becomes idle, maybe from making a HTTP request that will take at least finite amount of time to return, does another thread execute its program. This is the principle that ruby threads adhere to, at least if they are running under Ruby MRI which is the default.

The reason for this design is to prevent race conditions and uphold thread safety, at the expense of parallelism and thus performance, which is totally understandable. And the guardian of this job is none other than the Global Interpreter Lock (GIL).

Note to self: The GIL is analogous to the event loop mechanism in the Javascript engine.

To illustrate this concept better, below is a simple sinatra application that opens up 2 routes. Both executes 2 commands of sleep, but one of them is done using threads.

require 'sinatra'

get '/sleep' do
  start_time = Time.now

  2.times do
    sleep 2
  end

  elapsed_time = Time.now - start_time
  elapsed_time.to_s
end

get '/sleep_with_thread' do
  start_time = Time.now

  threads = []
  2.times do
    threads << Thread.new do
      sleep 2
    end
  end

  threads.each(&:join)

  elapsed_time = Time.now - start_time
  elapsed_time.to_s
end

As sleep is an idle operation, we would expect the GIL to release the lock on the first thread and execute the next thread. This will result in the elapsed_time for the sleep_with_thread route to be around 2 seconds, while that for the sleep route will be accumulatively at around 4 seconds, as shown in the screen shots below.

However, things will behave differently if it was not an idle command like sleep. Let’s setup 2 routes that goes to work rather than sleep!

require 'sinatra'
 get '/work' do
   start_time = Time.now
 2.times do
     string = ''
     50_000.times do
       string += '仕事!'.freeze
     end
   end
 elapsed_time = Time.now - start_time
   elapsed_time.to_s
 end

 get '/work_with_thread' do
   start_time = Time.now
 threads = []
   2.times do
     threads << Thread.new do
       string = ''
       50_000.times do
         string += '仕事!'.freeze
       end
     end
   end
 threads.each(&:join)
 elapsed_time = Time.now - start_time
   elapsed_time.to_s
 end

Note: The freeze command is a mere memory optimization and does not affect the response time in any significant manner.

The elapsed_time between the 2 operations are similar, as shown in the screenshots below.

This clearly depicts that the 2 threads does not run simultaneously, which would otherwise have resulted in the operation concluding in half the time shown.

Instead, the 2 threads are executed synchronously, one after another, because the operations involved are not idle.

This little experiment clearly defines the difference between concurrency versus parallelism, and portrays the limited capability of using threads in Ruby.

Conclusion

Hence, only spawn new threads when there is an idle operation involved, with the most practical example being making a HTTP request, in order to take advantage of threading in ruby.

Otherwise, you will be doing a move as redundant as “taking off your pants to fart”.