Listing the contents of a remote ZIP archive, without downloading the entire file

Why I needed to do this is a longer story, but this was a question I was looking for an answer to.

Initially it led me to the following SO question:

Is it possible to download just part of a ZIP archive (e.g. one file)?

Not exactly the same problem but close enough. The suggestion here is to mount the archive using HTTPFS and use normal zip tools. The part of the answer that caught my eye was this:

This way the unzip utility’s I/O calls are translated to HTTP range gets

https://stackoverflow.com/a/15321699

HTTP range requests are a clever way to get a web server to only send you parts of a file. It requires that the web server supports it though. You can check if this is the case with a simple curl command. Look for accept-ranges: bytes.

I’ve added a simple test archive, with some garbage content files, as a test subject here:

$ curl --head https://rhardih.io/wp-content/uploads/2021/04/test.zip
HTTP/2 200
date: Sun, 18 Apr 2021 14:01:29 GMT
content-type: application/zip
content-length: 51987
set-cookie: __cfduid=d959acad2190d0ddf56823b10d6793c371618754489; expires=Tue, 18-May-21 14:01:29 GMT; path=/; domain=.rhardih.io; HttpOnly; SameSite=Lax
last-modified: Sun, 18 Apr 2021 13:12:45 GMT
etag: "cb13-5c03ef80ea76d"
accept-ranges: bytes
strict-transport-security: max-age=31536000
cf-cache-status: DYNAMIC
cf-request-id: 0986e266210000d881823ae000000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
report-to: {"group":"cf-nel","endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?s=mQ4KV6cFG5W5iRV%2FSdu5CQXBdMryWNtlCn8jA29dJC44M8Hl5ARNdhBrIKYrhLCdsT%2FbD8QN07HEYgtWDXnGyV%2BC%2BA2Vj6UTFTC6"}],"max_age":604800}
nel: {"report_to":"cf-nel","max_age":604800}
server: cloudflare
cf-ray: 641e6ce9cf77d881-CPH
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400

This got me thinking if it might be possible, to construct some minimal set of requests, that only gets the part of the ZIP file containing information about its content.

I didn’t really know anything about the ZIP file format beforehand, so this might be trivial if you are already familiar, but as it turns out, ZIP files contain information about their contents in a data block at the end of the file called the Central Directory.

This means that it’s only this part of the archive that’s required in order to list out the content.

On-disk layout of a binary file format that is both the ZIP container and its relevance in the 64-bit format.
https://commons.wikimedia.org/wiki/File:ZIP-64_Internal_Layout.svg

HTTP range requests are specified by setting a header that has the form: Range: bytes=<from>-<to>, so that means if we can somehow get a hold of the byte offset of the Central Directory and how many bytes it takes up in size, we can issue a range request that should only carry the Central Directory in the response.

The offsets we need are both part of the End of central directory record (EOCD), another data block, which appears after the Central Directory, as the very last part of the ZIP archive. It has variable length, due to the option of including a comment as the last field of the record. If there’s no comment it should only be 22 bytes.

Back to square one. We have to solve the same problem to get just the EOCD, as we have for the Central Directory. Since the EOCD is at the very end of the archive, to corresponds to the Content-Length of the file. We can get that simply by issuing a HEAD request:

$ curl --head https://rhardih.io/wp-content/uploads/2021/04/test.zip
HTTP/2 200
date: Sun, 18 Apr 2021 14:45:22 GMT
content-type: application/zip
content-length: 51987
set-cookie: __cfduid=dd56ae29f49cf9931ac1d5977926f61c01618757122; expires=Tue, 18-May-21 14:45:22 GMT; path=/; domain=.rhardih.io; HttpOnly; SameSite=Lax
last-modified: Sun, 18 Apr 2021 13:12:45 GMT
etag: "cb13-5c03ef80ea76d"
accept-ranges: bytes
strict-transport-security: max-age=31536000
cf-cache-status: DYNAMIC
cf-request-id: 09870a92ce000010c1d6269000000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
report-to: {"group":"cf-nel","endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?s=Ko46rGYqFfKG0A2iY93XNqjK7PSIca9m9AK5iX9bfUUYr0%2BzdzjMN1IJXQ%2Fn5zjj%2B96d2%2Bnaommr%2FOUaGrzKpqyUjaeme0HGvA1z"}],"max_age":604800}
nel: {"report_to":"cf-nel","max_age":604800}
server: cloudflare
cf-ray: 641ead314d8710c1-CPH
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400

In the case of the test file; 51987 bytes. So far so good, but here’s where we have to cut some corners. Due to the comment part of the EOCD being variable length, we cannot know the proper from offset, so we’ll have to make a guess here, e.g. 100 bytes:

$ curl -s -O -H "Range: bytes=51887-51987" https://rhardih.io/wp-content/uploads/2021/04/test.zip
[~/Code/stand/by] (master)
$ hexdump test.zip
0000000 7c 60 dd 2f 7c 60 50 4b 01 02 15 03 14 00 08 00
0000010 08 00 5b 79 92 52 58 64 08 f4 05 28 00 00 00 28
0000020 00 00 0e 00 0c 00 00 00 00 00 00 00 00 40 a4 81
0000030 44 a1 00 00 72 61 6e 64 6f 6d 2d 62 79 74 65 73
0000040 2e 31 55 58 08 00 ba 2f 7c 60 dd 2f 7c 60 50 4b
0000050 05 06 00 00 00 00 05 00 05 00 68 01 00 00 95 c9
0000060 00 00 00 00
0000064

Since we’ve most likely have preceding bytes that we don’t care about, we need to scan the response until we find the EOCD signature, 0x06054b50, (in network byte order). From there extracting the offset and size for the Central Directory is straightforward. In the case above we find it at 0x0000c995, with a size of 0x00000168 (or 51605 and 360 base 10 respectively).

One more curl command to get the Central Directory:

$ curl -s -O -H "Range: bytes=51605-51987" https://rhardih.io/wp-content/uploads/2021/04/test.zip

Notice I’m including the EOCD here, but that’s just so we can use zipinfo on the file. Really to would be 51965.

Here’s a zipinfo of the original file:

$ zipinfo test.zip
Archive:  test.zip
Zip file size: 51987 bytes, number of entries: 5
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.3
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.4
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.5
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.2
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.1
5 files, 51200 bytes uncompressed, 51225 bytes compressed:  0.0%

And here it is of the stripped one:

$ zipinfo test.zip
Archive:  test.zip
Zip file size: 382 bytes, number of entries: 5
error [test.zip]:  missing 51605 bytes in zipfile
  (attempting to process anyway)
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.3
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.4
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.5
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.2
-rw-r--r--  2.1 unx    10240 bX defN 21-Apr-18 15:10 random-bytes.1
5 files, 51200 bytes uncompressed, 51225 bytes compressed:  0.0%

Ruby implementation

A bunch of curl commands is all well and good, but in my case I actually needed it as part of another script, which was written in Ruby.

Here’s a utility function, which basically does the same thing as above, and returns a list of filenames:

EOF

Obviously this whole dance might be a bit of an over-complication for smaller zip files, where you might as well just download the whole thing, and use normal tools to list out the content, but for very large archives maybe there’s something to this trick after all.

If you know of a better or easier way to accomplish this task, feel free to leave a comment or ping me on Twitter.

Over and out.

Addendum

After posting this, it’s been pointed out to me, that the initial HEAD request is redundant, since the Range header actually supports indexing relative to EOF.

I had a hunch this should be supported, but as it wasn’t part of one of the examples on the MDN page, I overlooked it.

In section 2.1 Byte Ranges of the RFC, the format is clearly specified:

   A client can request the last N bytes of the selected representation
   using a suffix-byte-range-spec.

     suffix-byte-range-spec = "-" suffix-length
     suffix-length = 1*DIGIT

This means we can start right from the initial GET request and just specify a range for the last 100 bytes:

$ curl -s -O -H "Range: bytes=-100" https://rhardih.io/wp-content/uploads/2021/04/test.zip

Here’s the updated Ruby script to match:

Migrating from LastPass to pass

I’ve been using LastPass as a password manager since 2012 and even paid for the Premium subscription, so I could access it from my phone. That specific feature later turned free, and has recently turned paid again. It’s not that I’m unhappy with LastPass per se, or that I wouldn’t want to pay for the service again. I’ve just had my eye on an alternative with a bit more control in the past, e.g. self-hosted Bitwarden. I never did get around to doing it however. It just seemed like a little too much effort for little gain.

That being said, ever since a colleague introduced me to The standard unix password manager, pass, I’ve started to use it alongside LastPass and think it’s a brilliant tool. For that reason, I’ve been wanting to do a setup based on it, to replace LastPass entirely.

Overall pass mostly offers the same core feature as LastPass, in the sense that it provides a means to store a collection passwords or notes in an encrypted manner. LastPass has a few more bells and whistles though. To name a few I’ve found useful:

  • Automatic synchronisation between devices.
  • Access from mobile.
  • Autofill in the browser.

Export / Import

Luckily getting passwords out of LastPass is very easy. The extension provides a direct file download from Account Options > Advanced > Export > LastPass CSV File.

The question is then, how to move the content of this file into pass~/.password-store.

There’s no support for importing random CSV files in pass itself, so my first thought was to write a small script, that would go through the file line by line and issue the corresponding pass insert commands for each entry.

Before doing that, luckily I came to my senses and looked for existing solutions first.

On the homepage of pass there’s a link to an extension for this very problem; pass-import and it supports importing from a LastPass vault. It’s not written by the author of pass though, so the question is whether it can be trusted?

Deciding to feed the entirety of your passwords portfolio to some tool you’ve found online shouldn’t be done lightly, but given the fact that it is linked from the official homepage and appears to have some activity and approval on GitHub, does make it seem trustworthy.

I did look a bit through its source as it appears on GitHub, but didn’t really know what specifically to look for. Also I didn’t find anything that smelled outright like, “send this file over the internet”. For that reason, and the above, I decided to throw caution to the wind and give it a go.

Installing the extension

My install of pass comes from Homebrew, but unfortunately only the pass-otp and pass-git-helper extensions seems to be available as formulae:

$ brew search pass
==> Formulae
gopass              lastpass-cli        pass-otp ✔          passwdqc
gopass-jsonapi      pass ✔              passenger           wifi-password
keepassc            pass-git-helper     passpie

Bummer… It is available from pip though:

pip3 install pass-import

This way of installing it doesn’t integrate with pass in the normal way however:

$ pass import
Error: import is not in the password store.

Not to worry, it is possible to invoke the extension directly instead:

$ pimport pass lastpass_export.csv --out ~/.password-store
 (*) Importing passwords from lastpass to pass
  .  Passwords imported from: /Users/rene/lastpass_export.csv
  .  Passwords exported to: /Users/rene/.password-store
  .  Number of password imported: 392

Marvellous! All of my LastPass vault is now in the pass store.

Missing features

Most important of all is probably synchronising the store between different machines, so let’s take a look at that first.

LastPass synchronises its vault via its own servers. With pass you could achieve something similar, by putting your store on Dropbox or Google drive for instance, but there’s another alternative I think is even better. There’s an option in pass to manage your store as a Git repository and as such, it would allow pushing and pulling from a central Git server.

Now the point of this whole exercise, is to have a solution that provides a lot more control and even though everything in the pass store is encrypted, pushing everything to e.g. GitHub, defeats the purpose somewhat.

Gitea

I’ve opted to self-host an instance of Gitea and use it to host the Git repository for my pass store. UI wise It’s a lightweight clone of GitHub, so it appears very familiar. I won’t get into the specifics of how to set it up here, but the official docs are quite good on that front anyway; Installation with Docker.

The man page for pass, has a very good example of migrating the store to use git in the “EXTENDED GIT EXAMPLE” section.

Whether you choose to self host or use another service for Git, after you’ve initialised the repository through pass and done an initial push, everything should show up in your remote. Encrypted of course.

Cloning the repository on a new machine and copying over your GPG key is basically all that is really needed. Here’s an example moving a GPG key from one machine to another:

hostA $ gpg --list-keys # Find the key-id you need
hostA $ gpg --export-secret-keys key-id > secret.gpg

# Copy secret.gpg to hostB

hostB $ gpg --import secret.gpg

You might need to trust the key on the new machine, but now synchronising additions and removals from the store, is just a git push/pull away.

What about mobile?

I’m an Android user, so for me it turns out this is an easy addition. You only need two apps:

Password Store handles the git interactions and browsing of the password store, whilst OpenKeychain allows you to import your GPG key to your phone.

That part was definitely easier than expected, but obviously this is assuming your phone can reach your git server. I put my Gitea installation on a publicly resolvable hostname, so this was not a problem in my case. Using a VPN client on the phone might also be an option, if you don’t want anything public facing.

The Password Store app doesn’t seem to do autofill, but to be honest, the way overlays work in Android is kind of annoying anyway. I don’t suspect I’ll miss it. A bit of copy pasting is fine by me.

Edit 21/03/21: The Password Store app does support autofill! I just didn’t see it in the input field drop-downs alongside LastPass when I did my initial testing as it is not enabled by default. My apologies for the misinformation.

There’s a radio toggle in Settings > Enable Autofill and enabling it will add Password Store as a system wide Auto-fill service:

Password Store in the Auto-fill service settings

It works similarly to the way LastPass worked, where you can invoke it from an empty input or password field in other apps.

Wrap up

As of today LastPass is gone from my phone, but I’ll keep the browser extension on the desktop around for a little while. Just as a backup.

On the subject of browser extension, there is the browserpass-extension. It requires a background service to interface with pass, which is to be expected, but also a little messy, so for now I’ll see if I can get by without it.

Links

Benchmark numbers for Tesseract on Android

Here’s an interesting find, I came upon recently.

In the the process of moving an Android app project of mine, Camverter, off the now ageing version r10e of the Android NDK and onto version r18b, I had to rebuild some of its dependencies as well, in order to to maintain standard library compatibility.

In r18b GCC has been removed, in favour of Clang, so I decided I wanted to gauge any performance differences, that that change might have incurred. One such dependency is Tesseract, which is a main part of the OCR pipeline of the app.

In this post we’ll be looking at how it performs, with versions built by both compilers.

Build

To build an Android compatible shared library of Tesseract, I’m using a homegrown build system based on Docker, aptly named Building for Android with Docker, or bad for short. I won’t go into details about that project here, but feel free to check it out nonetheless.

The docker files I’ve used for each version of the NDK is linked here:

Aside from using different NDKs and a few minor changes to the configure flags, they’re pretty much alike.

To note, in both cases the optimisation level of GCC was set to -O2:

root@3495122c4fa2:/tesseract-3.05.02# grep CXXFLAGS Makefile
CXXFLAGS = -g -O2 -std=c++11

Testing

To benchmark the performance of tesseract, I’ve added a very simple test that runs OCR on the same 640×480 black and white test image, ten times in a row1, and then outputs the average running time:

Test Image
Test Image

Full source of the test available here.

Test devices

Currently the range of my own personal device lab, only extends to the following two devices. It’s not extensive, but given their relative difference in chipset, I think they’ll provide an adequately varied performance insight:

  • Sony Xperia XZ1 Compact (Snapdragon 835 running Android 9).
  • Samsung Galaxy S5 G900F (Snapdragon 801 running Android 6.01).

Results

In the chart below it’s apparent, that there’s a significant performance increase to be gained, by switching from GCC to Clang.

This roughly translates into a 28.3% reduction for the Samsung, and a whopping 38.7% decrease for the Sony. Thank you very much Sir Clang!

Graph of execution times for Tesseract v3.05.02

Bonus

Additionally I decided to run a similar test for version 4.0.0 (non lstm) of Tesseract as well. The test source and Dockerfile for building v4.0.0 is likewise available in the bad repository.

In this instance however, I simply couldn’t get a successful build with r10e, hence I only have numbers for Clang.

Graph of execution times for Tesseract v4.0.0

Once again, there’s a handy performance increase to be had.

Comparison

Execution time reduction for v4.0.0 (r18b / Clang), compared to v3.05.02:

r10e r18b
S5 39.2% 15.3%
XZ1 50.5% 19.3%

That’s a 2X increase in performance, in the case of the Sony, going from v3.05.02 compiled with GCC to v4.0.0 compiled with Clang.

That is pretty awesome, and I’m sure users of the app will welcome the reduced battery drain.

Footnotes

  1. I believe I got this image originally from the Tesseract project itself, but I’ve failed to find the source.

Behind the scenes of shell IO redirection

In the day to day toils on a command-line, it can be easy to overlook the complexities behind many of the constructs you use all the time.

In a POSIX shell, one such construct is the ability to pipe between, as well as redirect input and output of various commands with <, > and |.

Let’s stop and smell the roses, and ask; How does this actually work?

As an example, have you ever wondered, what happens under the hood, when you write a command like this?

cat foo.txt > bar.txt

That’s what we’ll take a look at in this post.

dtruss

In order for us to look into the belly of the beast, so to speak, we’ll need a tool to monitor system calls for a given process.

Since I’m doing this on an OS X system, the tool of choice is dtruss, a DTrace version of truss. On Linux strace can be used instead.

If you’re not interested in trying this out for yourself, skip on ahead to the Inspection section.

Preflight checklist

By default dtruss doesn’t work because of the System Integrity Protection (SIP), security feature of OS X. If you try to attach to a running process, you’ll get this error message from dtrace initially:

$ sudo dtruss -f -p 43334
dtrace: system integrity protection is on, some features will not be available

And then the log will be filled with dtrace errors like this, as soon as the process makes any system calls:

dtrace: error on enabled probe ID 2633 (ID 265: syscall::ioctl:return): invalid user access in action #5 at DIF offset 0

In order to work around this problem, it’s possible to disable SIP for dtrace exclusively. Reboot OS X in recovery mode and enter the following command in a terminal:

csrutil enable --without dtrace

You’ll see the following warning message:

This is an unsupported configuration, likely to break in the future and leave your machine in an unknown state.

That’s ok for now. Restoring the default configuration later can be done with:

csrutil enable

Reboot to normal mode again and open a terminal.

Noise reduction

To reduce the amount of unrelated events in the output from dtruss, it’s a good idea to run commands in a minimal environment without various hooks and other modern shell niceties.

Starting up, e.g. a new instance of bash, without inheriting the parent environment and loading a profile or rc, can be done like so:

env -i bash --noprofile --norc

Take-off

In the minimal bash instance just started, get the process ID of the shell:

bash-3.2$ echo $$
529

Now we’re ready to start monitoring. Open up a separate shell; note this doesn’t have to be minimal like above. Start up dtruss, attaching it to the bash process:

$ sudo dtruss -p 529 -f

The -f here makes sure any forked children is followed as well. If all went well, you’ll see this header appear:

PID/THRD SYSCALL(args) = return

Now we’re ready to issue our command with output redirection.

Run

I’m using the following small test file in this example, but any file will do really:

Back in our minimal bash shell, we’ll issue this simple command, redirecting stdout to the file bar.txt:

cat foo.txt > bar.txt

Now let’s take a look at what dtruss has picked up.

Inspection

After running the command, we should see a lot of stuff in the log output from dtruss.

The full output I got from dtruss can be found in this gist. For a better overview, I created a filtered version with irrelevant system calls omitted:

grep -v -E "ioctl|sigaction|sigprocmask|stat64|mprotect" dtruss.log > dtruss.short.log

Here’s the shortened version:

Target file

Quickly skimming the log reveals, that we’re looking at two different process ID / thread ID pairs. Namely 1436/0x5b3d on lines 1-5 and 36-39, as well as 1458/0x5c1d from 6 to 35.

The reason for this, is that the shell utilises a fork-exec approach, for running program binaries, e.g. cat, or anything that isn’t a shell builtin really.

The way it works, is by the parent process, in this case 1436, calling fork. This makes a copy of the current process and continues execution in both, albeit with some important differences.

In the child, fork returns with a value of zero and in the parent, it returns the process id of the forked child. That way it’s determined which of the two will subsequently transform into a new process through one of the exec family of system calls. In this case the dtrace probe is unable to properly trace it, but on line 21 we see an error for an execve call, so that is most likely the one in this case.

From line 6 the log output is coming from the child process. The first lines of interest here is 11-13. Let’s look at them one at a time.

On line 11, we can see an open system call for the file bar.txt returning successfully with a file descriptor value of 3, or 0x3 if you will.

Next on line 12, there is a dup2 call, with the descriptor value for bar.txt and then 0x1.

The man page for dup2 is somewhat awkwardly worded, but in short, this means “change whatever file descriptor 0x1 is pointing to, to whatever file descriptor 0x3 is pointing to”.

We already know 0x3 is a descriptor for bar.txt, but what about 0x1?

In POSIX any process has three standard streams made available by the host system, stdin, stdout and stderr, which by definition have the values 0, 1 and 2.

That means the dup2 call effectively changes the descriptor for stdout to point to the same thing as the descriptor for bar.txt. This is relevant, since cat reads files and writes them to the standard output.

On line 13 there is a close call on the descriptor for bar.txt. Now this may seem weird, since no data has actually been written to the file yet, but keep in mind this is only releasing the file descriptor. It doesn’t do anything to the file itself. Remember the descriptor for stdout now points to bar.txt, so the new descriptor is no longer needed and can just as well be made available to the system again.

Source file

The next lines of interest is 29-33.

On line 29, we again see another open call, but this time for foo.txt. Since the descriptor 0x3 was released on line 13, it is the first one available and is reused here.

On line 30-31 we see a read call on descriptor 0x3, which puts the content of foo.txt into memory, followed by a write on the stdout descriptor. Remembering stdout now points to bar.txt, we can assert the content of foo.txt has been written to bar.txt.

With line 32-33 a final read on the descriptor of foo.txt returns zero, which indicates end-of-file, followed by an immediate close.

On line 35, the last event from the child process closes stdin, with a call to close_nocancel.

Finally, on line 36, we see control return to the parent process with wait4, which waits for the child process to finish.

After this the log trace ends and the command is done.

Recap

So, to come full circle, when you enter a command like this:

cat foo.txt > bar.txt

What really happens behind the scenes, is the following:

  1. A child process is spawned from current process.
    1. The child process is transformed to a new process for cat via an exec type call.
    2. bar.txt is opened for writing, creating a new file descriptor.
    3. The file descriptor for stdout is made to point to bar.txt.
    4. The new descriptor is closed.
    5. foo.txt is opened for reading, creating a new file descriptor.
    6. A read to memory from the new descriptor of foo.txt is done.
    7. A write from memory to the descriptor of stdout is done.
    8. The new descriptor of foo.txt is closed.
    9. The descriptor of stdout is closed.
  2. Parent process waits for child to finish.
  3. Done.

It’s not all magic, but pretty close.

Further reading

References

Ruby on Rails Gotcha: Asynchronous loading of Javascript in development mode

Everyone knows that you shouldn’t block page rendering by synchronously loading a big chunk of javascript in the head of your page right? Hence you might be tempted to change the default Javascript include tag, from this:

<%= javascript_include_tag 'application' %>

To this:

<%= javascript_include_tag 'application', async: true %>

Which makes perfect sense, when serving all Javascript in one big file, as is the case in production, meaning everything is defined at the same time. What about development though?

Well, in development rails is kind enough to let you work on individual Javascript files, which means it will recompile only as needed, when a single file is changed. To this effect, each file is included separately via their own script tag in the header. E.g:

<script src="/assets/jquery-87424--.js?body=1"></script>
<script src="/assets/jquery_ujs-e27bd--.js?body=1"></script>
<script src="/assets/turbolinks-da8dd--.js?body=1"></script>
<script src="/assets/somepage-b57f2--.js?body=1"></script>
<script src="/assets/application-628b3--.js?body=1"></script>

* Tags intentionally shortened in example.

There is a subtlety here that is quite important. All the scripts are loaded synchronously, one after the other, as specified by the order they appear in the application.js manifest. This means we’re guaranteed that jQuery, etc. is available once we get to our own scripts.

Now consider the the same scripts, but with async=true:

<script src="/assets/jquery-87424--.js?body=1" async="async"></script>
<script src="/assets/jquery_ujs-e27bd--.js?body=1" async="async"></script>
<script src="/assets/turbolinks-da8dd--.js?body=1" async="async"></script>
<script src="/assets/somepage-e23b4--.js?body=1" async="async"></script>
<script src="/assets/application-628b3--.js?body=1" async="async"></script>

Since all scripts in this case is loaded *asynchronously*, all previous guarantees are now lost, and we’ll very likely start seeing errors like this:

Uncaught ReferenceError: $ is not defined

Oops!

The fix is simple though: Don’t load Javascript assets asynchronously in development mode!

Here’s one way of doing it:

<%= javascript_include_tag 'application', async: Rails.env.production? %>

Happy hacking!