As expected, serving a local npm repository is a lot less straightforward than that for pip and python.
The npm registry uses the CouchDB
nosql server as backend to resolve dependencies and serve resource metadata. The npm
CLI tool expects a corresponding json response to requests that cite the name (and not the path) of the package.
Let's ask the registry [1] for the package ftp, and have a look at the response (excerpt):
$ curl -X GET https://registry.npmjs.org/ftp | jq
{
"_id": "ftp",
"_rev": "113-89fe76508a7ece41b4c9a157114f966f",
"name": "ftp",
"description": "An FTP client module for node.js",
"dist-tags": {
"latest": "0.3.10"
},
"versions": {
"0.3.10": {
"name": "ftp",
"version": "0.3.10",
"author": {
"name": "Brian White",
"email": "mscdex@mscdex.net"
},
"description": "An FTP client module for node.js",
"main": "./lib/connection",
"engines": {
"node": ">=0.8.0"
},
"dependencies": {
"xregexp": "2.0.0",
"readable-stream": "1.1.x"
},
"scripts": {
"test": "node test/test.js"
},
"keywords": [
"ftp",
"client",
"transfer"
],
"licenses": [
{
"type": "MIT",
"url": "http://github.com/mscdex/node-ftp/raw/master/LICENSE"
}
],
"repository": {
"type": "git",
"url": "http://github.com/mscdex/node-ftp.git"
},
"bugs": {
"url": "https://github.com/mscdex/node-ftp/issues"
},
"homepage": "https://github.com/mscdex/node-ftp",
"_id": "ftp@0.3.10",
"_shasum": "9197d861ad8142f3e63d5a83bfe4c59f7330885d",
"_from": "https://github.com/mscdex/node-ftp/tarball/v0.3.10",
"_resolved": "https://github.com/mscdex/node-ftp/tarball/v0.3.10",
"_npmVersion": "1.4.28",
"_npmUser": {
"name": "mscdex",
"email": "mscdex@mscdex.net"
},
"maintainers": [
{
"name": "mscdex",
"email": "mscdex@mscdex.net"
}
],
"dist": {
"shasum": "9197d861ad8142f3e63d5a83bfe4c59f7330885d",
"tarball": "https://registry.npmjs.org/ftp/-/ftp-0.3.10.tgz"
},
"directories": {}
}
},
That looks a lot like embellished contents of the basic package.json set up by npm init
, except it's wrapped in a versions array. It also explicitly defines an absolute url to a tarball under the dist key. [2]
How low can you go
It seems that all that's needed is a json index page served for the package name subpath of repository path, together with the tarball itself. We check this by putting together an utterly pointless package foobarbarbar:
foobarbarbar/index.js
module.exports = { 'smth': smth, } function smth() { console.log('foo'); }
foobarbarbar/package.json
{ "name": "foobarbarbar", "version": "1.0.0", "description": "Foo repo", "main": "index.js", "author": "Foo Bar", "license": "GPL3", }
Then make a tarball from those two files named foobarbarbar-1.0.0.tgz, take the sha1 sum of it,
Remembering our virtual interface setup from the pip example we stick it behind our webserver (here with npm sub-path) and add our minimal version json wrapper:
package.json
{ "name": "foobarbarbar", "versions": { "1.0.0": { "name": "foobarbarbar", "version": "1.0.0", "description": "Foo repo", "main": "index.js", "author": "Foo Bar", "license": "GPL3", "dist": { "shasum": "2ccd68498ef5f2bfa00f0e1e59f44686fdb296ee", "tarball": "http://10.1.2.1/npm/foobarbarbar-1.0.0.tgz" } } } }
Making introductions
The central trick here is to serve a json document as the "directory index" of the HTTP server. It's useful to remind ourselves at this point that we are not setting up a registry but a repository that contains provisions for a locked or frozen dependency graph. The former would surely need some of the CouchDB magic to resolve dependencies. Our assumption is that the latter can theoretically be realized using static files.
In other words; just as with the previous python repository example, we don't try to handle dependency resolutions for pip, but merely serve the actual package files after dependences have been resolved.
With Apache Web Server, using the package.json as the directory index is as easy as:
<Directory "/srv/http/npm">
DirectoryIndex package.json
</Directory>
Of course adjusting the Directory path as needed to match the local setup.
Make sure that the tarball and the latter package.json can be found in the foobarbarbar virtual subfolder of the above Directory directive path:
$ ls /srv/http/npm/foobarbarbar/
-rw-r--r-- 1 root root 351 May 24 18:38 foobarbarbar-1.0.0.tgz
-rw-r--r-- 1 root root 380 May 25 09:36 package.json
Set the registry entry in your ~./npmrc as follows:
registry=http://10.1.2.1/npm
Then restart the Apache server and give it a go:
$ npm install --verbose foobarbarbar
npm verb cli [ '/usr/bin/node', '/usr/bin/npm', 'install', '--verbose', 'foobarbarbar' ]
npm info using npm@7.13.0
npm info using node@v16.1.0
[...]
npm http fetch GET 200 http://10.1.2.1/npm/foobarbarbar/ 6ms
[...]
npm http fetch GET 200 http://10.1.2.1/npm/foobarbarbar/foobarbarbar-1.0.0.tgz 25ms
[...]
added 2 packages in 545ms
npm timing command:install Completed in 70ms
[...]
Cache dodging
Seriously, npm --help leaves a lot to be desired. However, the online npm install documentation does not yield any more clues as to whether it gives you an option of ignoring local cache like you can with pip --no-cache. This is a major pain in the ass, as you have to remember to keep your ~/.npm folder clean between each install attempt. Otherwise, changes you make to the repository won't be used by npm install.
While testing, it pays to use a folder fairly close to the fs root, like a subfolder in tmp. That way, you won't be thrown off by some stray node_modules folder somewhere down the tree.
Think locally, act locally
Serving packages globally apparently comes down to this:
- Pull the package.json versions list from a proper registry.
- Get the tarball
- Transform the absolute package url to our host instead. (sigh)
- (optional) Prune all the versions and metadata which is not needed (anything that's not in our minimal foobarbarbar example above).
Obviously, this is an annoyingly large amount of work to code up parsing and retrieval for.
Incidentally, there is a very nice tool called verdaccio which gets us most of the way there. [3] Its storage directory [4] uses exactly the directory structure we need. After retrieving a package collection using it as a proxy, getting the files in place is merely a case of copying [5] the files from the storage directory to the corresponding webserver directory.
Looking at the package.json versions wrapper saved by verdaccio, however, we see that it still preserves the absolute path of the upstream registry. Sadly we are still left with doing the third task ourselves.
As parsing json is one of my least favorite things in the world, I won't include the code for this last step here. For now it's sufficient that we understand the minimums required for manually setting up and serving a static, local, offline npm repository. Regardless of the pain involved.
Making it personal
Whether we massage jsons ourselves or lazily resort to verdaccio, the final step we need to take is the same: Setting the registry url in the npm configuration of the docker image.
This is merely a case of setting the registry url in the npmrc inside the container.
$ docker run -it [...] npm config ls -l | grep etc/npmrc
globalconfig = "/usr/etc/npmrc"
To get at our manually provided foobarbarbar package from before, the dockerfile will be (using archlinux):
[...]
RUN pacman -S nodejs npm
RUN mkdir -vp /usr/etc
RUN echo "registry=http:/10.1.2.1/npm" > /usr/etc/npmrc
WORKDIR /root
RUN npm install foobarbarbar
[1] This was the registry address I had in my ~/.npmrc or whether I put it in myself.
[2] I have tried using a relative path instead, both with and without a leading /, but in either case the install then errors our complaining that the lock file is "corrupt." Whether the schema allows a base-url setting I do not know, as the schema documentation doesn't seem to be readily available.
[3] In fact, verdaccio by itself solves the particular problem that we're trying to solve; providing an offline repository alternative. The task at hand, however, is to understand how to cope without using other tools than what can be considered base infrastructure (i.e. a web server).
[4] Should be /etc/verdaccio/storage; see /etc/verdaccio/config.yaml
[5] An alternative approach would be to set the storage path to point to the same folder as the web server is serving. However, since we need to mess with the tarball paths