man bytes gnu - The NFT token id as URI

Let's consider an NFT that works like a badge for participating in development of a software project.

This token is awarded as a proof that the task was completed.

To make things more fun, each NFT should have some unique, immutable content attached to it.

In other words, the properties of this token, once set, should never change.

Nor should they disappear.

So how do we refer to the artwork asset within the token standard?

It was acceptable at the time

The ERC721 standard is not explicit about where the assets that belong with the NFT can be discovered and resolved.

At the time when the standard was adopted by the Ethereum community, there were multiple "[...] Alternatives considered: put all metadata for each asset on the blockchain (too expensive), use URL templates to query metadata parts (URL templates do not work with all URL schemes, especially P2P URLs), multiaddr network address (not mature enough)." Furthermore, they "[...] considered an NFT representing ownership of a house, in this case metadata about the house (image, occupants, etc.) can naturally change." [EIP721]

A "changing house" doesn't sound quite like what we need. And anyway; if we stick a good old web2 URI in there, then that will end up on the great bonfire of dead links before long.

Image, schmimage

To be honest, I find the presumption in the optional EIP721 metadata structure to be surprisingly short-sighted. It specifically defines the asset as an image, and at the same time is presupposes that only a single asset file will be used.

We may want to add multiple sources, so this is another obstacle for us.

So how to get around this, while still playing nice with existing implementations out there? Two ideas come to mind:

Embed a thumbnail as a preview of the artwork using a base64 data URI [1] in the metadata. Stick name and description on it, and the schema is still fulfilled.
Extend the structure with a list of attachments that our application layer knows about. Of course, each of these can have the same format as above.

In other words:

{
        "name": "Foo",
        "description": "Foo image",
        "image": "data:image/gif;base64,R0lGODlhDgARAKEDAAAAAOjqAPP1APDw8CH+EUNyZWF0ZWQgd2l0aCBHSU1QACwAAAAADgARAAACO5wHqXdrClocodbUaGg34qoBHaJtl7VYIfqBZxepb5C6NBvbIlyucO6jPUIK1CMiSCYvDIVS4GhCJoYCADs="
        "attachments": [
                {
                        "name": "Bar",
                        "description": "Bar image",
                        "image": "data:image/gif;base64,"
                },
                {
                        "name": "Baz",
                        "description": "Baz image",
                        "image": "data:image/gif;base64,"
                }
        ]
}

Mirror, mirror

Since the asset reference shouldn't change, we can refer to it by its fingerprint or content address. If we define that the resource can be looked up over HTTP by that fingerprint as its basename, then we are free to define and modify whatever list of mirrors for that resource that's valid for any point in time. The application layer would simply try the endpoints one after another.

We take the sha2-256 [2] of the asset reference (the json file above, free of evil whitespace and newlines):

$ cat reference.json | jq -c -j | sha256sum | awk '{ print $1; }'
3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551

Imagine we had a mirror list of https://foo.com and https://bar.com/baz/. Then our application would try these urls in sequence, stopping at the first that returns a valid result:

https://foo.com/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
https://bar.com/baz/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551

Once we receive the content, all we have to do is hash it ourselves and verify that the sum matches the basename of the URI. If it doesn't the result is of course not valid and we continue down the list, appropriately banning the mischievous server then throrougly harassing its admin.

Cast away

Since our fingerprint is 32 bytes, it fits exactly inside the tokenId (uint256). Let's decide to big-endian numbers when converting (I find them easier to make sense of). In that case our hash from the reference turns into this modest number:

# python3
>>> hx = bytes.fromhex('3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551')
>>> int.from_bytes(hx, byteorder='big')
28891040728719892888467057134569335350980764617882743994259054993630416573777

As long as we're composing the evm inputs ourselves, we don't really have to worry about the integer representation in this particular case. But the interface is defined as an integer type, and other mortals may be using higher level interfaces, we have to be explicit about our choice.

Welcoming mint

Assume we have a method mintTo(address _recipient, uint256 _tokenId) on our NFT contract. The Solidity signature of that contract is edb20b7e [3]. If I were to mint to myself then the input to the contract would be:

edb20b7e000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e33fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551

Broken down:

signature:             edb20b7e
address, zero-padded:  000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e3
token id:              3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551

The corresponding web3.js code would look like:

const c = new web3.eth.Contract([...], '0x...');
c.methods.mintTo('0x185cbce7650ff7ad3b587e26b2877d95568805e3', 28891040728719892888467057134569335350980764617882743994259054993630416573777).call();

To satisfy the tokenURI method, we can generate a string that's prefix with sha256 as a "scheme" [4]. A bit of (unoptimized) solidity helps us out here:

contract NFT {

        uint256[] token;
        mapping(uint256 => uint256) tokenIndex; // map token id to master token array index position

        [...]

        function tokenURI(uint256 _tokenId) public pure returns(string memory) {
                bytes32 token_bytes;
                bytes memory out;
                uint8 t;
                uint256 c;

                token_bytes = bytes32(token[tokenIndex[_tokenId]]);

                out = new bytes(64 + 7);
                out[0] = "s";
                out[1] = "h";
                out[2] = "a";
                out[3] = "2";
                out[4] = "5";
                out[5] = "6";
                out[6] = ":";

                c = 7;
                for (uint256 i = 0; i < 32; i++) {
                        t = (uint8(_data[i]) & 0xf0) >> 4;
                        if (t < 10) {
                                out[c] = bytes1(t + 0x30);
                        } else {
                                out[c] = bytes1(t + 0x57);
                        }
                        t = uint8(_data[i]) & 0x0f;
                        if (t < 10) {
                                out[c+1] = bytes1(t + 0x30);
                        } else {
                                out[c+1] = bytes1(t + 0x57);
                        }
                        c += 2;
                }
                return string(out);
        }
}

This will return sha256:3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551 for tokenId 3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551 as input, provided that the tokenId actually exists. That may seem a bit useless at first, but consider the scenario where we want to interface with other NFTs aswell. Or perhaps we are implementing a contract that optionally can support a static web2 URI in storage. By doing it this way, all bases are covered.

Decentralized identifiers

Even better would be to add redundancy with autonomous decentralized storage. However, networks like Swarm and IPFS use their own hashing recipes. That means that for every network referenced, we'd have to define an alternative in our reference structure.

Referencing the canonical sha256 aswell as the Swarmhash for the same item could then look like this [5]:

{
        "name": "Foo",
        "description": "Foo image",
        "image": "data:image/gif;base64,R0lGODlhDgARAKEDAAAAAOjqAPP1APDw8CH+EUNyZWF0ZWQgd2l0aCBHSU1QACwAAAAADgARAAACO5wHqXdrClocodbUaGg34qoBHaJtl7VYIfqBZxepb5C6NBvbIlyucO6jPUIK1CMiSCYvDIVS4GhCJoYCADs="
        "alternatives": [
                {
                        "name": "Foo",
                        "description": "Foo image",
                        "image": "bzz:4b9149ee4550f2d786f9ba6584b79a30ee14ae05ff6e84a0f7c7561a14e3b779"
                },
                {
                        "name": "Foo",
                        "description": "Foo image",
                        "image": "sha256:d036a4ce7f929b632256225b2bebd81bdd558d3bfe3d96faae61db664708c16f"
                }
        ]
}

[1] Yes, they are valid URIs actually: https://www.rfc-archive.org/getrfc.php?rfc=2397

[2] Likely it would be prudent to start using the official sha3 instead of sha2 these days, also because the sha2 hash is not a builtin for evm. But neither is sha3. The keccak256 Bitcoin uses, which EVM has inherited, is a pre-cursor to the keccak published as the official sha3. Still, keccak256 and sha3 is used interchangeably in opcode lists (and previously in Solidity too). This has caused me quite a fair bit of confusion, I might add. Apart from it being ambiguous, the keccak256 tooling is also less common in the wild. Therefore sha2 seems like a safer bet for our experiments. It's not broken yet, after all.

[3] The hex result of keccak256("mintTo(address,uint256)")

[4] Data URI is of no use here, because the hash itself is just nondescript binary data. Luckily <scheme>:<path> is still a valid URI.

[5] Here the hashes represent the media content itself, not the reference. That's why the sha256 one is different than before.

[EIP721] https://eips.ethereum.org/EIPS/eip-721