Let's consider an NFT that works like a badge for participating in development of a software project.
This token is awarded as a proof that the task was completed.
To make things more fun, each NFT should have some unique, immutable content attached to it.
In other words, the properties of this token, once set, should never change.
Nor should they disappear.
So how do we refer to the artwork asset within the token standard?
It was acceptable at the time
The ERC721 standard is not explicit about where the assets that belong with the NFT can be discovered and resolved.
At the time when the standard was adopted by the Ethereum community, there were multiple "[...] Alternatives considered: put all metadata for each asset on the blockchain (too expensive), use URL templates to query metadata parts (URL templates do not work with all URL schemes, especially P2P URLs), multiaddr network address (not mature enough)." Furthermore, they "[...] considered an NFT representing ownership of a house, in this case metadata about the house (image, occupants, etc.) can naturally change." [EIP721]
A "changing house" doesn't sound quite like what we need. And anyway; if we stick a good old web2 URI in there, then that will end up on the great bonfire of dead links before long.
Image, schmimage
To be honest, I find the presumption in the optional EIP721 metadata structure to be surprisingly short-sighted. It specifically defines the asset as an image, and at the same time is presupposes that only a single asset file will be used.
We may want to add multiple sources, so this is another obstacle for us.
So how to get around this, while still playing nice with existing implementations out there? Two ideas come to mind:
- Embed a thumbnail as a preview of the artwork using a
base64
data URI [1] in the metadata. Stickname
anddescription
on it, and the schema is still fulfilled. - Extend the structure with a list of attachments that our application layer knows about. Of course, each of these can have the same format as above.
In other words:
{ "name": "Foo", "description": "Foo image", "image": "data:image/gif;base64,R0lGODlhDgARAKEDAAAAAOjqAPP1APDw8CH+EUNyZWF0ZWQgd2l0aCBHSU1QACwAAAAADgARAAACO5wHqXdrClocodbUaGg34qoBHaJtl7VYIfqBZxepb5C6NBvbIlyucO6jPUIK1CMiSCYvDIVS4GhCJoYCADs=" "attachments": [ { "name": "Bar", "description": "Bar image", "image": "data:image/gif;base64," }, { "name": "Baz", "description": "Baz image", "image": "data:image/gif;base64," } ] }
Mirror, mirror
Since the asset reference shouldn't change, we can refer to it by its fingerprint or content address. If we define that the resource can be looked up over HTTP by that fingerprint as its basename, then we are free to define and modify whatever list of mirrors for that resource that's valid for any point in time. The application layer would simply try the endpoints one after another.
We take the sha2-256
[2] of the asset reference (the json file above, free of evil whitespace and newlines):
$ cat reference.json | jq -c -j | sha256sum | awk '{ print $1; }'
3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
Imagine we had a mirror list of https://foo.com and https://bar.com/baz/. Then our application would try these urls in sequence, stopping at the first that returns a valid result:
https://foo.com/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
https://bar.com/baz/3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
Once we receive the content, all we have to do is hash it ourselves and verify that the sum matches the basename of the URI. If it doesn't the result is of course not valid and we continue down the list, appropriately banning the mischievous server then throrougly harassing its admin.
Cast away
Since our fingerprint is 32 bytes, it fits exactly inside the tokenId
(uint256
). Let's decide to big-endian numbers when converting (I find them easier to make sense of). In that case our hash from the reference turns into this modest number:
# python3
>>> hx = bytes.fromhex('3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551')
>>> int.from_bytes(hx, byteorder='big')
28891040728719892888467057134569335350980764617882743994259054993630416573777
As long as we're composing the evm
inputs ourselves, we don't really have to worry about the integer representation in this particular case. But the interface is defined as an integer type, and other mortals may be using higher level interfaces, we have to be explicit about our choice.
Welcoming mint
Assume we have a method mintTo(address _recipient, uint256 _tokenId)
on our NFT contract. The Solidity signature of that contract is edb20b7e
[3]. If I were to mint to myself then the input to the contract would be:
edb20b7e000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e33fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
Broken down:
signature: edb20b7e
address, zero-padded: 000000000000000000000000185cbce7650ff7ad3b587e26b2877d95568805e3
token id: 3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
The corresponding web3.js code would look like:
const c = new web3.eth.Contract([...], '0x...');
c.methods.mintTo('0x185cbce7650ff7ad3b587e26b2877d95568805e3', 28891040728719892888467057134569335350980764617882743994259054993630416573777).call();
To satisfy the tokenURI method, we can generate a string that's prefix with sha256 as a "scheme" [4]. A bit of (unoptimized) solidity helps us out here:
contract NFT { uint256[] token; mapping(uint256 => uint256) tokenIndex; // map token id to master token array index position [...] function tokenURI(uint256 _tokenId) public pure returns(string memory) { bytes32 token_bytes; bytes memory out; uint8 t; uint256 c; token_bytes = bytes32(token[tokenIndex[_tokenId]]); out = new bytes(64 + 7); out[0] = "s"; out[1] = "h"; out[2] = "a"; out[3] = "2"; out[4] = "5"; out[5] = "6"; out[6] = ":"; c = 7; for (uint256 i = 0; i < 32; i++) { t = (uint8(_data[i]) & 0xf0) >> 4; if (t < 10) { out[c] = bytes1(t + 0x30); } else { out[c] = bytes1(t + 0x57); } t = uint8(_data[i]) & 0x0f; if (t < 10) { out[c+1] = bytes1(t + 0x30); } else { out[c+1] = bytes1(t + 0x57); } c += 2; } return string(out); } }
This will return sha256:3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
for tokenId
3fdfbfe3b510b69f90cd92618e4c1ec76cf8b9c330bc2da1922acda8f84f9551
as input, provided that the tokenId
actually exists. That may seem a bit useless at first, but consider the scenario where we want to interface with other NFTs aswell. Or perhaps we are implementing a contract that optionally can support a static web2 URI in storage. By doing it this way, all bases are covered.
Decentralized identifiers
Even better would be to add redundancy with autonomous decentralized storage. However, networks like Swarm and IPFS use their own hashing recipes. That means that for every network referenced, we'd have to define an alternative in our reference structure.
Referencing the canonical sha256
aswell as the Swarmhash
for the same item could then look like this [5]:
{ "name": "Foo", "description": "Foo image", "image": "data:image/gif;base64,R0lGODlhDgARAKEDAAAAAOjqAPP1APDw8CH+EUNyZWF0ZWQgd2l0aCBHSU1QACwAAAAADgARAAACO5wHqXdrClocodbUaGg34qoBHaJtl7VYIfqBZxepb5C6NBvbIlyucO6jPUIK1CMiSCYvDIVS4GhCJoYCADs=" "alternatives": [ { "name": "Foo", "description": "Foo image", "image": "bzz:4b9149ee4550f2d786f9ba6584b79a30ee14ae05ff6e84a0f7c7561a14e3b779" }, { "name": "Foo", "description": "Foo image", "image": "sha256:d036a4ce7f929b632256225b2bebd81bdd558d3bfe3d96faae61db664708c16f" } ] }
[1] Yes, they are valid URIs actually: https://www.rfc-archive.org/getrfc.php?rfc=2397
[2] Likely it would be prudent to start using the official sha3
instead ofsha2
these days, also because thesha2
hash is not a builtin forevm
. But neither issha3
. Thekeccak256
Bitcoin uses, which EVM has inherited, is a pre-cursor to thekeccak
published as the officialsha3
. Still,keccak256
andsha3
is used interchangeably in opcode lists (and previously in Solidity too). This has caused me quite a fair bit of confusion, I might add. Apart from it being ambiguous, thekeccak256
tooling is also less common in the wild. Thereforesha2
seems like a safer bet for our experiments. It's not broken yet, after all.
[3] The hex result of keccak256("mintTo(address,uint256)")
[4] Data URI is of no use here, because the hash itself is just nondescript binary data. Luckily <scheme>:<path>
is still a valid URI.
[5] Here the hashes represent the media content itself, not the reference. That's why the sha256
one is different than before.
[EIP721] https://eips.ethereum.org/EIPS/eip-721