Contributing a Schema Paser
A guide on how to contribute a schema parser.
We'll learn how to build a parser for NFTs that follow the OpenSea Metadata Standard and go over important considerations.
Step 1: Determine the Type of Parser
Before implementing your parser, familiarize yourself with the BaseParser, CollectionParser, and SchemaParser base classes.
A parser will be one of the following:
CollectionParser
: Runs based on a token's contract address.SchemaParser
: Runs based on the shape of the token's metadata.
Since, we're building a parser for NFTs that follow the OpenSea Metadata Standard, we'll use a SchemaParser
.
Schema parsers are used for metadata standards that will be used across many different NFT collections.
Step 2: Define the Selection Criteria
The next step is to define your parser's selection criteria. This tells the pipeline which tokens to run your parser on.
For a schema parser you'll need to override the should_parse_token()
method of BaseParser
to implement custom selection logic based on the shape of the metadata.
For instance, if the new metadata schema contains a unique field, checking for the existence of that field would qualify as selection criteria.
In this case both background_color
and youtube_url
are criteria for checking if an NFT follows the OpenSea Metadata Standard.
def should_parse_token(self, token: Token, raw_data: Optional[dict], *args, **kwargs) -> bool:
"""Return whether or not a collection parser should parse a given token.
Args:
token (Token): the token whose metadata needs to be parsed.
raw_data (dict): raw data returned from token uri.
Returns:
bool: whether or not the collection parser handles this token.
"""
return (
raw_data is not None
and isinstance(raw_data, dict)
and (raw_data.get("background_color") is not None or raw_data.get("youtube_url") is not None)
)
Step 3: Write the Parsing Implementation
If the token uri is not passed in as part of the input, the pipeline will attempt to fetch it from the tokenURI(uint256)
function on the contract. Otherwise, it is expected that the parser will construct the token uri.
Let's use the Song a Day collection as an example:
Token(
chain_identifier="ETHEREUM-MAINNET",
collection_address="0x19b703f65aa7e1e775bd06c2aa0d0d08c80f1c45",
token_id=1351,
)
If we pass it into the parser, we'll get the following uri: ipfs://Qmb9X7yBk5iKzgnS5pfAfReFT8FVaLcf7cJGxxUFWzRMyk/1351
, which returns metadata for token #1351.
Once you have the token uri, we can use the Fetcher
to fetch the raw JSON data from the token uri.
By default, the parser is initialized with a Fetcher
instance with an IPFS adapter.
This should return the following data from the IPFS Gateway:
{
"name": "Obligatory Song About the iPhone 5",
"description": "A new song, everyday, forever. Song A Day is an ever-growing collection of unique songs created by Jonathan Mann, starting January 1st, 2009. Each NFT is a 1:1 representation of that days song, and grants access to SongADAO, the orgninzation that controls all the rights and revenue to the songs. Own a piece of the collection to help govern the future of music. http://songaday.world",
"token_id": 1351,
"image": "ipfs://QmX2ZdS13khEYqpC8Jz4nm7Ub3He3g5Ws22z3QhunC2k58/1351",
"animation_url": "ipfs://QmVHjFbGEqXfYuoPpJR4vmRacGM29KR5UenqbidJex8muB/1351",
"external_url": "https://songaday.world/song/1351",
"youtube_url": "https://www.youtube.com/watch?v=xBOaUo1GT0g",
"attributes": [
{ "trait_type": "Date", "value": "2012-09-12" },
{ "trait_type": "Location", "value": "Brooklyn Studio" },
{ "trait_type": "Topic", "value": "Apple" },
{ "trait_type": "Instrument", "value": "Electric Guitar" },
{ "trait_type": "Mood", "value": "Bored" },
{ "trait_type": "Beard", "value": "Shadow" },
{ "trait_type": "Genre", "value": "Rock" },
{ "trait_type": "Style", "value": "Fun" },
{ "trait_type": "Length", "value": "0:47" },
{ "trait_type": "Key", "value": "E" },
{ "trait_type": "Tempo", "value": "90" },
{ "trait_type": "Song A Day", "value": "1351" },
{ "trait_type": "Year", "value": "2012" },
{ "trait_type": "Instrument", "value": "Bass" },
{ "trait_type": "Instrument", "value": "Drums" },
{ "trait_type": "Style", "value": "Catchy" },
{ "trait_type": "Proper Noun", "value": "iPhone" }
]
}
The next step is to convert the metadata into the standardized metadata format.
Each field in the new metadata format should either map a field in the standardized metadata format or be added as an MetadataField
under the additional_fields
property.
In the case of Song a Day, the metadata format has the following fields:
{
"name": "NFT name",
"description": "More info about the NFT",
"attributes": "Custom traits",
"image": "Image media asset",
"animation_url": "Media asset for videos",
"external_url": "External linking",
"youtube_url": "Video on hosted on Youtube"
}
Each of these fields can be mapped into the standard metadata format:
Standard Metadata Field | New Metadata Field |
---|---|
token | |
raw_data | |
standard | |
attributes | attributes |
name | name |
description | description |
mime_type | |
image | image |
content | animation_url |
additional_fields | external_url, youtube_url |
Code used for parsing collections that use the OpenSea Metadata Standard:
def parse_metadata(self, token: Token, raw_data: dict, *args, **kwargs) -> Optional[Metadata]:
"""Given a token and raw data returned from the token uri, return a normalized Metadata object.
Args:
token (Token): token to process metadata for.
raw_data (dict): raw data returned from token uri.
Returns:
Optional[Metadata]: normalized metadata object, if successfully parsed.
"""
mime, _ = self.fetcher.fetch_mime_type_and_size(token.uri)
attributes = [self.parse_attribute(attribute) for attribute in raw_data.get("attributes", [])]
image = None
image_uri = raw_data.get("image") or raw_data.get("image_data")
if image_uri:
image_mime, image_size = self.fetcher.fetch_mime_type_and_size(image_uri)
image = MediaDetails(size=image_size, uri=image_uri, mime_type=image_mime)
content = None
content_uri = raw_data.get("animation_url")
if content_uri:
content_mime, content_size = self.fetcher.fetch_mime_type_and_size(content_uri)
content = MediaDetails(uri=content_uri, size=content_size, mime_type=content_mime)
if image and image.mime_type:
mime = image.mime_type
if content and content.mime_type:
mime = content.mime_type
return Metadata(
token=token,
raw_data=raw_data,
attributes=attributes,
name=raw_data.get("name"),
description=raw_data.get("description"),
mime_type=mime,
image=image,
content=content,
additional_fields=self.parse_additional_fields(raw_data),
)
# Helper Functions
def parse_attribute(self, attribute_dict: dict) -> Attribute:
return Attribute(
trait_type=attribute_dict.get("trait_type"),
value=attribute_dict.get("value"),
display_type=attribute_dict.get("display_type"),
)
def parse_additional_fields(self, raw_data: dict) -> list[MetadataField]:
additional_fields = []
if (external_url := raw_data.get("external_url")) is not None:
additional_fields.append(
MetadataField(
field_name="external_url",
type=MetadataFieldType.TEXT,
description="This is the URL that will appear below the asset's image on OpenSea "
"and will allow users to leave OpenSea and view the item on your site.",
value=external_url,
)
)
if (background_color := raw_data.get("background_color")) is not None:
additional_fields.append(
MetadataField(
field_name="background_color",
type=MetadataFieldType.TEXT,
description="Background color of the item on OpenSea. Must be a six-character "
"hexadecimal without a pre-pended #.",
value=background_color,
)
)
if (animation_url := raw_data.get("animation_url")) is not None:
additional_fields.append(
MetadataField(
field_name="animation_url",
type=MetadataFieldType.TEXT,
description="A URL to a multi-media attachment for the item.",
value=animation_url,
)
)
if (youtube_url := raw_data.get("youtube_url")) is not None:
additional_fields.append(
MetadataField(
field_name="youtube_url",
type=MetadataFieldType.TEXT,
description="A URL to a YouTube video.",
value=youtube_url,
)
)
return additional_fields
Step 4: Registering a Parser
After writing your custom metadata parser implementation, you'll want to register it to the ParserRegistry
.
The ParserRegistry
tracks all parsers and is used by the metadata pipeline to know which parsers to run by default.
Note: in order to have the parser be registered, you'll also need to import it in offchain/metadata/parsers/__init__.py
.
If you're developing locally, you still need to import the ParserRegistry
to register your parser. The parser must be registered in order for it to be run by default in the MetadataPipeline
.
Step 5: Testing the Parser
Lastly, you'll want to write tests to verify that your parser works as expected. At minimum, the should_parse_token()
and parse_metadata()
functions should be tested because the pipeline will call those directly.
It's important to verify that the should_parse_token()
function returns True
if and only if a token is meant to be parsed by that parser.
Given a token, parse_metadata()
should normalize the raw data into the standardized metadata format. Since making network requests can be flaky, it's preferable to mock the data that would be returned by the server that hosts the metadata information.
def test_opensea_parser_should_parse_token(self, raw_crypto_coven_metadata):
fetcher = MetadataFetcher()
ipfs_adapter = IPFSAdapter()
fetcher.register_adapter(ipfs_adapter, "ipfs://")
parser = OpenseaParser(fetcher=fetcher)
assert parser.should_parse_token(token=self.token, raw_data=raw_crypto_coven_metadata) == True
def test_opensea_parser_parses_metadata(self, raw_crypto_coven_metadata):
fetcher = MetadataFetcher()
ipfs_adapter = IPFSAdapter()
fetcher.register_adapter(ipfs_adapter, "ipfs://")
fetcher.fetch_mime_type_and_size = MagicMock(return_value=("application/json", "3095"))
parser = OpenseaParser(fetcher=fetcher)
metadata = parser.parse_metadata(token=self.token, raw_data=raw_crypto_coven_metadata)
assert metadata == Metadata(
token=self.token,
raw_data=raw_crypto_coven_metadata,
standard=None,
attributes=[
Attribute(trait_type="Background", value="Sepia", display_type=None),
Attribute(trait_type="Skin Tone", value="Dawn", display_type=None),
Attribute(trait_type="Body Shape", value="Lithe", display_type=None),
Attribute(trait_type="Top", value="Sheer Top (Black)", display_type=None),
Attribute(
trait_type="Eyebrows",
value="Medium Flat (Black)",
display_type=None,
),
Attribute(trait_type="Eye Style", value="Nyx", display_type=None),
Attribute(trait_type="Eye Color", value="Cloud", display_type=None),
Attribute(trait_type="Mouth", value="Nyx (Mocha)", display_type=None),
Attribute(trait_type="Hair (Front)", value="Nyx", display_type=None),
Attribute(trait_type="Hair (Back)", value="Nyx Long", display_type=None),
Attribute(trait_type="Hair Color", value="Steel", display_type=None),
Attribute(trait_type="Hat", value="Witch (Black)", display_type=None),
Attribute(
trait_type="Necklace",
value="Moon Necklace (Silver)",
display_type=None,
),
Attribute(
trait_type="Archetype of Power",
value="Witch of Woe",
display_type=None,
),
Attribute(trait_type="Sun Sign", value="Taurus", display_type=None),
Attribute(trait_type="Moon Sign", value="Aquarius", display_type=None),
Attribute(trait_type="Rising Sign", value="Capricorn", display_type=None),
Attribute(trait_type="Will", value="9", display_type="number"),
Attribute(trait_type="Wisdom", value="9", display_type="number"),
Attribute(trait_type="Wonder", value="9", display_type="number"),
Attribute(trait_type="Woe", value="10", display_type="number"),
Attribute(trait_type="Wit", value="9", display_type="number"),
Attribute(trait_type="Wiles", value="9", display_type="number"),
],
name="nyx",
description="You are a WITCH of the highest order. You are borne of chaos that gives the night shape. Your magic spawns from primordial darkness. You are called oracle by those wise enough to listen. ALL THEOLOGY STEMS FROM THE TERROR OF THE FIRMAMENT!",
mime_type="application/json",
image=MediaDetails(
size=3095,
sha256=None,
uri="https://cryptocoven.s3.amazonaws.com/nyx.png",
mime_type="application/json",
),
content=None,
additional_fields=[
MetadataField(
field_name="external_url",
type=MetadataFieldType.TEXT,
description="This is the URL that will appear below the asset's image on OpenSea and will allow users to leave OpenSea and view the item on your site.",
value="https://www.cryptocoven.xyz/witches/1",
),
MetadataField(
field_name="background_color",
type=MetadataFieldType.TEXT,
description="Background color of the item on OpenSea. Must be a six-character hexadecimal without a pre-pended #.",
value="",
),
],
)
In addition to testing your parser, you'll need to verify that the parser has been registered and added to the pipeline correctly. The tests in tests/metadata/registries/test_parser_registry.py
should break if the not modified to include your new parser class.
OpenSea Schema Code
View the full source code for the OpenSea schema parser here.