Skip to content

Downsides of Crawling Data with Infura and Alchemy

Weekly updates 08.18

This was a busy week. Here are our updates:

> We’ve been working on reliability of our product. Before last week, we only used our own nodes on each chain that we support. Last week, we added QuickNode as a backup. If our nodes fail for some reason, now fall back to QuickNode infrastructure.

If you would like to see how we implemented our fallback strategy, that code is free and open source just like everything else we build – Moonstream Node Balancer.

> Last week we added support for Ethereum, Gnosis Chain, and Polygon Mumbai testnet on Moonstream engine API. As we increase the scope of our operations on Ethereum mainnet, understanding how much gas our contracts use up has become very important.

This week, we completed the first draft of our gas profiler. The pull request is up and we are reviewing it now. 

If you are curious about this, we’d be happy to teach you how to use it in our Discord.

Work on NFT v2 dataset is still ongoing. We found some subtle data completeness issues resulting from our use of Infura and Alchemy when crawling NFT and ERC20 events from the Ethereum and Polygon blockchains.

Because we run production operations through our nodes, we prefer to run crawls on external nodes so as not to put the additional burden on our infrastructure (which our customers use for timely analytics and for security). To this end, we usually use Infura and Alchemy when we are just gathering data for public datasets.

The NFT v2 dataset is meant to contain all Ethereum and Polygon ERC721 events and all the ERC20 events on those blockchains which occurred in the same transactions as ERC721 events.

Since this time period covers the peak of the bull market as well as the market crash, we have to crawl some pretty wild transactions.

The way we crawl events using eth_getLogs, we don’t have much control on paging through the logs on a transaction-by-transaction (and log index-by-log index) basis. We have to go through block-by-block.

Infura throws an error when we try to crawl over 10,000 events from a given block. Sincerely, this is exemplary behavior for an API. Alchemy, on the other hand, quietly returns a maximum of 10,000 events without notifying us that it skipped any additional events in the response. This was causing our crawlers to write incomplete data into the dataset. Silently dropping data is one of the worst things an API can do to its consumers!

We are fixing this by switching over to crawling QuickNode because, as mentioned, we have to crawl some pretty wild transactions.

We just wanted Infura and Alchemy to be aware of this experience using their APIs. We are happy to discuss further in case anyone from their teams would like to speak with us.

> Reputation system updates.

Last week, we presented a document explaining how we plan to manage reputation for members of Moonstream Discord.

Some of you provided us very valuable feedback on our proposal. Shout out to Daniel Tedesco, Moonstream DAO member – namupta on Discord, Ronen Kirsh, and the Game7 DAO community.

We decided to write a whitepaper presenting the motivation and technology behind our reputation system, in case other DAOs would also like to make use of it. You can find it here.

Based on the feedback that we received, we are adding specializations to Moonstream DAO reputation badges – research, programming, and data science.

Initially, we were planning for Discord to be the main interface to our reputation system for members of Moonstream DAO. We are now expanding this to:

  • GitHub bots which automatically reward programmers with experience points for contributions to our codebase
  • a website (on Moonstream player portals) where DAO members can upgrade their badges using experience points

All these changes are documented on the Reputation whitepaper changelist.

We would love to have your feedback on:

  1. Moonstream DAO reputation: Are the points and perks now better aligned with how you would like to engage with Moonstream DAO? 
  2. White paper: Was it easy to understand? How could we make it more clear or easy to read?
  3. Demo: What would you like to see in a demo of our reputation system?

> Our reputation technology is built using our on-chain crafting features. The Crafting whitepaper is in progress, here’s a little sneak peek:

Crafting whitepaper

> This week we discussed web3 gaming promotion tactics that are not ideal on Twitter. You can read about it here.

> I (Moonstream’s main writer) was nominated for a HackerNoon Contributor of the Year – NFT in Noonies 2022. If you like my posts you can show some love and support by voting for me here (I’m Daria N).

> We also started rolling out our logo update on all of our accounts. Our Twitter, Discord, and Medium all have the brand new logo – what do you think of it?

> Now we have an official links channel on our Discord where you can find links to our GitHub repos, social media, and other useful places.

And as always, join our Discord server to get all the latest news! Thank you for reading~