Visualizing Potential Fraud on the Ethereum Blockchain Using Network Graphs

·

Beyond the eye-popping prices of monkey-themed images, the underlying technology of NFTs offers companies a new way to monetize their digital presence directly. Major brands like Adidas, the NBA, and TIME have already begun experimenting with NFTs to explore these revenue streams—and we are still in the early stages of this trend.

As data practitioners, we can deliver valuable insights into these new business models, especially since all transactions are publicly accessible on the blockchain. This article introduces a starter project that uses Python to access, analyze, and identify potential fraud using blockchain data.

In this post, we’ll cover:

We’ll also include a Frequently Asked Questions section to clarify common points of confusion.


Understanding Blockchain Data

Amid the media buzz around meme coins and million-dollar pixel art, there lies a transformative technology: the blockchain.

At its core, a blockchain is a cryptographically secured, decentralized, and immutable digital ledger. Unlike a traditional bank ledger, a blockchain is maintained by a distributed network of computers that must collectively agree on the validity of each new entry. This ensures transparency and prevents tampering.

For a deeper dive into Ethereum data, consider exploring dedicated Ethereum analytics resources.

One of the most powerful features of blockchain technology is that all data—including transaction logs and metadata—is public and accessible. This openness allows analysts to study transaction behaviors, track asset movement, and identify unusual patterns.


What Are NFTs?

NFT stands for Non-Fungible Token. It is a type of cryptographic asset on a blockchain that represents ownership of a unique digital or physical item. For example, while an ounce of gold is fungible (interchangeable with any other ounce), the original Mona Lisa is non-fungible—there is only one.

Contrary to popular belief, NFTs are not just digital art. They can represent ownership of a wide range of assets, including music, virtual real estate, and collectibles. In this article, we focus on the Bored Ape Yacht Club (BAYC), one of the most well-known NFT art projects.

For a beginner-friendly introduction to NFTs, video explainers from trusted educational sources can be very helpful.


Why Use Network Graphs for Blockchain Data?

Network graphs are an intuitive way to represent relational data. They consist of nodes (entities such as wallet addresses) and edges (connections or transactions between those nodes). Both nodes and edges can store metadata such as timestamps, transaction values, and asset IDs.

Blockchain transactions are inherently relational—every transaction has a from address, a to address, and associated metadata. This makes network graphs a natural fit for analyzing and visualizing transaction behavior.

In this tutorial, we use network graphs to detect wash trading: a form of market manipulation where a trader buys and sells an asset through multiple accounts to create artificial trading volume or inflate the price.

According to industry reports, wash trading is a significant issue in NFT markets, with millions of dollars in estimated profits from such activities in recent years.


Extracting Data from the Ethereum Blockchain

Although blockchain data is public, accessing and preparing it for analysis can be challenging. Common methods include:

For this project, we use the a16z starter pack, which provides a convenient wrapper around the Alchemy API and outputs easy-to-use CSV files for specified NFT contracts.


Preparing Data and Creating a Network Graph

The NFT Analyst Starter Pack generates three CSV files for the BAYC project:

  1. BAYC Metadata: Information about individual NFTs, identified by a unique asset_id.
  2. BAYC Sales: Transaction logs including buyer, seller, sale price, and transaction hash.
  3. BAYC Transfers: Data on asset transfers, including those without financial transactions.

Key data preparation steps include:

Once the data is cleaned, we use the NetworkX package in Python to construct a network graph. The from_pandas_edgelist function allows us to build a graph from a DataFrame by specifying source and target columns (representing from and to addresses) and attaching metadata to each edge.

With over 40,000 transactions in the full dataset, visualizing the entire graph is impractical. Instead, we focus on a specific asset to tell a clear and compelling data story.


Visualizing Potential Wash Trading

To demonstrate the process, we examine BAYC token #8099, which has been publicly flagged as potentially involved in wash trading.

We follow these steps:

  1. Filter the dataset to include only transactions involving asset #8099.
  2. Anonymize wallet addresses by renaming them to alphabetical labels (e.g., Wallet A, Wallet B).
  3. Use NetworkX to generate a directed graph from the transaction data.
  4. Visualize the graph with labels, edge arrows, and a structured layout.

The resulting graph reveals a circular pattern of transactions between a small set of wallets, accompanied by a sharp increase in sale price. While this does not conclusively prove wash trading, it highlights a pattern worthy of further investigation.

By examining the transaction history between the involved wallets on Etherscan, analysts can gather additional evidence to assess whether market manipulation occurred.


Frequently Asked Questions

What is wash trading?
Wash trading is a form of market manipulation where an individual or group trades an asset with themselves to create false activity and manipulate perceived value.

Can network graphs prove fraud?
No. Network graphs help visualize transaction patterns and identify red flags. Further investigation using tools like Etherscan is needed to validate suspicions.

Is all blockchain data public?
Yes, on public blockchains like Ethereum, all transaction data is accessible. However, wallet owners are pseudonymous by default.

Do I need to run a node to analyze blockchain data?
Not necessarily. Services like the NFT Analyst Starter Pack simplify data access, though running a node offers the highest level of data autonomy.

What other NFT projects can I analyze this way?
This method can be applied to any NFT project on Ethereum. You only need the project’s contract address to begin extracting data.

How can I learn more about blockchain analytics?
Join online communities focused on Web3 and blockchain data science. Many open-source tools and educational resources are available for beginners.


Next Steps

This tutorial introduced how to access, prepare, and visualize Ethereum NFT data using Python and network analysis. If you’re interested in diving deeper, consider:

To learn more about real-time analytics tools and advanced methods, you can explore additional strategies here.


Disclaimer

This content is for educational purposes only and is not financial advice. The author does not hold any financial position in the NFTs analyzed. This analysis highlights potential red flags for further investigation and does not constitute proof of fraud. Always protect your private keys and recovery phrases—never share them with anyone.