Ethereum Swarm $BZZ Valuation

Butian
19 min readJun 20, 2021

This investment memo was written a few months ago. Market and things in general has changed a lot since then. Swarm announced Bee mainnet soft launch on 13 June 2021, followed by the BZZ public token sale on 14 June 2021 via CoinList. As all BZZ tokens (yes the total token supply) will be released on June 21st, I thought it might be a good time to validate and revisit the Swarm investment thesis.

TL; TR

Conclusion

  • Decentralized storage projects are highly valued by the market. Current solutions are still at the cutting-edge early inception phase and will coexist as opposed to directly competing with one another
  • Swarm is fundamentally different from the other interplanetary file storage and transfer systems — it is not a decentralized version of a Dropbox for file upload and retrieval, but rather a base layer decentralized storage utility that empowers Web3 digital and social constructs
  • Swarm’s integration with Ethereum blockchain, protocols, and domain name resolutions is a sought-after solution for boosting Web3 network interoperability

VALUATION

Prominent Distributed Store Coin Market Cap (USD)

Table 1 Coin Market Cap

By 2020, Amazon AWS has a market share of 30%, with an Enterprise Value of $400B[1], entailing a $1.3T valuation of the entire Cloud Storage market.

We discount Filecoin’s fully diluted valuation by half to $137B to build in some buffer for future coin price-drop. In this respect, the Filecoin represents 1% of the Cloud Storage market share, an obvious underestimate of valuation, considering the fact that 1. distributed storage solution cost is less than 1% of the cost of storing the same amount of data on AWS and 2. distributed storage is not just a replacement of cloud storage but a step-up to Web3 interoperability. But let’s work off that underestimation to get a conservative forecast of Swarm’s valuation.

We then assume Swarm will capture 5% of the Filecoin market share once Swarm Live 1.0 has passed early adoption phase — a fairly safe assumption to make given that Swarm and IPFS+Filecoin are two most mentioned interchangeable solutions perceived by the dev community.

The valuation of Swarm on a fully diluted basis would be $27.3B. A total supply of 62.5M BZZ tokens implies a token price of 219U. Building some buffer into the equation, we assume an adverse market condition in which crypto market valuation shrinks by a magnitude of 10, in this scenario, BZZ tokens would be priced at 21.9U.

Table below displays three scenario comparison: Swarm market share as 10% of Filecoin (Scenario A), 50% of Filecoin (Scenario B), and same as AWS (Scenario C). BZZ token price is listed for each scenario, in current market condition and bear market respectively.

Table 2 Price analysis by scenario

OVERVIEW

Introduction

Swarm is a system of peer-to-peer networked nodes that create a decentralized storage platform and content distribution service. It is a native base layer service of Ethereum Web3 stack that aims to provide a decentralized and redundant store for DApp code, user data, blockchain, and state data, achieving, such as, node-to-node messaging, media streaming, decentralized database services, scalable state-channel infrastructure for decentralized service economies and more.

Vision

To become the operating system of the re-decentralized internet, providing a scalable and self-sustaining infrastructure for a supply-chain economy of data.

System Advantage

  • zero downtime
  • a distributed denial-of-service attack (DDOS)-resistant
  • fault-tolerant
  • censorship-resistant
  • self-sustaining due to a built-in incentive system

Current Stage

Since its inception in 2015, Swarm rolled out 5 major releases, with Swarm Live released in Nov 2020, the latest Swarm version v0.6 announced on May 7th, 2021, and the next v1.0 release scheduled for later this year. Bee Dashboard, a web application released on April 29th, 2021 for Bee node (name of Swarm Client implemented in Go) operations. There are currently over 50K[2] Swarm Bee nodes up and running.

INCENTIVES SYSTEM

Cryptoeconomics

Swarm set out to build an incentive structure for a self-sustaining system, achieved via its built-in incentive system enforced through smart contracts[3] on the Ethereum blockchain, enabling zero-cost hosting. For Swarm to properly function as a decentralized peer-to-peer storage and communication infrastructure, there must be network participants who:

  • Contribute bandwidth for incoming and outgoing requests
  • Provide storage for users to upload and retrieve data
  • Forward incoming requests to peers who can fulfill them if they cannot serve the request themselves

Swarm introduces its own incentives system for ensuring correct network behavior by rewarding nodes for serving these above functions. On a high level, there are 1) bandwidth incentives and 2) storage incentives in the Swarm system. Because bandwidth and storage are the two most important resources in a distributed file system.

Accounting

Swarm’s incentive system is implemented using the Swarm Accounting Protocol (SWAP), which relies on the simple tit-for-tat p2p accounting model:

  • A node gets rewarded for serving recourses
  • A node gets charged for requesting resources

Conceptually, if a node requests data from a peer when the peer delivers the data, the peer credits an amount equivalent to the price of that message in its local database, at the index representing the node it delivers to. The node, when it receives the message (data) from the peer, debits itself in its local database the same amount at the index of the peer. To avoid issues with insufficient data or fund roll-back associated with operation failure, Swarm first does a read operation on the database to check if there are enough funds. If positive, the requested operation (send/receive) is performed, and finally, if the operation succeeded, the accounting entry is persisted.

BZZ Token

BZZ is an ERC20 Token, to begin with, and a Multi-Chain Token will be made possible for EVM-based blockchains, as Swarm works to be blockchain agnostic, although first designed to work with Ethereum. The token will be used as means of deposit and payment, for SWAP and Postage stamps (spam protection feature[4]). According to Swarm’s Medium post[5], part of BZZ tokens allocated to fundraising are reserved for a public sale near the 1.0 release.

Fraud/Cheating Defense

There are two threshold parameters, allowing the mutual accounting to oscillate between a defined range. Payment threshold is a number at which if the node goes below that relative to a peer, it should trigger a payment to that peer. This is initiated by the debitor node. The disconnect threshold is a number at which if the node goes above that number relative to a peer, it can disconnect the peer. This is monitored by the creditor. The distance between the two thresholds is designed to be such that any normal variance in the balances between peers does not cause a disconnect (|disconnect threshold| > |payment threshold|).

One possible scenario of fraud is that a peer modified their local database to have a higher balance. However, mutual accounting mechanism ensures that such modification only has a local effect, as the fraudulent node cannot trigger or force other peer nodes to send cheque to it. In cases of such discrepancy, the fraudulent node will be disconnected from the network and losing funds up to the disconnect threshold amount.

Another scenario is that a peer can be freeriding by consuming resources up to the disconnect threshold. At this point, if there is no settlement, the peer simply gets disconnected.

Although the threshold mechanism does not prevent fraud, it significantly limits malicious behavior in the system resulting in zero sustainable economic benefits for peers to do so.

Ecosystem

There are three actors in Swarm’s universe, namely providers of storage, developers, and users whose function and value-add are self-explanatory, and of course the more people the better. What’s not obvious is how to achieve viral growth and massive adoption. Some good signs of generating traction for Swarm are:

  • 50K+ Swarm Client nodes up and running as of April 30th, 2021, with airdrop events scheduled further boosting Bee nodes engagement
  • Grants eliciting developer engagement with 26 ongoing projects covering a variety of domains ranging from social DApps to digital wallets
  • Early inclusion in Ethereum’s fully decentralized web roadmap and its ecosystem since 2014[6], with extensive Web3 research and technical expertise; first-mover effect in the sphere of social collaboration and digital lives

ISSUES FOR CONSIDERATION

Ethereum

Swarm directly inherits technology design of Ethereum. For example, the identities layer, network layer, and consensus layer of Swarm are the same as Ethereum. As Swarm benefits from Ethereum with its large ecosystem, secure and living network, and reliable funding sources, scalability is a big issue. In the design of Swarm, a chain of contracts is configured to maintain the basic operations. These contracts increase the data size of blockchain such that Swarm is hard to be operated as a full blockchain ledger. It’s been overheard that the developers of Ethereum have been working on Swarm towards an off-chain storage[7]. Some other off-chain solutions such as Lightning Network and Plasma allow participants to execute transactions in an off-chain manner, such that a large portion of on-chain transactions and smart contracts can be offloaded from the main chain. Thus, integrating the off-chain techniques will bring new solutions to the

storage policy of future distributed file systems.

Privacy

Although the data uploaded can be encrypted, the data content stored in the network is accessible by every peer. Besides, according to the design of Swarm, transactions that record developments of a peer can be easily collected. User information can be revealed through the graph analysis of transactions. A client can be identified through the peers it directly connects to. Thus, transactions stored in blockchain behind distributed file systems are publicly visible. Solutions to address these issues are organized around access control and peer anonymity.

Scalability & Performance

Yet to be tested and continuously monitored using Quality of Service metrics.

COMPETITIVE LANDSCAPE

Decentralized filestore addresses the needs for information democracy, data privacy, and digital sovereignty. In the decentralized storage industry, there are the following prominent players, along with a comparative analysis captured in the table below, with IPFS+Filecoin pair being the most similar and much more mature projects.

To summarize, decentralized storage projects are highly valued by the market. Current solutions are still at the cutting-edge early inception phase, which come in different flavors in terms of functionality, vision and community served. The players will coexist and complement as opposed to directly competing with one another. In terms of product and service offering, Swarm is fundamentally different from the other interplanetary file storage and transfer systems. Its vision is not to build the decentralized version of a Dropbox for file upload and retrieval, nor to perpetually store a version of a static-state file that is of value on its own, but rather to provide a scalable and self-sustaining infrastructure for a supply-chain economy of data. Swarm’s core storage component is an immutable content address rooted in its system architecture design. Its integration with Ethereum blockchain, protocols, and domain name resolutions makes it a sought-after solution for boosting Web3 network interoperability. Swarm is not an end by itself, but a base layer decentralized storage utility that empowers digital and social constructs on top of the infrastructure.

Table 3 Comparative Analysis

Appendix A — TECHNICAL SUMMARY

Base-layer Infrastructure

Swarm’s base-layer infrastructure enables peer-to-peer contribution of resources to each other, allowing data to be traded between nodes. The main components that make up the decentralized storage system are:

  • Chunks: A chunk is a fixed-sized data blob (4 KB max), the basic unit of storage in Swarm’s DISC keyed by its unique address. Pieces of data are stored and retrieved in chunks. Chunks are immutable i.e., there is no replace/update operation on chunks. There are two types of chunks-Content Addressed Chunk: A chunk is content addressed if the chunk content determines the chunk address. The address usually represents a fingerprint or digest of the data using some hash function. The default content addressed chunk in Swarm uses the Binary Merkle Tree hash algorithm with Keccak256 base hash to determine its address. Single Owner Chunk: A special type of chunk, the integrity of which is given by the association of its payload to an identifier attested by the signature of its owner. The identifier and the owner’s account determine the chunk address
  • Reference: A unique identifier of a file that allows clients to retrieve and access the content. Swarm’s storage API allows both encrypted and unencrypted chunk references. Users indicate if they want to have encryption on the upload or not. For unencrypted content the file reference is the cryptographic hash of the data and serves as its content address. This hash reference is a 32-byte hash, which is serialized with 64 hex bytes. In the case of an encrypted file, the reference has two equal-length components: the first 32 bytes are the content address of the encrypted asset, while the second 32 bytes are the decryption key, altogether 64 bytes, serialized as 128 hex bytes
  • Manifest: A data structure defines a mapping between arbitrary paths and files to handle collections, forming the basis of representing collections, indexes, and routing tables allowing Swarm to host websites and offer URL-based addressing. The BZZ URL schemes assume that the content referenced in the domain is a manifest and renders the content entry whose path matches the one in the request path. Manifests can also be mapped to a filesystem directory tree, which allows for uploading and downloading directories. Finally, manifests can also be considered indexes, so they can be used to implement a simple key-value store, or alternatively, a database index. This offers the functionality of virtual hosting, storing entire directories, Web3 websites, or primitive data structures; analogous to Web2, with centralized hosting taken out of the equation.

Decentralized Data Storage

When content is uploaded to Swarm, it is chopped into chunks which can be accessed at the address derived from its content using chunk hash. The reference data chunks are packed into a chunk as well. The Merkle tree hash structure allows for integrity-protected random access into large files.

Swarm’s decentralized infrastructure for storage and communication (DISC) does not keep a list of where files are to be found, instead, it stores pieces of the file itself directly with the closes node(s). As illustrated in the figure below, in step 1, downloader node D uses forwarding Kademlia[8] routing to request the chunk from a storer node S in the neighborhood of the chunk address. In step 2, the chunk is delivered along the same route using forwarding step just backwards. The setup guarantees that any node, uploader/requester, can reach any other node. The closest node not only serves information about the content but also hosts the data, resulting in an auto-scaling elastic cloud and maximum resource utilization.

Table 4 Swarm DISC

Swarm implements a distributed preimage archive (DPA), as illustrated in the figure below. After a blob is received by the Swarm node, it will split said blob into minor and equal chunks of data, then distribute said chunks among different nodes that will automatically sync the data according to each chunk’s timestamp. The DPA will choose which nodes get to store which chunks. Nodes (bin 0, 1…31 as shown in the figure) on the same address-space will store related chunks.

Table 5 Swarm’s DPA

Data Storage Architecture

There are 4 layers of data units:

  • Message: communication between peers is organized in protocols as logical units under a unique name that defines one or more streamers. It is the P2P RLPx network layer; messages are relevant for the devpP2P wire protocols which will be discussed further in the next section
  • Chunk: fixed-size data unit of storage in the distributed preimage archive, as discussed above
  • File: the smallest unit that is associated with a mime-type and not guaranteed to have integrity unless it is complete. This is the smallest unit semantic to the user, basically a file on a filesystem
  • Collection: a mapping to file system directory tree. Given trivial routing conventions, a URL can be mapped to files in a standardized way, allowing manifests to mimic site maps/routing tables. As a result, Swarm is able to act as a web server, a virtual cloud hosting service

As illustrated in the figure below, the actual storage layer of Swarm consists of two main components: the LocalStore and the NetStore. The LocalStore is composed of an in-memory fast cache (Memstore) and a persistent disk storage (DBStore). The NetStore extends the LocalStore to a distributed storage of Swarm and implements the DPA.

The FileStore is the local interface for storage and retrieval of files. When a file is handed to the FileStore for storage, it chunks the document into a Merkle hash tree and hands its root key back to the caller. This key can later be used to retrieve the document in question in part or whole.

Finally, the FileStore takes the Swarm hash and uses the NetStore to retrieve the root chunk of the document for the user.

Table 6 Storage Layer

Network Layer

  • Swarm relies on the Ethereum P2P network, which is comprised of three different protocols:
    RLPx (Recursive Length Prefix) for node discovery and secure data transmission
  • DevP2P for node session establishment and message exchange
  • Ethereum subprotocol

DevP2P is inspired by libP2P and has security properties that are beneficial to Swarm. When discovering through RLPx, Swarm nodes establish TCP connections and send ‘‘HELLO’’ messages including NodeId, listening port and other attributes based on DevP2P. Sessions start to transmit data packets. Due to the ecosystem of Ethereum, Swarm has a large number of long-term nodes, which support the robustness and stability of Swarm systems.

To connect things together, the low-level network component of Swarm provided the infrastructure of a distributed immutable store of chunks. Hence, files and collections can be represented in Swarm using chunk references. The root manifests serve as the entry point to virtually host sites on Swarm. Swarm supports domain name resolution using the Ethereum Name Service (ENS), resolving to a reference to the root manifest and the URL paths map to manifest entries based on their path. When the HTTP API serves a URL, the following steps are performed (figure below):

1. Request — domain name resolution: Swarm resolves the host part to a reference to a root manifest

2. Respond — manifest traversal: recursively traverse embedded manifests along the path matching the URL path to arrive at a manifest entry

3. Resolve — serving the file: the file referenced in the manifest entry is retrieved

4. Render — displaying the page: file is rendered in the browser with headers (notably content type) taken from the metadata of manifest entry

Table 7 Swarm’s data rendering

In summary, Swarm’s key differentiator compared to existing solutions is the ability to host content on a peer-to-peer storage network at an immutable content address. Swarm is Ethereum-native in the sense that it is built on the Ethereum blockchain integrating the DevP2P multi-protocol network, ENS, and other Ethereum DApp compatible features. Its robust technical infrastructure provided a building block to deliver on its premise of creating a decentralized storage and communication service.

Appendix B — COMPETITORS

IPFS and Filecoin

The InterPlanetary File System (IPFS) is a protocol started by Protocol Labs to create a new way to serve information on the web. Both Filecoin and IPFS are complementary protocols for storing and sharing data on the distributed web. Both systems are free, open-source, and share many building blocks, including data representation formats (IPLD) and network communication protocols (libp2p). While interacting with IPFS does not require using Filecoin, all Filecoin nodes are IPFS nodes under the hood, and (with some manual configuration) can connect to and fetch IPLD-formatted data from other IPFS nodes using libp2p. However, Filecoin nodes don’t join or participate in the public IPFS distributed hash table (DHT)[9]. Filecoin did its ICO in Sep 2017, raising over $250M backed by institutional investors such as Y Combinator, Naval Ravikant, Andreessen Horowitz, Union Square Ventures Sequoia, and Winklevoss Capital.

IPFS serves information based on what it is as opposed to where it is i.e., IP address. With the routing algorithms, users can choose where they get their content from, with the option of privacy setting to define what peers/nodes they trust to receive files. It is great for getting started using content addressing for all sorts of distributed web applications. In the majority of these cases[10]:

  • Data is provided by the user’s own nodes. Otherwise, must rely on other peers to voluntarily/altruistically storing data or on a centralized pinning service
  • Centralized IPFS pinning services must be trusted to do their job. IPFS brings no built-in provisions to verify that data is being stored and correctly provided by the pinning service
  • Popular content is more easily accessible. Popular content (with many providers) naturally becomes faster/easier to retrieve in IPFS, which is great when there are external incentives to sync and store data in multiple nodes, and for a situation where strong social contracts can be used to ensure the content remains hosted and maintained long-term

Filecoin builds on the content addressing of IPFS to add longer-term data persistence using cryptoeconomic incentives. With Filecoin:

  • Clients make storage deals with miners to store data. The network verifies that the miners are correctly storing the data. Small payments are made regularly for the duration of the storage deal
  • Miners that do not honor the storage deal are penalized
  • Content retrieval might be offered by storage miners directly, or by specialized retrieval miners. The user requesting the data pays for this service
  • Filecoin excels at storing large amounts of data for long periods of time

Many solutions combine the two networks to get the best of both worlds: IPFS for content addressing & data discovery, and Filecoin for longer-term persistence. To achieve this, services like Powergate back up data on the Filecoin network while also ensuring content is discoverable in the IPFS Public DHT. Data is constantly available and can be retrieved quickly, while also making sure that it is safely and verifiably backed up on the Filecoin network over time.

Sia

Sia aims to leverage underutilized hard drive capacity around the world to create a data storage marketplace that is more efficient and cheaper than current solutions. Siacoin(SC) is the native utility token of Sia. New Siacoin is introduced as mining rewards through the Sia blockchain’s proof-of-work mining algorithm. The current market cap of Siacoin is $2B USD (as of May 9th, 2021). According to its whitepaper, the long-term goal of Sia is to compete with existing storage solutions. It sees itself as being in direct competition with major cloud storage providers such as Amazon, Google, and Microsoft. Because of its decentralized nature, Sia is able to offer competitive storage rates. Sia is an early project that started back in 2013 at HackMIT and officially launched in 2015.

The Sia software divides files into 30 segments before uploading, each targeted for distribution to hosts across the world. This distribution assures that no one host represents a single point of failure and reinforces overall network uptime and redundancy.

File segments are created using a technology called Reed-Solomon erasure coding, commonly used in CDs and DVDs. Erasure coding allows Sia to divide files in a redundant manner, where any 10 of 30 segments can fully recover a user’s files. This means that if 20 out of 30 hosts go offline, a Sia user is still able to download their files. Before leaving a renter’s computer, each file segment is encrypted. This ensures that hosts only store encrypted segments of user data. Sia uses the Threefish[11] algorithm, an open-source secure encryption standard.

Using the Sia blockchain, renters form file contracts with hosts. These contracts set pricing, uptime commitments, and other aspects of the relationship between the renters and the hosts. File contracts are a type of smart contract. They allow us to create cryptographic service level agreements (SLAs) that are stored on the Sia blockchain. Renters use Siacoin to buy storage capacity from hosts, while hosts deposit Siacoin into each file contract as collateral.

Storj Labs

Another decentralized storage built on the Ethereum network, Storj (pronounced as “storage”) is an open-source cloud storage platform. It uses a decentralized network of nodes to host user data. The platform also secures hosted data using advanced encryption. In a white paper published in December 2014, Storj was first introduced to the world as a concept. It was to be a decentralized peer-to-peer encrypted cloud storage platform. Two years later, an updated white paper was published. Here, a decentralized network — connecting users who need cloud storage space with those who have hard drive space to sell — as described. The platform was launched in late 2018. The current market cap of Storj is $1B USD[12].

Storj Labs Inc. uses its Tardigrade software installed on node computers to create and secure user data. The system is also peer-to-peer encrypted. Each node only receives a random fragment of a whole file with decryption keys split among each node and the host, making it almost impossible to hack.

Node operators get rewarded for hosting data as well as confirming the safety and retention of the hosted files randomly in a process known in the crypto world as mining (PoW). The Storj token is used for this purpose. Individuals or organizations who want to store their data on the network provide the Storj tokens paid to nodes.

Arweave

Arweave is a decentralized storage network that seeks to offer a platform for the indefinite storage of data. Describing itself as “a collectively owned hard drive that never forgets,” the network primarily hosts “the permaweb” — a permanent, decentralized web with several community-driven applications and platforms. The Arweave network uses a native cryptocurrency, AR, to pay “miners” to indefinitely store the network’s information. The current market cap of AR IS $1.6B USD[13].

According to its yellow paper, Arweave seeks to ensure the “collective ability to store and share information between individuals and across time to new generations.” In order to accomplish this goal, its flagship permaweb is built on top of Arweave’s “blockweave,” a variation of blockchain technology in which each block is linked to both the one immediately prior and also a random earlier one. Arweave says this incentivizes miners to store more data because they need to be able to access random previous blocks to add new ones and receive rewards.

Arweave is focused on building a sustainable ecosystem around the network. In June 2020, it unveiled “profit-sharing tokens,” which allow developers to receive dividends when network transaction fees are generated from their application, and it hosts incubators to support the building of permaweb-based apps. The project also works with startups through its “Boost” program, offering free storage and access to the Arweave team and industry investors.

In March 2020, Arweave announced that it had received $8.3 million in funding from Andreessen Horowitz, Union Square Ventures, and Coinbase Ventures. This followed an earlier November 2019 investment also from Andreessen Horowitz and Union Square Ventures, as well as Multicoin Capital.

Appendix C — REFERNCES

[1] https://www.forbes.com/sites/greatspeculations/2019/02/28/how-much-is-amazon-web-services-worth-on-a-standalone-basis/?sh=46c4c2b9bbb7

[2]https://twitter.com/ethswarm/status/1388106106694742017; https://beenodes.live/

[3]https://github.com/ethersphere/swap-swear-and-swindle

[4]https://swarm-guide.readthedocs.io/en/latest/incentivization.html#spam-protection-postage-stamps

[5]https://medium.com/ethereum-swarm/swarm-secures-funds-for-mainnet-release-2992d453fa09

[6] https://blog.ethereum.org/2014/08/18/building-decentralized-web/

[7] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9031420

[8] Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. Wikipedia: [https://en.wikipedia.org/wiki/Kademlia#:~:text=An%20actual%20Kademlia%20implementation%20does,A%20list%20has%20many%20entries.]

[9] https://medium.com/bitfwd/what-is-decentralised-storage-ipfs-filecoin-sia-storj-swarm-5509e476995f

[10] https://docs.filecoin.io/about-filecoin/ipfs-and-filecoin/#using-ipfs-and-filecoin

[11] https://en.wikipedia.org/wiki/threefish

[12] https://coinmarketcap.com/currencies/storj/

[13] https://coinmarketcap.com/currencies/arweave/

--

--