Comments on the Digital Asset Anti-Money Laundering Act of 2022

The Honorable Elizabeth Warren
United States Senate
309 Hart Senate Office Building
Washington, DC 20510

December 22nd, 2022

Dear Senator Warren:

Senate Bill “Digital Asset Anti-Money Laundering Act of 2022” (DAAML) promulgates the regulation of various agents in the digital asset ecosystem toward the goal of preventing the use of distributed ledger (“blockchain”) technologies in money laundering, financing terrorism, and illegal drug trafficking [1]. Former CIA Director Michael Morell described blockchain analysis as a “highly effective crime fighting and intelligence gathering tool” and the Bitcoin ledger as an “underutilized forensic tool” (page 3, Morell et al) [2]. The report cited a currently serving official at the CFTC (Commodity Futures Trading Commission) who stated that it “is easier for law enforcement to trace illicit activity using Bitcoin than it is to trace cross-border illegal activity using traditional banking transactions, and far easier than cash transactions” (page 5, Morell et al) [2].

Bitcoin is an example of a distributed ledger technology that was designed and is primarily used for transferring value. However, numerous other distributed ledgers, most notably, the Ethereum blockchain, have their origins in decentralized computing. My analysis of the Ethereum blockchain demonstrated that 49.9% of all transactions entail transfer of funds*. The Ethereum blockchain, like many other distributed ledgers, is used for manifold purposes beyond simply transferring value, including voting, decentralized autonomous organization, litigation, intellectual property attribution, and proof of data authenticity, to name a few applications [3, 4, 5, 6].

Distributed ledger jargon suffers from misnomers that can easily confuse non-technical audiences who approach the subject from a financial perspective while ignoring blockchain’s numerous non-financial applications. For example, Ethereum applications have historically been referred to as “smart contracts”, although they generally have nothing to do with contracts in a legal sense. A “smart contract” is a misnomer for a software program that is stored and executed on a distributed computer known as “Ethereum”. Another example of jargon that is often misinterpreted by non-technical audiences is the term “transaction”.

Most non-technical readers interpret the word “transaction” in a financial sense, similar to a credit card transaction or a commercial interaction in which money is exchanged for a good or service. However, a “transaction” in the context of distributed ledger technology should be interpreted in the broader and more accurate technical sense of a state change. For instance, deleting a record, uploading a cryptographic hash, saving a string such as “Hello, World!”, or updating a database record on the Ethereum blockchain all constitute “transactions”. That is, a “transaction” results in a state change on the Ethereum computer and may or may not entail transfer of value from a sender to a recipient.

Section 3 of DAAML (“Digital Asset Rulemakings”) states: “The Financial Crimes Enforcement Network shall promulgate a rule classifying custodial and unhosted wallet providers, cryptocurrency miners, validators, or other nodes who may act to validate or secure third-party transactions, independent network participants, including MEV searchers, and other validators with control over network protocols as money service businesses.” Preventing money laundering and countering terrorism are obviously worthy goals. However, the measures proposed toward these noble goals are, in my view, misguided.

In proof-of-stake (PoS) blockchain networks such as Ethereum, validators are agents that run peer-to-peer software and use their own “staked” funds to cryptographically validate transactions. To rephrase this generally, validators run peer-to-peer client software that maintains the integrity of the network and keeps it online. Analogously, proof-of-work (PoW) blockchain networks such as Bitcoin (and previously, Ethereum, before a network upgrade that occurred on 15 September 2022 that changed Ethereum from a PoW to a PoS system) rely on miners to incorporate transactions into a distributed ledger.

Legislation that fails to distinguish between financial and non-financial uses of blockchain networks threatens not only the entire blockchain industry but also the very technologies that were developed to promote transparent, democratic, and censorship-resistant computing. Anyone with a modern computer and an internet connection can download an entire distributed ledger, interrogate and interact with the ledger using open-source client software. These agents can fall under the definition of “independent network participant” and “validator”. Classifying these entities as “money service businesses”, along with the regulatory and reporting burden this entails, is unreasonably onerous, in my opinion.

Beyond the stated goals of preventing the financing of terrorism, money laundering, and illicit drug trafficking, legislation that regulates the digital asset ecosystem should also protect consumers from abuse by financial institutions. FTX is neither the first nor the last financial institution, crypto-based or otherwise, to collapse, and these corporate failures should highlight a key value proposition of digital assets: the fact that they can be managed by individuals without having to place trust in third-parties that may or may not deserve their customers’ trust. The classification system put forth in DAAML would largely target innocent individuals while failing to focus regulation where it ought to be focused: on corporations that lost or otherwise gambled away funds of customers who relied on these companies to safeguard their digital assets.

Classifying miners, validators, and independent network participants as “money service businesses” would be analogous to classifying individuals who run file-sharing clients (for example, BitTorrent) as a cloud hosting service. Clearly, a college student running a peer-to-peer node from their dorm is incomparable to Google Cloud or Amazon Web Services. DAAML would severely discourage and unreasonably burden anyone wishing to run free, open-source software with a multitude of non-financial uses, whether or not these parties are involved in verifying transactions that entail transfer of value. DAAML fails to recognize the fact that most transactions on the Ethereum blockchain involve no transfer of value whatsoever. The bill lacks an accurate appreciation of the term “transaction” in a technical sense as it it used in blockchain discourse and is biased by interpretation of the word in a classical financial sense.

Every single transaction in the Ethereum blockchain can be scrutinized with a few lines of code. The tools I used to conduct my analyses are based on open-source utilities such as the Go-Ethereum client and the Python Web3 library [7, 8]. Rich and detailed information about the Ethereum blockchain can be obtained by anyone without the need for proprietary APIs. I encourage lawmakers to conduct their own analyses of blockchain ledgers so that emerging legislation can more effectively protect consumers and counter the financing of terrorism, illegal drug trafficking, and money laundering – without destroying the democratic, decentralized foundations of these technologies.

In particular, I urge you to develop a more nuanced definition of “money service business” that does not target the miners, validators, and independent network participants which serve as the foundation of blockchain networks and keepers of its decentralized integrity. I hope that the arguments made above demonstrate that doing so would be a fallacy.

Thank you for your consideration.

Respectfully yours,

Omar Metwally, M.D.

* My initial analysis of all transactions from the past 10 days yielded a figure of 43%, and this is the number I cited in my original letter to Senator Warren. In a follow-up study, I analyzed all transactions from every 100th block on the Ethereum blockchain beginning with block 16237072 and ending with block 1388368 and calculated that 49.9% of Ethereum transactions entailed value transfer. The ratio of value-containing transactions varies widely from block to block. Access to greater computing resources would enable a more detailed study, and I invite anyone interested in this research question to conduct their own analysis.

References

1. “Digital Asset Anti-Money Laundering Act of 2022”. https://www.warren.senate.gov/imo/media/doc/DAAML%20Act%20of%202022.pdf. Accessed 21 December 2022.

2. “An Analysis of Bitcoin’s Use in Illicit Finance” by Michael Morell, Josh Kirshner and Thomas Schoenberger. 6 April 2021. https://cryptoforinnovation.org/resources/Analysis_of_Bitcoin_in_Illicit_Finance.pdf. Accessed 21 December 2022.

3. “What in the Ethereum application ecosystem excites me” by Vitalik Buterin. 5 December 2022. https://vitalik.ca/general/2022/12/05/excited.html. Accessed 21 December 2022.

4. “How cryptography and peer-to-peer networks contribute value to society” by Omar Metwally. 13 March 2022. https://omarmetwally.blog/2022/03/13/how-cryptography-and-peer-to-peer-networks-contribute-value-to-society/. Accessed 21 December 2022.

5. “Great Explorers” by Omar Metwally. 16 September 2022. https://omarmetwally.blog/2018/09/16/great-explorers/. Accessed 21 December 2022.

6. Maestro Ethereum application by Akram Alsamarae and Omar Metwally. National Science Foundation Grant 1937914. https://maestro.analog.earth

7. https://github.com/ethereum/go-ethereum

8. https://github.com/ethereum/web3.py

How cryptography and peer-to-peer networks contribute value to society

By: Omar Metwally, M.D.

3/13/2022

Objective:

To illustrate the utility of cryptography and peer-to-peer networking in protecting the authenticity, integrity, and availability of information.

https://en.wikipedia.org/wiki/Snowflake#/media/File:Snowflake_macro_photography_1.jpg

1. Information is the useful synthesis of data.

Our email inboxes, phones, and hard drives are constantly filling up with data; however, collecting, organizing, and archiving the useful nuggets of information in an ocean of junk requires time, money, and energy. The number of useful emails in my inboxes is a small fraction of the total number of emails, which are mostly spam. I don’t pay for extra storage out of principle. Why fund a company whose spam filters are more likely to block important emails than spam? Why perpetuate the problem?

Similarly with the high-resolution photos which take up so much memory on my phone and hard disk: most of these photographs do not deserve the 2+ MB of memory they occupy on my phone and PC. I’ll commonly snap a photo of a beautiful landscape, a critter I encounter on a walk, or something I need to remember for a short period of time (for example, where I parked). Backing up every photo and video on my phone seems wasteful considering that, like my email inbox, only a small proportion are media that I actually want to preserve. The alternative, however, would be to manually go through each of my inboxes and every photo I take on my phone and make a conscious decision whether to keep or delete a file. This latter strategy often proves far too time-intensive to pursue on a consistent basis.

2. Data that exists in only one location is as good as gone.

I once asked a colleague how he backs up his digital information. “I’ve never needed to back up my data,” he answered. This is a fallacy. Every possible failure of a digital system will eventually and inevitably occur. Hard disks fail all the time. People accidentally delete and lose files. Important bits of information drown in oceans of spam and junk, to the extent that locating them becomes practically impossible. Networked systems get hacked. People lose or upgrade their phones and change platforms, only to realize years later that they never backed up their old Android or iPhone which is now resting in a landfill.

Preserving information in a way that facilitates future retrieval requires:

– a consistent schema for organizing files and directories

– multiple physical (e.g. HDDs and SSDs) and cloud-based storage systems

– a consistent version control schema

– consistency in backing up information to each of these media

In other words, if you really cherish your data, you need to be organized, anticipate what can (and inevitably will) go wrong, and back up consistently. If it’s important information, chances are you’ll also want to encrypt your disks in a way that prevents unauthorized parties from accessing the data, without accidentally losing access to your own data.

3. Cryptography is arguably one of the most useful and powerful technologies in modern-day computing.

Modern cryptography is the basis for digital tools that protect the authenticity and integrity of information. While information ends up in the wrong hands all the time, encryption ensures that only the intended recipient can “unlock” the information. To lay people, “encryption” may conjure messaging apps designed for protect one’s privacy. However, another compelling use case of cryptography, which may be unknown to lay computer users, is to mathematically prove the authenticity of digital information. Algorithms such as SHA256 [https://csrc.nist.gov/glossary/term/SHA_256] can generate a mathematically unique string of numbers and letters, which can serve as a “fingerprint” for a file’s authenticity. Altering even the slightest letter in a document changes this cryptographic fingerprint.

Just like no two individuals have the same fingerprint, so do non-identical files yield unique cryptographic hashes. For instance, an attorney who needs to ensure the authenticity of a collection of evidence can use a cryptographic hashing algorithm such as SHA256 to prove beyond a doubt that the data do indeed represent what the attorney claims they do. However, it’s important to note that these hashing algorithms do not necessarily preserve the actual data to which they refer. It is still upon the attorney to back up the evidence in a secure and redundant manner. Furthermore, the attorney must ensure that each backup is identical. Although a small discrepancy may or may not be consequential in court (for instance, accidentally adding a space, period, or comma may or may not alter the interpreted meaning of a document), the cryptographic hash will be altered, negating the utility of the hashing algorithm.

4. Distributing and decentralizing information is a key value proposition of blockchain networks

Encryption and hashing preceded cryptocurrencies. Hash functions, which are defined by the National Institute of Standards and Technology, are generally free to use and are accessible via command line on any computer. Arguably the biggest value proposition of blockchain networks, on a technical level, is their capacity to add verifiable and tamper-proof timestamps to cryptographic hashes, by propagating a verifiable and identical chronological database across numerous peers around the world. Being able to reliably exchange information with thousands of computers across the world, spanning many different geographic areas, yields redundancy that would be implausible to replicate by entrusting any one party to create thousands of backups, spread them around the world, ensure that they can be accessed reliably, and also ensure the integrity of the original information. In reality, governments restrict access to online content all the time. People in affected locations can use tools such as VPNs to try and circumvent these limitations, but as long as a critical number of nodes is online, the information will not be lost, even if it is inaccessible from a certain geographic region due to inability to run a p2p client.

Cryptocurrencies create financial incentives for people to volunteer hard disk space, broadband, their time, skills, computing resources, and energy to contribute to a peer-to-peer network. Rather than relying on one party to ensure the integrity, authenticity, and availability of data (which is typically hosted in a relatively small number of geographic locations), blockchains are essentially distributed databases (also known as “distributed ledgers” when used in the context of exchanging digital value).

5. Ensuring information availability is another value proposition of blockchain networks

I have been experimenting with IPFS (“InterPlanetary Filesystem” [https://ipfs.io/]), a peer-to-peer file-sharing networking, since 2017. Each byte stored directly on a blockchain network is relatively expensive. While all blockchains are peer-to-peer networks, not all peer-to-peer networks are blockchain. IPFS, an example of a peer-to-peer network that is not a blockchain, allows users to easily upload directories and files to the network, where they are relayed from node to node. IPFS itself is free to use; that is, there is no subscription fee to cover hosting costs because volunteers around the world share in hosting the data. However, this utopian dream of “share everything, preserve everything” ignores the reality of the cost of hosting data. Bandwidth, disk space, processing power, and electricity cost money. Data hosted on IPFS can be “pinned” using a 3rd-party service, but this crosses the line of decentralization and places trust in a 3rd-party service to ensure the persistence of these data. Furthermore, it’s unclear to me why a 3rd-party service would volunteer their resources freely without charging a hosting fee.

Filecoin is a cryptocurrency developed by the creators of IPFS (Protocol Labs) which aims to solve this missing economic incentive. The Filecoin protocol aims to incentivize miners (people with a lot of computing power and storage capacity) to host others’ data by rewarding them with the Filecoin cryptocurrency in exchange for running software that can mathematically prove that the hosted data (1) exist on their hard drive(s), and (2) can be retrieved by the party that is paying Filecoin in exchange for their data to be hosted.

I downloaded the Filecoin client (“Lotus”) and spent several days running IPFS and Lotus in parallel in order to see if hosting a 113 MB file on Filecoin was a better alternative to using traditional cloud servers, and also to learn about the economics of the Filecoin ecosystem. I provide here my impressions of this limited experience without a recommendation for or against any cryptocurrency.

It took me a few hours to sync the Filecoin mainnet to completion. I had to download a snapshot of the chain in order to sync, and I could not locate a SHA256 checksum of the snapshot used to sync. I was unable to sync by connecting to peers directly. Using snapshots hosted on a centralized server which are not associated with published checksums is never best practice because there’s otherwise no way to ensure the authenticity or integrity of what one thinks they are downloading.

The Slack channels used by the Filecoin community are active, and I received timely answers to my questions by knowledgeable contributors. Once the Filecoin chain was synced, I proceeded to upload a 113 MB file using its IPFS hash (that is, the file was already uploaded to IPFS, and I used the IPFS hash to point to the data). The process of uploading data generally entails (1) identifying storage providers (miners) who are willing and able to host one’s data; (2) uploading the data to the storage providers; and (3) paying a transaction fee to upload the data. These transactions are referred to as “deals” and can range from 180 to 540 days in duration. Miners can specify parameters such as the minimum and maximum file size they are willing to host, duration of hosting, and their cost per Gigabyte per time period (in the case of Filecoin, per 30-second epoch). Retrieving data involves a separate set of processes, but I haven’t yet made it that far.

In Filecoin, miners host others’ data, which may or may not be encrypted. This is a potential legal gray area because miners generally don’t know what they’re hosting, and miners are often located in jurisdictions separate from the party seeking hosting services. Deals can be arranged on a Slack channel or third-party reputation marketplaces, but rarely does one know whom exactly they’re dealing with. What happens if a party is uploading content that is illegal in their jurisdiction? Or perhaps legal in their jurisdiction but forbidden in the miner’s jurisdiction?

The process of trying to host data on Filecoin is far more complex than using traditional cloud servers. The average person is unlikely to succeed without a strong commitment to the steep learning curve involved in using these command-line tools. Some of the complexities can theoretically be simplified using third-party services, but this can potentially negate the advantages of using an incentivized p2p network in the first place.

The Filecoin protocol incentivizes miners to contribute their computing resources (and time) to host others’ data by rewarding them for reliably hosting others’ data and financially punishing them by deducting penalties from the collateral they have to put up. Due to the relatively early stage of development of these tools, Filecoin documentation recommends making multiple deals with up to 10 different miners to ensure the availability of one’s data, in case one or more miners’ do not make good on their deal.

On my first attempt to upload a 113 MB file, the “deal” failed for unclear reasons, despite my attempts to troubleshoot the Lotus client’s behavior with the help of technical support volunteers. My starting balance was one Filecoin (1 FIL). Here are some numbers central to the (failed) transaction:

Initial wallet balance: 1 FIL

Cost of hosting 113 MB file with a particular miner for 180 days: 0.01296 FIL ($0.225504, at an exchange rate of $17.4 per FIL on March 12th, 2022).

Wallet balance after the escrow funds were returned to my wallet (i.e. after the deal failed):

0.996353443699298176 FIL

Difference between initial and final wallet balance = amount of “gas” burned (network transaction fees):

0.006646556300701767 FIL

Therefore, 51.285% of the original proposed cost of hosting the file (0.01296 FIL) was burned in the form of gas. In other words, 0.006646556300701767 FIL / 0.01296 FIL = 0.5128515664121734

While the amount of burned gas may seem trivial, it accounts for a majority of the cost of the failed deal (51.285%)! If the goal is to establish 10 deals with 10 different miners, then the cost of gas associated with failed deals can quickly add up.

6. Mathematical proof of data availability may or may not be necessary

There are certainly cases in which it’s necessary to prove mathematically not just the integrity and authenticity of data (for example, using hashing functions such as SHA256), but also the availability of the data. Filecoin aims to mathematically prove both the existence and availability of data hosted on a peer to peer network while incentivizing miners to uphold deals with parties who need data hosted. However, there are also many instances where a SHA256 checksum uploaded to a blockchain with an immutable timestamp is more than sufficient. In this latter case, the responsibility of organizing, archiving, and maintaining identical copies of these data falls upon the party willing to pay for the weight of this proof. As mentioned above, there are instances where entrusting miners to store and deliver content may be undesirable for legal reasons, privacy, or simply the need to trust that at least one miner with whom one conducts a deal will uphold their end of the deal.

In conclusion, cryptography and peer-to-peer networking are powerful technologies that can help protect the integrity of information and ensure its persistence. Various blockchain networks use financial incentives in different ways to provide a variety of value propositions to network participants. Clearly understanding one’s goals as the relate to information preservation/exchange, and clearly understanding each network’s value proposition, is key to making good investments of one’s time and resources.