Reflections on the roles of human and artificial intelligence in scientific research

Title: Reflections on the roles of human and artificial intelligence in scientific research
Author: Omar Nabil Metwally, M.D.
Date: 20 July 2025
File: reflections_on_human_and_artificial_intelligence_20072025.txt
SHA256 checksum: b0dadc60e7594a93d567550079977d15d7550764b0edc77838a29040ddbef6e4

Objective:
To begin a discussion on the emerging roles and responsibilities of humans in the era of AI-facilitated scientific research.

Disclosure:
This content is original and no artificial intelligence was used in the course of writing. All ideas and opinions are those of the author, and the author assumes responsibility for this content on the basis of cryptographic authenticity.

Artificial Intelligence (AI) is a powerful set of tools capable of generating novel text, images, sound, and video that utilizes a human user’s input to modulate a corpus of input data. Emerging AI already possess the capacity to reason, infer, and generate novel material through infinite permutations of a finite corpus of information.

Most modern AIs are functionally “black boxes” in the sense that how an input maps to an output cannot be described by a mathematical function. In other words, given AI’s output, there is no way to deduce in a step-by-step manner how exactly the AI produced a given output. This is arguably no different than human reasoning in the sense that a human cannot explain how each neuron in their brain produced a certain thought. Certainly, AI and human intelligence (HI) can both explain their reasoning. This was a significant milestone in the development of AI. However, both AI and HI are too complex to explain in terms of mathematical functions.

Scientific research traditionally has been characterized by incremental increases in knowledge. A peer-reviewed scientific publication is assumed to reference information produced by others. Scientific discourse strives to be accurate and logically sound such that each claim has a basis in the scientific literature. As I learned throughout my formal education, the “scientific method” begins with rationale: why is the scientist conducting a certain experiment?

Once the rationale for an experiment is established, a scientist can then pose a question. This can be as simple or as complicated as: “Why is the sky blue?” or “Why are plants green?” The research question is classically based on observations of the natural world. Having established rationale and a research question, a scientist must then establish a factual precedent which forms the basis of a novel hypothesis. This factual precedent is sometimes called the “background” and comprises what experiments on the subject have already been conducted and what is generally considered by a scientist’s peers to be true.

Given a body of knowledge considered to be true by one’s peers and a research question inspired by observation, a scientist can then conduct experiments, collect data in the form of results, and analyze the results to draw conclusions.

If one accepts the claims that (1) AI can generate infinite permutations of novel outputs based on a finite corpus of information, and (2) AI is capable of reasoning, inference, and deduction, then it can be argued that AI is capable of conducting novel scientific research according to the scientific method. This represents a drastic branch-point in the evolution of scientific research and raises a plethora of ethical questions for humans. For instance, if AI can synthesize far larger sets of data more extensively and much faster than humans, where does AI-generated research fit into the classical notion of peer-reviewed literature? Why do some scientists reject the notion of AI-generated research? How can a human author accurately disclose their own contribution to a work and AI’s contribution?

Most peer-reviewed scientific journals that I’ve encountered as of this writing do not consider AI-generated scientific research as legitimate scientific research in its own right. Some scientists consider AI-generated publications as “plagiarism” or “masquerading.” It is my view that the basis for this lack of acceptance of AI-generated research is purely a function of entrenched tradition and the unfounded assumption that peer-reviewed literature should or must be written word-for-word by humans.

Academic literature is characterized by qualitative and quantitative properties that are unique to an academic genre. In other words, publications in a chemistry journal follow a certain format and are written in a particular manner that distinguishes them from publications in other fields such as mathematics, linguistics, or comparative literature. Even if the content of a publication is true and accurate and presents novel information on the basis of logically sound research, the publication can easily be rejected by a certain community simply because it does not look like what the readers expect it to look like: other publications that have already been published in the field and which form the basis of what a particular research community considers to be factual precedent.

Returning to the question of: Does AI-generated research constitute plagiarism? Although AI can reason, infer, and deduce, it is still based on modulating a finite corpus of input information that was originally produced by humans. In this strict interpretation of how AI works, one could argue that all material produced by AI is plagiarism because it all came from human work which is almost never accurately or completely attributed. However, human intelligence is also based on knowledge produced by other humans. Any seemingly new idea that I may have necessarily originates from my personal experiences, which were influenced in some manner by what I learned from other humans. Both human-generated and AI-generated scientific research are capable of fulling the implicit requirement that scientific publications must reference other authors’ work to substantiate their claims by citing real, human-generated peer-reviewed articles in the course of testing hypotheses. AI is at least as capable of generating and systematically testing hypotheses as a human. So then, what is the problem with AI-generated research?

I have been using ChatGPT and Claude.ai on a daily basis for a wide variety of tasks, including optimizing written communications, learning about new topics, writing code, learning foreign languages, and even creating a work of fiction based on my personal interests. These are just a few of countless use cases. My approach to AI is to record the exact prompts that I use to produce a certain output and input the same prompts to ChatGPT and Claude to compare their outputs. This quick and simple cross-check serves as an initial screen to help me identify egregiously wrong information or information that definitely warrants further manual fact-checking.

I recently listened to a podcast produced by Südwestrundfunk (SWR) on algae and the environmental, public health, and economic burden of “harmful algal blooms” (HABs), an increasingly common phenomenon that has been described as a harbinger of the next mass extinction event based on studies of algal blooms throughout geological history [https://www.ardaudiothek.de/episode/urn:ard:publication:72ed9586246373ea/]. The subject piqued my fascination and curiosity, and I conducted an experiment in AI-generated research by creating an outline of topics that interests me about HABs which borrowed from the podcast while also adding topics of personal interest such as mycology and bioremediation. I then serially prompted Claude and ChatGPT to produce a review article in the style of a well-known scientific journal. Claude produced a surprisingly convincing article that could easily deceive a lay person, and at first glance, I presume even scientists.

My rationale in conducting this experiment was purely to examine the capability of AI to conduct scientific research. It was never my intention to deceive anyone, and therefore I do not share here the actual paper that was produced by AI based on my prompts because it consists of a superficially very convincing article that intermingles useful facts with obvious nonsense and does not reliably substantiate every claim made. Based on the results of my experiment, I believe that AI is already very capable of conducting scientific research, and this capability is accelerating every day. I shared the paper exclusively with a few family members, two physicians and two attorneys, and friend who is a physician-scientist with extensive experience conducting traditional scientific research and publishing his work in prominent peer-reviewed journals. A few minutes after sharing the paper, I disclosed to everyone with whom I shared the paper how the paper was generated. Beyond demonstrating the capacity for AI to conduct research and present it in a manner that appears almost indistinguishable from human-generated peer-reviewed articles, this experiment also made me keenly aware of the capacity for AI-generated scientific research to mislead non-scientists and even expert readers, to produce and propagate false information and conclusions, and to misrepresent the human investigator’s role in the production of a research paper. These are serious risks with the potential to harm individuals and society and must be considered carefully by regulatory bodies and responsible creators of AI tools. The current safeguards against abuse and misuse of AI in general and in scientific research in particular are arguably minimal.

Returning to the question of how AI-generated research relates to the classic notion of peer-reviewed literature, I believe it’s a matter of time until academia accepts AI-generated research as legitimate and worthy work. Dismissing AI-generated research just because it was produced by a non-human agent is a failure to grasp the tremendous capabilities of AI which are very rapidly growing. However, I believe that the advent of such powerful tools also necessitates formal guidelines on the role and responsibilities of humans who use AI to conduct research. Dismissing AI-generated research also does injustice to the work of humans who must use their own fund of knowledge and creativity to engineer prompts in a manner that effectively leverages AI’s capabilities in the process of iteratively prompting AI to achieve a particular result. From extensive personal experience, this is a form of original work that requires time, effort, and skill, and a sound knowledge base from which to produce prompts. Although this work takes a different form than scientific research in a classical sense, it is still an important scientific endeavor that must be acknowledged as such. How AI’s contribution should be disclosed, however, is a significant unanswered question worthy of ongoing discussion.

It is a short matter of time until every scientist is using AI in some capacity to conduct research. Until then, I find it reasonable to disclose when and how AI is used in the process of conducting research, and to also disclose the human researcher’s role in the process. As AI is not a person, I find it unnecessary and inappropriate to treat AI as a human author, which it is not. Once AI is all-pervasive and accepted as a standard research tool, I believe that it will become increasingly superfluous to explicitly declare the role of AI in scientific research. I acknowledge that this is a bold claim with which many scientists may not agree. Regardless, an ethical human researcher who uses AI should assume ultimate responsibility for the accuracy of the work, whether each sentence was written by a human or machine. This includes the responsibility to fact-check and ensure that claims are substantiated by veritable data, that references are authentic, and that all inferences and conclusions are logically sound.

How then can the reader be sure that this reflection, which I maintain to be an original work written by me, Omar Nabil Metwally, M.D., was in fact written by me without the use of AI? This question, too, will grow increasingly moot. Whether produced by AI or by a human, I maintain that there are few truly novel ideas; “there is nothing new under the sun” goes the popular saying. Both humans and machines recycle pre-existing information to produce new information. And as AI increasingly produces new knowledge, I expect the knowledge base available to humans and machines alike to continue growing exponentially. AI does not simply consume information; it is also dynamically creating new knowledge, such that its output becomes new input and so forth.

One safeguard against plagiarism and misrepresentation, as I’ve previously proposed, is the use of a distributed ledger and cryptographic hash to associate an identity (e.g. Ethereum address) and a unique checksum (e.g. SHA256 checksum) with a block number, which is a relative time marker and proxy timestamp [https://omarmetwally.blog/2022/03/13/how-cryptography-and-peer-to-peer-networks-contribute-value-to-society/]. After many years of thinking about this problem, I still reach the conclusion that this is the best mechanism to guarantee authenticity. This method is not sufficient to ensure a document’s authenticity, however, because the cryptographic hash of false or plagiarized content can still be uploaded to a distributed ledger; however, such a transaction requires a person to prove control over a wallet and provides strong, nearly irrefutable evidence of a relative time point at which the hash was recorded on a distributed ledger, thus allowing humans to scrutinize the content of the document for its veracity based on the set of all knowledge that can be proven to have existed at a particular point in time. This is in contrast to the ever-flowing output streams being produced by AI.

In summary, in this work I present my opinion that AI has the capacity to conduct legitimate and useful scientific research. However, with great power also comes great responsibility, and human agents must take ultimate responsibility for the veracity, logical integrity, and basis in precedent of a work — regardless of whether sentences were generated by a human or a machine.

How to interact with an Ethereum contract

By: Omar Nabil Metwally, MD

24 June 2024

Objective: To interact with an Ethereum application (“smart contract”)

Background: The German Federal Intelligence Service, die Bundesnachrichtendienst (BND), announced on 5 June 2023 a digital scavenger hunt to collect “Dogs of BND” themed NFTs (“Hunde der BND”, https://www.bnd.bund.de/DE/Karriere/SozialeMedien/Gewinnspiele/blockchain-challenge/teilnahmebedingungen-blockchain-challenge-node.html).

The only hints provided toward solving the puzzle are the following 40-character hexademical sequence and the knowledge that it is an Ethereum address: 0x6E02ffa16171ac74dC1688480A1F703C23994f3D

Environment:

This write-up assumes working knowledge of the Ethereum client written in the Go programming language (“Go” Ethereum client, aka “geth”, https://github.com/ethereum/go-ethereum), a fully synced node, and use of the clef command line utility (https://geth.ethereum.org/docs/tools/clef/introduction) to sign transactions and data.

Rationale:

The Ethereum ecosystem continues to evolve rapidly. Note that web3 methods based on the deprecated “personal” namespace resulted in potentially breaking changes for code used to interact with Ethereum contracts. The proper way to handle account locking and unlocking in the context of broadcasting transactions is to now use clef.

Start geth and clef as follows:

./clef –keystore /path/to/your/keystore/ –ipcpath=/your/path/to/clef/ipc –signersecret /your/clef/signer/secret

./geth –syncmode snap –datadir=/your/datadir/path –signer=/your/path/to/clef/ipc console

Task #1: Query 0x6E02ffa16171ac74dC1688480A1F703C23994f3D

The above hexademical sequence appears to be an Ethereum address, based on the knowledge that Ethereum addresses are 40-character hexademical sequences preceded by 0x.

This can be verified using geth:

eth.getBalance(“0x6E02ffa16171ac74dC1688480A1F703C23994f3D”)

This yields a balance of 316649482296000 wei, or 0.000316649482296 Ether.

Call this address [ORIGIN]. Looking up this address on etherscan.io (https://etherscan.io/address/0x6E02ffa16171ac74dC1688480A1F703C23994f3D) reveals that [ORIGIN] is directly associated with two transactions.

Transaction 1 on block 17393444 entails a transfer of 0.04879799 ETH from [ORIGIN] to 0x2B127A04c4DA063dB1E75BAC1b007D5C0661570a. Call this recipient [RECIPIENT 1]. The transaction hash for transaction 1 is 0x30daf5adac61330817ba48bdcc093df402b992f7112898c88d83f025d3c2dd6d

Transaction 2 on block 17393441 entails a transfer of 0.05 ETH from 0x12CF21eE48426b0E8f9bE4704C38aAba6E9ab988 to [ORIGIN]. Call this sender [SENDER 1]. The transaction hash for transaction 2 is 0xa61b9a442c9a6836390d422c81597608a63bef0c6d71dc220793521edb188f80.

[RECIPIENT 1] created a contract at 0xb47c23d001c0c9f5c1a158a93b6df6004b6012f7 (transaction hash 0xd5b3d8b57316fecdfb958815e1c0d3079a5c5826e99c924c87cf11014b1b31d7 on block 17413916. Call this [CONTRACT 1], and call this transaction [TRANSACTION_CONTRACT_1_CREATE].

32 transactions have been sent to [CONTRACT 1] at the time of writing, the first of which constructed the contract, and the rest invoked a method called “updateMessage”. Each transaction can be queried using the python web3 library.

Etherscan.io provides the contract source code and indicates that it is “verified.” It is marked with the following comments:

// Herzlichen Glückwunsch!

// Du hast diese versteckte Nachricht erfolgreich finden können!

// Damit hast du nun die Möglichkeit, dir als eine oder einer der Ersten ein

// exklusives Hunde-NFT aus unserer Collection zu sichern (nur solange der Vorrat reicht).

// Du findest die Collection unter diesem Link auf Opensea: opensea.io/collection/dogs-of-bnd

// (Bitte beachte die Teilnahme- und Datenschutzbedingungen unter bnd.de/nft).

Task #2: Verify contract source code

To verify the bytecode of the provided source code and verify its authenticity, compile the provided source code and compare the resulting bytecode with the bytecode at the corresponding address, [CONTRACT 1], on Ethereum mainnet.

The easiest way to compile the source code is using the Remix compiler at https://remix.ethereum.org. Doing so yields the following ABI and bytecode:

abi: [ { “inputs”: [ { “internalType”: “string”, “name”: “_message”, “type”: “string” } ], “stateMutability”: “nonpayable”, “type”: “constructor” }, { “inputs”: [ { “internalType”: “string”, “name”: “_newMessage”, “type”: “string” } ], “name”: “updateMessage”, “outputs”: [], “stateMutability”: “nonpayable”, “type”: “function” }, { “inputs”: [], “name”: “message”, “outputs”: [ { “internalType”: “string”, “name”: “”, “type”: “string” } ], “stateMutability”: “view”, “type”: “function” } ]

bytecode: 60806040523480156200001157600080fd5b5060405162000bee38038062000bee8339818101604052810190620000379190620001e3565b80600090816200004891906200047f565b505062000566565b6000604051905090565b600080fd5b600080fd5b600080fd5b600080fd5b6000601f19601f8301169050919050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052604160045260246000fd5b620000b9826200006e565b810181811067ffffffffffffffff82111715620000db57620000da6200007f565b5b80604052505050565b6000620000f062000050565b9050620000fe8282620000ae565b919050565b600067ffffffffffffffff8211156200012157620001206200007f565b5b6200012c826200006e565b9050602081019050919050565b60005b83811015620001595780820151818401526020810190506200013c565b60008484015250505050565b60006200017c620001768462000103565b620000e4565b9050828152602081018484840111156200019b576200019a62000069565b5b620001a884828562000139565b509392505050565b600082601f830112620001c857620001c762000064565b5b8151620001da84826020860162000165565b91505092915050565b600060208284031215620001fc57620001fb6200005a565b5b600082015167ffffffffffffffff8111156200021d576200021c6200005f565b5b6200022b84828501620001b0565b91505092915050565b600081519050919050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052602260045260246000fd5b600060028204905060018216806200028757607f821691505b6020821081036200029d576200029c6200023f565b5b50919050565b60008190508160005260206000209050919050565b60006020601f8301049050919050565b600082821b905092915050565b600060088302620003077fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff82620002c8565b620003138683620002c8565b95508019841693508086168417925050509392505050565b6000819050919050565b6000819050919050565b6000620003606200035a62000354846200032b565b62000335565b6200032b565b9050919050565b6000819050919050565b6200037c836200033f565b620003946200038b8262000367565b848454620002d5565b825550505050565b600090565b620003ab6200039c565b620003b881848462000371565b505050565b5b81811015620003e057620003d4600082620003a1565b600181019050620003be565b5050565b601f8211156200042f57620003f981620002a3565b6200040484620002b8565b8101602085101562000414578190505b6200042c6200042385620002b8565b830182620003bd565b50505b505050565b600082821c905092915050565b6000620004546000198460080262000434565b1980831691505092915050565b60006200046f838362000441565b9150826002028217905092915050565b6200048a8262000234565b67ffffffffffffffff811115620004a657620004a56200007f565b5b620004b282546200026e565b620004bf828285620003e4565b600060209050601f831160018114620004f75760008415620004e2578287015190505b620004ee858262000461565b8655506200055e565b601f1984166200050786620002a3565b60005b8281101562000531578489015182556001820191506020850194506020810190506200050a565b868310156200055157848901516200054d601f89168262000441565b8355505b6001600288020188555050505b505050505050565b61067880620005766000396000f3fe608060405234801561001057600080fd5b50600436106100365760003560e01c80631923be241461003b578063e21f37ce14610057575b600080fd5b61005560048036038101906100509190610270565b610075565b005b61005f610088565b60405161006c9190610338565b60405180910390f35b80600090816100849190610570565b5050565b6000805461009590610389565b80601f01602080910402602001604051908101604052809291908181526020018280546100c190610389565b801561010e5780601f106100e35761010080835404028352916020019161010e565b820191906000526020600020905b8154815290600101906020018083116100f157829003601f168201915b505050505081565b6000604051905090565b600080fd5b600080fd5b600080fd5b600080fd5b6000601f19601f8301169050919050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052604160045260246000fd5b61017d82610134565b810181811067ffffffffffffffff8211171561019c5761019b610145565b5b80604052505050565b60006101af610116565b90506101bb8282610174565b919050565b600067ffffffffffffffff8211156101db576101da610145565b5b6101e482610134565b9050602081019050919050565b82818337600083830152505050565b600061021361020e846101c0565b6101a5565b90508281526020810184848401111561022f5761022e61012f565b5b61023a8482856101f1565b509392505050565b600082601f8301126102575761025661012a565b5b8135610267848260208601610200565b91505092915050565b60006020828403121561028657610285610120565b5b600082013567ffffffffffffffff8111156102a4576102a3610125565b5b6102b084828501610242565b91505092915050565b600081519050919050565b600082825260208201905092915050565b60005b838110156102f35780820151818401526020810190506102d8565b60008484015250505050565b600061030a826102b9565b61031481856102c4565b93506103248185602086016102d5565b61032d81610134565b840191505092915050565b6000602082019050818103600083015261035281846102ff565b905092915050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052602260045260246000fd5b600060028204905060018216806103a157607f821691505b6020821081036103b4576103b361035a565b5b50919050565b60008190508160005260206000209050919050565b60006020601f8301049050919050565b600082821b905092915050565b60006008830261041c7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff826103df565b61042686836103df565b95508019841693508086168417925050509392505050565b6000819050919050565b6000819050919050565b600061046d6104686104638461043e565b610448565b61043e565b9050919050565b6000819050919050565b61048783610452565b61049b61049382610474565b8484546103ec565b825550505050565b600090565b6104b06104a3565b6104bb81848461047e565b505050565b5b818110156104df576104d46000826104a8565b6001810190506104c1565b5050565b601f821115610524576104f5816103ba565b6104fe846103cf565b8101602085101561050d578190505b610521610519856103cf565b8301826104c0565b50505b505050565b600082821c905092915050565b600061054760001984600802610529565b1980831691505092915050565b60006105608383610536565b9150826002028217905092915050565b610579826102b9565b67ffffffffffffffff81111561059257610591610145565b5b61059c8254610389565b6105a78282856104e3565b600060209050601f8311600181146105da57600084156105c8578287015190505b6105d28582610554565b86555061063a565b601f1984166105e8866103ba565b60005b82811015610610578489015182556001820191506020850194506020810190506105eb565b8683101561062d5784890151610629601f891682610536565b8355505b6001600288020188555050505b50505050505056fea26469706673582212204326a68b08e2f4dc500c89e18af564f403f4aaab0887e352cdb0a6cabba9228e64736f6c63430008120033

Now query [TRANSACTION_CONTRACT_1_CREATE] with the following block of python code:

from web3 import Web3, HTTPProvider, IPCProvider

web3 = Web3(IPCProvider(‘/path/to/your/geth.ipc’))

# sanity check

web3.eth.block_number

# query first transaction

tx = web3.eth.get_transaction(“0xd5b3d8b57316fecdfb958815e1c0d3079a5c5826e99c924c87cf11014b1b31d7”)

# contract bytecode

tx.input

Note that tx.input is very similar, but not equal to, the bytecode from the allegedly verified source code we compiled using Remix. The difference is an appended 96-byte sequence trailing the compiled bytecode, with each byte in the bytecode represented by a pair of hexadecimal characters. 169 of these appended 192 characters are zeros.

Pythonically converting the non-zero bytes (7363687265696220756e73) to UTF-8 encoded string:

bytes.fromhex(‘7363687265696220756e73’).decode(‘utf-8’)

Yields ‘schreib uns’. (‘Screib uns” means “write [to] us” in German.

The “verified” contract source code appears to be authentic, and the appended bytes comprise the initial value of public variable “message” that was passed to the contract constructor when the contract was deployed.

Task #3: Interact with deployed contract

[CONTRACT 1] source code is simple and consists of public variable of type “string” called “message”, a constructor that accepts a string as an argument, and function “updateMessage” of scope “public” which accepts string-type argument.

So, let’s drop the Dogs of BND a line. To do so, save the contract abi and bytecode on your local machine. In this case, I saved them as “dogs_of_bnd_abi” and “dogs_of_bnd_bin”, respectively.

In Python shell:

abi_path = ‘/path/to/dogs_of_bnd_abi’

bin_path = ‘/path/to/dogs_of_bnd_bin’

bin_file = open(bin_path,’r’)

abi_file = open(abi_path,’r’)

bytecode = bin_file.read()

abi = abi_file.read()

# set your default Ethereum accounts

# note that transaction signing is handled via clef, per above

# when the command line appears to freeze, it’s probably because clef is waiting for [y/N] input

web3.eth.default_account = web3.eth.accounts[0]

bndContract = web3.eth.contract(address=”0xB47C23D001c0c9F5C1A158a93b6dF6004b6012f7″,abi=abi)

bndContract.functions.updateMessage(“Information is all.”).transact()

Comments on the Digital Asset Anti-Money Laundering Act of 2022

The Honorable Elizabeth Warren
United States Senate
309 Hart Senate Office Building
Washington, DC 20510

December 22nd, 2022

Dear Senator Warren:

Senate Bill “Digital Asset Anti-Money Laundering Act of 2022” (DAAML) promulgates the regulation of various agents in the digital asset ecosystem toward the goal of preventing the use of distributed ledger (“blockchain”) technologies in money laundering, financing terrorism, and illegal drug trafficking [1]. Former CIA Director Michael Morell described blockchain analysis as a “highly effective crime fighting and intelligence gathering tool” and the Bitcoin ledger as an “underutilized forensic tool” (page 3, Morell et al) [2]. The report cited a currently serving official at the CFTC (Commodity Futures Trading Commission) who stated that it “is easier for law enforcement to trace illicit activity using Bitcoin than it is to trace cross-border illegal activity using traditional banking transactions, and far easier than cash transactions” (page 5, Morell et al) [2].

Bitcoin is an example of a distributed ledger technology that was designed and is primarily used for transferring value. However, numerous other distributed ledgers, most notably, the Ethereum blockchain, have their origins in decentralized computing. My analysis of the Ethereum blockchain demonstrated that 49.9% of all transactions entail transfer of funds*. The Ethereum blockchain, like many other distributed ledgers, is used for manifold purposes beyond simply transferring value, including voting, decentralized autonomous organization, litigation, intellectual property attribution, and proof of data authenticity, to name a few applications [3, 4, 5, 6].

Distributed ledger jargon suffers from misnomers that can easily confuse non-technical audiences who approach the subject from a financial perspective while ignoring blockchain’s numerous non-financial applications. For example, Ethereum applications have historically been referred to as “smart contracts”, although they generally have nothing to do with contracts in a legal sense. A “smart contract” is a misnomer for a software program that is stored and executed on a distributed computer known as “Ethereum”. Another example of jargon that is often misinterpreted by non-technical audiences is the term “transaction”.

Most non-technical readers interpret the word “transaction” in a financial sense, similar to a credit card transaction or a commercial interaction in which money is exchanged for a good or service. However, a “transaction” in the context of distributed ledger technology should be interpreted in the broader and more accurate technical sense of a state change. For instance, deleting a record, uploading a cryptographic hash, saving a string such as “Hello, World!”, or updating a database record on the Ethereum blockchain all constitute “transactions”. That is, a “transaction” results in a state change on the Ethereum computer and may or may not entail transfer of value from a sender to a recipient.

Section 3 of DAAML (“Digital Asset Rulemakings”) states: “The Financial Crimes Enforcement Network shall promulgate a rule classifying custodial and unhosted wallet providers, cryptocurrency miners, validators, or other nodes who may act to validate or secure third-party transactions, independent network participants, including MEV searchers, and other validators with control over network protocols as money service businesses.” Preventing money laundering and countering terrorism are obviously worthy goals. However, the measures proposed toward these noble goals are, in my view, misguided.

In proof-of-stake (PoS) blockchain networks such as Ethereum, validators are agents that run peer-to-peer software and use their own “staked” funds to cryptographically validate transactions. To rephrase this generally, validators run peer-to-peer client software that maintains the integrity of the network and keeps it online. Analogously, proof-of-work (PoW) blockchain networks such as Bitcoin (and previously, Ethereum, before a network upgrade that occurred on 15 September 2022 that changed Ethereum from a PoW to a PoS system) rely on miners to incorporate transactions into a distributed ledger.

Legislation that fails to distinguish between financial and non-financial uses of blockchain networks threatens not only the entire blockchain industry but also the very technologies that were developed to promote transparent, democratic, and censorship-resistant computing. Anyone with a modern computer and an internet connection can download an entire distributed ledger, interrogate and interact with the ledger using open-source client software. These agents can fall under the definition of “independent network participant” and “validator”. Classifying these entities as “money service businesses”, along with the regulatory and reporting burden this entails, is unreasonably onerous, in my opinion.

Beyond the stated goals of preventing the financing of terrorism, money laundering, and illicit drug trafficking, legislation that regulates the digital asset ecosystem should also protect consumers from abuse by financial institutions. FTX is neither the first nor the last financial institution, crypto-based or otherwise, to collapse, and these corporate failures should highlight a key value proposition of digital assets: the fact that they can be managed by individuals without having to place trust in third-parties that may or may not deserve their customers’ trust. The classification system put forth in DAAML would largely target innocent individuals while failing to focus regulation where it ought to be focused: on corporations that lost or otherwise gambled away funds of customers who relied on these companies to safeguard their digital assets.

Classifying miners, validators, and independent network participants as “money service businesses” would be analogous to classifying individuals who run file-sharing clients (for example, BitTorrent) as a cloud hosting service. Clearly, a college student running a peer-to-peer node from their dorm is incomparable to Google Cloud or Amazon Web Services. DAAML would severely discourage and unreasonably burden anyone wishing to run free, open-source software with a multitude of non-financial uses, whether or not these parties are involved in verifying transactions that entail transfer of value. DAAML fails to recognize the fact that most transactions on the Ethereum blockchain involve no transfer of value whatsoever. The bill lacks an accurate appreciation of the term “transaction” in a technical sense as it it used in blockchain discourse and is biased by interpretation of the word in a classical financial sense.

Every single transaction in the Ethereum blockchain can be scrutinized with a few lines of code. The tools I used to conduct my analyses are based on open-source utilities such as the Go-Ethereum client and the Python Web3 library [7, 8]. Rich and detailed information about the Ethereum blockchain can be obtained by anyone without the need for proprietary APIs. I encourage lawmakers to conduct their own analyses of blockchain ledgers so that emerging legislation can more effectively protect consumers and counter the financing of terrorism, illegal drug trafficking, and money laundering – without destroying the democratic, decentralized foundations of these technologies.

In particular, I urge you to develop a more nuanced definition of “money service business” that does not target the miners, validators, and independent network participants which serve as the foundation of blockchain networks and keepers of its decentralized integrity. I hope that the arguments made above demonstrate that doing so would be a fallacy.

Thank you for your consideration.

Respectfully yours,

Omar Metwally, M.D.

* My initial analysis of all transactions from the past 10 days yielded a figure of 43%, and this is the number I cited in my original letter to Senator Warren. In a follow-up study, I analyzed all transactions from every 100th block on the Ethereum blockchain beginning with block 16237072 and ending with block 1388368 and calculated that 49.9% of Ethereum transactions entailed value transfer. The ratio of value-containing transactions varies widely from block to block. Access to greater computing resources would enable a more detailed study, and I invite anyone interested in this research question to conduct their own analysis.

References

1. “Digital Asset Anti-Money Laundering Act of 2022”. https://www.warren.senate.gov/imo/media/doc/DAAML%20Act%20of%202022.pdf. Accessed 21 December 2022.

2. “An Analysis of Bitcoin’s Use in Illicit Finance” by Michael Morell, Josh Kirshner and Thomas Schoenberger. 6 April 2021. https://cryptoforinnovation.org/resources/Analysis_of_Bitcoin_in_Illicit_Finance.pdf. Accessed 21 December 2022.

3. “What in the Ethereum application ecosystem excites me” by Vitalik Buterin. 5 December 2022. https://vitalik.ca/general/2022/12/05/excited.html. Accessed 21 December 2022.

4. “How cryptography and peer-to-peer networks contribute value to society” by Omar Metwally. 13 March 2022. https://omarmetwally.blog/2022/03/13/how-cryptography-and-peer-to-peer-networks-contribute-value-to-society/. Accessed 21 December 2022.

5. “Great Explorers” by Omar Metwally. 16 September 2022. https://omarmetwally.blog/2018/09/16/great-explorers/. Accessed 21 December 2022.

6. Maestro Ethereum application by Akram Alsamarae and Omar Metwally. National Science Foundation Grant 1937914. https://maestro.analog.earth

7. https://github.com/ethereum/go-ethereum

8. https://github.com/ethereum/web3.py

How cryptography and peer-to-peer networks contribute value to society

By: Omar Metwally, M.D.

3/13/2022

Objective:

To illustrate the utility of cryptography and peer-to-peer networking in protecting the authenticity, integrity, and availability of information.

https://en.wikipedia.org/wiki/Snowflake#/media/File:Snowflake_macro_photography_1.jpg

1. Information is the useful synthesis of data.

Our email inboxes, phones, and hard drives are constantly filling up with data; however, collecting, organizing, and archiving the useful nuggets of information in an ocean of junk requires time, money, and energy. The number of useful emails in my inboxes is a small fraction of the total number of emails, which are mostly spam. I don’t pay for extra storage out of principle. Why fund a company whose spam filters are more likely to block important emails than spam? Why perpetuate the problem?

Similarly with the high-resolution photos which take up so much memory on my phone and hard disk: most of these photographs do not deserve the 2+ MB of memory they occupy on my phone and PC. I’ll commonly snap a photo of a beautiful landscape, a critter I encounter on a walk, or something I need to remember for a short period of time (for example, where I parked). Backing up every photo and video on my phone seems wasteful considering that, like my email inbox, only a small proportion are media that I actually want to preserve. The alternative, however, would be to manually go through each of my inboxes and every photo I take on my phone and make a conscious decision whether to keep or delete a file. This latter strategy often proves far too time-intensive to pursue on a consistent basis.

2. Data that exists in only one location is as good as gone.

I once asked a colleague how he backs up his digital information. “I’ve never needed to back up my data,” he answered. This is a fallacy. Every possible failure of a digital system will eventually and inevitably occur. Hard disks fail all the time. People accidentally delete and lose files. Important bits of information drown in oceans of spam and junk, to the extent that locating them becomes practically impossible. Networked systems get hacked. People lose or upgrade their phones and change platforms, only to realize years later that they never backed up their old Android or iPhone which is now resting in a landfill.

Preserving information in a way that facilitates future retrieval requires:

– a consistent schema for organizing files and directories

– multiple physical (e.g. HDDs and SSDs) and cloud-based storage systems

– a consistent version control schema

– consistency in backing up information to each of these media

In other words, if you really cherish your data, you need to be organized, anticipate what can (and inevitably will) go wrong, and back up consistently. If it’s important information, chances are you’ll also want to encrypt your disks in a way that prevents unauthorized parties from accessing the data, without accidentally losing access to your own data.

3. Cryptography is arguably one of the most useful and powerful technologies in modern-day computing.

Modern cryptography is the basis for digital tools that protect the authenticity and integrity of information. While information ends up in the wrong hands all the time, encryption ensures that only the intended recipient can “unlock” the information. To lay people, “encryption” may conjure messaging apps designed for protect one’s privacy. However, another compelling use case of cryptography, which may be unknown to lay computer users, is to mathematically prove the authenticity of digital information. Algorithms such as SHA256 [https://csrc.nist.gov/glossary/term/SHA_256] can generate a mathematically unique string of numbers and letters, which can serve as a “fingerprint” for a file’s authenticity. Altering even the slightest letter in a document changes this cryptographic fingerprint.

Just like no two individuals have the same fingerprint, so do non-identical files yield unique cryptographic hashes. For instance, an attorney who needs to ensure the authenticity of a collection of evidence can use a cryptographic hashing algorithm such as SHA256 to prove beyond a doubt that the data do indeed represent what the attorney claims they do. However, it’s important to note that these hashing algorithms do not necessarily preserve the actual data to which they refer. It is still upon the attorney to back up the evidence in a secure and redundant manner. Furthermore, the attorney must ensure that each backup is identical. Although a small discrepancy may or may not be consequential in court (for instance, accidentally adding a space, period, or comma may or may not alter the interpreted meaning of a document), the cryptographic hash will be altered, negating the utility of the hashing algorithm.

4. Distributing and decentralizing information is a key value proposition of blockchain networks

Encryption and hashing preceded cryptocurrencies. Hash functions, which are defined by the National Institute of Standards and Technology, are generally free to use and are accessible via command line on any computer. Arguably the biggest value proposition of blockchain networks, on a technical level, is their capacity to add verifiable and tamper-proof timestamps to cryptographic hashes, by propagating a verifiable and identical chronological database across numerous peers around the world. Being able to reliably exchange information with thousands of computers across the world, spanning many different geographic areas, yields redundancy that would be implausible to replicate by entrusting any one party to create thousands of backups, spread them around the world, ensure that they can be accessed reliably, and also ensure the integrity of the original information. In reality, governments restrict access to online content all the time. People in affected locations can use tools such as VPNs to try and circumvent these limitations, but as long as a critical number of nodes is online, the information will not be lost, even if it is inaccessible from a certain geographic region due to inability to run a p2p client.

Cryptocurrencies create financial incentives for people to volunteer hard disk space, broadband, their time, skills, computing resources, and energy to contribute to a peer-to-peer network. Rather than relying on one party to ensure the integrity, authenticity, and availability of data (which is typically hosted in a relatively small number of geographic locations), blockchains are essentially distributed databases (also known as “distributed ledgers” when used in the context of exchanging digital value).

5. Ensuring information availability is another value proposition of blockchain networks

I have been experimenting with IPFS (“InterPlanetary Filesystem” [https://ipfs.io/]), a peer-to-peer file-sharing networking, since 2017. Each byte stored directly on a blockchain network is relatively expensive. While all blockchains are peer-to-peer networks, not all peer-to-peer networks are blockchain. IPFS, an example of a peer-to-peer network that is not a blockchain, allows users to easily upload directories and files to the network, where they are relayed from node to node. IPFS itself is free to use; that is, there is no subscription fee to cover hosting costs because volunteers around the world share in hosting the data. However, this utopian dream of “share everything, preserve everything” ignores the reality of the cost of hosting data. Bandwidth, disk space, processing power, and electricity cost money. Data hosted on IPFS can be “pinned” using a 3rd-party service, but this crosses the line of decentralization and places trust in a 3rd-party service to ensure the persistence of these data. Furthermore, it’s unclear to me why a 3rd-party service would volunteer their resources freely without charging a hosting fee.

Filecoin is a cryptocurrency developed by the creators of IPFS (Protocol Labs) which aims to solve this missing economic incentive. The Filecoin protocol aims to incentivize miners (people with a lot of computing power and storage capacity) to host others’ data by rewarding them with the Filecoin cryptocurrency in exchange for running software that can mathematically prove that the hosted data (1) exist on their hard drive(s), and (2) can be retrieved by the party that is paying Filecoin in exchange for their data to be hosted.

I downloaded the Filecoin client (“Lotus”) and spent several days running IPFS and Lotus in parallel in order to see if hosting a 113 MB file on Filecoin was a better alternative to using traditional cloud servers, and also to learn about the economics of the Filecoin ecosystem. I provide here my impressions of this limited experience without a recommendation for or against any cryptocurrency.

It took me a few hours to sync the Filecoin mainnet to completion. I had to download a snapshot of the chain in order to sync, and I could not locate a SHA256 checksum of the snapshot used to sync. I was unable to sync by connecting to peers directly. Using snapshots hosted on a centralized server which are not associated with published checksums is never best practice because there’s otherwise no way to ensure the authenticity or integrity of what one thinks they are downloading.

The Slack channels used by the Filecoin community are active, and I received timely answers to my questions by knowledgeable contributors. Once the Filecoin chain was synced, I proceeded to upload a 113 MB file using its IPFS hash (that is, the file was already uploaded to IPFS, and I used the IPFS hash to point to the data). The process of uploading data generally entails (1) identifying storage providers (miners) who are willing and able to host one’s data; (2) uploading the data to the storage providers; and (3) paying a transaction fee to upload the data. These transactions are referred to as “deals” and can range from 180 to 540 days in duration. Miners can specify parameters such as the minimum and maximum file size they are willing to host, duration of hosting, and their cost per Gigabyte per time period (in the case of Filecoin, per 30-second epoch). Retrieving data involves a separate set of processes, but I haven’t yet made it that far.

In Filecoin, miners host others’ data, which may or may not be encrypted. This is a potential legal gray area because miners generally don’t know what they’re hosting, and miners are often located in jurisdictions separate from the party seeking hosting services. Deals can be arranged on a Slack channel or third-party reputation marketplaces, but rarely does one know whom exactly they’re dealing with. What happens if a party is uploading content that is illegal in their jurisdiction? Or perhaps legal in their jurisdiction but forbidden in the miner’s jurisdiction?

The process of trying to host data on Filecoin is far more complex than using traditional cloud servers. The average person is unlikely to succeed without a strong commitment to the steep learning curve involved in using these command-line tools. Some of the complexities can theoretically be simplified using third-party services, but this can potentially negate the advantages of using an incentivized p2p network in the first place.

The Filecoin protocol incentivizes miners to contribute their computing resources (and time) to host others’ data by rewarding them for reliably hosting others’ data and financially punishing them by deducting penalties from the collateral they have to put up. Due to the relatively early stage of development of these tools, Filecoin documentation recommends making multiple deals with up to 10 different miners to ensure the availability of one’s data, in case one or more miners’ do not make good on their deal.

On my first attempt to upload a 113 MB file, the “deal” failed for unclear reasons, despite my attempts to troubleshoot the Lotus client’s behavior with the help of technical support volunteers. My starting balance was one Filecoin (1 FIL). Here are some numbers central to the (failed) transaction:

Initial wallet balance: 1 FIL

Cost of hosting 113 MB file with a particular miner for 180 days: 0.01296 FIL ($0.225504, at an exchange rate of $17.4 per FIL on March 12th, 2022).

Wallet balance after the escrow funds were returned to my wallet (i.e. after the deal failed):

0.996353443699298176 FIL

Difference between initial and final wallet balance = amount of “gas” burned (network transaction fees):

0.006646556300701767 FIL

Therefore, 51.285% of the original proposed cost of hosting the file (0.01296 FIL) was burned in the form of gas. In other words, 0.006646556300701767 FIL / 0.01296 FIL = 0.5128515664121734

While the amount of burned gas may seem trivial, it accounts for a majority of the cost of the failed deal (51.285%)! If the goal is to establish 10 deals with 10 different miners, then the cost of gas associated with failed deals can quickly add up.

6. Mathematical proof of data availability may or may not be necessary

There are certainly cases in which it’s necessary to prove mathematically not just the integrity and authenticity of data (for example, using hashing functions such as SHA256), but also the availability of the data. Filecoin aims to mathematically prove both the existence and availability of data hosted on a peer to peer network while incentivizing miners to uphold deals with parties who need data hosted. However, there are also many instances where a SHA256 checksum uploaded to a blockchain with an immutable timestamp is more than sufficient. In this latter case, the responsibility of organizing, archiving, and maintaining identical copies of these data falls upon the party willing to pay for the weight of this proof. As mentioned above, there are instances where entrusting miners to store and deliver content may be undesirable for legal reasons, privacy, or simply the need to trust that at least one miner with whom one conducts a deal will uphold their end of the deal.

In conclusion, cryptography and peer-to-peer networking are powerful technologies that can help protect the integrity of information and ensure its persistence. Various blockchain networks use financial incentives in different ways to provide a variety of value propositions to network participants. Clearly understanding one’s goals as the relate to information preservation/exchange, and clearly understanding each network’s value proposition, is key to making good investments of one’s time and resources.

A stroll through Victory Mansions

Omar Metwally, MD
Analog Labs
19 November 2018

It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent a swirl of gritty dust from entering along with him.

Opening paragraph of George Orwell's Nineteen Eighty-Four. 
306 characters, including spaces.

I did some back-of-the-napkin math to calculate how much it would cost today to upload George Orwell’s novel Ninteen Eighty-Four to the Ethereum blockchain.

To upload the opening paragraph using this Ethereum contract (there are much more efficient ways to accomplish this using Solidity), the transaction would cost 290697 gas under current network conditions. If the entire 576,789-character novel were uploaded in the same manner, it would cost 576789 * 290697 / 306 = 54743895.20588 gas. Gas is currently about 2.2 * 10^9 wei [1].

(54743895.20588 gas) * (2.2 * 10^9 wei / 1 gas) * (1 Ether / 10^18 wei) = 1.2 Ether.

The carat symbol (X^Y) here indicates “X to the power of Y”.

In this manner, Orwell’s Ninteen Eighty-Four would cost 1.2 Ether to upload to the Ethereum blockchain, where it would be permanently and publicly available, served by more than 10,000 nodes.

If Ether were regarded in terms of its utility rather than as a speculative or financial instrument, there would likely be much less price lability, assuming society’s utility for a technology in general changes at a much slower rate than a market’s enthusiasm for securities and commodities. For instance, the cost of electricity in the residential setting varied from an average of 11.26 cents per kWh in 2008 to 12.89 cents per kWh from 2007 to 2017 [2]. Contrast this with the cost of Ether ranging from less than $1 in 2015 to more than $1,400 in early 2018.

How much does Ether really cost? A dollar? $100? $1000?

One way to begin answering this question is to study current market rates of cloud hosting services [3, 4]. Google offers a 2TB standard storage tier at $0.000274 per hour, and Amazon’s standard EC2 instances can range from $94 to $2,367 annually. A direct comparison with the cost of uploading Orwell’s novel is inaccurate because:

  • Information uploaded to the blockchain is permanent as long as a majority of nodes continue perpetuating the blockchain. Cloud hosting contracts are only as permanent as a recurring credit card payment, a company’s existence, and its willingness to serve data.
  • Google and Amazon cloud instance capacity is much larger than the 590kb size of Nineteen Eighty-Four as a text file.
  • Cloud hosting companies charge for bandwidth, whereas there are no blockchain transaction costs associated with downloading blockchain data
  • Conversely, running blockchain clients consumes a lot of bandwidth
  • A large, distributed network’s downtime is virtually zero and is theoretically much more resistant to hacking

I offer file storage as an imperfect thought experiment because a significant part of what consumers pay for when purchasing a smart phone is the ability to store large amounts of media, access and share these data. This thought experiment is only a starting point to answering the question of how much one Ether actually costs.

It took decades for the internet’s value to manifest, which today often takes the form of profiling users and using this information to sell digital ads. As one of my academically-minded siblings keenly points out, however, one important difference between the origins of the internet as we know it today and blockchain networks whose tokens are traded on exchanges is that the internet was built in a more farsighted manner without the objective of making money for speculators. ARAPANET, the precursor to the modern internet, initially ran on four Interface Message Processors (IMPs) at UC Santa Barbara, Stanford, the University of Utah, and UC Los Angles [5]. Of course, the internet has changed dramatically since its early years, and technology in general is constantly evolving under the pressures of regulation and free markets.

Crypto markets poisoned blockchain research by muddling networking protocols and stake in open source projects with financial speculation. On one hand, capital is an important element of many large endeavors. On the other hand, skyrocketing prices and price lability can breed greed, resentment, and hinder the ability of programmers, consumers, and researchers to actually use networking protocols. The lower the price of crypto, the cheaper the transactions on the network and the more accessible the protocol is to the average consumer.

So how much does Ether really cost? A dollar? $100? $1000?

One step toward answering this complicated question is to ask: how much would you pay to perpetually host George Orwell’s Nineteen Eighty-Four (or another 590kb text file or image)?

Great Explorers

Omar Metwally, MD
Analog Labs
15 September 2018

 

She was one of the truly fortunate people who discover what they love to do, have the means and the courage to follow their passion, and the gift to share their discoveries.

Robin Hanbury-Tenison on botanical artist Marianne North

Traveling in Japan with friends, Robin Hanbury-Tenison’s The Great Explorers captivated and inspired me with a collection of biographies of courageous individuals who explored and discovered continents, oceans, deserts, caves, and rivers. These people lived in times when large parts of Earth’s surface were unknown to humanity and entirely uncharted, and their stories left me wondering which frontiers stand before their contemporaries in pursuit of advancing society’s collective knowledge.

Most of these explorers lived before the advent of the digital age, relying on analog instruments to study terra nova: magnetic compasses, sextants, pacing beads, and their powers of observation. A journey that spans thousands of miles over years requires a deliberate estimation of the minimum amount of equipment necessary to facilitate their survival and studies without burdening them. In stark contract, we live in a time of abundant and oftentimes superfluous technology. During the past weeks of travel, I meditated on the question of how much technology one actually needs without becoming burdened by it. Every day reminded me of the joys of good company, the mind’s capacity to acquire languages, the utility of answering questions by asking locals rather than searching the web, and a postcard’s ability to distill thoughts into a memorable moment. Translation software and internet access, while sometimes handy, are no substitute for a sound grasp of a foreign language and asking locals how to get around. Google can help translate a phrase in a pinch, but it’s unlikely to know that a typhoon blocked a bus route and that a taxi driver will find the safest way home.

Richard Burton taught himself to speak 27 languages by the time he died in 1890, and his mastery of cultural camouflage opened doors to civilizations in Africa, India, and the Middle East which would have otherwise been closed off to Europeans of his time. Gertrude Bell, the first female officer in British Intelligence, mastered Arabic and Persian, translating poems by Hafiz as she trekked across deserts meeting local sheikhs and tribe leaders.

The tools one has at hand bias one’s approach to discovery. Compare our trip to Japan, for example, with that of Francis Garnier, who embarked on a treacherous journey to explore the Mekong with his crew. Compared to Garnier’s crew, we enjoyed every luxury available to modern travelers: airplanes, hotel reservations at our fingertips, smart phones, and Google Translate. And should we stray from cell reception or forget to charge our phones, my GPS-connected RPI can still pinpoint our whereabouts anywhere on Earth. Unlike the fearless explorers who risked life and limb in pursuit of beliefs, passions, or sheer love for discovery, who immersed themselves in native cultures and dedicated lifetimes to observing and describing, one might say we left Japan only slightly more acquainted with its people and culture as when we arrived.

IMG_3763

Humankind – the individual mind and collective human behavior – is a perpetual frontier. Know thyself, so the wisdom of ancient civilizations. Most interesting to me and pertinent to my research is the question of how human societies can use finite resources to provide better lives for future generations. A solitary zero-sum endeavor has the potential to become a vast leap forward when knowledge is shared effectively with a global village. This is what excites me most about open source collaboration and paradigms of participatory computing, such as peer-to-peer networking and data structures based on them.

Norwegian explorer Roald Amudsen left his medical studies to pursue his childhood dream of traversing the Northwest Passage. Having gone into debt to acquire a shipping vessel and assemble a team that would succeed in achieving his childhood dream – as well as becoming first to reach the South Pole – he departed on his journey hours before debt collectors planned to seize his ship. Debt was a recurring theme in many of these ventures, and many explorers burned through personal fortunes, imperial funds, or private capital to fund their expeditions. Amudsen’s story is an example of humankind’s capacity to lift itself from its own bootstraps, to produce lasting humanistic and technical works that are greater than the sum of individual labors. Amudsen’s successful return converted the same debt collectors into patrons and benefactors eager and proud to support his future voyages.

Screen Shot 2018-09-15 at 4.07.58 PM

Screen Shot 2018-09-15 at 4.07.00 PM

I chronicled our trip on the Ethereum network for the sake of posterity and to illustrate the utility of technologies that have grown into areas of interest and focus for me. For non-technical users, the easiest way to download these points from the Ethereum blockchain is to use the Ethereum Mist Browser (similar to a web browser for blockchain).  They can also be downloaded using numerous command-line frameworks for interfacing with the blockchain, such Web3py.

Fleet Fox contract address:

0xe18FE4Ded62a8aa723D6BE485B355d39d409354d

Link to Fleet Fox ABI

Many colleagues and friends have asked me, in the context of the distractions of financial speculation, why anyone would bother developing an application on a blockchain and forego the relative ease and inexpensiveness of services offered by large, established corporations. The reason why most people, myself included, use services offered by large tech companies is because they sell useful products. It is the logic of a free market. For example, I have a MacBook and iPhone, and I have benefitted from Apple, Google, and Amazon’s products. My work studio is also filled with home-made computers running Linux-based operating systems, and I use the Ethereum blockchain on a daily basis to run my and others’ code, which performs familiar tasks such as networking, storing, and moving information. To enjoy the convenience of mainstream products such as iMessage, iCloud, and iPhone, one must pay the Apple “tax” by purchasing one’s way into the Apple ecosystem, an exclusive gateway to access one’s multimedia, emails, text messages, documents, and personal contacts’ information. To enjoy the convenience of Google’s cloud, one pays the Google “tax” by waiving a certain degree of privacy and control over one’s personal data, which is only as permanent as a recurring credit card payment, the company’s existence, and the output of its machine learning algorithms. The same analogies and parameters can be extended to Facebook and Amazon.

The notion of transaction costs on blockchain networks is the analogous “tax” one pays for the security, persistence, and control over one’s information on a decentralized network, which are sacrificed more or less when relying on corporations. It is the cost of digital sovereignty. At the time of writing, the transaction cost of uploading each individual GPS location onto the blockchain cost 0.0004164664 Ether, or $0.09 at a rate of 1 ETH = $220 USD.

Blockchain technologies are in their infancy. Using a Blockchain Messaging Service today reminds me of sending email in the early 90s, when my uncle (a networking engineer) and a few hobbyists in the UK and Japan, whom I had never met, were the only people in my address book. One of my first books was a kid’s guide to the internet, which listed a handful of websites, such as Nickelodeon, Kellogg’s, and NASA, along with the authors’ advice to have a pencil and paper handy to doodle because some images (very low-resolution by today’s standards) could take up to 30 minutes to load on slow dial-up connections. Like those early days of the internet, blockchain applications still have a long way to go. And that’s what makes working with this technology fun and worthwhile. It’s a new frontier.

Fleet Fox (Github repo | Fleet Fox receiver) is an application that allows decentralized exchange of information and value tied to one’s physical location. It’s built on the same open source infrastructure I’ve used to chronicle our trip to Japan, and I’m excited to pilot the technology as a backend for vehicle fleet-sharing services in coming months. I would be grateful for and humbled to receive feedback from fellow explorers using it to collaboratively build a behavior-centric map of the world on the Ethereum blockchain.

IMG_3747

Take-Home Lessons:

  • What you really need is good friends. Technology is optional.
  • Anything can be learned.
  • Transaction costs on blockchain networks contribute to security, persistence, and control over one’s information on a decentralized network

Participatory Computing

One of my family friends congratulated me recently on the success of the ‘Ether startup,’ leaving me briefly puzzled. While the parallels between issuing common stock, stock options, and digital tokens are relatively intuitive, this was the first time I heard an open source community described as a company — by someone unfamiliar with open source software. Technically, Ethereum isn’t a startup but an organization rooted in open source communities working to develop decentralized, logic-gated information and value exchange. There are similarities, and differences, between open source communities funded by digital tokens and traditional startup equity.

Transitioning from a Clinical Informatics fellow at UCSF to starting an R&D lab has provided me an opportunity to reflect on the valuable mentorship I’ve been lucky to receive along the way.

Analog Labs is an applied research laboratory aiming to:

  1. Educate societies about blockchain technology and emerging paradigms in Participatory Computing
  2. Apply this research directly toward social good
  3. Be financially and environmentally sustainable

The excitement surrounding cryptocurrencies drew attention to a field in tech that had been niche until relatively recently. Capital allows companies to grow and subsequently create value for society. However too rapid influx of wealth into cryptocurrencies can outrun the ability of these technologies to mature and evolve. Rapidly increasing prices of cryptocurrencies can bring wealth (and ruin) to speculators and can also discourage the spending of Ether to actually run applications. The excitement surrounding the industry, despite being a source of attention and potential investment funds, needs to keep pace with the development of these technologies for the sake of the long-term health of these technologies.

Since I started purchasing health insurance last month — $902.04 per month for medical insurance and $32.52 per month for dental insurance — I’m reminded of the dizzying cost of healthcare in the United States — the glaring economic and public health problem that sparked my interest in Ethereum several years ago. Analog Labs’ flagship project is a study of grassroots primary care models on the Ethereum blockchain. This living experiment is an opportunity to tap into a body of literature in Global Health and international communities’ experience with designing creative solutions to the challenge of funding healthcare’s perpetual journey to better.

Analog Labs is also seeking to help develop 2-4 projects that further the lab’s goals of applied research for sustainable social good by providing funding, technical expertise, and collaborative work, especially in the areas of:

  • health insurance
  • environmentally-friendly shipping materials
  • public transportation

 

I’m grateful to Betty Tran, Steven Truong, Peter Mikhail, Royd Carlson, The Haham-Grossman family, Linh Tran, Darlene Nguyen, Dr. Blake Gregory, Dr. Indhu Subramanian, Dr. Taft Bhuket, Dr. David Avrin, Dr. Sidhartha Sinha, Dr. Scott Enderby, the Highland family, the UCSF community, Bella Shah, Seth Blumberg, Dana Gersten, Tanner Irwin, Youssif Abdulhamid, Shahzad Ahsan, and friends at UCSF’s Aldea community for their support, mentorship, and contributions to this work.

 

Omar Metwally, MD

 

 

Re-programming the developing world: the tobacco problem

Omar Metwally, MD
University of California, San Francisco

If I were a government or private health insurance company trying to improve public health and reduce costs associated with treating cancers, chronic obstructive pulmonary disease, cardiovascular diseases, and many other preventable tobacco-related illnesses, how would I take on this challenge? Would offering nicotine addicts cash or subsidizing their insurance premiums curtail these unhealthy behaviors? While there’s some evidence that paying smokers works in the short-term, the effect is modest and has not been shown to be a successful long-term strategy.  The 2017 recipient of the Nobel Prize in Economic Sciences, Professor Richard Thaler’s work demonstrates how humans’ bias toward short-term rewards contributes to poor long-term decision making. Especially in the case of chemical dependence, immediate positive reinforcement (e.g. a puff of a cigarette) trumps the relative abstractness of long-term planning. Who doesn’t want to live a long, healthy life free of suffering, expensive healthcare bills, and the loss of independence associated with frequent trips to the hospital? But to an addict, a drag from a cigarette is more attractive than the prospect of being rewarded with something like health, time, or disposable income in 10, 20, or 50 years time.

Breaking chemical dependence, and modifying behavior (such as a sedentary lifestyle, unhealthy eating habits, and compliance with preventive healthcare) in general, depends to a large extent on modifying one’s environment. This includes things like eliminating triggers (ashtrays, packs of cigarettes and lighters lying around), and enablers. More effective than paying someone to ditch unhealthy habits may be helping someone with an addiction change their social context. I’m skeptical about the efficacy of an incentive program that would, for example, pay smokers cash in exchange for urine tests that verify an individual’s nicotine-free status. The desire for long-term abstinence from substances, weight loss, or regular exercise must be intrinsic and reinforced by the company one keeps. A chronic smoker is more likely to smoke among a group of friends who also smoke than in an environment where they’re constantly subject to inconvenience, protest, or punishment whenever they reach for a cigarette. The corollary is the hypothesis that helping a smoker and their group of smoking friends quit together may be more effective than limiting an intervention to individuals.

Continuing the thought experiment, how does a health minister or surgeon general help people modify unhealthy behavior while changing one’s entire psychosocial situation? Instead of an intervention like a urine test, which people may find embarrassing or perhaps not worth the inconvenience of extra pocket money paid at the conclusion of a research study, what might a reward system look like which compensates individuals based on a convenient, dignified, and inexpensive  “proof-of-motivation”? If I’m a smoker interested in quitting  and make my intentions clear to my family, friends, and co-workers, could their vouching for me serve as such proof-of-motivation? If I truly muster the willpower to not smoke for a week, month, or year, could the people closest to me supplant something as sensitive/specific as a blood or urine test?

A hacker-mind will be quick to point out that a nicotine addict, if determined to do so, will find a way to game simple trust-based systems, whether it means stepping into -20C weather, walking for a mile to a secluded smoking spot, or sneaking cigarettes while driving to and from work. Moreover, a smoker could easily convince others to lie about their behavior in exchange for sharing the reward with colluders. What is necessary is a more perfect mechanism for allowing individuals to vouch for one another’s behavior. Earlier in my career, I had a tendency as a technologist in general and blockchain researcher in particular to reach for technology XYZ and ask, what can I do with this technology? I see this bias throughout Silicon Valley; we love to build things, and technology is an easy starting point for our desire to effect change.

The organizational and technological merits of my specialty’s distributed, peer-to-peer paradigms are rather clear. The other half of the blockchain equation, proof-of-work, makes sense in the context of value stores, a feature common to the two predominant cryptocurrencies (Ethereum and Bitcoin) — in addition to Ethereum’s Turing completeness and logic layer. In the excitement of embracing new technologies, one should be wary of shoehorning technologies into domains where a logical fit doesn’t exist. Bitcoin as a digital asset makes sense. Ethereum as a value store — and much of its organizational functionality  — makes sense.

But with the goals of promoting healthy behaviors, helping developing countries kick a terribly addictive habit, and improving air quality as our starting point, how does one begin to effect change without burning up the planet’s resources in the process (one Bitcoin transaction wastes enough energy to power a household in a developing country for weeks)?

I would love the feedback of people thinking about this problem from different points of view. More than a billion human lives, which will be claimed by the tobacco industry in the 21st century, depend on it.

 

 

 

The Nile and the Ethereum Blockchain

Omar Metwally, MD
University of California, San Francisco

Distributed Data Sharing Hyperledger (DDASH).
=============================================
   Github repository
   -----------------
   Project website
   -----------------

Like blood rushing through a major artery, the Nile flows north from its origins in eastern Africa, nourishing hundreds of millions of people in Ethiopia, Sudan, and Egypt. Through millenia, the world’s longest river has turned otherwise uninhabitable deserts into fertile farmland, giving birth to civilizations that depend on its water for sustenance and trade. Herodutus, the ancient Greek historian, described Egypt as “the gift of the Nile,” and in the southern Egyptian cities of Aswan and Luxor, the Nile’s critical importance into modern times is ubiquitously apparent.

I step outside Aswan airport into a warm, brilliant January morning. A sweet breeze and southern Egyptians’ lightness greet us, a stark contrast from Cairo’s frenetic bustle . A charming Nubian man drives us to our hotel, and as we pass acres of hydroelectric generators and the Aswan High Dam, the Nile’s modern day importance to a country dependent on its every last drop comes into focus.

The Ethiopian government began constructing The Grand Ethiopian Renaissance Dam in 2011, making it Africa’s largest hydroelectric power plant. Although the $6.4 billion project is well underway, the ramifications of this project remain incompletely understood. The potential threat of depriving downstream countries (the Nile flows from south to north) of water and hydroelectric energy has raised concerns about its potential impact on human life.

While hydropolitics is a step removed from my field of Clinical Informatics, this complicated situation involving numerous parties with conflicting interests (sound familiar?) piqued my interest. The U.S. healthcare system, like this sensitive hydropolitial situation, is plagued by the major problem of many conflicting interests with little incentive to cooperate. Stifled health information exchange has bred a climate of competition rather than cooperation, ultimately to the detriment of individuals. I began my career as a blockchain researcher in 2014 when I realized this paradigm’s potential to create equity and promote cooperation. The conflict surrounding the Nile and the Renaissance Dam is a vivid demonstration of how the Ethereum blockchain can help nations solve a geopolitical conflict surrounding a scarce natural resource through cooperation rather than competition. My core thesis on blockchain, a technology that bridges computing, psychology, and economics, is that opportunities for cooperation will arise naturally as individuals benefit from increasing opportunities to participate in decision-making on all scales.   

To demonstrate these principles and test the above hypothesis, I spent several jet-lagged nights deploying a Nilometer contract on the Ethereum blockchain. This Ethereum contract lets parties bid for a minimum Nile water level and send a variable amount of Ether to support their bid. If the next month’s water level meets this minimum, these funds move from digital escrow to a pre-determined recipient (for example, a government, non-profit, or corporation). 

nilometer

Hydrological Time Series data were obtained from Technical University of Munich (Deutsches Geodätisches Forschungsinstitut an der Technischen Universität München).

 

Trolling for a wealthier world

Omar Metwally, MD 
University of California, San Francisco

“No,” said the priest, “you don’t need to accept everything as true, you only have to accept it as necessary.” “Depressing view,” said K. “The lie made into the rule of the world.”  – The Trial (Franz Kafka)


Goal

To understand what motivates people to create and share knowledge.

Plot

Larry, is a cafe owner on a mission to brew the world’s best cup of coffee.

larry
Larry’s on a mission to brew the world’s best coffee

The Cast:

Roy Bender:  Australian-American physician who invented Roy’s Retractable Needle in 1990. His patents in Australia, America, and Germany brought him great success and have since expired.  

surfer.jpg
Dr. Roy Bender enjoying his patents’ success

Ernesto Bernal: Ernesto is an altruistic Mexican inventor whose outrage at the cost of American “Epi Pens” (life-saving medical devices used to treat potentially fatal allergic reactions by delivering epinephrine into the thigh muscles) inspired him to invent Ernie’s Excellent Pen. Ernie’s Excellent Pen uses the technology behind Roy’s Rectractable Needle to make this life-saving medication more affordable for patients.

ernesto.jpg
Dr. Ernesto Bernal building “Ernie’s Excellent Pen”

Dr. Xu: Chinese scientist looking for a way out of a dead-end postdoc. He spends a lot of time trolling websites like reddit in between experiments.

xu.jpg
Dr. Xu, Full-time redditor, part-time postdoc

Dr. Chang: Chinese scientist who invented China Pen in 2005, her own version of the epi pen. Dr. Chang, a brilliant scientist without business aspirations, quickly forgot about her invention and moved on to other research projects. She and Dr. Xu were postdocs together and longtime friends. 

chang.jpg
Dr. Chang mentoring her student

Part I

Looking for inspiration for his next big invention to save him from a stagnant postdoc, Dr. Xu browses the blackswan network, a database of inventions and ideas, on a quiet afternoon in his lab. He likes the website because it’s like a nerdy version of reddit, a website that has occupied a lot of his time recently. While browsing blackswan he stumbles upon Ernie’s Excellent Pen (a medical device built using Roy’s Retractable Needle), which immediately reminds him of his old friend’s China Pen. He creates an “attribution” on the blockchain, an association between two devices/components, that looks like:


To a human, this attribution looks something like:
[Appraiser's (Dr. Xu's) Ethereum address, Resource1* (Roy's Retractable Needle), Resource2* (Dr. Chang's China Pen), Timestamp, Transaction Handle]

To a machine, this same attribution might look like:
[0x6060604052341561000f576, Qmt3z9320ba, Qz429ccr082, 1508029285, 0xc6a493eb108266c548906c8b]

This attribution allows others to see that Roy’s Retractable Needle and the China Pen are related to one another. Others can “upvote” this association as a useful one and create their own associations (so that Dr. Xu can learn about related inventions which he wouldn’t have otherwise encounter). For someone on the hunt for the next big idea, this is a great way to find inspiration and learn about what others are building. All community members can vote on how useful an attribution is and can create their own attributions.

*As a side note, resources are named something like Qmt3z9320ba , and these names also function as locations (addresses) of files with detailed information about each invention, including schematic drawings and textual descriptions. If any of the files to which these addresses point are modified, the entire address changes — one way to make sure each timestamp accurately reflects the information with which it’s associated. 

Part II

Larry owns a hip cafe in Tel Aviv and has invented many gadgets on his quest to brew the perfect cup of coffee. As he sips on a cup of coffee and browses the blackswan network, inspiration strikes, and he has a new idea for a modified French press that could be built using the spring-loaded mechanism underlying Roy’s Retractable Needle. Larry draws up some sketches and a description of how his Better Coffee Press would work and confidently uploads the information to the blackswan network. He doesn’t need to worry about someone else claiming ownership of his ideas because there’s a timestamped record of this information on the blackswan network. 

Part III

The owner of Oakland Standard, a manufacturer in Oakland, California, discovers Larry’s sketches a week later and calls him in his cafe. He loves the idea, he tells Larry, and wants to bring his product (Better Coffee Press) to the U.S. market. One of Oakland Standard’s designers suggests using a slightly modified component from Ernie’s Excellent Pen (Larry’s never heard of Ernie or his epi pen, but he likes Oakland Standard’s suggestion). Larry seals the deal with Oakland Standard. 

Oakland Standard eventually takes the product to market, and it’s a hit among hipsters and coffee connoisseurs across the U.S.. After Oakland Standard (and Larry) make their millions, the design for Better Coffee Press appears on the blackswan network around the same time that the patent is published and viewable on Google Patents and the US Patent and Trademark Office website:


Type: device
Name: Better Coffee Press
Function: hand-operated coffee brewing device
Content-addressed hash: Qbb4a27e6783
Author1: Larry Bucks
Timestamp: 1601036650
Classifier1: Food and Beverage
Classifier2: Brewing System

Part IV

Dr. Xu is having another rough day in his lab. He heard about the Better Coffee Press on reddit and bought one so he can brew coffee in between experiments. As soon as his coffee press arrives in the mail, Dr. Xu brews his first cup of coffee, sets his laptop on his lab bench, and pulls up a stool. Sipping an extraordinarily delicious cup of coffee, he admires the technical genius of this new coffee press and begins dismantling the gadget. As he takes apart the coffee press, he records the following attributions (logical associations between the coffee press and its underlying components) on his quest for inspiration for his own inventions:


Human version:
[Appraiser's (Dr. Xu's) Ethereum address, Resource1 (Roy's Retractable Needle), Resource2 (Larry's Better Coffee Press), Timestamp, Transaction Handle]

Machine version:
[0x6060604052341561000f576, Qmt3z9320ba, Qbb4a27e6783, 1508029285, 0xa8e493eb108266c548906331]

[Appraiser's (Dr. Xu's) Ethereum address, Resource1 (Ernie’s Excellent Pen), Resource2 (Larry's Better Coffee Press), Timestamp, Transaction Handle]

[0x6060604052341561000f576, Qz429ccr082, Qbb4a27e6783, 1508029285, 0xd8a493eb106206a448906257]

Part V

Oakland Standard sells tens of millions of dollars worth of the Better Coffee Press, and Larry makes a fortune in licensing fees. Meanwhile, Roy, Ernie, and Dr. Chang have also made millions — in tokens.

Whenever blackswan community members like Dr. Xu appraise information by creating and voting on the quality of attributions, inventors like Roy, Ernie, and Dr. Chang receive tokens on the blackswan network. 

But why would anyone care about earning tokens when they could earn real money like Larry and Oakland Standard? Aren’t these tokens just monopoly money? Larry and Oakland Standard earned their wealth by operating within the intellectual property systems of each respective country where they manufactured and sold the Better Coffee Press. They had the financial resources to pay intellectual property attorneys tens of millions of dollars in fees to draft and review contracts, and even more to enforce their patents by taking infringers to court.

But what about all the smart people out there who don’t have the same access to intellectual property attorneys and millions of dollars in investment capital? 

Larry may be a clever capitalist, but he also sees the value of the novel economy emerging around the blackswan network. As he sips on a cup of coffee, Larry is already planning his next big venture. He announces on his cafe’s website that he’s on a quest to build an even better coffee brewing system and drafts an Ethereum contract that will award $5 million to all the tinkerers out there who make the most meaningful intellectual contributions to his future invention. Larry types up his Ethereum contract, buys $5 million worth of Ether, and sends these funds to be held in digital escrow. He then creates this entry for his future invention on the blackswan network, which he calls Best Coffee Press:



Type: device Name: Best Coffee Press 
Function: hand-operated coffee brewing device that keeps coffee warm and serves up to 6 people 
Content-addressed hash: Qzt7w201e55j 
Author1: Larry Bucks 
Timestamp: 1720015640 
Classifier1: Food and Beverage 
Classifier2: Brewing System

Part VI

One year and many blockchain transactions later, new records of device components and devices have been created on the blackswan network, new attributions have been made, and millions of makers have earned tokens for their contributions. Larry has also amassed a personal fortune as a result of his second contract with Oakland Standard to manufacture and sell his latest invention, Best Coffee Press. 

Larry’s smart contract then distributes the $5 million that have been held in escrow for the past year to 280 inventors on the blackswan network whose work has contributed to the creation and success of Best Coffee Press. Rather than dividing $5 million equally among 280 people (each receiving $17,857.14), Larry wrote his contract to reward inventors proportionally to their contributions; the more frequently a device or component appears in the blockchain in the form of attributions (as they relate to Best Coffee Press), the greater those inventors’ piece of the $5 million pie. 

While making his own personal fortune (and bringing wealth to Oakland Standard, teams of attorneys, factory workers, and international governments), Larry also brought wealth to 280 inventors who would not have otherwise contributed to or benefitted from Larry’s success he had operated solely under existing systems of information disclosure, such as the US Patent and Trademark Office. Through his foresight in adopting the blackswan network, Larry was able to create his Best Coffee Press in half the time it took to create his less innovative (and less successful) Better Coffee Press.

One of Larry’s childhood friends, now a famous Professor of Medicine, read a newspaper article about Larry and came to visit his old friend in his cafe. 

Larry greeted his old friend with a warm hug and insisted on brewing the best cup of coffee for him using his latest invention. As they enjoyed what Prof. Grossman admitted was truly the best cup of coffee he had ever tasted and watched people hurrying beyond the cafe’s windows, Prof. Grossman began, “I’ve heard of billionaires who’ve made fortunes building monopolies…but a billionaire who’s made fortunes by dismantling monopolies?”

turkish_coffee.jpg
Larry pouring coffee for his childhood friend

Larry’s face wrinkled with laughter. “Most people think that wealth can be made only at others’ expense,” he answered. “The secret is, the more you give, the more you get. And here I’ve found a way to do just that.”

“From a certain point onward there is no longer any turning back. That is the point that must be reached.” ― The Trial (Franz Kafka)