There is a zen to hard work that leaves one too weary to think deeply about anything. I’ve spent the past week working 14-16 hour days, caring for patients and not writing code. It was a typical week on the medical wards. These are days of fasting, days that leave me spiritually satisfied but intellectually starved.
But today I’m rested and at peace with my thoughts for the first time in a long time. A question that has been whirling about in my mind’s undercurrents for years now resurfaces, as it does on days like these, bobbing up and down, restlessly spinning on its way downstream. Cool mist sweeps over San Francisco; my French press stands at the edge of my desk, the smoky taste of dark coffee lingers on my tongue.
I sift through the bold, curly lines my Uniball pen leaves on my Moleskine’s thick pages in pursuit of meaning. This act is mechanically easier when I write on paper than when I type on my PC, but the search is just as fruitless. I comb through words and brush them off the notebook’s lined pages until I’m staring at a blank page, and I start to make out an image of a cold, overcast day in October 2012. I’m reading Wittgenstein’s Tractatus logico-philosophicus in the original German on the balcony of my Berlin apartment. A woman next to me sips her coffee and lights a cigarette. All around us Berlin is perpetually becoming but never being . I blink, and a new skyscraper appears. The young woman puts out her cigarette, and passengers exit a new train station that wasn’t there moments ago.
Wittgenstein’s words overtook me like a hallucinogen, profoundly changing the way I would think and perceive the world thereafter. The zen-like opening lines “Die Welt ist alles, was der Fall ist” (“the world is everything, that happens to be the case”) lead into crisp deductive reasoning that uses logic to piece together a Weltbild as sound and beautiful as a diamond. The truth is in logic, I thought, that unadulterated fabric holding the world together — free, unbiased, untainted by language, loyal to no school of thought and no civilization of the Occident or Orient. And so I felt, for the subsequent years, that I had stumbled upon something extraordinary. Pull the fabric here, and this happens. Pull it there, and that will happen.
My faith in logic and love for words led me to the discipline of “natural language processing”, a term I grow to dislike the more experienced I become in the field. Nearly every day for the past year, I spend a few hours dipping my bucket in the endless ether, collecting data and running calculations. The results themselves are scientifically interesting, but the more data I have, the more removed I feel from that original goal of understanding semantics, to hold the word “coffee”, squeeze it between my fingers, and watch the dark drops of meaning stain my pages, drops whose coarse texture I can feel between my fingers, drops whose bitterness I can taste on my tongue, smear across the page, and say: “Here it is! Here is the meaning of the word!”
Five years into my pursuit of meaning and I catch myself in a free-fall, grasping for “it” but reaching only that logical fabric connecting words with one another. I can hardly even make out the individual words. Dangling from the fabric holding together “cool” and “mist”, my fingers cramp, my muscles ache…I can’t hold on any longer…
…and catch myself on the fabric connecting “sweeps” and the prepositional phrase “over San Francisco.” My fingers slip, and I fall again…This continues, again and again and again, until I begin to wonder, in my exhausted delirium, whether words themselves are entirely devoid of meaning. Does meaning lie in associations between words rather than the words themselves? To know “Omar Metwally” the hacker, the physician, Twitter handle “osmode” is not to know Omar Metwally at all. But to know Omar Metwally, the son of Moustafa Metwally, the husband of Marwa El-Hamidi, the father of Ismail, the neighbor of Evgeniy, is to begin to know him — as a node in a web of inter-relationships, and it is these inter-relationships, I believe, which correspond to “meaning” as we understand it.
I’ve read Kafka’s The Metamorphosis at least a dozen times in English and German . When meaninglessness overwhelmed me, I turned to Kafka’s writing for its rich, layered meaning, each sentence woven to the preceding and succeeding ones by time (the few seconds it takes to read each flowing sentence) and space (their arrangement on the page). In pursuit of meaning, I unravel The Metamorphosis, splitting the story into sentences, breaking its spatial and temporal semantic bonds, and reconnect them based on lexical similarity (that is, how many words they share) [3,4]. Kafka must be rolling in his grave now; forgive me for the sake of this thought experiment.
What new meaning, if any, does this text now have? Certainly not the literary grace it once carried; gone is the melancholic apartment Gregor Samsa shared with family until the day he woke to find himself a cockroach. Gone is his angry boss, his family, his miserable job as a traveling salesman — as Gregor the cockroach observed it from the cold walls of his former home. The above looks more like a story written by a search engine. In place of that sad apartment, which Gregor’s family rented out to strangers to support themselves (now that their son-turned-cockroach became unable to help the family pay off its debt), is an ugly, urban mess: apartment buildings filled with people who don’t know each other and don’t want to know each other, buildings connected by fiber optic cables, high-speed rails, and crowded streets.
The result is far from meaningless, but it certainly lacks meaning in the sense that it once carried, as it exists in the crumbling yet very much living pages on my bookshelf.
My attention wanders across the bookshelf, to a 3-ring notebook from my first-year linguistics seminar on discourse, which I had the privilege of attending with Professor Jon Swales, a pioneer of the field (and one of the most cultured Englishmen I have ever met). My semester project was “An interdisciplinary examination of textbook interactivity,” in which I analyzed the grammars of history, calculus, and chemistry textbooks to understand how grammatical structure correlates with a textbook’s perceived interactivity. I smile at the memory of spending Thanksgiving break during my first college semester at the University of Michigan circling second-person pronouns, manually counting words in textbooks. If I were to repeat the project in the year 2016 rather than the year 2003, I would have probably written a Python script to do the task in a few seconds.
But there was something romantic about holing myself up in my apartment, watching that winter’s first snowfall, and circling words, as there was about the “natural language processing” that Professor Swales pioneered. He would cringe if he ever heard me describe his work as NLP, and in fact, his work on discourse is too artistic and not quantitative enough to be called NLP . Yet it’s precisely the fact that he is neither a machine learning practitioner nor a computer scientist that his work is so far reaching in the linguistics community. He is the proverbial Englishman at the polo club who has traveled so much that his ears can recognize any Arabic dialect and poke fun at linguistic nuances that go over most of our heads.
My search for meaning continues somewhere between the statistical methods currently in vogue and Professor Swales’ softer, almost literary approach. Quantifying linguistics helps us identify patterns and test hypotheses, but sacrificing art at computation’s stake, as I hope this essay illustrates, can divulge into meaninglessness if we are not careful.
 These are Schopenhauer’s words. He described the perpetually becoming but never being world (“die immer werdende aber nie seiende Welt”) in his Die Welt as Wille und Vorstellung.
 The English version of The Metamorphosis used for this experiment is from the Gutenberg Project.
 The 500 commonest English words (such as this, and, a, the,…) are excluded here.
 Email me for my code. I’m happy to share it.
 NLP and computational linguistics are different but overlapping fields, and linguistics itself is a very broad discipline. I will not get into that here but simply acknowledge these facts.