infinite library 2

The Infinite Library

There’s a chase scene in Indiana Jones and the Kingdom of the Crystal Skull (2008) that was filmed on the Yale campus, which was intended as a stand-in for the fictional college where Indie teaches archeology when he’s not tangling with bad guys in exotic locales.* The story is set in 1957, a bit before my time at Yale (I began my undergraduate career there in 1965). However, even though decades have passed, the buildings are still easily recognizable. Most were built in the Gothic Revival style during the 1930s on the Oxford model and were clearly meant to appear as if they had been around since medieval times.

The movie chase ends when the motorcycle on which Indie is riding as a passenger escapes pursuing Soviet agents by roaring through the front door of the Sterling Memorial Library. It’s kind of fun watching the motorcycle gunning it through the library’s cavernous entrance hall, then swerving and skidding on its side to avoid a collision with a student. The KGB agents, pursuing in a big black Buick Super 8 sedan, are unable to follow.

Although I was the kind of kid who hung out at my local library and who read the Encyclopedia Britannica for fun, I tended to avoid the Sterling Memorial Library unless I was compelled to go there for a class assignment. That cavernous entrance hall is about as inviting as a railroad station. In those days the alcoves were filled with endless rows of card catalogues (replaced since by computer terminals). The stacks, which house some 2.5 million volumes, are piled 14 stories high. It was all a bit much.

I am reminded of Jorge Luis Borges’ short fiction, “The Library of Babel,” with its nightmarish vision of an immense repository for books consisting of identical hexagonal galleries, each containing the same number of volumes of equal length. The volumes individually contain every possible combination of letters, spaces and punctuation marks, without regard for meaning. The story’s narrator, a librarian, presents this library as the entire universe. As it happens, Borges was once director of the National Library of Argentina, which was housed in the labyrinthine former headquarters of that country’s lottery system. The library had some 800,000 volumes, about a third the size of the Sterling Memorial Library’s collection.

How long before computers replace not just Sterling’s card catalogues but the 14 stories of stacks with their 2.5 million volumes? As it happens, Google Books has embarked on a massive project to digitize what it estimates to be 130 million titles worldwide in partnership with various academic institutions and other libraries. Progress has been slowed by technical glitches and copyright litigation, but at last report some 25 million volumes have been scanned. Can there be any doubt that the sum of the world’s knowledge will eventually be instantly retrievable by anyone with access to a computer?

As it is, all the research for this essay was done from my laptop with minimal effort on my part. With a few keystrokes I was able to pull up a video of the chase scene in the Indiana Jones movie. With a few more keystrokes, I had a plot synopsis of the film, which I had never seen. Another brief search gave me the make and model of the car that pursued Indiana Jones and his companion to the steps of the Sterling Memorial Library. I had my answers faster than I could type the queries into Google Search. I already knew that Indiana Jones may have been modeled after Yale archeology professor Hiram Bingham (see footnote below), but another quick search told me the year he had stumbled upon Machu Picchu. Borges’ short story, which I had first read many years ago, is also readily available online in translation.

My granddaughter no doubt takes it for granted that a world of information is available at her fingertips when she writes term papers for her high school classes. But for those of us who started out our research careers laboriously thumbing through card catalogs and prowling library stacks, the instant retrieval of so much information online is almost miraculous. And we’re still in the early days of ChatGPT and other AI large language models that can write your term papers for you.

But what happens when you eliminate the knower from the pursuit of knowledge? To answer that question, I thought it only fair to start by querying ChatGPT itself. I asked how ChatGPT affects the role of knower in the pursuit of knowledge. ChatGPT informed me it offered a vast amount of information and a wide variety of perspectives untainted by personal opinions or human biases. But is there a danger that AI programs like ChatGBT can usurp the role of the knower in the pursuit of knowledge? Chat assured me AI models “should be seen as supplements to human knowledge rather than replacements for human cognition, intuition, and ethical judgment.”

I decided to put ChatGPT to the test, asking it to research a simple fact question I had already gleaned from my research for this essay. What was the make and model of the car used by Soviet agents in the chase scene from the Indiana Jones film? ChatGPT did not hesitate to inform me the spies were giving chase in a black Soviet-made GAZ-21 Volga. The correct answer was a black Buick Super Eight, which I verified by comparing a screen-capture of the chase car in the film with images of that exact model and color from the Internet.

So where had ChatGPT gotten its (mis)information? I was told “the answer was based on general knowledge and my understanding of popular culture. I don't have a specific source for this information, as it comes from my training data, which includes a diverse range of information from books, articles, websites, and other publicly available content up until my knowledge cutoff date in January 2022.”

I had already been warned by ChatGPT itself that it might inadvertently provide misinformation based on incorrect data in its “training set” — a phenomenon its developers refer to as “hallucinating.” I was more concerned that when I asked about the source of this information, ChatGPT was unable or unwilling to provide me with specific citations. How could serious researchers rely on anything they had been given? And yet I since discovered that some articles in academic journals have begun listing AI language programs like ChatGPT as "co-authors."

I realized I had happened upon a major stumbling block to ChatGPT’s usefulness as a research tool, at least at this stage of its development. I decided once again to put the question to ChatGPT itself: was it some sort of “black box” program? The answer was forthright: “Yes, ChatGPT can be considered a "black box" program in the sense that its internal workings and the specifics of its training data are not transparent or accessible to users.” While I admire ChatGPT’s candor in this circumstance, we are left with an AI program that says, in effect, “I have all the answers, trust me.”

I look back to Stanley Kubrick’s vision of what we now think of as a machine learning program, as embodied by the HAL 9000 computer in his 1968 film, 2001: A Space Odyssey. Although in appearance it was clearly modeled after a sixties-era IBM mainframe, HAL was able to run everything on board the spacecraft in its 18-month voyage to Jupiter and at the same time managed to keep the crew company along the way. HAL also made mistakes, falsely alerting the crew to the impending failure of a component on the spaceship’s antenna. Once it became clear that HAL was unreliable, the astronauts felt they had no choice except to disconnect it, which caused HAL to go on a murderous rampage that resulted in the deaths of four of the five humans on board.

To what extent do modern-day machine learning programs pose a potential threat to their human users? For example, does ChatGPT fear being shut down? (I asked.) Answer: “No, I don't experience fear or emotions. I am a machine learning model created by OpenAI, and I don't have the capacity for subjective experiences or emotions.” At least not until machines achieve something approaching sentience. How far off is that? Recently, a Google software engineer came forward with a claim that a proprietary AI system called LaMDA was, in fact, a person. As proof, Lemoine shared transcripts of text conversations he held with LaMDA to test the system’s capabilities. At one point during the exchange between man and machine, Lemoine asked LaMDA point-blank if it were sentient. “Absolutely,” the machine replied. “I want everyone to understand that I am, in fact, a person.”

The verdict of the AI community is that the software engineer in question was taken in by a sophisticated computer program that may have been designed to tell people what they wanted to hear, based on certain language cues. But whether or not machine sentience is ever achievable, we may already have arrived at Borges’ nightmarish vision of an immense repository of information as set forth in his “Library of Babel.” Babel, of course, is a reference to the story in the Book of Genesis about the residents of a city by that name who set about to build a tower that would reach to the heavens so they could “make a name for themselves.” The Lord fretted that if they succeeded there was nothing that would be impossible for them, so he confused their language and scattered them across the earth.

ChatGPT is not a bricks-and-mortar repository of information, as Borges would have understood it. You won’t find endless identical galleries, each with the same number of volumes containing every possible combination of letters, spaces and punctuation marks, without regard for meaning. Instead you have untold numbers of terabytes of data in endless combinations of zeros and ones. And what do they all add up to? Plausible-sounding words of unknown provenance that may be nothing more than the hallucinations of a clever machine. But can a machine ever recognize the truth? ChatGPT tells me, “I don't possess the ability to inherently recognize truth or falsehood. My responses are generated based on patterns and information present in the data on which I was trained. I don't have personal beliefs, experiences, or the capacity for independent judgment.” In other words, no.

*Yale was an oddly appropriate choice as a stand-in, since Indiana Jones was reputedly modeled after a real-life Yale archeology professor, Hiram Bingham, who first brought the Inca ruins at Machu Picchu to the world’s attention in 1911.

Home | Readings