When ChatGPT 3 was released in fall 2022, it brought the concept of generative artificial intelligence (AI) and large language models (LLMs) to the mainstream with a bang. But for Rebecca Hwa, now professor and department chair of computer science at the George Washington University School of Engineering and Applied Science, the discourse was finally catching up with her interests. Hwa has been researching natural language processing (NLP), the mechanism by which computer programs interpret and replicate human language, for decades.
“I have been interested for a long time in the question of how we make computers understand the way people talk—the ability to communicate to a computer as if it were another person, which used to take a bit of imagination because it sounded a little bit sci-fi,” Hwa said with a laugh. “But these days everybody's doing it.”
“When people say ‘AI,’ sometimes they just mean ‘ChatGPT.‘ But in my mind it's just a language model which actually has no particular intelligence per se.”
When Hwa took programming classes in high school in the late 1980s, computer literacy was extremely niche. “You had to be really interested,” she said. “It's kind of like joining the band or something.” At her undergraduate institution, where “computer science” didn’t yet exist as a major, Hwa majored in computer engineering. But her interests always tended toward the linguistic, dealing with programming and algorithms, rather than in the mechanical engineering aspects of the field. As an analytical thinker, she said, “the logical thinking, one step after another, really appealed to me.”
And, Hwa said, learning computer programming was an excuse to learn another language. She speaks multiple (human) languages, including Chinese—her first language—and French. Syntax and semantics fascinated her; she remembers being interested in the sentence diagrams she learned in high school English, which laid out a key for how language encodes information.
“In a way, learning programming language is much easier because it's very universal,” Hwa said. “Once you understand the principle, it's not hard to learn a new programming language, whereas it's really hard to learn a new foreign language in the human world.”
In graduate school, Hwa’s research interests evolved toward the questions that would make her an expert in large language models long before most people had ever heard of them. She was interested in questions of compositionality—how everything from a computer program to a piece of music is readable not only as a whole, but as a string or assemblage of smaller sequences structured in rational ways. Certain of these smaller phrases, motifs or subprograms are predictably found together; others never are. With a large enough data set, these patterns of assembly become clear.
“Underneath the sequence that we actually see is all of this structure that we don't see,” Hwa said. The question of interest then becomes whether the researcher can parse out these invisible elements of the whole and use them to predict what a similar program, story or piece of music would look like.
This is what LLMs are designed to achieve—output that statistically predicts how words and phrases tend to be associated in the creation of meaning. While it’s tempting to write that such programs, widely referred to as artificial intelligence or AI, “understand” the data they are fed or the content they produce, Hwa clarifies that what they actually do is “behave in a way that exhibits intelligence to a human audience.” That’s an important distinction to keep in mind as AI becomes a topic of mainstream discussion.
“When people say ‘AI,’ sometimes they just mean ‘ChatGPT,’” Hwa said. “But in my mind it's just a language model which actually has no particular intelligence, per se. So that's probably, for me, the biggest misconception.”
Some of the problems that have arisen with generative AI have been predictable for years, Hwa said: “It’s not surprising that people would use it to finish their homework.” But others, like thorny questions of intellectual property, have developed in complex ways that may need whole new fields of study to address.
Still, Hwa is excited about the potential of these programs, the avenues of research they open and the swiftness with which they’ve evolved. “A lot of the problems that I thought would take my lifetime and many people's lifetimes to achieve have suddenly improved and are now kind of doable,” she said.
As an example, Hwa cited “coreference resolution,” a linguistic concept that has to do with reconciling all the different ways a text could refer to the same subject. If a paper uses the word “he” in one paragraph to refer to a person named in a previous paragraph, for instance, that pronoun’s reference might be clear to a human reader, but researchers thought it would be a stumbling block for computers.
“That was considered difficult because you need to understand essentially everything that was discussed surrounding the relevant people until you see this word ‘he,’” Hwa said. “But for whatever it’s worth, language models do pretty well with that. If you give enough context, it tends to get them right.”
With these programs’ capacities expanding exponentially as they train themselves to do better and better, Hwa said, new questions will continually arise. But she hopes the potential improvements to human life will outweigh the challenges the technology poses.
“NLP is exciting because before we may have had a perception that the things we could work on were limited due to technical limitations,” Hwa said. “But now the technical limitation is arguably kind of gone. So we can go way, way further.”