Using AI to Study Vaccine-Hesitant Language: Interview with Dr. Dennis Tenen Artwork

Your AI Injection

Is AI an ally or adversary? Get Your AI Injection and learn how to transform your business by responsibly injecting artificial intelligence into your projects. Our host Deep Dhillon, long term AI practitioner and founder of Xyonix.com, interviews successful AI practitioners and domain experts to better understand how AI is affecting the world. AI has been described as a morally agnostic tool that can be used to make the world better, or harm it irrevocably. Join us as we discuss the ethics of AI, including both its astounding promise and sizable societal challenges. We dig in deep and discuss state of the art techniques with a particular focus on how these capabilities are used to transform organizations, making them more efficient, impactful, and successful. Need help injecting AI into your business? Reach out to us @ www.xyonix.com.

All Episodes

Your AI Injection

Using AI to Study Vaccine-Hesitant Language: Interview with Dr. Dennis Tenen

August 20, 2021 • Season 1 • Episode 8

This week we had the chance to talk with Dr. Dennis Tenen, an associate professor of English and Comparative Lit at Columbia University. Dennis has been studying vaccine-hesitant language online, a project that is leveraging AI to better understand the various social and cultural explanations for vaccine-hesitancy.

We chatted with Dennis about his unique background in the intersection of AI and literature, and about the machine learning that is helping bring his research to life.

To learn more about Dennis' work, check out:

https://denten.plaintext.in/

Automated Transcript

Deep: Hi there I'm Deep Dhillon. Welcome to your AI injection Podcast, where we discuss state-of-the-art techniques in artificial intelligence with a focus on how these capabilities are used to transform organizations, making them more efficient, impactful, and successful.

Welcome to this week's episode of your AI injection. This week, we're joined by Dr. Dennis Tenen associate professor of English and comparative lit at Columbia with a unique background teaching classes like literature in the age of AI. He's the author of the book, plain text, the poetics of computation, and has been studying vaccine hesitant language, online, a project that's leveraging AI to better understand anti-vaxxers by making a data back map of their myriad positions.

Okay, great to have you on the podcast, we're going to spend most of the time talking about the anti-vax situation, of course, but maybe you can start by telling me a little bit, this project started before the pandemic, as I understand it. So, you know, maybe tell us a little bit about the motivation behind the project. Like how did you get the idea, how did it first develop and how was the reality and tastes of the actual pandemic affected things?

Dr. Tenen: All right. Well, thanks for having me on. Yeah, so, so I started working with, with Rishi Doyle was professor of literature, but also a practicing emergency room physician about a year before COVID. So several years ago now, and when we met, we initially spoke about measles because at the time the United States was in danger of losing its measles protection status, which to us seemed completely sort of insane. So there's, there was something in the, in the national discourse that was sort of reversing decades of, of vaccination and measles. And when we started thinking about how any, th th the word we prefer, vaccine hesitant versus anti-vax, and that's sort of the, the, the, the more, the more expensive category I would say, but vaccine has it done. See it around, around measles one. When we started looking at online conversation on measles, what we're seeing, what we're seeing right away is to what extent, to what extent it's not the necessarily the facts that are driving conversation, but it was all sorts of other, other kinds of rhetoric. So for example, having metaphorical rhetoric, and there was, there was an affect and emotion and sort of all sorts of things that, that feed into someone's belief and therefore decision to vaccinate themselves or their children. And that's what was fascinating to us is that the, which, you know, what, what is usually packaged or discussed as a conversation about facts, right? Like, I'm just going to give someone more facts and they'll go, oh, right. I didn't know this stuff. And therefore, and it's, it's not so simple and we want it to, we want it to really untangle kind of the conversation around vaccination to kind of find other things that are extra factual, right?

Deep: Like a layer of extra factual sort of linguistic stuff.

Dr. Tenen: Yes. And then COVID actually happens and you have a much bigger problem than the measles. And then was there a pivot or something, or was, did the template sort of still apply or. No. So there was, there was a, I mean, I guess it's not really a pivot, but when, when COVID happened, it really, I think us as just being scholars, just having that feeling of like, how can we contribute to, to this question? And, you know, I think what was really gratifying is that when we started talking about vaccine hesitancy, there, there was not, there's not the vaccine yet. So we applied, we applied for this presidential grant from the Columbia world projects from Lee ball, his office. And at the time all the feedback we got was like, why are you thinking about this? There's no vaccine and it's not coming for maybe. Two, five years. It's like, it's not a problem. And, but, but already, already at the time we, we, we were early on kind of thinking, well, no, no, no, this is not like a new thing. It's not, you know, there's already people kind of gearing up thinking, and they're thinking about vaccination and this all these interesting ways, if you remember, also in the early, so this was still during Trump's presidency. A lot of the vaccine hesitancy came from the left. Right. So a lot of the kind of conversation was if there is going to be vaccination on the Trump, it's going to be rushed. It's going to be unscientific. It's going to be just kind of used for political means. And then it flipped kind of during the election, it flipped to.

Deep: Similar arguments, but applied on the other side. Yeah.

Dr . Tenen: Right, right, right. So, yeah, it's definitely, so yeah, I wouldn't say it's a pivot, but it was definitely kind of an opportunity. We wanted to do something, something where we could contribute sort of to this problem. And, and there were interesting, I think conversation partners that we had. So for example, early on, we spoke and we still in touch with them. There's a team at Merck pharmaceuticals where at the time were one of the developers of vaccination. And it's also, it was a team that included data scientists and public health officials, not the official public health researchers school were early on thinking about, okay, what if we have a vaccination, how do we, how do we distribute it? And kind of what potential problems in distribution, including these ideological problems that, that could, that could happen. And there were a good sounding board for our early thinking about this project.

Deep: So, so walk us through, what's the, what's the high level thesis that you're working with, you know, as I understand it, there's data. So maybe walk us through what that is and how you got there. And then there's this idea of that, that, you know, we've talked about of, of mapping out the vaccine, hesitancy communities, if you will, and perspectives, how is that different than, you know, maybe the simplified narrative that we're hearing in the news. And then once you've performed that map, what, what happens with it?

Dr. Tenen: Right. So, so first of all, let me also tell you a little bit more about kind of where we're coming from. Like, what is the mental model of vaccine hesitancy here? And then one of the, one of our influences that your listeners might be interested in with Susan sundeck? I'm not sure if you're familiar with sunset's work, but, but she was a literary scholar still active in the eighties. She wrote this great piece for that. Lentis called illnesses metaphor, which later came out, came out as a book. And her point, she mostly spoke about tuberculosis and aids actually at the time, right. There was just this moment of thinking about the aids crisis. And her point was, again, it's a very similar it's it's we were, we were influenced by this is to say that the way we speak about tuberculosis, the way we speak about aids ends up, ends up sort of structuring the way we treat it. Right. So, and like a good example, like a good comment example, like if you declare a war on drugs, you end up like shooting drugs. Putting them. In jail. So it's sort of that metaphor war. So a bunch of things about war get sort of transferred into the domain of, of medicine and the domain of, of addiction treatment. Right? And so, so using, so illnesses metaphor was saying that we need policy reconfiguration, but that knowledge to reconfiguration unfold, this metaphorical logic, that, and I think she did, she did a good job. And in sort of in looking at that metaphor, metaphorical logic. So similarly, when we went to our first question was where are, where is the discourse happening on vaccination, right. And there's two kinds of, two parts of this. One part is the public health. It's the official discourse that's top down, right? It's going from the public health offices, but, and that sort of, it's a known thing, right? Like that's what you hear from, from official sources. But then w we were doing sort of the field work on initially online bulletin boards, online chat forums. So these are like old school, you know, BBS kind of, which are sometimes really active. So there's all sorts of like parenting forums where fathers, moms like natural healing for children, and just thousands of messages where people in an unedited way are discussing their feelings about vaccination. And then from there, we started also looking at, I think like YouTube is, is an amazing source. YouTube comments is an amazing source of natural discourse about medicine, Facebook groups, increasingly today, telegram groups. When we saw with the, with filtering of, of COVID related content, on Facebook, on Twitter and other platforms, a lot of, a lot of people went to telegram as a place, which is not in any way sort of filtered. And, you know, other places like parlor for example, was it was a big move for a bunch of people left Reddit and went to parlor.

Deep: In response to the crack down by Twitter and Facebook and the quote unquote, proper social media companies to the, you know, there's just been a large political backlash of them allowing a lot of disinformation. And so now they've been trying to take steps. And so those who love disinfecting find new places to go.

Dr. Tenen: Exactly. Yeah. But, you know, I mean, this is a weird, I wrote about this elsewhere, but asking, asking sort of these large multinational corporations to do the right thing, I think is a paradoxical sort of political impulse because they, they cannot, you know, like ethics and corporate governance, they don't necessarily, they don't necessarily align always like what's good for our society and what's good for a corporate interest may not be aligned and they have to fulfill the logic of their, of their business.

Deep: Yeah. I mean, if you look at a Facebook or Twitter, I mean, they're in the business of monetizing engagement. And if you're monetizing engagement, then you know, your two levers are, get more engagement or get more money per engagement, both of which lead to getting, you know, like encouraging more conversation, people are more attracted, you know, they're sort of more primitive parts of their brains are just naturally more attracted to fear, you know, problematic stuff, screaming and yelling, like all kinds of things that like in the large sense, quote, make the world a bad place. And they're generally not as attracted to things that in the large sense are factual, credible, all that. So, you know, like Zuckerberg, these guys at Facebook, they certainly pay lip service to trying to crack down on the problem. But, you know, we know that that's not always, that's just not how they're incentivized according to their current business models. Like they make more money by slinging more ads and they can see more ads when there's more people talking about.

Dr. Tenen: Yeah. And I think a little bit, the confusion comes out, our kind of the public's confusion comes is that these spaces, this, we treat these spaces as public space. You know, we publicly discuss things on YouTube, but YouTube is not truly public space. Right. So it's not. So, and so the things that we can reasonably expect in the, in public space, don't don't necessarily hold true. And, but anyway, that's a different, I think. That's. A different rep point being here. I think the importance, the, the, the point here is that here are places where you can have, have a glimpse into actual conversation about vaccine hesitancy and not, you know, there's so much sort of projection happening and people have a lot of sort of expectations of what vaccine has it been this course may look like or what, like the loudest voices sound like in, in, on those platforms. But, but as researchers, you know, my initial impulse is to say, okay, that's, that may be true, but is it actually true? Like, can we go out, like, where is, where is this conversation happening one and sort of what is the texture of that conversation? Right. And so what we began is initially just observing just by being there. Right? So looking at specific sites of discourse and looking at things changed rapidly in the year from when the vaccine came out and then the following year. So like last year, both, as you mentioned, kind of the, the, the sites changed the conversation migrated, and there were sort of new political movements and new sort of new new site of conversation that, that really emerged very quickly. And so at first we were observing this by hand, right? So it's fieldwork, it's, it's really talking to people I've I I've gone to a few anti vaccine protests.

Deep: I was just in Iceland actually yesterday. And I was at, I went to an anti-vaxxer a thing just out of curiosity, that they're not only in the U S.

Dr. Tenen: That's the thing is that there's giant the, the, the, the movements we are monitoring are holding protests in, in, in London, in various places in. Australia. Yeah. And I think, and this is the part where you really, you know, really, really trying to understand what, what that movement is all about and trying to, you know, coming to it with, with, with blanks or that without free suppositions. Right. I think it's as good researcher. So I want to just come and say, okay, what does the data actually tell me, what are the, what are the key motivators behind this, like rapid, you know, rapidly growing, growing movement. And so that's where a lot of this kind of field work ease the thing, which primes our intuitions about the data-driven work. So right. Once you begin to collect these snippets of conversation, these samples of conversation in the millions, right. We have sort of millions of examples of individuals sort of statements about the vaccine. We can then use those initial sort of hand crafted insight and intuitions to begin to think about, okay, how do we take this big thing, right. And this, the multitude of voices, and like, what kind of, what patterns do we see in this data? And that was that's, that's the, that's the approach. The approach to this question is to take something, something as complex as vaccine, hasn't been seen, something that looks to many people, it's like, one thing is just like right-wingers doing right this information. But then when you start, when you start looking at, when you start doing field work, when you start collecting data, you see that it's many, it's not one thing in fact. Right. And what, what, what we're seeing is that, you know, they're, they're very different sorts of different places and different rhetorics and different logics that are unfolding and coming together in an interesting way.

Deep: You can say like different personas, if you will. Like, that was one of the things that struck me. I think it was yesterday or a recent New York times article, you know, they went through and they were talking about the quote reasons for, for an anti-vax position. But one of the things that struck me about what, about what your approach or what you're after, or what you're finding is that there's personas, like there's the Orthodox Jewish community that has its set of concerns and perspectives. That's maybe very distinct from the, you know, the left, you know, there's like a political lens, then there's cultural lenses. Like, you know, you, you, you see in the UK, for example, that there's like, you know, within some of the immigrant communities, there's like a real concern that this is like a quote, you know, plot to like, you know, get us, you know, track us and take us out of the country or whatever.

Dr. Tenen: Yeah. Yeah, exactly. Exactly. So, so, so, so an immigrant or migrant worker, who's worried about just like being documented and being sort of be getting on the radar of some official, that's a very different concern than somebody who is talking about their whatever first, second, or the amendment rights versus a, somebody who is of Jewish faith, a Muslim, you know, and who's worried about like the presence of halal or kosher sort of ingredients, which is like actually necessitated by like the logic of their, of their practice coming from a kind of a more historical humanities background. You know, we're well aware that, like it hasn't been seen the indigenous American in this community. Well, there's a history there that again is completely not related to like the, the, the, the new right. And those are all very different, but they, you know, superficially they can manage, they can look the same.

Deep: You're listening to your AI injection, brought to you by xyonix.com. That's x-y-o-n-i-x.com. Check out our website for more content. Or if you need help injecting AI into your organization.

Let me see if I can characterize what you're doing. So you've got, so let's capture the actual live social media content, pull it in, throw it on a timeline. So we have statements happening over time, and then let's study it. So we can do a lot of stuff from a machine learning, vantage, looking at, you know, terminologies, drifts over time. We can do some unstructured analysis. We can dig into that in a second, but once you do that, your kind of goal is to like, make a map. If you will, of like, what are all these different personas slash you know, positions, vantages that are present. And then there's this missing piece that I'm presuming you want to go after in our, our listeners are kind of waiting to hear, but what do you do with that? So once you've got this persona map, now you can, you know, go to your public health agencies perhaps, and start having really targeted messaging to these communities, you know, via these social media platforms, taking out ads, but, you know, whatever is, that's the general plan or approach.

Dr. Tenen: Yes, no, that's exactly right. Is that the monster? So when you have that map, I think persona is a good, you know, here's another metaphor having a map of, of vaccine has it that's you right then? And you can tell that there are different clusters or different neighborhoods that are driven by different concerns and linguistically they're driven by different sorts of key words and key concepts, right? So once you have that, the biography we're then working with CDC officials and, you know, increasingly vaccination campaigns, they look like modern advertising campaigns. They're using Google AdSense, they're using Facebook, all the kinds of tools that allow you to really sort of layer your, your, your public and, and target the messages. And, and our approach is to say, rather than present one message to everybody, right? And then message. Usually being like, please get vaccinated. It's, it's actually safe, which is what the projection is. The projection is the person doesn't know that it's safe, but we can really, we can target the Orthodox Jewish community, right. Or somebody who's searching is the vaccine.

Deep: Only the things they care about targeted for their kind of complete percentage. I mean, I love the persona based approach, because if you think about it from a, from a marketing or advertising vantage, that's exactly what you have to do. You have to know who you're talking to. You know, I mean, a lot of companies, even, you know, when they're designing products, you know, you do this thing where you take pictures of people and you put them up into like, the designers will have them posted all over. And they've got like a particular, like a three year old young woman is like, it has a name Amy or whatever that, you know, that they'll speak to their parts, understanding that person kind of in a lot of depth, lets you have much more ability to target them when you actually create, you know, maybe, you know, like your video ad or whatever your, your advertising vehicle might be.

Dr. Tenen: Yeah. I mean, I think for, for each one of these sorts of neighborhoods, we can, we can talk about the conceptual, like a cluster of concepts that really seem to drive that type of hesitancy. Right. So religious has it then is being driven by very different conceptual cluster than political has it then C then hesitancy around natural parenting or then hesitancy that, that is in your example, like that's, that's people, people worried about their, their documentation status. Right? And so by, by knowing that conceptual conceptual cluster, we can, we can then target the message and really try to address the re the deep reasons behind, behind the, the hesitancy itself.

Deep: So let's maybe go back a little bit and get into a little bit more of the grit. So like walk us through exactly what is the data? So like how do you get it? How do you decide which sources to go after? Are you going after a random sample or are you actively hunting for, you know, anti-vaxxer or sorry, vaccine hesitancy content because, you know, inevitably that's going to like, tell me, tell me about that first. Like the data is.

Dr. Tenen: Yeah. So we divided their data collection effort into two distinct sort of teams or two parts. And one is the biased. There is a bias corporates or bias data sample. And then there is an unbiased at the simple, so the bias data sample is these are groups, active groups that we've been monitoring by hand and, and they're self th th they're self-described as described themselves as vaccine hasn't done or antibiotics or, or in some way. So, so we know they're labeled in a sense, right. They're labeled as self labeled as anti-vaxxers, but there is probably a couple of dozen of these super active sites. There were Facebook groups, there were bulletin boards. So that's one path. And I think just, just me as a matter of methodology, I mean, I view everything as a simple, like, you can never get everything of something. Right. So, so it is a, and it feels like a significant sample of vaccine has been conversation online.

Deep: But the key here is that they might not have, non-vaccine hesitant people in the conversations. Like these could be echo chambers also.

Dr. Tenen: There echo. Yeah. And that's the problem with the biased approach is, you know, a few, if you happen. So for example, like we're actually seeing this, like, it just so happens that they'll stop. Like there is an over-representation of groups from Canada. So a lot of like the really top, top like of driving concepts or like have to do with, we had to look them up where like, who is this person? They're like, oh, Canadian politician. So, so, so, and that's not the case, right? So it's, this is a problem, not with the vaccine discourse, it's a problem that in our sample. So you have to really understand kind of the simple biases, but that's why it's a bias, right. It's both the positive and the negative, the bias simple. But, but the nice part is that we know for sure that the, it is it's people who described like in their method data of like forming their group, they are sort of, they, they told us already that they're going to be discussing with scene hesitancy.

Deep: So that, so you have those T two teams, one of them is going after the bias day, the other ones, it will be going after the random sample or the less the non-biased approach.

Dr. Tenen: Yes. What does that mean? Then non-biased approach is just to take any conversation on YouTube, for example, then has a keyword vaccine or a new version of any sort of cognitive vaccination. And then just pull that in. And, and the idea here is you're, you're casting the net wide and you're going to catch, like seeing discourse that has to do with vaccine hesitancy. But you're also, for example, going to catch a bunch of, you know, just news stories and maybe like actually videos from various, from, from the various centers for disease control. Right. Or so it just kind of catching a bit of everything, like, like anything that has the word vaccine, and then in YouTube, like YouTube is a rich ground because you can have, you can imagine like whatever, like a Fox news or a CNN story and vaccination, and then just like a bunch of people jumping into the comments. And some of them are saying, please get your children vaccinated. Or like, here's a phone number. And then other people are saying, this is a conspiracy and it will. So, so we just getting much more of a hodgepodge of, of, of, of, of language there, which is, which is the unbiased approach. And that is a, I would say that in that constitutes a significant sample of just rhetoric around vaccination period. Not, not necessarily because it hasn't been rhetoric. And then I, I do think that the sort of by, by putting those two datasets by pitting them against each other, we can sort of get some right. We can begin to in a descriptive way, start start talking about, okay, what is the difference between people talking about vaccination, you know, in a negative way versus in the more neutral way so on.

Deep: So let's, let's talk a little bit about the analysis then. So now you've got this dataset, what is your analysis look like? Maybe even let's talk about it from a community vantage, like who's doing analysis or the grad students of yours. Are you making the dataset public? So other people are able to do analysis as well. And then what kinds of analyses or classes of analysis are you seeing happening there? And are you like, you know, there's like a temporal lens that one could look at this, there's the sort of persona, or even just direct individuals, you know, where there's like a number of studies that are like tracking back a lot of the anti-vax sources to like, you know, a handful or two of, of folks. And then there's like the content only approach. And then even within the content only approach, there's like, you know, sort of, you know, there's, there's stuff that you could do. That's very kind of grammatical in nature where you're trying to go after, like, you know, who's saying, what about whom or, or, or what versus keyword based topic based. So, yeah.

Dr. Tenen: Well, yeah. So our methodology, we begin, I begin, I've always preferred to, to begin an exploratory way. Right. So, so just simple question. So you have, you know, you have a bag of, you have like millions of words. So you're looking at, once you remove the common stop words, you start looking at like, what are, what are the most sort of most common sort of marked words in this, in this dataset. Right. And, and that already tells you, it, it, again, it primes your intuition. You're like, oh, like the word freedom comes up a lot. Right. Of course, a lot of the top words are what you expect, right? Like a side-effect or something. Right. But when you, you sort of start getting things you expect, but also some things that you don't expect, that's kind of, let's call them as these drivers, the engines of discourse, right? Like these, these really marked linguistics sort of linguistic markers. So I think the initial analytical step is exploratory right. Is to, is to another thing that we've done is, is early on, is unsupervised topic modeling. Right? So, and unsupervised topic modeling is fascinating because it's a taxonomy, right. It's a way to divide up the world. But like without human intervention, like the computer, it just puts you, it puts things in, you can say, how many piles, how many buckets give me 12 buckets? And it puts things in 12 buckets that are very different, but like for different reasons. So some of them are like some of the buckets, right. Are like, whenever people share a link, like that's a very particular sort of discursive act, right? Like your Yeah. Checkout. So like the word checkout, like chapter would be very high at. And so when you look at this kind of alien computer way of looking at the world, some of this stuff, you're like, oh yeah, I get it. Like that cluster is just people sharing things. But then again, initially it may, it may give, so for example, like the fact that like, it, it does very well in separating American from Canadian politics because of the mark, right? Like it's specific names of local governors that are separate. So this is all exploratory stuff that then layers on top of our intuitions that, that, that happened through months and years of observing the communities, right. Where we are already sort of we're expecting these clusters, right. We're expecting natural healing, for example, to be a topic of conversation. Right. As opposed to right. It's alternative medicine, as opposed to whatever other types of medicine. So parenting in general, being a very, like a subset of the conversation that had to, that has to do with children. Right. So certain things that, so we're layering kind of things that we're trying to look for patterns in the data in an exploratory way, and then layer that with the things that we also observed, just, just in the more of a field work kind of ethnographic way. So that's the first step. And then the second step is to actually begin to, to, to organize those intuitions into a formal taxonomy right then to a formal set of categories to say, okay, what we are seeing here really is religious hesitancy as distinct from political hesitancy, as distinct from right. Undocumented immigrant hesitancy, right. Those categories begin to merge, and then we are labeling samples, right? We're lamp, labeling samples, we're using graduate students and other faculty researchers. And just kind of, we have kind of an interesting team. There's several librarians who do data science, there's undergraduate school kind of super interested. Like we have a, you know, somebody who's really interested in, in, in a black American hesitancy in New York, the largest hesitant group is young black males. And also doesn't, doesn't fit into this like narrative of right-wing propaganda. Right-wing misinformation. Yeah.

Deep: I mean, the media likes to kind of, I hate talking about the media likes to, I shouldn't say that, but there tends to be a reductionism that happens in the, in the kind of formal discussion that, you know, that most of us consume via, you know, media topics. And that tends to be this like political lens. Like I don't, and a lot of times this is because of what's going to move the news and, and bring in the most clicks. And so people are right now, it's hot to have political lenses, but you know, you're talking about other lenses, like there's this historical cultural lens, you know, and then there's this religious lens maybe, you know? Yeah.

Dr. Tenen: Yeah. So the lens and note, again, know that these, these metaphors that we're using lens persona cluster, right. Or topic, all of those is an attempt to say, okay, like the thing that you think is one monolithic thing actually contains right. Contains neighborhoods, clusters, all of them. And they, right. So neighborhood and these metaphors translate into quantitative methods. So like neighborhood detect detection or topic modeling, those are all like ways to try to get that intuition. Right. That there, it's not one big neighborhood, it's multiple neighborhoods. And these multiple, you know, they are occupied by different personas. And, and they prefer like here is the key things that, that drives their particular sort of version of this one thing we call vaccine hesitancy. Right. And that, and that's, you know, more or less that's the approach. But again, I really think that when we classify things, right, it's like, yes or no, or like some percent like 80% chance that this is an anti vaccine, right. Like it's all categories, but then the conversation is much more fluid than that. Right? Like the, the, the conversation even sort of the neutral conversation is sort of all over the place. So for example, how do you tease out the kind of historical hesitancy that we spoke about right. Among the indigenous then has to do with like history, the history of colonialism and the usage of vaccination for like nonmedical, for like political and for war purposes. Well, how do you tease that out from like this, from the noise of political hesitancy, right. Like it actually, in terms of a sample that might really be drowned out just by the virtue of the kind of people that tend to participate. So that's a specific problem, but what is it, the problem, it's a problem. It's like, I see that category. I know its history, right? Like I've met people, I've read their reasoning. Right. But then I'm saying, okay, how does that get the glory? Can I pick up that signal in this giant data set? Like, is that signal there? How do I best sort of hone his signal to actually isolate that particular logic? And that's where, that's where classic classify. I mean, it's, it's like the art of like the art of classification, right? Like that's where you have to be kind of clever about, about really trying to get right. Get down to the things that matter to you as, as a, as a researcher.

Deep: Perhaps you're not sure whether AI can really transform your business. Maybe you don't know what it means to inject AI into your business. Maybe you need some help actually building models, check us out at xyonix.com. That's x-y-o-n-i-x.com Maybe we can help.

So I know it's still kind of early days as far as actually doing the analysis on the data for you and your team, but have you found anything so far that's maybe unexpected or particularly interesting.

Dr. Tenen: Yeah. So, so, you know, one of the, one of the biggest sort of surprises of late has been this, I mean, one way to say it was it's appropriation of language. So in the mainstream sort of very loud political hesitancy, we hear slogans and, and rhetoric that has to do with like things like my body, my choice, my body, and my choice choices, a slogan from, you know, the me too movement it's from reproductive rights.

Deep: Yeah. From the left pro-choice et cetera.

Dr. Tenen: Right. The pro-choice and really people. And we see this, repeat that all the time is that we are not anti, so this is like a direct quote. We're not anti-vaxxers we're pro-choice. Yeah. So we're not anti children. We're, pro-choice.

Deep: The creation technique that that's such an interesting thing, because it happens on both sides. Like both sides will appropriate language from the other. It, I don't know what that, I don't know if that's like sort of the process of finding maybe the point of least friction that like gets through to the, to the other side in a way or something. I don't know what it is, but it, yeah.

Dr. Tenen: But look, there are two ways, at least two ways of thinking, like one way of thinking about it is just like, it's completely cynical appropriation, right. Like just say, okay, it's just people they know, then they know it was effective. So they're like using rhetoric that they succeeded and that they hate maybe themselves couldn't be like that, but from another perspective. Okay. So let's, this is where like, let's look at the keywords choice, right. Freedom. Okay. There's a particular, there is, you know, my choice freedom, right. My right. To choose my freedom to choose. So choice, freedom, right. That cluster of ideas. Right. That's a very particular, like, you can write the way recognizable particular, like political theory behind that. Right. It comes from a very like enlightenment era, like, you know, Hobbes and Locke and mills. Right. It's, it's thinking of freedoms and, and rights and choice. Right. And personal choice and less. So for example, it's a less, so rhetoric driven by like community responsibility or something like that. Right though. Like when we say public health, it's the public. So this is all about like individual choice. And then, so the, so now as some, like, I have to teach political theory at Columbia as well. So then I write the way I'm thinking like, oh, wow, that's a very interesting, like, is this an American thing? Is this like an English language kind of like Scottish enlightenment thing to like really consider your relationship to children or to vaccines or to public health through the lens of personal choice. Like there isn't that what I'm saying is there is a, non-cynical like actual question there that is like super complex and doesn't fall neatly into like political, you know, into like parties. Right. So it's, it's, it's weird. I mean, there's just weirdness there that I'd like to explore just by sort of thinking about it then by, by writing about it.

Deep: So one of the things I, this isn't maybe directly along the vaccination hesitancy conversation topic, but I have to ask you this, you have a very unique background for somebody who's doing text mining and natural language understanding. And I have to like, how did you, and for our listeners benefit, maybe I want to ask you about this, but like one of Dennis's courses here, I'm just going to read a little bit of your course description. Cause I, I find your perspective, very unique for somebody who came up through the machine learning world. In this course, we will consider the long history of literature composed with, for, and by machines. Our reading list, we'll start with Ramon rules, 13th century, commentorial mystic liveliness bacons with we'll read plot robots, instrument with the writing of Hollywood scripts and pulp fiction in the twenties, avant garde poetry had Donna computer-generated love letters written by knowledge. I mean, this is, this is like, it's fascinating to me. This must be one of the most popular courses at Columbia, at least for a geek like me, that's the natural language. How did you, like, how does this work? Like how does a literature, somebody coming up from comparative lit world wind up doing this work? How does that happen?

Dr. Tenen: I mean, a lot of it is an accident of history. So my, my undergraduate degree was in political theory and in comparative literature. So I was like one of these like continental philosophy nerds, but because I didn't have a lot of money, I'd put myself through school by working and I was working as a software engineer early on. So I just, I, you know, when I graduated as an undergrad, I just happened to have like, you know, almost like five to eight years of experience in, you know, an, in a very weird niche is I was designing sites for mobile devices, like in the early, like for like Nokia phones, you know? So it was like a very weird thing. Like it, software engineering of a very weird ilk that then, you know, turn out to be quite quite important. And one of my first jobs out of college was to work for Microsoft. So I started out at web TV, which was like. Yup. Internet for your TV. And I ended up at the, you know, Microsoft bought them and we worked on windows XP media center. So I ended up in the OLS. So I'm really a software engineer and not so much like a computer scientist just by professionally. But then when I went back to school for graduate, for graduate work, I just went back to school. Like I was like, I want to go back to reading philosophy and literature And I really didn't think like the two worlds aligned in any way. So I wasn't like very smart about it. I have to tell you. But then when I did go to graduate school and when I finished my coursework, like it sort of started the, to that sense of like, I was like, why did they, why did they leave the software, software engineering world? And it was partly because I felt this gap, like the intellectual stuff I was interested in was just not like there was not the good connection. There, there was not the good like book or something like a history book that I read that connected to me, you know, for example, the political theory and, and, you know, did the design of operating systems, but it turns out that they're connected. And I think kind of building that bridge at first, like on a very personal level, just for like my own wellbeing, right. Just to kind of like be a more whole person. But then when I started doing that, I really started finding that the links are not just like tenuous, you know, like thin links. So some of the links between like some of the major ideas in computer science are truly connected to, you know, like the writing of poetry or to theology, for example, right. You mentioned Ramon novel. I mean, it was, there were like mystics and, and kind of monks that were trying to like get to the truth of God through this combinatorial brute force sort of composition height and influence people like live nets. It's it influenced, you know, binary, calculus, binary, algebra, binary, notation, right point is like, you start looking at it. And you're like, no, these things are actually connected. The history of computer science has its roots in rhetoric, in poetics, in N. And I thought, that's, you know, if it's interesting to me, I was like, there must be other people. I mean, it's fascinating. Yeah.

Deep: It's definitely fascinating to me. I mean, I literally have goosebumps listening to you hear this because, you know, when I, when I, when I think about our audience, you know, and just, you know, the people that, you know, that I work with all, you know, we help a lot of companies, you know, apply machine learning, AI, they're all coming from very specific and different disparate domains. You know, it can be anything from construction to biology to, you know, to history. You know, there's, there's lots of totally different places where we tend to think about it as AI is kind of going into these realms and, you know, and bringing efficiencies and leveraging, you know, this power of pattern recognition towards utility, but you're kind of coming at it. You're, you're sort of saying something very unique to me, which is like, no, these ideas that are, are just like returning home in a way to, to, you know, to literature at least, or, you know, philosophy. And I don't know, I've never, I have never thought of it in that way.

Dr. Tenen: I think it enriches our practice. I mean, like as a software engineer, like, and as a user of computers, like it enriches my practice to think of, you know, to think of this sort of like tools and techniques and algorithms in that way. Right. Then that's more expensive, philosophical way. So I think it's, it's just, I find that gratifying and, and it's also, it's just the gap in our education. Like I think like it's both ways software engineers there, I'd like to teach like philosophy and literary theory for software engineers. And I'd like my humanity students to be more, you know, like to be able to code and to be able to reason about like these black box or the, you know, like, Hey, this thing is out of completing your, your email using something called mark of chains. Well, that seems like completely right, like science fiction sort of magic, but like you pick up the original paper and mark of chains was an analysis of Pushkin's poetry. Right. And now it is used to like, out of complete your emails, right? Like that's, that's something that I'd like to dwell on and something like that's a little piece of history that, that I'd like to discuss and like kind of preserve.

Deep: Yeah. I mean, me as an, as an engineer, I think many of us got into the field cause we just looked around us and we're just kind of fascinated by the world that we're looking at. But have no, I mean, I remember back then just had a very rudimentary understanding of how things worked. I just wondered, I just want to know, like you take, you know, I think at one point I remember back when like the iPhone first came out, somebody was doing a presentation off their iPhone on a projection screen and I just, you know, somebody made the comment, like that's an awful lot of technology to put, you know, to put that, you know, to put that, that image up on the screen. And, you know, for a lot of us who got into engineering, at least like that was one of the motivators was just to figure out how does all this stuff work? And I remember going, you know, going all the way down to the electron level and back up was quite a journey, but you do wind up having a much deeper appreciation for these things. And I find, anyway, we can talk about that.

Dr. Tenen: Yeah. Yeah. I'm very happy. Yeah. You know, the deeper, the deeper you go into that, like rabbit hole, like it's also the technology, you know, so I had to like for my, for my first book, which is called plain text Stanford university press, I had, I had to look, you know, I really got, got, got into like how like magnetic storage works and it's wild, you know, it's like quantum phenomenon. And, and also at some point it's like, it is so wild that like the language is entirely metaphorical, you know, it's just all it's like engineer. So you pick up the papers and the patents around, around flash storage. It's like, it was so fascinating to see how much, cause they're trying to describe realities that we don't have like natural language for. So they're like this modes and electrons sort of digging tunnels and you know, all this. And, and it, I don't know if it demystified things, but I certainly, you know, then you do simple things, like save a file. And you're like, man, like that, I just pressed the button and like things, all kinds of cool things happen. And I think it's important to retain that sense of, you know, that sense of like.

Deep: Just to bring it back. Like, how do you think you are kind of unique background affects your ability and, and perspective on understanding this vaccine hesitancy problem. Like, you know, maybe a, a somebody who's kind of coming up through the more traditional data science-y worlds might come up with one set of conclusions, but how do you think your, your, your, your background and approach are different and maybe yield to different findings?

Dr. Tenen: Yeah. So, I mean, I think, I think too, I would, I would, I would say two to two things that, that, that make our approach different is one, is, is this emphasis and metaphorical thing thinking and emphasis in red, the, this is where the facts are not enough, right. Then there is that, that metaphorical thing, thinking structures, I teach a class on metaphor. And one of the, like the biggest sort of the punchlines of the classes, manta force structure, our, our experience in the world. Right. And so, so really looking at how that structuring actually happened, happens. Right. And that, that comes out the VR. So, so that, that means I'm doing language analysis and I'm doing, you know, like computer science-y stuff. But like, from that, from that, with, with, with the tools of metaphor analysis and with an eye toward, toward metaphor, right. That's one, and the second thing is more, I would say when we speak to public health officials, when we speak to physicians actually like being driven by data and being like very empirical and being saying like, okay, that's like, yes, maybe you are right. That there is a, you know, that people don't know the facts about vaccination, but is that true? Is that like, can we, can we get like a, a data set? Can we get the porpoise? And can we look so, so, so I think those are the two things that really separate this project is it is a data-driven project. It is an empirical project that looks at actual kind of vaccine hesitancy in the wild rather than kind of projections of it. So starting there, but then also looking, you know, using computational tools, but also tools of, of they're kind of literary analytical tools. They're, they're looking at it's, it's, it's analyzing language in its, in its complexity and it has a historical component and it has right. You have to have models to do the work. You have to have kind of models of computational work. So.

Deep: We're almost out of time here. I want to just ask you one last question. If we can jump ahead, you know, three to five years, and let's assume that you got the most wildly successful output of your project, what would it look like? What would the output be? What, how would the world be different?

Dr. Tenen: Yeah. Well, I think, I think kind of a very, I think making sure that the money that goes into vaccination campaigns and it, you know, that question is going to be with us for forever. It hasn't been with us. Right. Like it's. Not. Just COVID measles. So, so really making those campaigns more effective by using more targeted language. I think that's a, that's something that we can do, you know, by, you know, by the way we discussed. So that's one, and then in the more speculative rearm, you know, I think the powers is the side of misinformation, whether they're state sponsored powers or just commercial powers, they use all sorts of like clever, you know, so for example, the use of bots, right? The use of bots that are like, you know, they're liking things or they're just like trolling, but right with, so, so using the language once we understand those, the various personas, the various neighborhoods of hesitancy, but then using algorithmic tools to actually generate pro vaccination responses. Right. I think kind of being using AI to, to really like participate in that national conversation and improve it in some ways and, and sort of using, using those tools for, for the good, for the, for the benefit of the public is something that I also think that we can use this data and this kind of approach to do. And that's, that's maybe a bit more speculative, a bit more out there, but, but possible.

Deep: Okay, cool. That's all for this week. Thanks so much, Dennis, for being here and as always thanks to our listeners for tuning in, if you like, what you're hearing, give us some love on your podcast platform, in the form of a rating or review that's all for now until next time.

That's all for this episode, I'm Deep Dhillon, your host saying check back soon for your next AI injection. In the meantime, if you need help injecting AI into your business, reach out to us at xyonix dot com. That's x-y-o-n-i-x.com. Whether it's text, audio, video, or other business data, we help all kinds of organizations like yours automatically find and operationalize transformative insights.

People on this episode

Deep Dhillon

Host