EJB Talks Podcast

Jim Samuel

EJB Talks: Public Informatics Spotlight–from Data Analytics to Artificial Intelligence Impacts

February 8, 2023

This week on EJBTalks, Stuart Shapiro welcomes Professor Jim Samuel, Executive Director of the school’s Master of Public Informatics program Talking about his professional path from architecture to the business and finance sector, Professor Samuel explains how the global financial crisis inspired him to take a deeper look at the Big Data phenomenon. They also discuss the synergy between studying informatics and Bloustein School’s other disciplines. Jim concludes the episode by taking Stuart and listeners on a quick but deep dive into two current hot topics, Artificial Intelligence and ChatGPT. He explains their meanings, uses, dangers, and what we need to be thinking about as these changes infuse society. Tune in the for this and more!

Stuart Shapiro
Welcome to EJB Talks. I’m Stuart Shapiro, the Interim Dean of the Bloustein School, and the purpose of this podcast is to highlight the work my colleagues and our alumni in the fields of policy, planning, health, and today, informatics, are doing.

We’re spending this, our eighth season, speaking with new faculty at the Bloustein School. We hired 10 people in the past year, in a wide array of fields, as the season will show. Today we’re cheating a little bit and speaking to someone who has hired a little bit more than a year ago, the Director of our Public Informatics Program, Professor Jim Samuel. Welcome to the podcast, Jim.

Jim Samuel
Good morning, Stuart.

Stuart Shapiro
So let’s start sort of with a question about you and your background, your origin. How did you get interested in questions about big data and AI?

Jim Samuel
That’s been a very interesting journey. So my past is quite eclectic. I started off as an architect and then decided to delve into the world of business and finance. And it was while working with one of the top 12 banks in the world, that I began to realize the growing emphasis on data, the whole power of data, and the use of technology to manage data. Those were the days just post my MBA, I was working with Excel models. And then one day, the realization dawned on me (this was before the financial crisis, global financial crisis 2007 2008) that the kind of data that we’re dealing with, it goes way beyond the kind of technologies that I was using at that time, it goes way beyond Excel modeling. So that’s when I began to actually take a deeper look at what is the big data phenomenon all about.

And the more I studied it, the more I was attracted to it and drawn into it. And it opened up a whole new world for me. So that that was the kind of the starting point of my journey in the world of informatics and artificial intelligence. And very soon, I also realized once I did a deep dive into the world of big data, that the future does not just belong to large quantities of data, but it belongs to high-quality data, coupled with algorithmic technologies that we’re kind of beginning to call artificial intelligence, artificial intelligences, so on and so forth.

Stuart Shapiro
Gotcha. That’s great. I mean, I’ve been fascinated with those issues, too. But I never really moved beyond Excel. So I’m still stuck there. So glad that we have you with us.

So you joined us as the first permanent Director of our Public Informatics Program. And public informatics is not a common field. It’s not a common degree. I’m interested to sort of get your perspective on what and when you’re talking to students who might be interested in a degree, what is public informatics?

Jim Samuel
I’m going to set the stage with a couple of thoughts and then answer the question, what is public informatics?

Stuart Shapiro
Fair enough.

Jim Samuel
So the initial data science became quite popular. And the idea of data science is that it teaches you methods on how to work with data. Then we had the whole idea of analytics and business analytics. And again, the idea was working methods to work with data, going beyond traditional datasets, and traditional statistics, working with new types and varieties of data, and using new forms of data visualization to discover insights and so forth. And then we have informatics, which … while there is a lot of overlap between statistics, data science, analytics, and informatics, each of these words they represent some unique value, you know, they reflect some unique value. So, informatics refers to not just analytics, not just the use of statistics and data science methods, but also domain specialization. So usually informatics is associated with some words such as health informatics, where the implication is we are working with healthcare data and advanced methods to discover insights, which will help us take appropriate decisions in the healthcare domain.

So, informatics, therefore, refers to the next step going beyond data science going a little bit beyond analytics, getting domain specialization, in some sense, the higher use of technologies and technological capabilities to manage big data and smart data. And that brings us to public informatics, where we are taking this whole concept of using statistics, and data science analytics, and specializing in domains, which are of public interest.

So while traditional healthcare informatics relates to… could even be used by private hospitals, for example, when we speak about healthcare from a public interest perspective, then we are talking about public health issues. It could be public, it could be public finance, it could be public education. Just a few days ago, I asked ChatGPT, this question, I asked it a number of times in different ways. So I got a lot of output. But the first thing that it listed to the first question, and that struck me as interesting is the analysis of social media data to understand public opinion.

Stuart Shapiro
Oh, that’s interesting.

Jim Samuel
And I thought… I found that quite interesting. And I agree with that thought process, because although social media data was initially used for a wide variety of purposes, including business analytics; really, social media data reflects public opinion. And so when we analyze that with the intention of mining public opinion, on a topic of public interest, such as public response to COVID-19… In one of the early papers I wrote of my research, studies showed that initially, social media showed that people were actually joking about COVID-19. They were being humorous, they were downplaying the risks. And then somewhere around February towards the end of February and March of 2020, we find that humor change to seriousness and seriousness change to fear.

Stuart Shapiro
Right.

Jim Samuel
And we actually plotted the sentiment, the fear sentiment, it was a steep increase in just a matter of two weeks. It just shifted.

Stuart Shapiro
Fascinating. Fascinating, I remember going through those stages, myself there.

So you’ve been the MPI director for about 18 months now if I’m counting right. And I’m wondering how your view on this new discipline, this new area has evolved, in your time in the position between interacting with colleagues interacting with students? What’s changed in a sort of the way you think about it if anything?

Jim Samuel
That’s a good question, Stuart. And, number one, I believe it has changed and it has changed a lot. So for me the past one and a half years, it’s been a great learning experience. I spoke to a lot of the faculty here at Bloustein, and to students, and I learned a lot from them. From Clint Andrews, and so many others who have interacted, but also the students. One thing I found about Blaustein students is that they’re very passionate about their work.

So I had doctoral students, for example, one of them was studying used EVs (electronic vehicles). She was very passionate about our work. And as I spoke to different students, I realized number one, that the breadth of public informatics is much greater than what I had envisaged when I came into the program. The second thing that I realized is that the value creation potential is also far greater as I saw students apply to projects, as I saw the ideas that were being generated. And, if someone were to put a net present value on some of these, the implementation of some of these ideas would be huge. So yes, it’s been a steep learning curve.

Stuart Shapiro
And I don’t think that’s surprising, given the nature of the field and, and sort of where we were when we got started.

All right, you’ve already mentioned ChatGPT. So we would be remiss to listeners if we didn’t spend at least part of our time talking about ChatGBT and AI and such. And I know I’ve seen you talk about this before, but I want to start with your definition of artificial intelligence.

Jim Samuel
Artificial Intelligence is essentially a set of technologies, including the math, the stats, the programming, and the hardware that goes into it, which aims to mimic human intelligence, components of human intelligence, specifically, cognition and logic. So on one side, we have cognition, which means the ability to see the ability to hear, and now even the ability to smell is being there’s an attempt to program. And the logical side is where we are talking about finding things, adding subtraction, mathematical operations, and so on and so forth.

So this new set of technologies, which are able to mimic these human intelligence capabilities is what we are calling artificial intelligence. Generally, within the field, there’s an understanding that there’s general artificial intelligence, which is artificial intelligence that has integrated capabilities like human intelligence. We are very far away from developing meaningful general artificial intelligence. What we do have is our narrow artificial intelligence, which means we have a computer vision application, which can identify more human faces, and tag them by their name than what a single human being could meaningfully remember. This means you can train the application to recognize 100 million faces, while a human being would find that difficult, especially in the short period of time that we are able to train the AI.

Stuart Shapiro
Yeah, well, that’s different, I think that a lot of people think about… a lot of our views of AI are formed by popular culture, right? We think about the Terminator, or we think about HER, another movie with a guy who falls in love with an artificial voice if you will. So this is definitely different.

I want to sort of think and try to put this together with what we talked about before with public informatics. And think a little bit about what applications of AI are most likely to make a difference in public welfare, whether that be policy, urban planning, or health. You can pick one or two examples in any field and I think it’ll be of interest to people.

Jim Samuel
Sure. I’m going to kind of take a minute to talk about ChatGPT, and then travel into this question, because I think the two are connected. So ChatGPT is essentially a pre-trained model. It’s a generative pre-trained transformer. That’s what the GPT stands for. So the reason I mentioned that is that there are many GPTs. ChatGPT, to the best of my understanding, is based on GPT-3, which is a very large model with a significant number of parameters. I think something like 1.6 billion parameters, or so. I have worked with text generation, it’s very easy to write a program in Python. And you can base it on GPT-3. You can base it on some other large language model elements, and it is able to generate text.

Two things. One is I’m very impressed with ChatGPT. ChatGPT is very useful. And I think academia and the world, in general, should take a more friendly and positive approach towards ChatGPT, it’s very useful, and it’s going to be very useful. But along with that, the caveat that we need to remember is, there is no artificial intelligence that can understand the human meaning the way humans do. So though ChatGPT is impressive, while we use it, we should never forget that it does not understand meaning the way humans do it simply producing output in a very mechanical, probabilistic fashion.

Having said that, now that we have experienced the power of ChatGPT, there have been a number of powerful technologies that’s been released over the past few years. In fact, according to me, generative adversarial networks was a greater breakthrough than what has been illustrated through ChatGPT. But that’s just my opinion, I’m not stating it as an objective fact. The reason I mentioned it is that apart from ChatGPT there are a number of technologies that have crossed certain thresholds, and now they’re going to be very useful. There’s a lot of value-creation potential.

So when we talk about taking these AIs and using them for public benefit, now it’s a matter of ideation and establishing the appropriate policy framework, so that there is a controlled release of these technologies for public benefit. Most AI’s are currently controlled by large corporations or some specialized companies such as open AI. For these technologies to be fully… for these technologies to realize their full potential in public service, we need these technologies to be open source. Otherwise, there are certain risks that are going to be associated with these. One is a certain class or a certain group of people controlling what these AIs do and how they interact with the people the data that is generated from them is also controlled.

Unless it becomes open source, the education process for the public is not complete. It’s incomplete. And so it’s like you’re dealing with illiterate masses. They’re able to use ChatGPT, but they don’t know what data it’s been trained on. We have an idea, but really, we know that it’s about 40 terabytes of data, and we know that it’s data up to 2021, approximately. So we have some metadata information about ChatGPT, but we don’t know what exact data it’s been trained on, so we don’t know objectively what the limitations are. Through experience and by interacting with ChatGPT, we’re kind of reverse engineering and understanding what the limitations are.

Stuart Shapiro
And what are the implications of that what is what does that mean for the way the world might be different?

Jim Samuel
Number one, artificial intelligence is here to stay. So this is not a technology that we’re going to be able to roll back. We have to move ahead with it. Number two, unlike other technologies, where the technology would be released, and then the lawmakers would get to policymaking and building the laws and a time, the speed at which AI is progressing is very different. We cannot use the same philosophy or the same mindset to first let the technology be released, then look at how the technology is performing and decide what the policy should be. We need to front-end it.

This means policy itself has to have a strong research component where we say, Okay, what kinds of technologies do we anticipate being released in 2023, in 2025? And what policies can be put in place now so that companies have appropriate guidance, so that when they release these AIs, there’s risk management, there’s transparency, there’s openness, and there’s fairness and equity in the place? Otherwise, the technology is released, laws and policies are being developed, and by the time the laws and policies are developed that AI technology is already outdated, and they have released the next model, or the next technology.

Stuart Shapiro
Well, let’s look at the flip side, you mentioned risk management, what kind of risks are we talking about?

Jim Samuel
There are different categories of risks. One category of risk is control by companies or a few powerful people who control these AIs. So there is the… there are a number of risks that come in that bucket of just controlling people for individual profit, or some other idiosyncratic whim or fancy of, let’s say, a person who owns an AI, a strong AI.

The other category of risk is, it’s related to public education. It’s related to really educating people about AI. I think that AI education should be mandatory at different levels in different ways. But everyone should understand the philosophy and the concept of artificial intelligence. So when an uneducated user or an uneducated public tends to use a technology like ChatGPT, if they’re not trained, and if they don’t understand what AI is, they can set wrong expectations. And they will begin to depend on these AIs more than they should. And that can lead to very dangerous scenarios. For example, in the past, it’s shown that AI has given some really terrible and life-threatening advice.

Of course, those were in experimental settings. So it was used as a joke, but it could happen in the real world, that a future chatbot or this ChatGPT is based on GPT-3, and I’m sure within the next couple of years, we’re going to see other companies come out with their own versions and variations of GPT. One of these bots, if they get the wrong input, if they get the wrong stimulus words or phrases, could give out output that’s completely unexpected. Even AI scientists cannot expect what the output will be like. It’s an unpredictable set of words that will be thrown out by probabilistic associations. The user could misinterpret it and just think that that’s correct, while actually, it’s very wrong, leading to wrong decisions and different kinds of crises.

Stuart Shapiro
Right. Fascinating. Jim, I could probably keep you on for an hour asking questions about this, but for our listener’s sake, I’m going to wrap it up here. Thank you so much for coming on, it’s really been educational.

Jim Samuel
Likewise Stuart. My pleasure.

Stuart Shapiro
Also a big thank you to our team, Amy Cobb and Karyn Olsen. We’ll see you next week with another talk from another expert at the Bloustein School. Until then, stay safe.

Recent Episodes