How AI is redefining coding

Jon Prial: [00:00:00] The material and information presented in this podcast is for discussion and general informational purposes only and is not intended to be and should not be construed as legal, business, tax, investment advice, or other professional advice. The material and information does not constitute a recommendation, offer, solicitation, or invitation to the sale of any securities, financial instruments, investments, or other services. Including any securities of any investment fund or other entity managed or advised directly or indirectly by Georgian or any of its affiliates, the views and opinions expressed by any guests are their own views and do not reflect the opinions of Georgian. Sometimes when it comes to generative AI, I just don't know where to start. I'm not a fan of being too hyperbolic, but i'm comfortable saying that many different aspects of a business have the potential to be affected by this new technology. Today, we're going to talk about something that's behind the scenes for most people, although hopefully not this audience. It's [00:01:00] coded. Yes, people could chat with ChatGPT. Yes, images can be generated. And I'm sure you've seen GenAI, right? Cool stuff, in really specific styles no less. I mean, hey, I just heard a Charles Dickens-like urchin talk about a rocket to another planet. It was impressive. But actually what's impressed me more is when I saw the use of generative AI to automate working with a spreadsheet. Now that's coding, but that's not what I consider coding-coding. It turns out that generative AI can also do the work that many programmers do. Now that's a productivity tool and something that every business leader needs to understand our recent podcast have shown how much of an impact GenAI has on products themselves, like being developed by companies. But today, we're going to be talking about how writing code testing code and yes, even product design can be affected by generative AI. Eye-opening to say the least. I'm Jon Prial and welcome to Georgian's Impact Podcast.[00:02:00] With me today is Rodrigo Ceballos and he is a Machine Learning Engineer as part of Georgian's AI team. Rodrigo, tell us a bit about what you do at Georgian. Rodrigo Ceballos: Hi, Jon. So I would spend usually my time working with our portfolio companies in projects that range anywhere from there's a company that is starting to get involved with AI or starting to use AI, they need someone to help them build a team, come up with the first few products that make sense for them to start exploring AI as a business all the way to mature AI first companies that have ML teams or AI teams. And just either need a extra pair of hands or are venturing into a space of AI that they're maybe not so familiar with, and we can bring in our team. The other is around our pipeline for investment. So [00:03:00] finding companies that are related to AI and why it might benefit a company to have us as an investor and have access to my team as a resource. Jon Prial: Just to put a little more perspective and to repeat a little bit slightly differently, what I talked about in my introduction. I think it's fair to say that over time, more and more power has been put into the hands of end users. I think back to office productivity, word processors, charts, and spreadsheets, but now it's moved to automating and simplifying the building of websites or running marketing campaigns and so much more. And to me, the net of it is that the power of productivity was moving directly into the hands of end users, but now we're talking about developers. The things that the technical teams have been doing over the years has expanded. I don't know, maybe even siloed from just writing code to managing databases and building AI models, but I'm not sure that's the same thing as those end user examples that I listed. So Rodrigo, from your perspective, what's going on now? Rodrigo Ceballos: I don't think that this is necessarily that new. I think software engineers have basically [00:04:00] been automating themselves since this field exists, right? It's as basic as, you know, we started writing Assembly, and at some point we realized that we were repeating ourselves a lot. Can we come up with a slightly higher level language, a more abstract language that simplifies some of these things that we keep repeating over and over again to make our lives easier? And to some degree, that is kind of like automating the job as well. But I think that that progression, right, from low level to high level and abstract thinking now has reached this new era where that language is now English or, you know, natural language. We're getting to that transition phase where we no longer write Python, which is, to be honest, quite high level already compared to what we were writing 10 or 20 years ago, and now we're transitioning into writing English. So I think in that sense, large language models have expanded the range of people that can directly interface with computers and create [00:05:00] applications, write programs, write code. That's really what's at heart here. Jon Prial: Then how has your job as a coder changed in terms of your use of GenAI? What do you do now? Rodrigo Ceballos: I think I'm in a bit of a special position. I'm fortunate enough that my job is to some degree to stay up to date with all these trends. So I use it every day. I use ChatGPT and other language models multiple times a day. How it has affected my day to day is that I can now automate a lot of menial tasks that I used to, you know, have to do myself. So for example, if I have a very clear idea of a function that I want to write, that does something very specific, it's not something too complicated. Most of the time ChatGPT code interpreter or advanced analysis as they've renamed it [00:06:00] now, which is one of the beta features of GPT-4, will basically write that function perfectly from scratch. It has a Python interpreter in it as well so it can actually write the function and then run it to check that it's correct. And most of the time for that kind of use case, this is really helpful. I also have a few plugins in my environment in VS code that let me, for example, highlight a function and ask an LLM to write me a unit test for that function. Sometimes, this is not a comprehensive test but it usually does let me test at least some amount of the functions and automate some of the testing process as well. Jon Prial: So one of the changes has happened over the years, more apps, more interconnectivity things in the cloud, companies opened up their software to interact better with other applications, the APIs and the like. So in my naive pre-GenAI thinking, that would always be a challenge, how to get coding and parameters, right? Has this changed now? [00:07:00] Rodrigo Ceballos: I think so. I don't think that there's yet a full general solution that anyone can just plug into any kind of API and just have it work. There's still a little bit of work to do, but I think now LLMs have been successfully used many, many times as a intermediate layer, as a translation layer, if you may, between either two pieces of code, two libraries that don't necessarily talk to each other, or even a user and a library. The way that this works high level is that you can have something called an agent that essentially is some piece of software that uses large language models for decision-making and pair that agent with a tool, which in this case can be something like a wrapper to an API. And what this allows this agent to do is that you can ask the agent, "Hey, can you please change the name of my user?" And it will look inside its wrapper and find the API call change name of user, and [00:08:00] find the documentation for that API call, and then write the API call that would change your name to what you asked for and run that and execute that function, right? And so in that sense, you can use agents and LLMs more generally as a layer- translation layer between different pieces of software. Jon Prial: I'd like to talk about some simple hypothetical cases and where GenAI can help. You talk about AI code generator, you've talked about agents, you know, in my mind, there's this old thing called scripting and clearly there's a lot more that you can do to automate, but I'd like you to think about that. Help me understand that automation against all of these other apps again. Just take me a little deeper to what we were just talking about a bit. Rodrigo Ceballos: There's many ways to do this. One of the most common right now is to use a library called LangChain. LangChain, at first, might seem a little bit unnecessary because it's not writing the models, it's not, in and of itself, doing any of this work. But the real [00:09:00] beauty of LangChain is that a lot of good engineers have sat down and come up with a way to organize all of these different components that need to fit together to make these applications work and they've made it into a open source library that anyone can use with a huge community. So that's what I've been using most of the time for these kinds of projects. Jon Prial: So let's talk about, I don't know, let me call it this new development process. Can you walk me through an example to help me understand this better? What type of requests might you create an agent for? Rodrigo Ceballos: One example of what you can build with this, and it goes to show how quickly this technology is advancing, that what I'm about to describe is probably now pretty trivial to do yourself or even find something online that already does it, is that we built a wrapper around one of our internal databases at Georgian. We built an agent that could write Python and had [00:11:00] this essentially archive of information about the semantic information in that database, essentially schemas and semantic information about, you know, this is the kind of data that is in this database, this column contains this kind of data and it follows these properties, things like that, and combine both of those into an agent that basically allowed us to query this database using natural language. You could ask it, "Hey, please get me the latest 10 entries in this database that satisfy this criteria." And that would get translated first into Python. It would use a tool to execute that Python and make sure that it was correct. It would then check against that wealth of data that it had, the archive, to make sure that it was correctly interpreting the request that it got and the output that it got from the original running of that Python code. It would update the Python code, run it again, and then give you back the query. You can kind of chain together these models and interleave [00:12:00] them with other operations like "Run Python code, query a database," and you can build these pretty complex pipelines. That really expand the capabilities of what large language models can do. Jon Prial: It's fascinating to me because, you know, in the old days, you'd open up a database and teach somebody SQL, perhaps, and that's gone. And I, clearly, you're delivering much more power to an end user. From a development side of things, was this easier and faster than teaching end users SQL? And how long did it take? Rodrigo Ceballos: Well, this is pretty fast. We did this during an internal hackathon that lasted two days. So it was a full two days of programming to get this to work correctly, but it has no comparison to what it would have taken to teach someone SQL. I should add, however, here, a disclaimer, that this does not yet replace completely knowledge of SQL. In fact, another example is that we got this SQL statement that had 500 [00:13:00] lines, which I didn't believe was possible, but it runs and it does something useful, however, it's, it's obviously very hard to debug because it's so large. And I tried using LLMs on that kind of statement and it was not able to do it correctly. There is still a lot of use, you know, in knowing these technologies, but I think that as we move forward, those needs will get less and less. Jon Prial: You mentioned earlier that you generate some code and it would be kind of automatically tested, but let's talk about generally the testing cycle through a development process unit testing. Can you ask it to give me a whole pile of unit tests? Talk to me about how GenAI might help you test the product that's been now coded. Rodrigo Ceballos: So, I think here there's a concept that we've been using internally that helps us walk through the stages that one could go through to get such a system to work. We call it the "crawl, walk, run," strategy. [00:14:00] And the idea is that in crawl, it's like the simplest thing you can do. So yes, it is true. You could copy/paste some function or a set of functions into ChatGPT and come up with a clever prompt that basically explains to the LLM what is a good unit test for Python. What does getPython do? The series of mental steps that one should take in writing a comprehensive suite of unit tests to test the function. It's just a single prompt and you could automate that. I'll skip ahead to what I would think is probably the solution to that problem that could be built today. And this is an opinion, but I think that probably an agent in this case would be powerful. Because you could add to that agent the ability to, one, run those unit tests. Two, you could have that agent be running on your code base and learn from things like actual mistakes and errors that happen in [00:15:00] production. Remember earlier I mentioned this idea of having an archive of information. That the agent had access to that archive could include things like logs, and it could include things like past errors that have happened in this code base, and it could use that to be more targeted in the way that it write tests, for example, trying to prioritize systematic issues in the code base, right, that you can see if you go through the history of issues. Jon Prial: In terms of work that you do, and obviously there's so much more efficiency that's shown up here. Do we need to think about. What's generated and trust and does that put more responsibility on you for saying like another layer of validation that you need to do? Rodrigo Ceballos: Yeah, I think there's a very common concern and rightly so. The classic story [00:16:00] of someone trying something like ChatGPT a few months ago was that they would go there, they would ask it some question that was mostly meant for Google. I read like a factual question about something and GPT would get it wrong and they would be like, "Oh, this thing doesn't work!" And then they would just leave the website and never go to it again. LLMs are prone to what's called a hallucination. They're not built to store factual information. And so, you should not trust them to recall said information accurately. Now, what these networks are good for is reasoning, especially when given templates on how to reason. Following steps, I think of it as, there's a set of tasks that the language models can do today, almost perfectly, every time. And a lot of the interesting tasks that we want them to do are a little bit harder than that. They're a bit above that. One way that you can still use them is that you can break down that harder [00:17:00] task into smaller subtasks that fall back into the band of problems that it can do. Then LLMs can help you and do those subtasks and then put it all back together to answer the main question. So... for me, a lot of the trust component when using like language models is to build in some guardrails around these LLMs. Give them access to data that they can use to validate their own answer you can do things like ask different LLMs the same question and only be sure that it's mostly correct if all of them answer the same thing. Right, like there's all these techniques that you can do to minimize the chances of mistakes. Jon Prial: And it does sound like it's a [00:18:00] different layer that needs to be worked on so the development process does look different, which is fine- things do change. When I put my market research hat on my market analyst hat on, it was not that many years ago we were talking about no code, low code tools as a way to democratize coding. Is this different? Is it a replacement or do they sit in different worlds? Rodrigo Ceballos: No code tools basically encompasses everything that is no code. And so I think that the tools that were built for you to, for example, be able to analyze CSV files or other data sources using no code and just kind of clicking around. I think this can replace a lot of those tools. This is basically what LLMs do. And as I explained, because they serve as this intermediate layer between natural language and code, it is no code solution, in a sense, because anything that you can express in natural language, in theory, could be translated by one of these systems into code. Jon Prial: So how do [00:19:00] we ensure that all knowledge that you've got isn't lost in this new world? And I'm thinking about if you were using some web designing tool, all key security aspects would have been built into that website. But if someone doesn't know that, guardrails or not. How do we make sure, and is it within the LLM or is it on the shoulders of the programmers to make sure that we're doing the right things regarding the breadth of safety that's required? Rodrigo Ceballos: Yeah. And I would not advise anyone to use LLMs to build a website from scratch and assume that it's going to be secure from attacks. The LLMs are there to, I think right now, automate the things that they can and security of that form, I think is not something that they should be trusted with just yet. They can be complementary. One of the advantages of language models [00:20:00] is that they can adapt to unforeseen situations reasonably, especially when given the right tools to do so. And I can see those being very helpful as a part of a security system, but not really the whole thing. And to some degree, I think, you know, not everything should be replaced by an LLM. As I mentioned before, many of the agents that I've described are also using tools like Python and databases, right? Like not everything is an LLM and even if you use generative AI to generate some of the security code, you still want to be putting in those guardrails in the code itself and get professionals that, know what they're doing to add those things to something like a website. Jon Prial: So we talked about coding, we talked about testing, some of the aspects of deployment. I want to go back to the beginning of the cycle and talk a little bit about product design. What do you think about how that might affect,[00:21:00] programmers? Rodrigo Ceballos: I have a funny story about this that happened the other day. As all good software engineers, I love a good card game, and for a long time I've toyed with the idea of making my own card game as well. And, as I mentioned earlier, I basically have ChatGPT give it a shot and see if it can do it. And so when I was thinking about a new card game that I was imagining, I thought, well, why don't I use ChatGPT and and help me brainstorm what a good game could be? And I basically just gave it some very high level ideas about what I wanted in this game. For example, I said I wanted the game to last less than 20 minutes. I wanted the game to be a one v one combat style game, deck builder, something like that. I told her what I wanted the theme to be, I wanted it to be about aliens, and a few other descriptions. And it was surprisingly good at coming up with suggestions on ideas, but also a framework on how to [00:22:00] think about the game design process. Granted, I did ask it to do that first. I would say that is key. Prompting is a skill that I think we all will eventually need to develop. And so I, yeah, basically ask, "First, think through what are the pieces of this game, the components that we need to think through, and then give me some ideas about what those could be given my requirements." I'd say though that it is a very interactive process because ChatGPT sometimes is good at blurring out a large quantity of ideas, out of which 10 to 20 percent might actually be really good. And that's where I come in. And so I think that it wasn't that the LLM could just do this whole game design by itself, but the combination of me as a filter and as a guide, an architect essentially thinking through, okay, how would these ideas work in practice, how would that relate to the vision that I have for the game and trying to guide it towards what I wanted, but also letting me be surprised [00:23:00] by some of the things that I came up with that I hadn't thought of before. I think that interaction, that's what's really powerful and how these models can be best utilized for brainstorming and design. Jon Prial: That's fascinating. So you mentioned the test cases could be quite broad and I get the sense that they can address things that you didn't think of. And here we are talking about improving and enhancing as a team, you and this generative AI system helping with the aspect of human creativity. I always thought, you know, it can only do what it knows, but it seems to do, can do more. These things, whether you're writing code or creating a product design. Are you getting elements of creativity that maybe someone else could not have thought of? Rodrigo Ceballos: It's a philosophical question here. You know, it's, is there any original thoughts, you know, like is, is everything basically recycled to some degree or another? If that's the case, if the answer is yes, then LLMs are, in [00:24:00] theory, just as creative as humans, because they've just seen, you know, a non-trivial percentage of the internet, and they are just kind of reusing those ideas, recombining them in different ways to give you potentially creative or novel outputs. At the same time, though, will an LLM ever come up with a paradigm-shifting idea, something that it completely breaks with whole schools of thought that have come before? Of that I'm a little more skeptical of, at least for now. One other point to bring up here in terms of creativity is that there have been a few preliminary studies coming out that show that when you are using language models in this way, you do tend to actually be less creative. So you are offloading that responsibility to the model, and you are assuming that role that I mentioned earlier of kind of a critiquer or a guide. And that's something that I think we need to be very careful about and make [00:25:00] sure that we as a species are not losing one of the few talents that we have still that are not completely automatable, which in my opinion is creativity. Jon Prial: So a best practice, I believe, is your pair programming. It sounds to me like you could pair up with a ChatGPT, a genitive eye solution, and be even more effective than if you were paired with a human. We surely don't want to pair two computers together, but maybe that'll happen sometime in the future, but does that, that makes sense to you? Is this kind of a new vision of paired programming? Rodrigo Ceballos: That's how I feel when I work with, with an LLM model for programming tasks. For those that don't know, in pair programming, you usually have a driver, that's the person writing the code, and you have a navigator, that's the person that, because they're not in the weeds writing the code and concentrating on that, can have a slightly laggy higher level view of what is being built and has a little more time to think about the consequences of some of the decisions that are being [00:26:00] made while we write that code. And I think in that sense, right, the way we're using it now is the LLM is the driver and you are the navigator. I think that's a great analogy for what it feels like to pair program with an AI. Jon Prial: So for our audience here, if you've got a programming team, which hopefully all of our listeners do, there are tools to be leveraged to make you more efficient, to get more things done faster. I think the key is to see what you have, get started. Rodrigo, thanks so much for your time. It's been a pleasure. Rodrigo Ceballos: Thank you, Jon. Thanks for having me.

Show Notes

Episode Transcript

Other Episodes

Episode 49

Episode 49: Bridging the Gap Between Business and Social Data

Episode 4

Testing LLMs for trust and safety

Episode 8

Cybersecurity in the Age of Quantum Computing With Michele Mosca