ESI Interviews

Ep 34: How to Train Your Generative AI with Priceline CTO Marty Brodbeck

Guest Michael Keithley
Marty Brodbeck
February 14, 2024
32
 MIN
Listen to this episode on your favorite platform
Apple Podcast Icon - Radio Webflow TemplateSpotify Icon- Radio Webflow TemplateGoogle Podcast Icon - Radio Webflow TemplateApple Podcast Icon - Radio Webflow TemplateApple Podcast Icon - Radio Webflow Template
Ep 34: How to Train Your Generative AI with Priceline CTO Marty Brodbeck
ESI Interviews
February 14, 2024
32
 MIN

Ep 34: How to Train Your Generative AI with Priceline CTO Marty Brodbeck

On the 34th episode of Enterprise Software Innovators, Marty Brodbeck, CTO of Priceline, joins the show to share his thoughts on organizational frameworks to implement AI effectively, underestimated use cases for generative AI, and how Priceline’s new chatbot Penny uses the latest generative AI capabilities to transform the customer experience.

On the 34th episode of Enterprise Software Innovators, host Evan Reiser (Abnormal Security) talks with Marty Brodbeck, CTO of Priceline. Priceline is an online travel agency that enables users to book hotels, flights, and rental cars across 400 airlines and 300,000 hotels in over 200 countries worldwide. Marty shares thoughts on organizational frameworks to implement AI effectively, underestimated use cases for generative AI, and how Priceline’s new chatbot, Penny, uses the latest generative AI capabilities to transform the customer experience.

Quick hits from Marty:

On the future of AI in business: “I firmly believe that... a lot of the biggest innovations [in the enterprise] are going to be the application of machine learning and generative AI to scale out infrastructures.” 

On the iterative development of AI applications: “These large language models don't work right out of the box. There's a corpus of information that they get trained on that isn't necessarily accurate, so you have to complement the capabilities you're going to get from a large language model with your own data.”

On AI for personalization and customer interaction: “Number one, it's the iterative nature in which we built our prompts for Penny and tested those to get it to a point where it was conversion positive... And two, being able to iterate on prompts to get them right so that Penny is responding in the appropriate way that's valuable for our consumers, those are the two big things that we learned."

Recent Book Recommendation: Going Infinite by Michael Lewis

Episode Transcript

Evan Reiser: Hi there, and welcome to Enterprise Software Innovators, a show where top tech executives share how they innovate at scale. In each episode, enterprise CIOs share how they've applied exciting new technologies, and what they've learned along the way. I'm Evan Reiser, the CEO and founder of Abnormal Security.

Saam Motamedi: I'm Saam Motamedi, a general partner at Greylock Partners.

Evan: Today on the show, we’re bringing you a conversation with Marty Brodbeck, CTO of Priceline. Priceline is an online travel agency that enables users to book hotels, flights and rental cars across 400 airlines and 300,000 hotels in over 200 countries around the world.

In this conversation, Marty shares organizational frameworks to implement AI effectively, underestimated use cases for generative AI, and how Priceline’s new chatbot Penny uses the latest generative AI capabilities to transform the customer experience

So Marty, thanks so much for joining us. To start, do you mind giving the audience a brief overview of your career? Maybe how you ended up in your role at Priceline? 

Marty Brodbeck: Yeah, sure. So I was going to be a lobbyist, and I had an interest in technology and computers and was doing some programming on the side before I graduated.

And then I, I landed a job on wall street and investment banking tech working as a C++ programmer, doing mortgage backed securities, design and development for Nomura Securities. Went on to do electronic trading platforms for Deutsche Bank. You know, was designing equity trading platforms at Merrill Lynch.

Then I totally pivoted my career. I went, uh, to go work at a pharmaceutical company called Pfizer. Was their chief architect for four years and their chief technology officer for another four and a half to five years. And then I pivoted again, went into consumer products, was the chief technology officer for Diageo.

Did a stint at Pearson as their Chief Technology Officer. Uh, went to go to work for a startup company in the background check business that was sold to Goldman Sachs. And then, uh, was the Chief Technology Officer at Shutterstock for close to three years. And now, am the CTO of Priceline and I've been here close to five.

Evan: Well, you've had a very storied career at many notable companies. Um, I feel like everyone listening probably, uh, knows what, uh, Priceline is, but do you mind just kind of giving an overview for maybe the small percentage of people that, that don't know kind of what, what Priceline does and, you know, kind of how you help out your customers?

Marty: Yeah. Priceline is an online travel agency and, uh, we provide a technology platform for our customers to find the best travel deals across flights, hotels, rental car, and packages. So our job is to provide consumers with innovations in the product space for them to maximize the value in deals and cost savings they can get by buying our products.

Evan: Is there any like particular, um, technology initiatives or some upcoming features that are using, you know, novel technology that you just feel particularly excited about or maybe proud of? 

Marty: Yeah, we launched a product in the summer called Penny, which is our AI chat bot, and we've embedded Penny across our checkout flow, our post booking and customer care capabilities, and also, we're now building out some trip planning capabilities. Penny's a generative AI component that we've built out, but it also leverages reinforced learning. It also, uh, leverages capabilities in high performance computing, such as cache. And also is very intuitive and learns as the more it interacts with our customers and consumers across customer care, our checkout flow, and we'll be launching something in the trip planning space in the next couple months.

Evan: So I know Priceline probably has had a longer history in machine learning AI than probably most people realize, right? And it is, you know, I imagine it is a, it's kind of always been a big data company, right? Consuming all these different data points from, you know, different kind of travel partners as well as, you know, customers.

But as you've started to kind of use some of these new generative AI technologies, has that been a, has that required kind of a shift in some of the, you know, technology strategy, architecture, or even the kind of makeup of the team? Like how have you guys thought about kind of, preparing an enabling the organization to best utilize some of these new technologies?

Marty: Yeah, it's a great question. I think when we were, when we were moving, uh, to the cloud a couple years ago and we were refactoring all of our applications to be 12 factor running on Docker and Kubernetes, we also took a hard look at how we need to refactor our data to be, what I would say, a real time company, and in doing so we put a lot of investment into really redoing our data pipeline and how we capture customer information and for it to be real time.

So, that was like a foundational capability for us to get to a point where we could do generative AI and advance a lot of our machine learning capabilities because a lot of the data that feeds Penny, and is also feeding, uh, other parts of our generative AI capabilities, is fundamentally built on getting real time information from our customers and our consumers. So that was a very big deal. 

Two is, is building out a customer data platform where we had insights into what people were searching, booking, and calling in about from products that they bought. So, building a customer data platform that gave us insights into our customers around that, and enabling that information to be real time, was paramount to get us to a point In launching a lot of our generative AI capabilities. Because without that, you can't really learn at a very fast trajectory.

And also, we do a ton of AB testing, so getting information in real time about how any of these generative AI capabilities are impacting our conversion rates, I think, was very paramount for us to be able to move pretty quickly in this space. 

Evan: You know, most conversations about AI, people get very excited about the algorithms, or the kind of training techniques, or the applications of where AI is actually used.

But I imagine for your business, like a lot, some of the key foundational work, right, is getting the data architecture and the data structures, you know, correct. Like when you, when you think about some of the progress you've made, is there something that stands out as kind of disproportionately important from a, you know, architecture perspective that has enabled you to launch these type of capabilities like Penny?

Marty: I think it's really two things. I think number one, it's the iterative nature in which we built our prompts for Penny and tested those to get it to a point where it was conversion positive, rather than having a negative impact on our consumers. Two, I think the big piece is our customer data platform that understands who this customer is that enabled us to do any kind of real time personalization and recommendation based on who this person was. I think those two pieces have been paramount in terms of getting generative AI right for us. Because you want to have a context about who the customer or consumer is. And two, being able to iterate on prompts to get them right so that Penny is responding in the appropriate way that's valuable for our consumers, I think is, those are the two big things that we learned.

I think the third is, you know, these large language models don't work right out of the box. There's a corpus of information that they get trained on that isn't necessarily accurate, so you have to complement the capabilities you're going to get from a large language model with your own data and eliminate the amount of hallucinations that can happen when you marry the two together.

Evan: I mean, the first thing is like, it sounds like you're pretty clear on what the objective function is, right? You're trying to maximize, you know, conversions, right? As a kind of proxy for like, is it actually helpful for people finding the right travel offers? The right, the right kind of, you know, packages for them.

But then there's additional layer of fine tuning that you're doing, right? Using kind of historic queries and prompts or like, you know, how do you actually go through that fine tuning process to make sure it's actually effective in, in the context of like, we're using that as an application? 

Marty: Yeah, so it goes back to our, you know, in general, you know, it goes back to our AB testing strategy.

So, like when we launch any feature, whether it's generative AI or not, we have a metric that we use called NIPBD, which is how is this feature driving net new incremental bookings per day? And so that's the premise. Like, anything that we do, it either has to drive more value and revenue for the company, or it has to be, you know, cutting out some operational cost or, you know, making us faster, smarter, better. So that's the premise. 

And so when we, you know, we're originally, we're designing Penny, you know, we looked at, okay, let's, let's launch this bot with the prompt. And then we are seeing, as we do our, when we, how our AB system test works, that, you know, the first couple of iterations of the prompt were actually impeding customers to actually buy our products.

And so, you keep on learning what the customers are asking for and where we failed, through logs that are captured through Penny, and then you start making you feeding those logs into Penny to make her smarter. To the point where you're iterating the prompt again and again and again and again, till you get to what I would say a neutral NIPBD were okay, this prompt is now not impacting bookings in a negative way. How do I iterate it more to get to the point where it's positive? And so we had to go through like 20 iterations of this, but I think the key is really two things. Number one to have really good cycle times from a CICB deployment perspective where you can turn out a feature and push it out really quickly and do blue green testing and AB testing. That's the first thing. 

Second thing is this notion of reinforced learning where you're looking at the logs of, Of how your generative AI components are interacting with the customers and figuring out what they're asking for, where you're failing, and then apply that back into the models to make them smarter.

So that for for us, you know, when we were storing that log information and just like something like Splunk. So we were getting insights into what consumers were asking for with Penny, how it was converting and not converting, and just having that log information with some level of analysis was was usually beneficial for us.

Evan: When you talk about, like, the prompts, presumably there's some, like, prompts coming from the user, but it sounds like you're also using some sort of context from that customer data platform about their preferences for, I don't know, luxury versus budget, or maybe even travel destinations. Like, are you, you're going to weave that into, like, the, the prompt or there's kind of like, um.

Marty: Yeah, the way I would describe it, it's like, it's almost like object oriented programming. Where you have this universal prompt that is how you want a generic conversation to happen with your end user. And then based on who the end user is, there's several contexts for the travel identity of that user.

They could be traveling alone. They could be traveling with their significant other, or they could be traveling as a family. So then the prompt expands into those three different dimensions. And then those three different dimensions are, are funneled into, Okay, so what is this user booked in the past? What is, what have they booked as a single traveler or traveling with a companion or what have they traveled as a family? And then from there, you go one step further down the prompt chain, well, if this person was traveling as a family member, what are some of the things that they booked in the past or searched around that profile?

So, you start with essentially a master class of the prompt, and then the prompt then gets feeded down into the different personas of the traveler that could be interacting with, with Penny. And then underneath of that, there are several other prompts. So it completely expands, it could expand to infinity.

You're constantly iterating these things, but that's actually a very good thing because what it enables you is to have one to one personalization with the customer, which is what you want. And so that's the way that we've approached it.

Evan: How do you actually like operationalize that where it works for like a much larger kind of product organization, right? Do you have part team working on kind of the agent instructions part of the prompt, a separate one working on the kind of personalized context.

What have you found successful in actually enabling kind of a larger team to coordinate and collaborate, you know, buildings get more complex cases of applied AI? 

Marty: Yeah, I mean, we've, I think the way to keep it simple and to scale it is to keep the team small. I mean, when we were building out Penny, the core team was, you know, five developers, one programmer, are one product manager, and one quality person.

And so they were, you know, actually to scale it out is, you go small to go big, in words, because the iterations of the prompt have to happen so quickly to turn it around. That was the first thing. The second thing is, the team was, was very good at, at having an established architecture and pattern for scaling out the prompts, similar to what I was talking about with object oriented programming.

The same thing applies when, when they were developing this, is that there's a certain level of where you need to outline how the prompt should function for a particular persona, as we call them, which is a single traveler, a person traveling with a companion, and then if you're traveling with a family.

Now, on the flip side, what makes this problem interesting is then, then the products that they could buy, based on the prompt, there's different vectors of what a product, what the product could be. Whether that's a romantic getaway hotel, a hotel in the mountains, a beach hotel. I mean, all these things factor into basically infinite personalization based on all those things. And I think that's what, quite honestly, consumers are demanding. 

You know, personalization capabilities, I remember back in the old e-commerce days, were very, they were very static. Now we're getting to a point with all these capabilities, whether it be personalized medicine, whether it be personalized products, whether it be personalized ,clothing, we now have the ability to actually personalize and recommend tailored products right within the experience.

You know, and all the infrastructure stuff that we talked about before with cloud and with real time data. And now with like vectors and what we're doing on the prompts, it makes it quite easy to scale. And quite frankly, you don't need to have large teams in order to figure these things out. 

Evan: Priceline has always been ahead of the curve when it comes to understanding users, doing kind of experimentation, building kind of personalized offers, personalized content, and so I imagine there's like a lot of infrastructure you've kind of built up, you know, you're, you're kind of building on top of to enable that velocity of kind of innovation and product development. And any advice you have to, you know, your peers out there about what are some of the fundamental building blocks that are really important to put in place that will enable, you know, the future, your kind of future high speed product development using some of these new AI technologies. 

Marty: Yeah, I think there's really four that we've focused on over the last five years.

When I started with the company about five years ago, we were in data centers. We, you know, weren't focusing enough on developer productivity, I would say. And, you know, we weren't, we weren't as fast as we are now, which is really, you know, a credit to, to the team. So I think the first thing is, as a CTO, I, we're obsessed with developer productivity.

We look at our software development life cycle as one of the most mission critical processes in the company, and we're constantly looking at how we can make developers faster. So, you know, we've been religious about making investments in, in that space. Those could be anything from automated functional testing.

To now looking at how to use generative AI to do auto completion of code to reviewing key pull requests, going into our build cycles. So that's the first thing. Two is, we made a huge investment in cloud. We refactored all of our applications to be micro services based, running as a 12 factor application architecture pattern, running on docker and kubernetes.

The reason why that's important is it isolates the team's ability to focus on a particular part of our business and actually churn out features and functions quickly without having any interdependencies upon other teams. I mean, one of the biggest things I always hear is, Oh, we have a monolithic code base, it's very hard to do releases because there's so many interdependencies. Well, at some point you're going to have to modernize. So, you know, it's, it sounds like a no brainer, but you know, 70-80 percent of tech budgets are still spent on maintenance and support. So, at some point you got to bite the bullet and modernize, which we did. 

Three is, you know, real time data is critical, particularly for an e-commerce company that has 50 million users on the platform per month. You know, understanding what they're doing, what they're buying, what they're searching is paramount for us to drive new product features and innovations.

And then fourth thing is we have a culture of failing fast. We built our own AB testing platform called SETI, and that is really the glue for us that drives all of the testing that we do around product features. So I would say those are the four big things for us that we have focused on over the last several years in order to go much faster.

Evan: You mentioned earlier, it sounds like you might be using AI to improve, like, the operations of the business. I think you mentioned, you know, using Copilot or some kind of, you know, code completion tools. What has, like, worked for you, right? Have you found, kind of, um, examples where you've been to incorporate, you know, some new technologies, whether it's AI or otherwise, into that, that kind of software development lifecycle?

And are there areas where you've seen, kind of, success, where it's been quite productive? 

Marty: Yeah. We're, um, in the developer productivity side, we're partnering with a company called Codium that is really a large language model for, you know, software development. So it does code completion, auto generation of code.

So that's been something that we've been rolling out and we have a goal of next year where we want to make our software developers 20 percent more productive. And for me, generative AI is not going to eliminate the need for a developer. Quite frankly, it's the opposite. It's going to make that developer much more efficient and effective where you can have the generative AI components work on support and maintenance or bug fix. And you can really have your developers focusing in on large value added features. So that's the first piece. 

Second piece. We're also looking at how generative AI can automate QA and quality assurance. We have an organization called SDATS, which are software development and test, and we want to make those folks more efficient and effective. So how can we use generative AI to automate functional testing, integration testing, user testing. So we're spending a lot of time on how we can use generative AI to automate capabilities in that space. And then there's still a lot of innovation in the infrastructure side. I know, like everyone's talking about, you know, generative AI, but, you know, sophisticated ways in which we, you can make the end user experience faster.

Like how do you do edge caching for APIs or data where, you know, we're serving, you know, people in 150 markets, so anytime that we can make our website faster is an innovation. And then we're spending a lot of time in the server driven UI space where how we can start working off of one code base across iOS, Android and desktop, you know, to standardize our core customer journey across those three channels and then have our native apps teams focusing on value added features that are really native to the device, as well as for our desktop folks. And, you know, in the in the Cooper Nettie's management space, it's still very early innings about how to do auto scaling and how you shift those responsibilities into the hands of the developers versus having like a traditional DevOps teams.

One of the things I've learned in my career when, like, when you start creating these specialized groups like DevOps or quality assurance, it speaks to, well, there's an area within your company that's not fully automated yet, or deploying people, but at the same time, it's an area where you most likely will figure out a way to automate it over time, which has become the case of like, there's a company called Commodore that we're using that allows you to have the developers manage much more of the Kubernetes infrastructure than the DevOps team.

And that, that doesn't mean the DevOps team goes away. It just means that they can now work on more high value added tasks than managing your clusters or your helm charts to do auto scaling. 

One of the things that's like so basic with these technologies that I don't think a lot of people are focusing on. Everyone so everyone's so focused on like the The cool or the sexy use cases, but like meantime to discover meantime to resolve in the data center is like a huge deal for revenue, and to have like generative AI be a huge impact to that I think we're still very early on around. You know, you have a corpus of information, like, in our case We have 20 years of information on how the website is performed and run and you have all this eventing information. Your typical apm vendors, quite frankly, they don't have offerings in this space yet. Like New Relic or Splunk.

There's still large infrastructures that still rely on humans to determine what the issue was and how do you resolve it. And so, you know, a basic use case, like, where you could use generative AI is, where you just need to, you know, reset up a pod, or shift scale, or reboot this piece of infrastructure, or I see you have a memory leak in your database, you might want to just recycle this machine, like all those things are like still somewhat manual.

And those use cases are, in my mind, just like the low hanging fruit that's not being addressed at all. Like, customer care is another area where we're spending a lot of time where, you know, humans are still involved in answering questions to a certain extent, whereas, you know, when now we're starting to roll out Penny, and having Penny handle as many customer care cases as possible until it gets to a point where it's just so complicated that they can't do it.

Creating marketing content is another where, you know, you have humans creating all this content. You could have the robots actually, or the generative AI capabilities doing it. 

Evan: And for a lot of these things, like the kind of the bottom 25 percent of sophistication, like can clearly be done, right? Even in the, um, infrastructure debugging, right?

I'm sure all of us, all of us have seen an error message at one point, typed it into Google, found the first stack overflow thing and like started our, you know, our, our investigation there. Right? Certainly, there's some better generative AI version. It's not going to solve every problem, right? But for the, again, the 20th percentile of the most common things, right?

If you could aggregate all your historic data and kind of cross company data, right? You could probably find some patterns, right? When you see these four events fire, right? 80 percent of the time it's because of this symptom. 

Marty: Exactly. You know, we have a large corpus of like we do these root cause analysis meetings every time there's like a critical issue and we have such a huge corpus of information that you can start leveraging, you know, that information to do a lot of infrastructure automation.

I firmly believe that. You know, in the next couple of years, a lot of the biggest innovations from the infrastructure are going to be the application of machine learning and generative AI to scale out infrastructures. And everyone's focused over here on like the, the sexy chat bot or information mining, but like, at the end of the day, we're an e-commerce company, we run a large technology platform and, we want to get better and faster and more efficient in operating and running that platform. And that's where I feel there's still a lot of room for, you know, generative AI and these technologies to be, to play a major role in doing that. 

Evan: Is there a application of AI that most people, that you feel like most people underestimate, right?

That where you feel like you're kind of bullish on the opportunity for impact that, that you think other people, you know, don't fully appreciate or, you know, underestimate to some large extent. 

Marty: The two things that I see, I think the, the customer care side or the call center agent or the customer care side, which is a huge part of our operating costs.

There's nothing smarter than a learning bot that, you know, can go through all the calls that have ever happened within your company, and marry that with all the information around how a product actually operates or functions or all the fees. No human can ever understand all of that. So, You know, there's a huge opportunity in the customer service, customer care side of the equation.

I think it's bigger than in the conversion side of our business where we can get a lot of value add and people are still talking about using this to, you know, create things where generative AI can really automate things. So I think the customer service, customer care side is a huge untapped space. And then two, I think the other one is, you know, modernization of technology infrastructures.

I'll give you a great use case. When I was at Shutterstock, we had a huge problem in moving from a Perl stack to a modern React stack. And the problem was reverse engineering 20 years of code to actually operate in React. And so, you would have to hire a consultant or people. These large language models can figure out how to refactor your code better than a human can.

So the accelerations of modernizations of legacy codebases is another area where there's not a lot of focus on it, but there's a huge opportunity. You know, I, I'm a very data driven person and, if you see in large fortune 500 companies, still a large portion of the tech budget is on support and maintenance.

They can't figure out a way to unlock those dollars to put it towards innovation. And I think generative AI and AI in general is a great opportunity for those companies to unlock their maintenance and support budgets and, you know, shift those to innovation by applying these capabilities. 

Evan: That's a great technology leadership insight. Okay, we got, um, I'm probably super over time already. So, um, let's just kind of switch to a quick, uh, lightning round. So, you know, for the last five minutes, I'd like to do a couple of just, you know, just kind of shorter, punchier answers. So looking for like the, the one tweet version, I know these are going to be very hard questions to answer in one tweet, so feel free to, you know, take your, take your time. 

Okay. First question, how should companies measure the success of a CTO? 

Marty: Employee engagement, that's the first thing, the ability for a CTO to drive increased revenue and drive bottom line EBITDA growth, that's the second thing. And the amount of features and functions that they can actually churn out that drive value added insights for customers.

Evan: What is one piece of advice you wish someone told you when you first became a, you know, CTO? 

Marty: The key to being a great CTO is to have a high level of emotional intelligence because every engineer is completely different and not being able to connect with people on a personal or an emotional level leaves you with the inability to inspire, motivate, and have frank conversations around performance.

Evan: That's probably good, good advice for any leadership role at any level. So maybe switching gears to their personal side, is there a book you've read that's had a major impact on you or your leadership or how you think about leading organizations? 

Marty: The one book that I am reading right now that, uh, from Michael Lewis about FTX and Sandbank Benfried has made me think a lot about emotional intelligence and, you know, what it takes to actually run a very successful engineering and technology organization. I recommend anybody Uh, to read it because it's very engrossing and there's a lot of leadership insights into how Sandbank Benfried, you know, operated as a person and to see some, you know, flaws in his overall leadership style early on were very insightful.

Evan: Okay, final question, which I know is probably a tough one. Um, I know it'd be hard to do in the one tweet format, but um, what do you believe will be true about technology's future impact in the world that most people would consider science fiction today? 

Marty: The ability to calculate space travel in a way that is not impeded by the limitations of our existing infrastructure today.

So, um, you know, traveling to Mar you know, space exploration to Mars, you know, and and, you know, traversing galaxies, I think, while it may sound science fictiony today, I think by the time I die that will potentially be a reality. 

To the auto generation of body parts and limbs, and being able to artificially grow hearts, arms, legs, eyes. While that may sound science fiction today, I think that'll be a reality in the near future as well. 

And then three, I think climate changes, you know, is, is a reality. I think there's gonna be millions of people dispersed in the next decade just from, you know, water levels increasing and rising. And, you know, I think technology is gonna be a great enabler for, for combating climate change, uh, in the foreseeable future. So, you know, I'm super excited about that as well, but cautiously optimistic. 

Evan: Well, the, the, uh, astrophysicist and science fiction nerd in me, uh, has to resist asking follow ups here because we're out of time. But, um, Marty, I did really appreciate you taking time to join us today and I'm looking forward to chatting again soon.

Marty: Yeah, thank you. I really enjoyed it, Evan. So thanks.

Evan: That was Marty Brodbeck, CTO of Priceline.

Saam: Thanks for listening to the Enterprise Software Innovators podcast. I’m Saam Motamedi, a general partner at Greylock Partners.

Evan: And I’m Evan Reiser, the CEO and founder of Abnormal Security. Please be sure to subscribe, so you never miss an episode. You can find more great lessons from technology leaders and other enterprise software experts at enterprisesoftware.blog.

Saam: This show is produced by Luke Reiser and Josh Meer. See you next time!