Your AI Injection
Is AI an ally or adversary? Get Your AI Injection and learn how to transform your business by responsibly injecting artificial intelligence into your projects. Our host Deep Dhillon, long term AI practitioner and founder of Xyonix.com, interviews successful AI practitioners and domain experts to better understand how AI is affecting the world. AI has been described as a morally agnostic tool that can be used to make the world better, or harm it irrevocably. Join us as we discuss the ethics of AI, including both its astounding promise and sizable societal challenges. We dig in deep and discuss state of the art techniques with a particular focus on how these capabilities are used to transform organizations, making them more efficient, impactful, and successful. Need help injecting AI into your business? Reach out to us @ www.xyonix.com.
Your AI Injection
Expert Tips for AI Implementation and Data Strategy with Paul Lewis
In this episode of "Your AI Injection," host Deep Dhillon chats with Paul Lewis, CTO of Pythian, to explore the frontier of AI innovation and the practical challenges that enterprises face when implementing AI at scale. The two discuss the evolving needs of businesses as they transition from basic data management to advanced AI applications, and Paul talks about how Pythian helps organizations refine their AI ideas, prioritize projects with the highest ROI, and navigate the complexities of integrating AI with existing systems. The discussion also covers the importance of grounding AI initiatives in solid data foundations, the strategic benefits of utilizing embedded AI features in commercial software, and the critical steps to ensuring that AI projects are innovative, sustainable, and scalable.
Learn more about Paul here: https://www.linkedin.com/in/paullewiscto/
and Pythian here: https://www.linkedin.com/company/pythian/
Learn more about implementing AI in your business:
[Automated Transcript]
Deep: Hello, I'm Deep Dhillon, your host, and today on Your AI Injection, we'll be exploring AI strategy and data management with Paul Lewis, CTO at Pythian. Paul brings over 30 years of experience in the technology sector with a focus on helping organizations navigate The challenges of AI innovation. Paul, thank you so much for coming on the show.
Paul: Thank you. Now that you say 30 years, it makes me sound very old.
Deep: I've been around for about that time. I just say decades. Why don't we get started? Can you maybe tell us what is a typical customer for you? Why do they call you in? What does the interaction look like? And are they coming to you with an explicit AI need? And maybe if so, how is that articulated?
Paul: Sure. Customers come to us for either data needs or AI needs.
Like it, it all depends on the situation. If it's a data need, it's either. I have a traditional database and I don't have the skillset to be able to manage that, or I have a severe performance availability problem and I need to have support of that, or, um, I've been asked to do 30 things and I only have enough people to do five of those things.
So they need some scale, right? On the AI front, it's mostly I'm well read on the topic. I have 400. Ideas in my head, help me narrow it down to the fives that make the most sense. And then potentially help me implement the first one through N and then as I implement one through N I'll build my own skillset.
Deep: So would you say like your typical customer already has a data warehouse in place that existed, maybe evolved during the. Quote, big data era. And they started aggregating, you know, across all their different projects.
Paul: I would say 50, 50. So 50 percent of the customers were more mature in that they would have a data warehouse or data lake.
It might be a homegrown one. So it might be a traditional database, an Oracle database to which they're renaming to a data warehouse, or they might have implemented snowflakes and data bricks or a native implementation of BigQuery, things like that. The other 50%. You know, as a percentage, IT is small, right?
They might have 2 percent or 1 percent or half a percent and therefore, you know, billion dollar organization with a dozen people in IT. So, so they're not really that advanced in the actual data management or analytics world. And then we need to effectively start from scratch, right? Create a data platform for them from that start to create value.
Deep: Got it. And like on the machine learning AI side, are they typically customers where this is their very first AI project, or do they already have data scientists in house and maybe they don't have access to them for this project,
Paul: same split, bigger, the organization that might have a. Full blown data team that might also have a full blown data science team.
Yeah, they've, they have multiple models in production. They're using it for, cause we do a lot of CPG and retail. They do a lot for basket analysis as an example. And then the other half, a lot of their data work happens outside of IT. Maybe it's the marketing team doing marketing analytics on it. Customer data platform versus an enterprise data platform, right?
So they really only have the business analysis skill set versus the, uh, data analysis skill set comparatively. So they're looking for us to say, I have a bunch of what we perceive to be a data centric problem, or even a generative centric problem. Can you match the tools to the problem?
Deep: How often are they Kind of like A.
I. Problems core to their product lines. And how often is it more like, Hey, we just have a complicated set of customer analysis that needs to be done and it's not necessarily kind of viewed as kind of core I. P. to the core product lines.
Paul: It's, it's, it's a brilliant question because the math is pretty consistent every time I do a workshop.
So it's, it's usually, let's whiteboard 400 interesting ideas. 350 of them are analytics, dashboards, reporting, right? Things I can do with traditional tooling. I just, they just didn't know they could do that. Right? Uh, the remaining 50, 30 of them would be traditional machine learning, right? Their segmentation classification, you know, use multimodal implementations if they needed to.
And then the last 10 are generative in some way, but five or more might not actually have an ROI, right? They are complex. They don't have the data. Uh, they don't have an appropriate You know, um, defined LLM that would be helpful for them or it requires too much grounding or, um, they have intellectual property problems.
They can't use their customer data as an example, therefore we narrowed down to the five that remains. And of those five, they're either enterprise search or kind of a chat with my data. Process, uh, which we can easily implement with existing tools or use embedded functions of the tools that they already have.
So we, we, we take that big list of 400. We can solve all of those 400 with them, but when it comes down to a generative implementation, then it narrows down pretty quickly.
Deep: So maybe, maybe we start with the problem formulation, well actually let's even back up before that, like how do these people even know about you and come to you, and if they have 400 AI ideas that sort of implies that they know something about AI, so how did they get there, and then how did they call you, and then maybe let's walk through what's your actual process for finding out and identifying those AI opportunity candidates, and how do you kind of help them think through Which ones to pursue and which ones to keep.
Paul: Sure, sure. So our brand is very, very data centric. So Pythian is based on Pythia, who was the high priestess to the Oracle Apollo. We were a Oracle services company, right? That was the origins of the company. So we've been doing this for 27 years ish. Our skill set is managing the actual core implementation.
So we have, we managed 25, 000 databases and 2000 environments for 400 customers. It's a broad appreciation for the main environment of our customers. From that, since we operate their core environment. We know that we can create insights from those core environments. So they'll come to us for the data warehouse, data lake, analytics side.
And then from the analytics side, they extend that into the machine learning slash AI side. That that's kind of the progression we take. It tends to be it centric versus business centric. However, what we have found in the AI world is that because. Most of the news they're reading and the most of, of the invention that they're experiencing is on the consumer side, right?
Cause so everybody's had a chance to download, use chat GPT as an example. They might even subscribe to the 10 a month version. So they know what's possible on the consumer side. And then everybody from the chairman down is saying, thou shalt implement this coolness in my organization. So let's figure out how we can do that and part B of your question was well How do we get those interesting 400 ideas?
They are well read on fun interesting use cases They also have a series of problems. They've yet to solve from an analytics purpose and they group them all Into one big bucket to which sometimes they call AI what we do with them and say, well, let's put them on the board. Let's put them in those actual buckets.
Right? So describe to me this problem. This sounds like a dashboard. So let's put this in the analytics bucket. Uh, what you're telling me sounds like you're trying to draw a best fit line. Right? Let's put this in the machine learning bucket. What you're describing to me is the creation of an answer or a change to the customer journey to support, you know, an augmented interface.
That sounds very AI centric, so let's put it in that bucket. And then we look at the most difficult bucket, which is the AI one, because they tend not to have the skill set, and say, Let's put some simple ROIs on here, right? What's, what, what do you have? What do you own? What tool sets do you need? What kind of skill sets exist or don't exist?
And then let's find that happy ground of things you already know, tools you might have already have access to, and skills you already might have. From those, we find the two or three we could actually implement in a relatively small period of time.
Deep: And tell me, like, what's your engagement look like during this phase, this kind of discovery phase of the project?
You know, is this thing elapsing over a couple of weeks and, uh, and the engagement is structured to produce a plan, you know, of what you're going to work on? Or is it a part of, like, is the, is the relationship already kind of like ink and this is just kind of like. A new project comes in on top of the database services work that you've already been doing.
Paul: It tends to be incremental to the work that we have, but certainly we get AI sort of entrance work. Think of it as sort of the four stage process. And it usually follows these four stages. Stage one, education. So I will even personally come in and we'll do a two hour, what's real, what's not real, what's true, what's not true, what's fact, what's fiction, right?
Especially of generative AI, but AI,
Deep: yeah.
Paul: And because I do, um, some academic side on the AI. I can bring the academic perspective in too sometimes, which, which helps with the math. So we'll start there. That will lead to, let's say the one day workshop that will include the 400 use case assessment. So we'll whiteboard those interesting ideas that you already have, invent some new ones in the room.
That audience tends to be the same as the first audience. First audience is pretty broad. I've had founders, I've had CEOs, I've had CMOs, CFOs, because they, there's a broad appreciation for its value. Second one, double clicking tends to be the next level, right? The VP of finance versus the CFO, uh, the head of supply chain versus the COO, the PMM versus the CMO as an example, because they'll have the interesting ideas because the outcome of that workshop is.
Not only a sub selection of the use cases, but then some sort of plan to say, if we're going to implement the first three, here's what a roadmap would look like. And then the very next step is the implementation of the first one, for the most part. And in a 12 week process, we will deeper dive on what that is.
We'll figure out what those data sources need to be. We'll figure out what models make sense. We'll ground fine tune, you know, do everything we need to, to appreciate whether this model is a good one or not, or valuable or not. Um, and then we'll help them with the application development if it requires pre processing and post processing.
So some implementations require a. Is this a question I should be answering? And then post processing, is this an appropriate answer to give, which moves some of these projects away from the BI team into the software engineering. Because it becomes a software engineering project. In fact, when I talked, I talked to, let's say, a hundred CIOs and CTOs on a quarterly basis, they would tell me their biggest mistake in AI or generative AI was giving it to the CDO, the data officer, when they should have been giving it all this time to the application engineering team, because it truly is an application engineering project.
Not just an implementation of a dashboard.
Deep: Is there a question in your customer's mind about whether this is an appropriate thing to have go outside of the company versus like how do they make that decision of developing in house versus out of house?
Paul: Definitely existing clients are typically okay because they're already using this.
You already
Deep: have a long term relationship with them, but is there an added sensitivity because there's generally a perception that You know, AI projects are like core to the intellectual property of the business. And
Paul: it is rare to have the first few use cases to be externally accessible. Very rare. In fact, it's just as rare to have the first few use cases to be outside using embedded features.
Versus build features. In fact, one of the biggest educational exercises in the first workshop is do you realize the vast majority of the implementation will be an embedded feature co pilot within three 65 or Gemini within workspace or co pilot within GitHub, right? Uh, or even in their core systems, jewel within SAP, right.
Or Einstein AI and Salesforce. These are the. Um, I'm going to use the word easier, but I don't really mean easy, right? These are the easier implementations of Gen AI because they already have access to your data. They're already part of the core environment. I don't require to build another portal, as an example.
Uh, they already comply to my regulatory and security guidelines. So all of the potential risks, in many ways, are solved. But even more importantly, It is likely a use case that is already based on the flow that you're using, right? So if part of the selling process is to build a forecast plan And there's a feature in Einstein at AI to build a forecast plan Then you just have to click the button in the exact same Spot and process you'd click that button anyway I don't have to invent, I don't have to create a change management problem.
Versus building something, which I have far more control on, and it might actually be, you know, more specific to my business. I now have to put it somewhere. I have to make it available for people to use, and it has to, it may or may not be outside of their actual day to day lives. Right. Which means it's a change management problem.
Tell me a little bit more
Deep: about how are you interacting with the teams? Like, are you interacting with the product team that already has the data, has a UX experience and whether it's an internal tool or an external application and your sort of team is embedded with them, or is it something else where it's more like an IT group?
That's like servicing a bunch of teams in house.
Paul: It tends to be an internal IT team, and we tend to convince them that these projects will only be successful if you create a fusion team. If you create a implementation team that crosses the boundaries. And while they might have those other teams for let's say their ERP implementation, it becomes far more relevant in an AI implementation that has more risks associated with it.
Right, because I, I need a better appreciation of the business and my sort of customer contractual boundaries to know whether this is an appropriate use case or not. So that's kind of the first step. And then, um, sort of the academic exercise, they generally don't have the expertise and skill set to build these things, or they might not have even the sort of prompt engineering skill set to know how to implement an embedded feature.
Right. So there's a lot of champion challenger trial and error with the features in a sandbox before they're even comfortable turning on the feature in production.
Deep: Got it. So, so your folks end up being that fusion team that you're describing largely, but they must have. Be paired up with somebody that can at least help navigate the social hierarchy.
Yeah, there'll be some
Paul: IT, a marketing folk. The use case tends to be department specific, right? So there'll be a marketing use case, an operations use case, a customer service use case, even a finance use case. So one of those departments tend to be the lead on the use case, not the lead on the, uh, practice, but the lead on the use case.
One of the important things I do in the initial workshop is what I generally refer to as zooming out. One could easily be well read on AI and immediately think applications and agents, right? You immediately think this, these are cool things I can build with the tooling that's available to build. What you don't think is AI is much more of an organizational capability thing, right?
So if I zoom out and look at the entire organization, I'm saying, well, I've got bigger buckets here, right? I have a, I have a educational bucket. Maybe I need to look at Coursera and say, you know what? Let's create learning paths from the chairman of the board down to the administrative staff to help them understand the world of prompts.
Yes, they've been using chat GPT, but they've been mostly using it as a displacement for Google, right? It's just been search results. They just see it in a paragraph form, right? Well, how does the distinct difference between consumer and enterprise? Make me think about risks, which is that sort of learning plan, right?
The learning path. The next bucket is the build stuff, but it's not just go ahead and start building an interesting agent. It's do I have the tooling available? Uh, has it built a bunch of demonstrations so that people can figure out how to actually use this tooling? Have we thought about the. Tools that we use in it for things like application development, IDE, right?
How do we enable those kinds of features within the build side? How do we do code conversion as an example, or, or code, uh, optimization that's part of the usage of embedded. And then finally, how do I give them an understanding of all the. Core software that I have that have embedded features. They use Salesforce.
They use Atlassian, they use Jira, all of these things have Tableau, right? They all have embedded features and let's enable those embedded features, including their productivity tools. So I kind of say
Deep: embedded features, you mean in their internal dashboards, their
Paul: internal of their Commercial soft software they that they already buy enabling those features is still AI functions.
I have to do right. In fact, it might solve some of those use cases because they already are much closer to their day to day lives.
Deep: So part of your process must be to take an inventory of all of those existing stacks that they have various data in. Correct. And maybe understanding how they interact with their teams or something.
Paul: In, in fact, I use that, um, AI capabilities map, that one slide. I actually have put together the Pythian version of it to say this is actually Pythian's map of how we use internally AI from education to embed. And I use that as the starter point. I'm going to erase Pythian. I'm going to insert your logo and say, here's the map.
So let's describe your organization based on this map. Let's take out Atlassian. Let's put in ServiceNow. Let's take out, uh, Tableau. Let's put in Power BI. So it becomes incredibly obvious that you have invested in a bunch of core systems and we're going to put those core systems. We're going to put in the tooling you're more likely to use, right?
Maybe you're a vertex AI versus, you know, GPT implementation. Great. Let's plug those in or you're not a Coursera subscriber. You build your own education, right? Great. Let's plug those in. And that's kind of where the starting is of the zoom out. Let's let's have this conversation by. Plotting what your capabilities look like.
Deep: So would you say that, that you're helping these IT orgs build demonstrative applications that they then kind of are encouraging their other teams to go off and embed into their actual applications or.
Paul: Two, I would say, yes, I'm encouraging it to look organizationally and to enable, not just create agents.
Right? So enabling means put as much effort into ensuring that the development tools exist for AI as you do building an actual use case. Because I can't just turn on the tool, Vertex AI, and hope people are going to use it. I've got to turn on that tool. I've got to build a demo of using that tool, and then teach developers in my organization to use that tool, right?
That has to occur. I can't just hope people understand, right? So that, that's sort of part one. And then, uh, we help them implement a few actual use cases, either embedded, Or build so that they can see the whole end to end process, right? What it takes to go from interesting idea to implementable use case.
Deep: And so I imagine like, I mean, a lot of big orgs already have a, you know, a bunch of teams, some subset of which already are integrating some AI based features into their, into their work. So I imagine you're interacting with them and trying to like, maybe elevate the visibility of their approaches and successes, something like that.
Paul: Add it to the list, help them changing the priority a bit. Sometimes they like the external opinion, right? If we're talking about an embedded feature in, in their Viz tool, I can help them say, you know, the semantic layer is more valuable to you than, you know, a more dynamic dashboard. We help them with better ROI implementations because we've seen so many of them.
Deep: Got it. What are some of like the big questions they ask you like, you know when when you're in there Let's say after you've gone through this discovery process after you kind of haven't maybe a stack ranking of their AI I call them opportunity candidates Maybe after you understand a little bit about their team and their tooling and how they work at that point, you know What are some of the core of things they're asking about?
You know, is it behavior analytics for their customer data? Like what kind of steps do they ask here?
Paul: Like, like big top use cases are clearly distinct for each department, right? But if I look at marketing, they're looking to say, what's propensity to buy? What basket analysis to say, Uh, what's their, what's the next best product they're going to purchase based on what they've purchased in the past, or even things like predicting, like in a grocery store chain, they might say, based on what they're buying now, predict what they plan on making tonight for dinner.
And once I can make that prediction, look in their basket to see if they're missing a product. And if they're missing a product, pin them to say, Hey, there's 10 cents off flour, because we noticed you're missing flour, like those kinds of Customer journey changes that they want to be able to do in real time.
It's very valuable. They're using, let's say a mobile app or online to buy their groceries, right? Cause then we can see this in real time. So those are the big use cases, right? Um, in finance, it's about forecasting in supply chain. It could be as simple as. determining which truck I should send to which address based on how big the package might be.
The package might be a letter. The package might be 30 cases of a flower, right? Or apple juice, right? That makes a big difference in what The truck might be, and then it might make a difference in what truck you can use and what path to fill the truck. Cause the goal of the logistics company is to fill the truck.
The goal of that company is not to send empty trucks all over the place, right? As an example. So there's lots of those internal use cases that are. Just as beneficial as the external use cases.
Deep: Maybe walk us through like, what does it look like from a, from your team standpoint, like what skills interact at what stage of the process?
So, you know, like when you first meet up with them and you're assessing the AI opportunity candidates, like who in your team is there and what's their kind of general, you know, Sort of skillset background and then when you're getting down to actually building out a recommendation model or some time series forecasting for finance What does that team makeup look like and maybe put it on a timeline for me so I can understand sure
Paul: So I have a field cto team So myself and and others on my team who are ex cios ex ctos Um, have a mastery in AI strategy or technology strategy, maybe even had a consulting background.
We start that conversation. So it's us doing the academic workshop, the educational workshop to say what you thought it was because you used a lot of chat GPT isn't really what it is in an enterprise sense, right? In fact, we do a lot of the difference between consumer and enterprise. We have that conversation.
The risks that exist in enterprise are greater than the risks that exist in the consumer. So we'll have a field CTO team with high order, high academic mastery in some of the topic, right, that we'll bring to the table. That goes all the way to the workshop where we We do the 400 use cases and we narrow it down to the five
Deep: and that'll be one of them alone being out meeting with customer or is somebody else with them?
Paul: There's likely a sales team there, a customer engineer, somebody who's has a business analysis background that are reading the room, getting understanding of priority and you know, documenting where they need to document. Sure. Certainly. Yeah. It can't be a single person doing that kind of stuff.
Deep: And so they're doing that one day workshop and the output is, are they following up after with a plan or something or yes,
Paul: so the output of that workshop is the whiteboard of the 400.
It's the plan to implement the 1st 5 in many ways, um, and a, an offering to actually implement the 1st 2. Which leads us to the next project, which is implementing the first two to three. That team makeup is different. There is consistent. So the CE as an example, the solution architect would be consistent amongst the first two.
But now the team makeup on the implementation side is people from our data science team. These would be actual data sciences, which have likely a expertise in the industry, right? They might have a retail CBG background, a healthcare manufacturing background. So they already have the nomenclature in their head, right?
They already have an appreciation. We also have a data engineering team because the vast majority of our use cases require data from the core systems, which may or may not be available in, let's say, a knowledge base or a data warehouse. All right, so we've got to engineer something to support them. We tend to also have an infrastructure team, mostly because at least half of our clients don't have a, let's say a cloud based enterprise data platform.
And so we're starting from scratch in many ways. So not only do we have to do the integration side, the data pipeline side, but we also have to have the destination side, the data lake side. Or, or just a simple knowledge base, right? But that has to occur. And then we have a BA effectively assigned to it so that they can be the product owner, right?
In the agile methodology to deliver on the particular use case for the customer. It tends to be somewhere between six to 12 weeks, depending on what exists. So if there is no knowledge based enterprise data platform, it takes 12 weeks. If some of that pre req technology already exists, then it's much more like six weeks, right?
If we don't have hurdles of data accessibility or data governance, it gets done much quicker, comparatively.
Deep: And are they done done, or is the team kind of assigned for a long time? track relationship and are they maintaining the systems and actually, or is it more like a, a knowledge, like build a first implementation, get it in place, bring the clients people up to speed on how to maintain and maybe develop further, like what does the long arc of the partnership look like?
Paul: We do a lot of knowledge transfer because the goal of most of these companies is to build their own internal skillset. Yeah. What, what we tend to see is we'll do the first two, three, And they'll have obtained a skill set to do, let's say the next three, but they're rarely satisfied with the next three.
We, they tend to come back and say, you know what, we've been asked for 20. So we need, we need team B, right? We need somebody to oversee. We need somebody to appreciate the ops, the X ops version of this. So we tend to manage the pipelines and help them be plan B or sorry, team B or team C. Building sort of parallel use cases.
So a great example would be, you know, very big e commerce environment where, you know, over 20 years, they built a monolith application with a monolith database. So they needed to move into services oriented. They wanted microservice application deployment. Therefore they needed microservices database deployment.
Um, and they calculated that to be a 47 year journey. So they went to Pythian and said, how can we make this more like a two year journey versus a 47 year journey? So we helped them with a generative AI code conversion tool set. Said, okay, well, what we're going to do is help you with the monolith code conversion to microservices in a different language.
And the SQL and schema code conversion on the monolith database to, uh, services oriented database implementations. So instead of hand coding all of that relatively large modernization of application set from datacenter even to cloud. So not only is it architectural change, but it's actually a destination change.
We implemented all of the AI tooling to make that happen. And then, of course, we taught them what that tooling looks like. We still manage that because it's complex and it has to learn, it has to grow, it has to get better over time, right? The conversion needs to be better. Arguably, even the tooling changes over time because models are better, right?
So we don't really get involved in the actual code based implementation. We get involved in the AI automation of the code. It's a nuance in words, but it's a pretty dramatic difference in how we interact. And the reason why they would come to us is because we manage that monolithic database. Right. So we have a pretty keen understanding of the data model, and therefore we have a pretty good understanding of what a services schema looks like as compared to the monolith schema.
Like, that's our expertise. And based on that, we used our AI knowledge to say, well, Converting X to Y requires this kind of automation.
Deep: One of the things that this makes me think of is, I mean, you've even mentioned that you see common solution templates, if you will, kind of repeatedly. Do you have a process for developing your own platform or your own code to like help reduce the time for future engagements? And if so, maybe walk us through like, how does that look?
Paul: So we have a delivery team. So think of a team that. You know, it's billable, it's assigned to projects for customers. And then within my team, under the CTO office, we have a software engineering team, an IP engineering team. The purpose of the IP engineering team is to build reusable software for either project or delivery acceleration or delivery efficiency.
So imagine there's a set of IP less set of software we build for the purpose of making the managed service better, more efficient. We can manage more hosts per DBA as an example, by creating software. And then I have a set of intellectual property to make 12 week projects, two week projects, reusable code.
That's both. Use case based, i. e., I've seen this marketing use case many times. So I have built this code and I will reuse it for other circumstances or things like a Terraform code or Python code for data pipelines that I know will always be true because lots of people use Salesforce and lots of people put Salesforce in BigQuery.
So I've already pre built those pipelines, that Python code, and I licensed them to Salesforce. My customers perpetually so that they have access to that code and they can modify them as they need to. So we have a separate team that's watching for code that we built and creates sort of reusable assets for them.
Most of our,
Deep: I'm sorry, um, when you say they're watching, like maybe I want to double click on that a little bit. Are they embedded with your implementation teams? So they know, or are they like literally just looking at a GitHub repo and seeing a bunch of stuff coming in and asking questions and, and do they have relationships with those, the folks that are on the front lines building?
We have a
Paul: formal architectural review board and that formal architectural review boards looking for software to which we develop. Know that a good portion of what we do is database management. In many respects, scripting, right?
Deep: Yeah. I'm,
Paul: I'm less concerned about the scripting. I'm more interested in. Code that we build outside of the scripting environment.
All of those ideas and concepts for a customer project, our own project, goes to the architectural review board. That is a combination of delivery, what we call IPCs, which is our highest order architect, and our software engineering architects together. They come to the conclusion whether this asset is reusable or not.
If it's reusable, it becomes an IP project to build versus a delivery project to build. And that way, even the effort To build the software comes out of the customer project and moves into the IP team.
Deep: And then they'll rebuild it from scratch or something with more platform, correct for reusability
Paul: in mind.
Yeah. For availability in mind for performance in mind for multi tenant implementations, right? All of those abilities that 1 wouldn't implement for what what's important to know, which is, I think, unique to pathin is. The vast majority of the work we do with customers, anything we build is owned IP wise.
And we license anything we build to our customers to use perpetually.
Deep: Well, can we do that a little bit? Cause that as somebody who also owns a consulting company, that's easier said than done. How do you actually get them to agree to that? Like, especially when you're dealing with potentially vulnerable or sensitive stuff that they could easily view as being central to their IP.
Thank you.
Paul: Uh, we have not really had a lot of pushback in all fairness because they're used to us building a lot of automation as part and parcel of their managed service. They are comfortable that we protect their core database assets, and therefore we will protect their interests in intellectual property.
But the important part there is they know that if we take out that IP that they find is interesting and put it into the IP team, that that's now a cost reduction to them. They now don't have to pay for that, right? Because I know it's. And building as part of the Pythian platform, we'll call it. And therefore what they would have paid to implement this use case is much smaller now.
So they see real time value by not having to pay for IP that we would deem as reusable. The interesting point there is most of our customers say. I don't think the software, I don't think the code is the secret sauce. The secret sauce is the combination of the code and the data we have. You're not taking the data, Pythia.
You're only building the code. So the secret sauce is the combination. That makes it valuable to me as a customer.
Deep: And do they need to be involved in all the decisions that your IP team, when they pull something into a platform, are your clients having to approve those on a one off basis, or you have like a generic agreement that you can.
Paul: Generic agreement.
Deep: And so there's just got to be a lot of trust there. I imagine that you're not going to be pulling things that are kind of core to their product or something.
Paul: Correct. Correct. Uh, in fairness, they will do audits of code. Especially from a security perspective, right? So if we're building code and deploying in their environments, obviously they would be concerned over that.
Uh, but since we manage their databases, as an example, uh, we already have agents that are running in their data center, in their hosts, in their cloud, that are event collectors. And they've already done assessments on that code, right? So they're generally comfortable with things we build, deployed in their environment already.
But yeah, there is IP protection for them and us. There's reps and warrants. Thanks. For safety and health of those environments that they're, they're generally comfortable with our builds and, you know, they test, right? The reality is we rarely just put it straight into production. We put it into a testing environment.
They test it. They're the ones that promote it to their production environment. So they're satisfied when they're ready to go, right? So there's, there's a safety net, not unlike any other IT project.
Deep: I guess I suppose the reason they're kind of invested is they know that that you're also Pulling things in from other clients, and therefore they're getting kind of an increased efficiency benefit.
And it's almost kind of a communal sense where they have to kind of contribute to get that efficiency.
Paul: And they know that we're not pulling in their data, right? Well, we might build an interesting. You know, product basket assessment service. Uh, they know that we're not accessing their data to support that we're deploying it in their environment and that service is accessing their data.
So we, we don't share their data amongst customers cause we don't even really see their production data. Right. We're not even in that side of that equation for the most part. Yes. We manage their databases, but you rarely actually have to look in a database to manage the database.
Deep: Right. And then that platform that you build is that, do you open source that, or is that your own license that you just give back to them?
Paul: It's our own license. Yeah. We don't open source.
Deep: Got it.
Paul: That being said, we manage 34 different database technologies. A good chunk of that is the open source set of technologies in that world. We absolutely contribute back to those open source projects.
Deep: Got it. Another thing I was thinking that might be fun to double click on is, so you've mentioned some generative AI projects, you know, and I think you also mentioned.
This idea of, um, I'm kind of reading between the lines, but I'm thinking like chat bot with sensitive private data, something like that. Maybe walk us through some, maybe an example or two on that use case. It sounds like your clients, like almost everyone out there, their execs are, they interact with GPT four or anthropic or clod or whatever.
And they're like extremely impressed with what they see. And then they go to their own website and they see this really dumb customer service bot from, you know, two or three years ago. And so people get, are asked internally, like, what the heck, how can we close this gap? Maybe walk us through the evolution of one of those projects.
Like, you start with a GPT 4. Interaction and then do you start pulling models in house, you know, maybe using Llama 3 or something to, to deal with privacy issues, like, or are you going to Microsoft's hosted version of GPT 4 or is it something else? Walk us through some of the challenges that appear there.
The vast majority
Paul: of the time they will implement a. I'm going to use colloquially chat with my data, right? They will implement those use cases internal burst, right? And the obvious easy one, right, is internal knowledge bases. So it might be their confluence environment, right? Or their SharePoint environment or their, their drives.
Right. SharePoint drives or, or Google drives, right? They have access to all those knowledge bases and you would connect an internal chat with my data to that, right? So, uh, what's my vacation days and, and what's the policy to support that? And it'll create a paragraph to support that answer, right? Uh, the reason why they do the use cases, cause they want to try out accessibility to the information and accuracy of the response.
And what they generally find is it's decently accurate. Impressed. A specific question was asked, however, it's a little too generic in the response for the policy. In other words, it just repeats the policy that exists on the Confluence page. They are less enthused by things like IT helpdesk chats, where, Hey, I need a password reset change, or I, this system happens to be down, and they'd use a Slack message to IT support, as an example.
Most of those automated interactions are, uh, tell me more about this thing because I don't understand the question? Or did you really mean x, y, and z instead of what you actually asked? So they get very frustrated with that interface because a human would have a much better appreciation of the question and get the answer much quicker than a human.
Deep: And why is that? Like, are they not using GPT four level capability and are they not uploading the source content? Well, because if they are, then
Paul: they're using shorthand in the prompt, which the model wasn't either grounded or fine tuned for no rag implementation, but that's kind of, Oh, I
Deep: see. That's what I was going to ask.
Paul: So therefore, if you're using ACT and ACT means something to
Deep: right.
Paul: But it doesn't mean anything else. That it's a lot of tell me more about the question you're asking, right? So, yeah, they might
Deep: not be mining the the history of the conversation to like resolve kind of a teased out latest question That's a common error that we see
Paul: so commonly we see you didn't give it enough Knowledge bases, right?
You gave it one and you still should have given it all as an example. Another common problem is that you didn't ground it or you didn't give it the data dictionary of the company, right? You didn't, uh, you didn't implement a pre processing or post processing. You let it ask inappropriate questions. The best example I have of that is The weather network.
Uh, the weather network, there is a, you know, GPT implementation on the mobile app and you can absolutely ask it right now if you download the app, do I need to wear a raincoat in New York next week? 100%. And it'll give you a relatively accurate answer based on what it knows about the weather. But I could also ask it.
tell me the origins of Pythian and respond in the voice of Hulk Hogan, and it will, right? So there's no limiting factor to both the question and the response. And they don't know how
Deep: to put the guardrails around the, uh, yeah.
Paul: Correct. And the same internal use cases can be there. They're asking either too complex or inappropriate questions, and then frustrated that no responses are provided.
Because we haven't trained the model on information that doesn't exist, or you only gave it to knowledge based sources, and the other four may have answered the question. Right. But you never made it access to that content, right? So you could see the user frustration very, very quickly, trying out chat with my data in those circumstances is needed for them to perfect what they think is important once they start to create a changed customer journey, because the vast majority of customers we talked to have said to me, our customers like the way they interact with us.
Don't make them change. And I'll give you the easiest example. We have a logistics company that said, our customers call us 1 800 number and say, um, I have a package to pick up. They have to determine what truck to send. Right? Whether it's just a carrier, because it's a package, or it's 50 skids of apple juice.
What they do not want to do is may ask another question of that customer. They don't, even though it seems logically by us IT folks to say, well, just ask them, how big is this package? They don't want to do that because they do not want to change the customer behavior. They want the journey to be the same, but the automation to be better.
So now we need to implement an ad that doesn't have any more information from the source and say, based on the past behavior of this customer or past behavior of a variety of customers on this trail in this city. What kind of truck should I be sending in a much better appreciation of probability just as an example, right?
So we get a lot of do the internal use case Because we think it will have bumps in the road based on either lack of grounding lack of rag Lack of fine tuning, lack of access to appropriate data sources, perfect that. And then from those learnings, implement an enhanced customer journey that doesn't change the customer interaction.
And I know that sounds weird to us technologists, but that is absolutely the demand.
Deep: So do you find that they're implementing kind of robust efficacy assessment strategies around bot responses? Or are you seeing kind of more naive implementations like. You know, the Chevy dealer that, um, you know, sold the Tahoe for a buck to a customer service agent or the Air Canada thing that offered a full refund for somebody.
Like, are you, are you seeing just a complete lack of robust efficacy assessment in place? I
Paul: think when they first deployed, they were cowboying it. Uh huh. And now they've seen a lot of bad implementations. And now they're either not doing it. Or being much more cognizant of the pre processing of course processing pre processing says Is this an appropriate question to ask and then immediately respond with that's not in the domain of the chat Yeah, right and then a post processing and says I know the answer is seven It can't sometimes be six, sometimes be seven, sometimes be eight.
The answer is always seven, which is always why I talk about prompt engineering. It's very different to consumer versus enterprise. In consumer, you can think of prompt engineering is asking more in detailed questions to get a more precise answer. In the enterprise, in many ways, The answer is only seven, prompt engineering is saying, what are the hundred different questions that will I will ensure or always give an answer of seven, that's the distinct difference.
I mean,
Deep: I imagine in a lot of these use cases, you're not actually using the generative AI to generate the response you're using it to generate the SQL or the, whatever the internal query language is to get the results. And then maybe, you know, it, Put some lipstick on how to do what's the user? Is that fair to say
Paul: it's fair to say, which is why we consider this generally a software engineering project, right?
Because it's not just apply model. Hope the answer is good, right? It's be prescriptive in the question, the processing and the answer so that you don't find yourself in liability. Yeah, that makes sense.
I'm going to switch
Deep: gears a little bit because we're running out of time and I think I just have maybe two more questions. So my first question is like if you're speaking to a bunch of orgs out there about Whether to do something in house or whether to use an AI consultant like yourselves What would the case look like to use an outside consultant?
Paul: I use two words that resonate the most, and it's expertise versus experience. I 100 percent agree that you as a customer can have the same expertise I do. You can attend the same courses, get the same degrees, be as well read as I am. But what you don't have is the hundred time experience to which we have.
We have Failed. We have succeeded. We've seen what is needed. What's not needed that experience. You need to be a successful AI implementation. That's where we went. So we have the combination of yes, we're well, well read and yes, we have experts, but also we've done this hundreds of times and therefore we know what's more likely to work in those circumstances.
And that's the win because. Anything new, as you know, you're guessing in a customer site, right? You're, you're hoping this is true. And what the reading is a lot of bad news of people guessing and they're saying, well, we can't do this anymore. We, we, we need some expertise. We need some experience in here to help.
That's kind of the big.
Deep: Yeah. I use something similar. I usually say we've been down all the dead, the dead end alleys, like sometimes, you know, occasionally even twice because they didn't look quite like the one that we were down before. And, and, you know, when you've done that for a while and you've solved, you know, a wide array of problems, you also just have ideas that I think, you know, if an internal team has really been spending maybe years on a really specific problem, they they're not necessarily seeing how to like, you know, they're not seeing.
50 time series projects with like 50 generative AI projects with, you know, and then all of a sudden you put it all together due to the fact that you've sort of been there and seen it, you tend to have a really different perspective in some ways, you know. And they
Paul: rarely zoom out, right? They get really focused on the use case, really focused on a set of tools to which they've already bought into.
They already believe to be valuable, right? They, they want an open source implementation and think less about. Risk and impact. Right? So you gave a great example just before you said, we're probably not just applying the, you know, the model out of the box. They are right. We are saying model of the box in is inappropriate.
Let's use regular development to do some of the work and use the model to create the paragraph dancer. Right? Like let's think differently versus, you know, simplicity.
Deep: Awesome. So my final question is if you have to look out five, kind of 10 years into the future, maybe with respect to your business and your clients, like what does the landscape look like?
Like, you know, how is it different than today?
Paul: I think there are going to be far more models, far more augmented uses of sort of day to day life. I think there is a little bit of implementation of, let's say, productivity tool, generative AI. I think that'll be far more prevalent in five years from now. I think once the costs get down, Then that we get over the hump when it's 30 per user per month, cheap at 50 expensive at a hundred thousand people.
Uh, once that gets down to pennies or dollars, then you're going to see much wider use. Now it's not just a couple of people in your organization. It's all people in your organization. And. Emails will be drafted. Documents get drafted. Slideware gets drafted. Lots of things get drafted. And I always talk about AIs being less about a volume problem, but a velocity problem.
So the goal isn't to draft more documents. The goal is to draft the document you plan to draft. Faster. So that's what I see on with the other get on with the other giant
Deep: list of stuff to do.
Paul: Yeah. Yeah, so I think that'll be far more prevalent because of the amount of embedded automation that will exist.
That'll just be out of the box. Right. I will start with draft this slide based on this strategy document and the slide will be drafted and then I will just make modifications that will be the norm. You won't open up a blank. PowerPoint anymore. It just won't happen.
Deep: Interesting. Yeah, I tend to, I tend to agree.
I mean, I tend to think people overdo the The loss of jobs argument in a way, I mean, I feel like we're not talking about jobs. We're talking about tasks being automated largely and humans have a really long list. I mean, I've never really been in a company where there was just a finite, somewhat stable list of tasks to be done.
And like, there's usually like way more. To be done than what you can handle at any given point in time and hence all of the, you know, energy around stack ranking. And it seems to me like the ML AI systems are just gonna make smaller and smaller groups of people more and more impactful, which I suppose could, you know, I mean, I definitely there will be jobs lost, but I think the nature of what a job is and the tasks it requires will sort of evolve up, if you will.
Paul: Yep. Yep. 100 percent agree. Yep. There won't be less people. They'll just be faster at what they're doing.