This week, our data scientists discuss how AI-driven body pose estimation technology can be used by athletes to improve sports performance.
Too often, athletes face injuries that could have been prevented by better technique and body positioning. When running, impact at each step, overtraining and poor form can promote undue strain on muscles, joints, and tendons and the cumulative effect can lead to serious injury. Getting professional feedback from physical trainers to prevent these injuries can be costly, tedious, and ineffective.
After facing a major injury while training for a marathon, one of Xyonix's own data scientists created a machine-learning algorithm that relies on body pose estimation technology to assess and correct his running form. The device measures the motions of key body parts based on videos of the runner to assess for inefficiencies such as asymmetrical technique or poor cadence.
If you're interested in reading about this topic, visit our blog article here:
DEEP: Welcome to this week's episode of your AI injection podcast. I'm Deep Dhillon. I'm a data scientist, co-founder at Xyonix and your host this week, we've got Bill Constantine, Carsten back, our regulars to talk about how we can use AI to improve Sports efficiency. So with that bill, why don't you kick us off a little bit? Tell us. I'm kind of like rewinding a couple of years. Go. When I think it was you that got really hot to trot on this idea and and tell us a little bit about the background that you had because I think we all have sort of really connected with this topic over
BILL: time. Yeah, this was really, really fun article to write and put together and it came on the heels of some some technology that Carsten actually had had shown to me as a computer vision guy. And so it marries artificial intelligence with sports, which is I have passion for both, it was really cool and the back story is that I for some reason in this masochist and signed myself up for marathons every now and again to keep from being really fat and getting out of shape. And so I've run quite a few marathons over my time and you know during during some of these marathons and particularly one. Around the Seattle Marathon. I had really horked up my achilles and, you know, was quite injured for some time. And that was absolutely convinced that it was my running form. That was the cause of it. And you know, when you look at like the typical person that runs a marathon and train for a marathon, they literally like on average will run over like some enormous number of steps. I think I estimated at one point like 1.25 million steps. When you're running. And if you have bad form you can imagine that that's just a lot of extra stress on your joints and so forth and so so there was this idea in my head that you know I could be a more efficient Runner and coupled with that was a little bit of an interesting thing going on. I have a, I have a daughter who played kind of elite soccer and she had done so for years and I literally had people come up to me after the soccer game or during the game and say is that your daughter? There. And I'd say, yeah. Yeah. She said, man, she's just beautiful to watch run and it's absolutely true. She would just Glide across the field and when I would go running with her she seemed like it was just effortless. I was like you know, what is she doing that? I just can't quite grasp. And
DEEP: so yeah I think that was when you started bringing up new terms that I hadn't really thought about stuff like asymmetric gate and you know, stride length and you know a lot of kind of details in the running. In the renting lexicon?
BILL: Yeah, when you're when you're looking to optimize in a marathon, you know, for me I'm not like super fast but you're looking to optimize the energy of the you happen to make yourself the most efficient. So there's like things that you try to do there is this notion in the running Community. For example, that if you run at a Cadence, which is the number of steps from minute of somewhere around like a hundred eighty then, you know, hundred eighty one, eighty five and that sort of you know optimally efficient after analyzing, by the way, sort of like Olympic Style. And so forth symmetry in motion is, of course, importance of the. You not overburdening one side or the other, and I don't know if you've ever run, I don't know if you've ever watched these Olympic Runners now that I mention it, but they're their heads. They're just, they're just flying around like the track. But their heads, vertical oscillation is pretty limited they did. Yeah. They do seem to be putting all of their energy in a forward motion. So, you know, in terms of things that I'm thinking about, as a runner, I'm thinking about, okay, I got to get Mike Cadence of which I usually kind of suffer. I'm being I'm 6 foot tall and I usually have slightly lower than that. Like maybe in the 160s Symmetry emotion is not something. I really thought about until this exercise that we're going to talk about. And then vertical head off solution. You can kind of feel yourself, going up and down came out too, because you're standing, you're not meaning enough. You're spending too much time. Pushing upwards instead of forwards Etc. So, there are those measures and quite a few others but those are the things I tend to focus on.
DEEP: Yeah, I mean I'm I've just got this image.
CARSTEN: I had on
DEEP: and everybody's seen this episode of Friends. When I do the character, The Tall Blond. When he's out running I'm imagining that this is most likely not what you're after. Like that crazy spastic run the arms flailing and they
BILL: had blobbing all over the place. Yeah.
DEEP: So with that let's take a turn to the a I said like what is artificial intelligence have at all to do with? You know, with this, you know, with an activity like running and improving the efficiency Carsten, you want to give us a little bit of a kind of background on what we mean, when we say body pose estimation and you know what it what it means to take, you know, just a video signal and and like, how can it give us a decent starting point for trying to help some of the bills problems here on the running side.
CARSTEN: Sure. Yeah. So from my perspective, I actually came to this from a completely different angle. So what one of my, one of my hobbies It's a computer game development. And if you look at animation and computer games, one of the things that it's hard, if you're not a professional animator, it's really hard to animate your own characters. But this is thing called motion-capture, right? And, and traditionally that was done, either with sensors, that people would like have in a special made suit, or with markers on the body. So they would basically take visual markers, a little tennis courts and glue them to your whatever, you're wearing at the ankles at the knees at the elbows. And so, Key points around your body and then there would be like a trifecta of cameras that would just observe the actor. And based on the multiple camera, angles, they would reconstruct your Motion in 3D and then when computer vision, made some huge strides due to deep neural neural networks. It was actually possible to do these. This key Point detection, simply with complete additional neural networks and that's kind of like how the post is summation works. So we have these these computer models that change basically to detect key points on the human body, like the knees. The Boys, the feet. Keep on something face, the eyes mouth nose, etc, etc. And the first stage of these, of these networks, really does exactly that. It generates a heat map given a computer image video image the heat map of these key points, right? And then, you can basically assume that these key points have a certain structure is because they form a human skeleton. So you have some constraints and given that you can really resolve, okay? That must be the left me. That must be the hips that must be the Alamo cetera, et cetera. Another technology just simply advance from there and these days we can do this in real time on our cell phones it.
DEEP: Yeah, I mean I think I think, you know, the first time you see these skeletons kind of moving around, you know, based, you know, kind of off of just a video signal. It's, it's sort of gives you like for those of us who kind of think about what you can do with data, you know, you're sort of light up and you start thinking about, you know, tracking these joints
BILL: sort of individually, you know,
DEEP: getting their, their natural signal patterns out. And then from that may be walking up and Starting to do something, like trying to measure some of these pieces that you know, that that bill, you were alluding to bill, you want to talk us through. Like so what what do you, what is your starting point with one of these kind of skeleton outputs? And then what are the problems that you're seeing to go from that to this kind of kinematics analysis? That like, ultimately answers a question, like, what's your vertical head oscillation while you're running or what's yours? Ride length, you know, or whether you're biasing, you know, whether your left, you know, ankle, maxes out a little bit lower than your right. Ankle, like what is that kind of analysis you would do and in and from what starting point?
BILL: Yeah, that's a great question. I will answer that. I just want to take a brief moment to say how far we've come with this technology in the sense that it's now mobile and available to all people. Essentially. Because when you think about what cars are saying, The ability to hook yourself up with sensors and monitor these your activity that typically is you know, leverage by professional sports, teams college teams, not necessarily the average Joe out there running. So
DEEP: yeah, yeah, I mean, I remember that. I think it was. This might be dating me a little bit. I think there's like 1995. The first time I went to siggraph, which for those who don't know, this is like this, this massive Graphics convention where the whole animation industry is at. And I remember being There and just being wowed by, you know, these Contraptions inside of, you know, like kind of instrumenting, these actors and actresses. And they were, you know, they're like in all kinds of motion and then, meanwhile, you've got these 3D puppets, you know, kind of running around. I think this was even pre Pixar, you know, back then that obviously is an expensive proposition, you know, to instrument, every elbow, every shoulder, every knee every risk of your ankle and and so you have a different set of Business realizations that can come from that expense, but now we're talking about. I think it was, Apple has a now, has body pose extraction in the, in, in the iOS if I'm not mistaken. So, yeah, I mean, total total agreement, that we've come a long way their
BILL: game changer. And when you think about this particular problem, now, all of a sudden, I'm, let's go back to my daughter and me, and think about how this is gonna play out. I basically said there were one days. Like, look, let me show you this pose. Mason stuff that Carson showed to me and she's like, oh yeah, that's pretty cool. I said you know, grab your right, grab your running stuff because we're going to go out to the park. And what we're going to do is I'm going to prove to the world that you're a better Runner than I like okay. Well I already know that I said yeah. He's like I did. Right? That's right. I know that you know that but I actually saw him a scientist Emma. You know what, I want to prove it to myself and also maybe I'll find something new, you know? So how I set up the problem is I did consider Strain, the problem to be pretty too pretty easy peasy. Lemon squeezy, you know, I had my wife basically, take my iPhone and have it aimed at offense, that I The in front of her in mostly, an orthogonal 90 degree view and 90 degree angle. And then essentially, I just ran across the screen quite a few times and had her record, me and then separately record my daughter. And then I took that video then and I use pose estimation to frame by frame identify. Those key points on the body. And now I want to take those key points. As Carson said, It's a constrained relationship, you know, hopefully, your nose is somewhere close to your ear and your your elbows, you know. Some we're not close to your ankle and so forth, so you can develop these skeletons that are overlaid on the body as you go about your motion. So kind of the first order of business is well I had these metrics in mind that I mentioned before I want to see what my metrics are in terms of cadence and symmetry in motion and my vertical oscillation. And I want to map from these key points that I see. At on every image to these guys. Well, how do I do that? Well, the first thing you have to do is you have to have some reference in the image about going from the number of pixels that you see say from the ankle to my nose. I know how tall I am and I know that distance so I can get a pretty good estimate if I'm standing straight up of how those pixels. The number of pixels that are identified in the image scale to my actual height. And so Prior to doing this, we did kind of like a normalization thing where we just stood there and at the depth that we are going to be writing about the camera and got that. So got that measurement so that I can make that transformation. So that allows us to go from the pixel domain to the real world coordinate today. And now, once you have those, you can form these things called kinematics. For example, I can take the position of things frame, by frame over time, and I can convert the position to say, my elbow or my nose into velocity. And so, what, you're actually pulling Out are sort of signals with they call Time series, which are a series of positions over time for your different parts of your body and you can form things like, what's the vertical, what the vertical, velocity component of my nose. And I use that as a proxy, for example, of the vertical, oscillation in my head, I basically formed positional and velocity coordinates and now you have those in place. And the next thing is, you feel fill in the blanks for whatever. Measure that you're trying to to accomplish. So the vertical oscillation, the head was pretty easy at its found the peak sort of my Peak amplitude going up and down and in that wave form and use that. As in comparison to Emma, the Symmetry in motion thing was a bit interesting. I just based on the observation that the videos which we have in our article online. By the way, you can see that at xyonix.com XY, o ni x.com and you're going to want to look up using. Are improve Sports Performance there as one of our blogs and you can see this video in action but I noticed that my right ankle was going up much higher than my left and while Emma's was perfectly level with it. So her ankle was her stride. Essentially vertical oscillation was at the same as her, right? As well as her left. And so I measure the deviation from the peak amplitudes of The right and left as a as a means of measuring Symmetry and then Cadence is something that you back out is how many times you land your foot, your feet in one minute. And that's something you just have to pull out in terms of like that's like a Time series analysis problem. Yeah, and so, that's a lot of words to say they're sort of these basic steps. It's not necessarily simply got to clean up the data bit. You basically called the time series, then you analyze the time series to get the metrics now that you have those metrics those Measure of your performance and just the spoiler alert here. Is that it turns out that Emma and all of these metrics was better than I was. And I proved to myself in the world but she was yeah, there than I was all along she knew this but so that was, that was
DEEP: interesting. Yeah. So I noticed, like I remember when you did that that first analysis, we looked at those results, you know, we were like, there's this thing is this is like a very powerful concept here. Take, you know? A phone in someone's pocket, pointed at a person and get this, you know, get this incredibly historically, expensive analysis results and you know for for improving Sports efficiency. But also I started thinking a lot about it from a sports medicine, Vantage, you know what happened for me. And this is like the story of xyonix, fathers and daughters, but my daughter, I think right around this time had she A pretty intense soccer player and she had torn her ACL, she was about 14 at the time, she tears, her ACL, she gets the surgery. She goes through it and now she had a good nine months from when she could kind of walk and get moving to when she wanted to get back on the soccer field and I was at her, I was at, you know, at her Physicians that are orthopedic surgeons clinic. And, you know, we actually kind of had really intentionally chosen this position because cuz, you know, my daughter did not want to, you know, get her ACL repaired by, you know, somebody that didn't know how to get you back on the field and into sports in an intense way. So this position is located underneath Husky Stadium and so you walk in and it's like a really dramatic setting. You know she focuses on not just husky athletes but like other you know, Pro Sports. I think that I think the Seahawks he some other folks down there. So I'm in there and I'm chatting with her and I looked and I said, hey, I noticed you got this. Like be this room is just Tricked out with all of these sensors just like all kinds of you know, stuff like how to do your patients. I'll go in there and she's like I wish all my patients went in there. That's only for the YouTube football. Team players may get to go in that special room which is, you know, expensive and I started looking in our poker, my head around in there, and I was thinking like what you know, meanwhile, you know every once while I'm at my daughter's physical therapy, And I'm looking around. I'm like this whole place just seems like it's stuck in the 70s. You know. You get the Xerox photocopy, you know they comes out that gives you your exercises, you know you have this you know fairly you know this person that's like kind of watching you do you know your experiments but there's like a one-to-one relationship. Just got to be some way to capture more of what's going on with those elite football players in that Elite room to get them rehabbed, then, you know, then just, you know, this kind of slow kind of Process of the physical therapy and can we start to like, maybe do a different activity? So like not just, you know, let's say, you know, the running efficiency but can we now assess because like walking gait was another important metric. There's there was all kinds of stuff that the the PT was looking at with respect to like, hey can she keep Ernie super straight while she's trying to re-strengthen or while she's doing these lateral exercises, so we started expanding our thinking, you know, a little bit on how we get extend this activity analysis. And I think the soccer stuff kind of peaked when we said well, hey, let's pick another problem, very different, you know, from running, but can we do something like where we can actually map things into the real world coordinates? And I think that's when we started Gathering video, footage of somebody kicking a penalty kick and and and we started tracking everything on the goalie and the keeper. So you maybe you want to walk us through like some of the differences, you know, from The Keeper versus the running and some of the similarities like, you know, what are all the things that we could do? As we started walking up the stack there?
BILL: Yeah, definitely. I we also we both had that soccer connection through our daughters and it was it was thinking about, you know, how do you take this really cool thing that we just did with running and apply it to some other area? We have this mobile lab essentially, as you indicated, not confined to the University of Washington Center but out there in the real world, well we can take this out there and say you know what if we eat just for example We're looking at the efficiency of a goalie, like, what does it mean to be a good goalie and not a good goalie. Obviously, you know a good goalie who blocks all of the balls coming. His way is a great goalie, right? No matter what. But how does that, how does that great? Goalie achieve this? And when you think about, you know, when I looked at these Runners, this observation Lee, you know, it's very easy to see that these people are just much different than you you are. And you want to capture a bit of that magic. And I think that, you know, this this kind of system, Allows that to happen where we can take goalies that are like really, really good at what they do, record them start to get an assessment of the of these kinematics that they display. May be some physical properties like is a good goalie immediately tall. Like he's got to be like six foot ten you know or our goal is there a bit shorter? Are they also efficient how can I maybe
DEEP: she's got a better vertical jump or a more like or? Yes, maybe she's like, you know, you got the penalty kick because we were studying in the context of the pain of the
BILL: Yeah, maybe
DEEP: she's like committing a little bit earlier. You know, and just like because, right watch it frame by frame. Yeah. One of the things that, you know, that we did during this, you know, this period is, you know, we talked to a lot of folks who actually do this for a living. And I remember talking to, you know, to a, to a gym, like a rehab, kind of a performance optimization Center and I can't remember, you know exactly where they were. I think it's somewhere in the South, but this These folks would, you know, sit down and take this in this this High frame rate photography of elite athlete. So these are like the Michael Jordan's of the world. Not not, not even like yeah, u-dub husky. You, no mercy. Football player. But like, you know, the elite of the elite and this group would study them and study the exemplars, like the Michael Jordan's and, and, basically, like come up with really specific differences. Between what they were doing and what the, you know, the the athlete aspiring to be the next Elite athlete was doing in there, you know, in their studio and and they were. They were studying and kind of looking at a lot of the kinds of things that we're seeing and able to isolate from the skeleton and the kinematics extraction on top of that, you know, and, and so, so yeah. So I mean, you kind of back to the penalty kick scenario. So what were the some of those things that, you know, that you might be looking at if you're studying and, you know, lead athlete or Or somebody who wants to be.
BILL: Yes. Certainly, you know, and that and that, and that process of having the camera mounted behind the penalty person, taking the penalty kick, we have the sense of timing, the reaction time we can gather. So we know when the ball for example is left to Turf by way of some other types of signal image processing magic that we can do so we can start the clock essentially soon as that ball leaves the surface. Now how is that map to the goalies movement? Is he Also leaning in a particular direction when that when that comes about and what is his, what is his vertical, leap capabilities. That's funny that you mentioned that because I think in the video that we process, the thing that struck me about this visually was that our goalie didn't have much of a vertical leap, and I think he ended up being scored on their so could he could, could you take that information? I mean, certainly that might be something, you can just observe, you don't need to have special pose estimation to Able to figure that out. But you might be able to optimize his vertical leak like how far you squatting down how far he's squatting down. What's funny that you
DEEP: mention that about vertical leap. Because you know, we had a pro football team, you know, director of their analytics that we were working with and and you know, one of the things that I didn't realize that happens with you know with pro sports teams is similar to you know, when we were talking about the you know, the the video gaming where were these actors and These are like kind of instrumented like, with these really expensive instruments players are usually instrument, especially particularly close players, you know, there's a lot more budget for that. But one of the things that happens with these Pro players is they, you know, they walk through the locker room and they get these these quick assessments, and one of the assessments they get is their vertical leap, and they use a classic Force plate, where, you know, which I don't know if I actually didn't know about it, you know, before kind of digging in, you know, with this, with this, with this sport Steam analyst, but they basically stand on this plate, the plate has, you know sensors in just the floor, they're on their feet, they jump up and it can kind of measure their sort of acceleration or deceleration it can even measure you know like you know the the contact with the surface but what it can't do is like you know place like have a full view of the body, you can't comment on the knees, the shoulders, the wrists, the heads. And so one of the things these you know, these folks were interested in Like, hey, sometimes these pro athletes, you know, are they just want to get back on the field? They might have like a really subtle issue with the recovery from an ankle or a knee, you know, injury. And now, you know, we want to be able to tell that they're kind of like schmoozing us like they're just trying to get, you know, get past the porcelain because we're like, no, you gotta, you know, we're going to, you know, you're going back to PT with you or whatever. Yeah. And so so, so we were kind of, you know, like, you know, setting up the cameras because one of the advantages in that text is, you know, as one thing, one of the problems I'd like us to talk about a little bit, is the camera positioning at some point? Because that is a challenge in this particular context. It's not because everyday, the force plates in the same place, you know, all 80, or I actually don't know. The number of players in an NFL team roster will kind of come through and you got like 20, 30 seconds with each player, they get on this plate, the jump but having the cameras like right there, waist-high pointed, right at it being able to get, you know, like the full that, you know, the full Activity analysis. That, you know, we're able to do suddenly became a game changer, you know, for these folks. So so yeah, so
BILL: back and we had, we had an orthogonal views. We have the front view side view, to be have a top view and I think, yeah, I don't quite recall, but you're right. We had these things different.
DEEP: Yeah, the key is that you can instrument, like so this is one of the challenges with this stuff, right? Is it, where do you put the cameras? Right? So, you know, with it, with the, with the keeper scenario, you can imagine you're in training, you got a camera, like, right by the keeper. /, you know the distance from the from the PK, you know, ball placement point to the goal, you know, the dimensions of the goal so you can map everything into real actual coordinates, talk to us a little bit about that process because I know that was an exciting moment for us. When we were able to go from like, oh, your heads just moving around, you know, X pixel differences compared to like, hey, we know exactly what your acceleration of your left wrist is going for a, you know, a crossover reach to a, you know, a ball in the area, upper, right. Part of the keep the goal
BILL: that that's you had. Now you're reminding me because it has been some time since I thought about that. But yeah the one of the things we did was was also the monitor they velocity of the wrist and the hands as they sometimes if you think about your first jump, first of all I did an assumption that we get to your question specifically. I assumed I made an assumption to sort of make the problem a little less difficult in terms of these These coordinates. Yeah, I assume that the goalkeeper was jumping within the plane of the goal and so if you were to basically cover the goal with a giant sheet that he was jumping basically along that plane about she and so that we do we do know that the standard width and height of those goal posts. And so we could use that as again. Another means of mapping from the pixel domain you know how many pixels? We can count in the horizontal Direction and the vertical direction, to what that means in the real world for that person because there's a perspective, obviously, right, we are all familiar with perspective when we look at railroad tracks, so they converge in the distance to a single point. So we have to take that perspective into our calculations when we talk about, when we talk about real world coordinates, and of course, the person kicking the ball is very close to the camera. He lives in a sort of a, he's part of that perspective. It's so he appears much larger than Than the person in the background, so you have to take into account perspective. But yeah, that was a really cool. Very interesting thing. I was just dying to maybe go off to my favorite. MLS team and showed this type of analysis and see the types of things that they look at, because they certainly are, you know, all of these professional athletes are definitely. Yeah, miked up and spiked up, you know? Yeah, there's stuff. So
DEEP: another thing that happened that you know, is we were working with, at one point we were working with a couple of X Physicians, that sort of had kind of and created this company where they were, they were doing something that's kind of very straightforward for AI and Carson. I want you to explain this a little bit in just a second here but one of the things they were doing with is they take the outlines of a major major league baseball pitcher and they would sort of painstakingly cut the outlines of the pitcher out reassemble a video of the pitch during games. For for these pictures. And and then present that to the pitching coaches of Major League Baseball teams and other other teams. One of the things that, you know, that's caught our eye was like, Hey, and one of the reasons they do that apparently is like this is It's really valuable to like take away all the visually distracting information like you know, the fans waving stuff and you know in all of the other kind of motion on the field but get down to just the actual Motion. And this goes back to what we were saying before about this other. You know, this other kind of elite athlete kind of assessor performance obsessor, but you know, being able to like study it almost frame by frame. So we thought with the same, you know, penalty kicks in area that we would do something similar. And again, we'll part of the goal here is like to walk down from professional athletes and all of the expense of custom sensors and you know, and having it be cost-effective even to have someone Painstakingly cut things out frame by frame. So we tried to do that. Automatically thinking we can push some of this capability all the way down until like, you know, amateur Athletics like are in our daughters are involved with Carson. You want to talk a little bit about, you know what, what kinds of models are capable of cutting outs, you know, like a person's car like Contour shape, you know, in a video and like how you know, how we went about kind of how we typically go about using that and applying it in this kind of soccer scenario.
CARSTEN: Yeah, so that's a that's a process for semantic, segmentation and it's kind of related to optic detection, you know. In the in the early days we would do object detection by drawing rectangles around. Let's say the ball or the player. And we will train these models to detect basically. We're in the image, the ball is over in the image play roots. And then, once again, with the Advent of deep conditioner networks, we took that a step further and we're Sitting on rectangles throughout the object but the concrete outlines and these models are then actually completely able to not just detect whether the object is but really detect the outline. There's a couple problems with it. So one one problem is that it's really expensive to train these models because while we don't have to like Mark key points in your training data or rectangles. You literally have to have somebody very carefully trace. The outlines of the objects that is training there. For these not for these networks. And so unless you're working with something that is well known or has been well explored, like people or common objects, you want to do a custom bottle, you can be very, very expensive to generate these training sets that you need. And the other problem is, it's not, it's not perfect. When you do this with the video, you will notice that there's a little bit of Jitter. So basically our town is not really totally exact but there's a couple of pixels difference usually. And so you have these Jobs that make a little bit when you do this but in general it works really really well. Yeah.
DEEP: Yeah. That's Yeah, that's that. That is something you notice when you, when you, when you look at the semantic, segmentation that it's got its own, it's on view and it feels like it's going to be a ways before we get those perfect. You know all the way down to the pixel level but the trajectory of the improvements that we're seeing you know is kind of clearly in place.
BILL: Hey I have a question I'd like to so Carsten I think from what you said there it's nice to point out something that you mentioned that is let's say I am fascinated with all this technology but I want to apply it to my horse. My dog. Can I take one of these models trained up in a human? And this applies to those, I mean, I kind of know the answer to this but I wanted you to speak to it.
CARSTEN: With less effort than doing it from scratch. You can. Yeah you can you can do something that's called transfer learning. So let's say you have an object that has been trained to recognize people right and cars let's say but you want to apply it to something? The model has never seen before. Like dogs, then you can actually use the base network was trained on people. It has learned like a lot of the rudimentary elements that images and Contours and shapes are built out of Um and it and then you're able to provide a smaller training set with your custom objects and City kind of fine. Tune that Network towards your custom object and it will perform really well because it has kind of like trained these on learn these, these Elementary visual elements beforehand and so you need much less training data but you still need to supply a couple hundred samples and hopefully then you know, you can you can recognize your own objects. That's really cool. So we can
BILL: transfer this to other domains and that's what it takes. That's a great aspect of always loved about working with this image is transferred, arming concept where somebody out there, spend a lot of time and effort to build a deep learning, convolutional neural network, where they took the time to create all that, wonderful training data, but we can actually use that to be able to apply it to other domains with a with just a little bit more
DEEP: effort. Very interesting. Yeah.
DEEP: mean, there's one of the questions, I'll kind of throw out there is like, you know, what is there that you can use some of this? Body pose estimation like out spot outside of the sports arena. In sports is and one of the one of the cases that comes to mind is, you know, I know we've been working with a client. That's automatically trying to assess autism spectrum disorders, and infants, for example. So the body pose data that that were sort of generating and using two, So, get the skeletons on full lie. You know, like, full adults is not necessarily. It doesn't necessarily work quite as well and these kind of cute pudgy, Little Creatures cars. Do you want to talk about? Maybe some of the like, what are some of the challenges in just getting the skeleton itself, you know, out and kind of really effective. You know, in some of these scenarios where it's, you know, it's a baby. It's a horse, it's a dog. Like, it's something different
CARSTEN: Oh yeah, so so dimension of it. Getting the whole principle Behind These techniques is key Point detection, right? And the nice thing about you and bodies is that the key points, you need to model the skeleton of often, very distinct the knees, the ankles, the fetus cetera, et cetera, and depending on where you want to transfer and apply that the coupons might be not. So distinct, like, like you said, babies a little bit more pudgy and it's not so super clear when they're crawling around, where the elbow is, and So, the model has a harder time to learn, these just to these, especially if they're ambiguous. And so depending on what could what your constraints are and how distinctive the key points are, it might be difficult for you to follow the same principles to train such models an example that we did. For example, for example, was that we applied this to surgical instruments and therefore, scopic surgeries we were interested in. Kind of the same principle how the surgeon is it's moving. These robotic instruments with what accuracy with what speed but they're very careful with the not careful. They perform their Suitor incorrectly. And so this was about motion analysis, activity analysis, based on instrument movement and we did the same thing. We follow the same principles. We trained basically key Point detectors on very important key points in these instruments. But often you would have, like, a long shaft of the instrument that is no distinguishing or no? Yeah, Stan. In out sections, one looked like the other and so at that point it becomes really, really difficult to train key Point models.
DEEP: Come on. Yeah, I think maybe we'll talk a little bit. So one of the things so we got pretty inspired here, you know, between bills, kind of initial foray into running kind of digging in, on the soccer side, you know, helping out you know with this autism project some of the surgical stuff that we started kind of We ended up actually putting together, you know, an activity analysis platform, where we said look somebody's probably going to have a device that probably going to be sitting on an iPhone or an Android device. They can capture the video, send it up generate and an analysis, like send it up. Get the body pose extracted either on device or up in the cloud and then be able to get these kind of kinematics. Kinds of analysis build it you were describing earlier and get that down, you know, back onto device to kind of, you know, generate a report or present, some sort of specialized feedback back. We actually, you know, went ahead and kind of put that together. And and so it's now pretty straightforward for us to like map a new activity into this infrastructure and that's kind of an exciting outtake. I know there's a, you know, there's a lot of, you know, folks out there that have been kind of like scratching Heads on what they could do with body pose. One of the challenges though is, you know, you have some of these obfuscation issues where something gets a certain, get secluded, sorry, occlusion issues and you also have the kind of challenges around camera placement. Do you guys want to maybe chat a little bit around? You know around some of those problems and you know kind of mitigation strategies or
BILL: Carsten, oh yes, I'd like to go first on this one. Just kind of obvious, and I'm gonna speak to Something. In terms of Time, series analysis, sometimes these key Point, detectors are errant suddenly the right elbow becomes the left elbow and and so forth. It's a within one individual frame which is essentially one point on the screen. You can't really do much about that but when you Called the points across time use, you'll see an obvious blip. Like, for example, my vertical head motion went from, you know, being that like, you know, say six feet off the ground, solve a sudden 20 feet off the ground or three inches off the ground. Chip for just one point. So you can in terms of there's a Time series analysis approaches or means that you can detect these obvious errors and just simply fill them in with something, you think is reasonable. So in terms of sort of Corrections are from You can do so with some sort of visualization of these key points across time and fix them accordingly. Okay, now Carson you go
CARSTEN: almost live data. You just have to have these sanity checks and they're right. Just like you just mention. The other thing that's really important is calibration, if you really want to and what you talked about that, a little bit better. But if you really want to derive physical measurements, all those Like velocity or movement speeds, and yeah, you have to have some reference in your image. And so the camera becomes really important. You need to know you need to have some reference object or points in your image, for example, in your burning. Then we'll bill. The fact that you guys were running in front of the fence. It's great. Because you can really, you can measure the fence and you can even not just measure relative velocities but also absolute velocity in your steam because you have that static. Thing in your scene. So if you set up an experiment, you always want to have something like that in your, in your view in your scene, you kind of want to know the camera model. You know, what's the focal, range, cetera, et cetera so you can make really good sense. All those physical objects, you have their your reference objects. Because without that, all you can measure is relative motion, right? You could measure how fast that your hands moving in relation to your spine and you can you can derive some good stuff from that but you won't get. Any, any absolute
BILL: networks? Yeah, that's a good call Carson. Because I remember when we did the football analysis, we did, we did have a lot of those
DEEP: references as well. See, we definitely needed that. Yeah,
CARSTEN: and the other thing is deep mentioned, the occlusion that is, that is really important. The running demo worked really well because you're essentially, moving in a plane. And the same thing is true with the goalie. If you ignore the fact that he might actually jump forward or backward, but the moment that you have motion that is not In a plane that is orthogonal to your camera. You need more cameras. You basically need to have more multiple views in order to kind of like reconstruct the motion and 3D space and then things get more complicated
BILL: when I always thought was fascinating, was this occlusion when it came to how? Well, I thought visually these models perform proposed estimation. The I've seen a lot of examples of it where people are dancing, for example, in a group. And you see, you know, they naturally turn their heads, say one way or the other train their bodies. But there really are mapping, where the locations of these key points are in a three dimensional space and they tend to do fairly well. So they must have had in. Tell me a little bit about what that might have have involved. Carson from a training perspective, maybe with the data or you have to do something special with the models for that. How does that work?
CARSTEN: You know, if it is, if it is really like a two-dimensional image and the construction boss from one single scene, then it's just a visual effect. There's no way to reconstruct 3D information unless you have a, unless you have cameras that also give you that information. There's some cameras out there that do that, for example, the connect, right? So if I have depth information in addition to my update information like derive, this 3D locations, if I don't have that, then it Remains the guessing game once. And because because Youmans and human skeletal motion is very constrained. You can play a pretty good guessing game, right? Because arms don't twist certain ways and so you can optimize things that way but that's what I was that's what I was
BILL: wondering. Whether is as you say, essentially a guess on the purse on the attitude is part of the person actually laid in these joints for training or whether they brought in some sort of physical constraints knowing that they didn't even beings. Maybe it's a combo those
CARSTEN: to solve the real problem. Stereoscopic Vision. At least two cameras, ideally more. So you don't have to deal with inclusion effects.
DEEP: Yeah, alright. Bill, one last final question before we wrap it up here, did it work? So you got this analysis for your running. You found out your heads bobbing around too much and one of your ankles was kind of not moving up as much as your other. What did you do about it? And are you a better Runner than your daughter? Now,
CARSTEN: I'll add this other place. Goal drug Bill
BILL: Swift sport. I started gambling and playing golf. I mean, it seemed to be the natural out of this, it? Well, I will tell you something. I, it did help me. I'm fortunate enough to have a very supportive family when I go on these massive training runs. You know, I usually don't want to Lug around a bunch of water. I'm spoiled right? I'm a prima donna Runner. I don't want to Lug all my stuff around so I begged and ask my Wife or my kids, would you ride alongside me with your bike? And maybe give me some water and so forth. But knowing that I was actually seeing myself in the video and be able to quantify how particularly in my right leg was asymmetric for my left leg. I had my wife take further video of me and you know what's really great about this is that you want to do this type of stuff. When you're tired, when you're not tired, you're just starting out. Run. You tend to do great with your form, right? Maybe I'll know when you get tired, you break down. So the basically that the task that I asked her to accomplish was I want you to take some video me and also just visually assess me, you know, like ten miles into the run because that's when I'm, that's what I'm going to be tired and that's when I start messing up and so, you know, without seeing this video without doing this analysis, I would not have have I would not have detected this, this issue, I just didn't feel it. Deep when I was out there running and and so it allowed me to hone in on it and quantify how bad it was. And then measure it later and see that I had. I was doing better when I was tired and and so that definitely helped me. Awesome. Yeah. All right. That's fantastic.
DEEP: That's all folks. Thanks, build, Carson for a robust and fun conversation.
BILL: That's the end of this episode
DEEP: on improving Sports
BILL: efficiency by using AI.
DEEP: As I mentioned at the beginning, we do have some articles on this topic actually, a quite a bit of content on related with topic on our website. So if you're interested in any of our real listeners and reading more go to xyonix.com x-y-o-n-i-x.com, you can check out in our articles or you can go to One of our body post Solutions, thanks everybody for sticking around and we will see you next time on your AI injection.