**00:00** Good morning, I hope I can still say good morning. Good morning everybody, welcome to this presentation on the elimination of hidden black boxes and machine learning models. My name is Ingo Mierswa, I’m the founder of RapidMiner.

** **

**00:11** RapidMiner is one of the leading platforms in data science machine learning. I’m doing this stuff for 20 years and I’ve seen it all, and I would like to share today with you well, first of all, there’s a description of the problem about understandability and the lack of understandability in machine learning. More importantly, I would like to share a couple of pieces of advice; how we can overcome this. But then the most important thing is going to be, I’ll introduce the concept of hidden black boxes to you and that’s unfortunately probably something most people aren’t even aware that it is existing, and the good news is it’s actually easy to fix and you have to fix it.

** **

**00:46 **So stay tuned, we will end on the hidden black box but before we get there, let’s actually phrase the whole problem in a slightly different way. This is a quote I took from the Gartner’s Magic Quadrant of data science and machine learning platforms research piece. 60% of all models which are intended to be put into production are actually never operationalized and that’s, this is just an astounding number!

** **

**01:12 **Think about this, I mean it’s not 60% of all the models, but models which actually should have been put into production. And 60% of them are never seeing the light of the day. What does this really mean? A model which is not put into production provides zero value. Why bother building it in the first place?

** **

**01:29 **So, this is a huge number and it really annoys me actually because creating those models is a lot of work and then nobody’s using them. So what are the main reasons for this? Well, Gartner cites a couple of reasons. First reason is a bit technical and it makes kind of sense.

** **

**01:43 **The first reason is, well, maybe there’s not a good set of tools to support the operationalization?** **And I would agree especially if you’re coming from a more coding-centric approach to data science with R & Python, this is all good and fine but it is, in fact, a little bit problematic to take those models and put them into production. I would argue though, that if you’re smart enough to actually build a data science and machine learning model using let’s say Python, you’re probably also smart enough to overcome those technical hurdles somehow. So I think the other reasons which are less technical are more important.

** **

**02:12 **The first thing is, unfortunately, data science, since it’s very complicated, there’s a lot** **of math involved, a lot of specialist Ph.D. statisticians who all put some of it into the basement and then go and create some models; they often do this a bit in a silo. They’re a bit disconnected from the rest of the organization and that’s too bad because what this typically means is that the stakeholders in the business are not really part of the model creation process. They don’t really understand what this model really means.

** **

**02:38 **All this gets back is complex mathematical formulas and then well good luck.** **And that in particular really leads to this lack of understandability and as a consequence than to a lack of trust. And I personally think this is the biggest issue we have in data science as of today.

** **

**02:54 **So, if you’re not trusting what I’m getting back, as a result, how could I actually really** **disrupt my business, how can I change my business process using machine learning models if I hardly understand what this thing is doing in the first place? It’s like hiring somebody solving a problem and force this person to speak in a different language. I don’t understand anything.

** **

**03:12 **Okay sure, do you understand what I want you to do?** **Some answer… You do not understand anything? Are you feeling good about that? Probably not. So how do we really solve this problem of lack of trust in machine learning? And I would like to give you an example.

** **

**03:29 **I’m going to use a couple of different data sets in this presentation to illustrate the** **different techniques to overcome this huge problem and the first data set is literally** **very close to our heart. It’s a data set about heart disease. Which is, in general, cardiovascular diseases are the number one cause of death in most countries in the world. So, it’s definitely a serious problem.

** **

**03:49 **And why is this important? Why would it be a good idea to have a machine learning model which can be used to predict if somebody’s more likely to suffer from heart disease down the road? Well because you still have some time. You have still the chance to change your lifestyle, maybe your diet or take some medication. But if it’s too late. Well, it’s too late. So maybe it would be a good idea to create this model, so let’s try to do that. So, this is the data set and it’s not important that you understand all the details but please still follow along a little bit.

** **

**04:16 **So, on the left side here those blue-white columns, this is data, the data columns describing our patients. So, we also call those attributes or features or variables. So, we have things like the age of the patient, or gender, or if there’s a certain type of chest pain. Or what is the resting blood pressure for example? So we have those columns on the left and then we have 303 rows, those are our patients.

** **

**04:40 **So, for every single patient like the first row here, there’s 63-year-old male with a typical** **angina chest pain type and so on. And then there’s this column here on the right, this orange one. This is what we call a machine-learning a target or also a label. This is what we want to predict I mean, just for the few of you who have really never built a machine learning model, this is kind of a machine learning model 101 for you now as well.

**05:02 **So, the goal here is that we take information in those blue-white columns here on the left** **and create some mathematical model or a formula which can be used to predict the outcome on the right. And in the orange column or at least like it. So, you could create for example rules like well if you have a 40, or let’s say if the age is higher than 40, and you’re a woman and you have a resting blood pressure of 300, yeah, you’re very likely to suffer from heart disease soon. I mean I guess very soon.

** **

**05:29 **So that is the idea, so come up with this kind of rules and a rule would be simple because we can understand this but we will see in a second that life is not always that simple. So, let’s actually build some models together and obviously since I like it a lot, but I might be a little biased, I’m going to use… give me one second… I’m going to use RapidMiner for this. And in fact, I’m going to use RapidMiner Auto Model because it allows me to build I don’t know ten different models in the next twenty seconds or so.

** **

**06:03 **This is the same data set we have seen before just loaded into RapidMiner.** **Only difference here’s the column we want to predict you see on the left, so we just selected this and now RapidMiner goes and creates a couple of columns. So, just ignore the next two steps same here all I’m doing here is basically telling RapidMiner what columns to use, everything which is green is fine so that’s good. And now RapidMiner suggests a couple of machine learning models like simple models like linear models or the second from the top or a decision tree further down here but also a more complex model like gradient boosted, trees deep learning stuff like that.

** **

**06:36 **I could turn on automatic feature selection here on the left I just don’t do this in the interest of time, keep in mind but whenever RapidMiner is trying a different model here, we’re not just calculating one, let’s say for the decision tree, they try different algorithms, we try different parameter settings in order to find the best model for you. And if you turn on automatic feature engineering on the left, it can easily end up that we create like between 100 and 250,000 models for this one data set. That might take an hour or so.

** **

**07:02 **So, let’s maybe speed it up a little bit, not turning it on.** **In this case now we should actually be done in a couple of seconds. So, first results coming, the first model we can calculate here is the so-called Naïve Bayes model; it has an 86% accuracy. Then the linear model 88% or this Regression 83 and so on. The decision tree doesn’t really look very good, 62%.

** **

**07:24 **Well, let’s actually have a look at this model because decision trees, in general, are something people say they can understand and if I look at this model here now, I would say yeah most of us will be able to follow along. I mean this model is only using two of the input columns ST depression whatever that means. I hope there’s no doctor in the room because I might actually say wrong things about patients here. Correct me if I’m wrong. Then there’s also the maximum heart rate.

** **

**07:48 **But this whole model it’s maybe too simplistic. Yeah, it’s easy to understand, would I trust my life to this model? No, probably not. I don’t know, if that’s all doctors are doing that that’s probably too simple. Let’s try a different model though.

** **

**08:04 **If you have a look here into this table so now all the models are done, you see that** **actually the linear model is also pretty good, gradient boosted trees as well. The problem is with gradient boosted trees though, although they are still tree based, you now end up with like dozens or hundreds of those models so they are not really easy to understand. We will come back to them a little later.

** **

**08:24 **But linear models are something many people believe they can understand; well at least you can sort the coefficients here to see what are the coefficients with the biggest influence so like the max heart rate, or if the gender has some influence, or if I look at the end the other end of the spectrum, well this asymptomatic chest pain type that seems to be important.

** **

**08:46 **But now let’s be really honest here, as a group.** **So, we can see what is important. Who of you does really know how this model actually creates a prediction? I mean truly knows. We all believe you understand linear models, but what is the formula? Feel free to raise hands, not so many at the same time. One, thank you, two.

** **

**09:12 **That’s the problem. It seems to be simple but actually, if that model which performs very well is no longer** **that simple and this unfortunately is not always the case, but in general is a bit of a rule for machine learning, that the better the models are, the more complex they get. And even a linear model can actually already be too difficult to fully understand, and let’s not even try to understand neural networks or other stuff.

** **

**09:32 **So, that is the situation we are in.** **The good models are too complex, so we can no longer understand them and the bad models** **which are not performing very well are easy, but we don’t trust them because they’re not.

**09:45** So, we would like to get better models but the problem is if all I see is I put some input in on the left side and I get some output out on the right side – the inner workings they stay hidden for me. I don’t understand them. That’s why we call those models black boxes because, we can’t really understand how they work.

** **

**10:01 **So, how we can overcome this problem? It is actually quite simple. Here are three pieces of advice, and I will explain all three of them to you now. The first one is I really believe you should focus much less on prediction errors alone. Unfortunately, most data scientists are really trained to do exactly that. Let’s build more accurate models. And there’s nothing wrong with this per se, but I think you should balance this also the understandability of the models and I will show you a different data set and some examples on the next couple of slides.

** **

**10:28 **Another idea of which is underused, unfortunately, is to take the model and put this into some** **simulation framework so you can actually play around with the model see how it works and more importantly you can confirm your prior knowledge. Whatever you believe to know about your domain, you can actually confirm if the model behaves like it would expect it to behave. Again, another example will be later. And then if you do this at a larger scale, this actually leads to a technique we call local explanations and that will become a bit clearer a little bit later as well.

** **

**10:53 **So, let’s start with number one here, focus less on prediction errors.** **And I would like to introduce a different data set because it’s even simpler than the heart disease data set. So, this one here, we also have only one dimension, X on the x-axis and second dimension Y on the y-axis. And the goal for the machine learning model here is to predict the value of Y given any value of X. This is what we call a regression problem as opposed to the classification problem – which we have seen in the heart disease where we had two classes “yes” and “no”.

** **

**11:20 **So quick example, if I know the value for X is “-5”, then the model should say something** **around 10k as the outcome. Okay, pretty simple. So, obviously this is a nonlinear problem. And there’s also some noise involved and there’s this strange little peak here at the center. So I’m not expecting that the linear model is going to do a good job, but nevertheless, let’s try. So, if I put the linear model on this data set the red curve here well we can see a couple of things.

** **

**11:46 **First of all, there’s not a lot of noise which is good but it just didn’t get the shape at all, and as a consequence well I said let’s not focus too much on the prediction error, but at least let’s calculate them. The error rate of that model is horrible. So it’s a 39% relative error, so not really good. Well if a linear model is not good, let’s maybe try a nonlinear one, for example, this decision tree.

** **

**12:04 **They can be also used for regression problems like here and well then they are called regression trees (just for information). You can see well it’s getting the overall shape which is good, but there is a bit of noise here. You see those little red dots? They’re all moving around, doesn’t make a lot of sense there and it definitely struggles with the peak here at the center. I mean what are those flat lines here? And then there’s just this random blob up there. It just didn’t really fully understand the data.

** **

**12:30 **But the really good news is it’s doing a much better job so the error rate drops down to** **only 6%. So, I mentioned the gradient boosted trees before, which is a technique, (boosting in general), to learn from previous mistakes. So, you start with one tree like this one and then you build another tree focusing more on the areas where the first tree made more mistakes. So, in this case, for example, more on this area here. So, we really focus down on this local area there. And then, after you built the second tree, you build another one which tries to focus on the mistakes from the previous two and so on.

** **

**12:56 **So, you stack them up all on top of each other so that’s the basic idea of boosting and also of gradient boosted trees. So, if I do this, look small difference but important, look at that. So, less noise it got the peak as well and the error rate was going down to 4%. Fantastic! So, I think we nailed it. Good model here right? Clear case as well.

** **

**13:15 **We have three models here, gradient boosted trees clearly the best performing model, only 4% error rate. Then we have the decision tree 6%, and the linear model with 39. So, it should be totally obvious to us that we go with the gradient boosted trees, with the smallest error. Let’s put it into production, let’s make some money here… Not so fast, let’s actually have a look at the model itself.

** **

**13:35 **So, this is the linear model and we can all understand that.** **It’s the prediction where Y is calculated with 817 times X plus 17 thousand something.** **And while this model was not in particular good, at least we can learn something from** **that.** **I could, for example, learn well if I double the value of X so times 2, I increase the** **value of Y by roughly 1600.** **So at least I learned something.** **I can stick some information out of this.

** **

**14:02 **But decision trees are supposed to be understandable as well, right?** **And they are so much better with 6% error rate so let’s have a look.** **This is the tree.** **In case you wonder what is this white blob there, no this is the actual tree for this** **data set.** **And since you can hardly read anything of this, let’s just zoom onto this right area** **here.

**14:19** That is just one tiny part of the tree look at, all those little splits for X all of X is larger than 16 then check X if it’s smaller than 16.7, if it’s then larger than 16.5, then it’s supposed to be 295. Is there anything I can learn from that tree? No. I mean look at the whole tree there is nothing for me to learn from. I don’t understand how this tree is working, I don’t trust this thing.

** **

**14:40 **So yeah, it’s a much lower error rate but it’s hard to understand.** **And with gradient boosted trees, it’s even worse. Because now I don’t have a single tree, I have 500 of them!** **And they all depend on each other.** **Forget it. I do data science for 20 years, there’s no chance on earth I could understand anything of that or learn from it. So, what are we doing then?

** **

**15:02 **I would like to use this linear model but that’s not good enough and the gradient boosted** **trees are good, but I don’t understand them. So maybe there’s other ideas. Here’s the second piece of this piece of advice here. This is another model, the red line I created. And this looks pretty good, right? I mean hardly any noise, it’s got the oval shape, it’s got the peak, it’s not really perfect there on the far right end, though but overall it looks good. What did I do?

** **

**15:23 **Did I use some neural networks, deep learning, or just voodoo and magic? Nope, it was a linear model. But I used automatic feature engineering, also part of RapidMiner Auto Model, to create a couple of additional columns based on X. Things like absolute values of X, squared value of X, stuff of that nature. If I do this I can do the linear model on those additional columns like five or six of those and that model actually gets the whole shape here very well with a pretty good error rate.

** **

**15:50 **But look at this model.** **Now that I can read it it’s 10,000 times absolute value of X + 7,500 times X times X absolute value of X squared and then 700 divided by absolute value of X, which by the way explains the peak in the center. So let’s just assume this would be data from some physical process or manufacturing process. I as an engineer, for example, could look at this as a domain expert and say “yeah that might make sense.” I can see that oh it squares, interesting so I learned something, or I can say “yeah, I kind of guess it was squared but I didn’t know that this would be the coefficient.”

** **

**16:21 **So, I trust this thing because I understand it.** **So, the takeaway here really is: don’t let’s just look for this for the best model in terms of accuracy, but actually balance this with understandability and you can actually be a bit creative and if you are, you can get very very very far with even very simple model types like this linear model, and I would absolutely sacrifice this 1% of error rate and use this model (which is much easier doing a much easier job) to actually instill and create some trust.

** **

**16:48 **So, that is the first section on like don’t focus too much on prediction errors.** **So let’s move on to the second one, which is about using the idea of simulation around machine learning models. Well, what do I mean by this? So let’s just assume we don’t have a linear model but something we don’t understand at all. It’s truly a black box. We have no chance to understand how it works. But we do understand if you fit in some input here on the left, we can expect some output. And we have some knowledge about our world.

** **

**17:17 **So, for example, I could be a doctor.** **Let’s go back to the heart disease here. And I could say I have this 54-year-old woman with asymptomatic chest pain, with a colored vessel rate of 0.7 (whatever that means), and a max heart rate of 143, and other information. And I as a doctor could, for example, say like you know what, I would say “too bad, but I bet it’s very likely that this person is really going to suffer from heart disease” and so my expectation is, yeah, it should be a heart disease case.

** **

**17:44 **So, I can feed something into the model and I can see how the more reacts and then I can** **check if this matches my prior expectation. And if I do this for 10-15 cases I might build enough trust to say like “yeah that model behaves pretty much like another doctor.” I like that, that’s good. So, this is now going to be a little hard for us because I assume there are no doctors in the room.

** **

**18:04 **But there’s one thing we believe to know about heart disease.** **I mean collectively all of us. And that’s that high levels of cholesterol are not a good idea. So, let’s try and use our model actually and check for this hypothesis that you increase the cholesterol level, that actually, that should also increase the risk for heart disease. So, let’s go back here let’s go with just the first one here.

** **

**18:28 **There’s a Naive Bayes which is not super hard to understand but most people probably wouldn’t** **be able to understand it. So here we have this simulator on the left side you have the inputs for the model and we can play around with this in a second and on the right side you see what the model is going to say and how likely this outcome is going to be. So, for the inputs which is basically your average patient good news, for average patients, the likelihood for suffering from heart disease is only 14% and it’s an 86% for not suffering from this. And the bottom here you see actually the main reasons for this particular decision, but we will come back to this little bit later.

** **

**18:58 **So, here we have now our cholesterol level and let’s play around a little bit with this.** **So, when I move this around you see on the left side how this current value of the orange bar behaves compared to all the values we know from our training data. If I bring this down now by moving it to the left side, pay attention the to the blue bar on the right. So, it actually drops the likelihood for heart disease a little bit. But if I move it up, let’s bring it up, and further up, you see that this actually has some impact on the likelihood for heart disease it goes indeed up. And if I crank it up all to the right, I have pretty much a dead patient.

** **

**19:33 **So, again we are not doctors, but we believe that this is something we know about heart** **disease and we checked it with this model and it behaves exactly like we would expect it to behave, so even if I don’t understand the model, at least it helps me to understand how the model works and if it behaves kind of like a human being. So that can be helpful sometimes at least to instill or create this level of trust but at the same time, I think this kind of simulation is a fantastic tool also for using machine learning models in the first place.

** **

**20:01 **So, yeah machine learning is great if you automate decision making.** **But in many cases, you actually want to offer something to a user, to choose a second analyst so they can play around and see what happens. Well let’s double the price for this product, what is this going to do to my data conversion rate? Well let’s reduce our service level for those this group of customers, what about our churn rate?

** **

**20:23 **So using a machine learning model by actually trying to figure out what is the best course** **of action to get to the desired outcome, simulation is a fantastic tool to achieve that next to also creating this level of trust. So then ok we did this now that playing around with one input dimension here, what about the other?

**20:41 **So, what is really driving a single decision? In order to figure that one out, we can actually do this a bit at scale. And I’m using a third data set now to explain this whole idea. So here we have a two-dimensional data set X and Y and it’s a classification problem so obviously of two classes green and blue and it looks simple for human beings. I guess it’s kind of this checkerboard pattern here. But for example, a linear model would fail with that data set. I mean there’s no linear curve I could put into this plot here, to separate green from blue points.

** **

**21:15 **All right so let’s go with our good old friend the decision tree again and this is the tree** **for this model. And I guess most of us would think well that looks already more complex than you expected it should look. And the reason is because well these humans we just look at it one corner like this one here and we just say like well if the value here is higher than that and smaller than that, that is really easy to explain why this is a green corner. But the tree needs to cover all the cases at the same time and that often makes models more complex because they really need to look at all the data points at the same time.

** **

**21:46 **So, let’s also focus just now on one data point – let’s say this one here.** **Obviously, I hope, this is a green data point. So, if I would ask you now, out of the two dimensions X or Y, which one is more important to motivate why this green data point? Is the value of X more important or is the value of Y more important? What do you think you think?… You said X? Thank you for playing, this is wrong. (But still thanks.)

** **

**22:19 **Y is more important, why is that?** **Because if you change Y a little bit, if you go a little bit up, it actually changes the class. So, it has more impact on the decision of whether it’s green or blue versus X. If you go to the left or right, it really doesn’t matter a lot it always stays green. So, if I look for example at this data point here on the left side, it’s exactly the opposite. Here, if I move the point up and down, it has pretty much no influence on the decision of if it’s a green data point. But if I go a little bit to the left, it turns the class over to blue. So, in this case here X is more important.

** **

**22:49 **So, let’s go back to the first point.** **Now, we can actually do this at scale and that’s the same thing as the simulation. So, if I want to explain what really or how is the model coming to a conclusion for this particular data point, what I can do is I just generate in the neighborhood of this data point more data points, artificially.

** **

**23:06 **I asked the model what’s the prediction for those data points and then I check for every** **of the potential input dimensions, which of those dimensions actually has more influence of the prediction value of this data point? And we could clearly see in this case here, well Y has much more influence for this neighborhood here than then X, so Y is contributing more to this prediction, it’s supporting the prediction stronger than X. And this idea as simple as it might look, it’s actually not that old. There is an algorithm called LIME- Local Interpretable Model-Agnostic Explanations and it was only published in 2016.

** **

**23:42 **And so you saw that actually in our simulation before, so all the green bars mean they are** **supporting a prediction like “yes” and the red bars mean they are contradicting and input dimensions, which are not even in this plot, they don’t add at all. So, in up in RapidMiner we actually created a different variant of LIME, which works on both numerical and categorical data for classification and regression problems and also in real time. For that reason, I can actually play around like I did with the cholesterol level so right now it’s actually supporting the prediction of “yes”, but if I bring it down again, it becomes less important, it kind of drops out but you can see how the other bars are moving around.

** **

**24:18 **So being able to calculate this in real time is pretty important for the simulation.** **But you can also do this for all the data points at the same time so here you create all the predictions and the dark green means it’s supporting the current prediction and dark red means it’s contradicting the current prediction. So, it’s really an easy tool to also explain why a model comes to a certain outcome with using this little technique.

** **

**24:41 **So, here we have it… the three pieces of advice on how to overcome this black box nature** **of machine learning models. But there’s one more thing. And that’s this concept of hidden black boxes. The problem really is, let’s say you saw those models for heart disease they have like 85, 87, 88% accuracy you like that (by the way doctors have 82). You have that. You have been playing with the simulation, you created a linear model which is easy to understand. You really build enough trust, you want to put in production.

** **

**25:20 **But there’s one thing you don’t know.** **You have no idea how I actually created this model. Yeah, I used RapidMiner Auto Model, I pressed five buttons, but what did I do to the data? How did I optimize the model? How did I validate the model? You don’t have any idea what I did.

** **

**25:34 **For all we know right now this model could have been the output of some random process** **or potentially more dangerous, a faulty one. There was leaking some information about my data where which I want to validate the model on how good it performs and I was accidentally leaking it into the training data, which is an absolute no-go in data science. So how would you know? I pressed a couple of buttons I mean I love RapidMiner but do trust me? I mean am I that trustworthy? I don’t know? (I am).

** **

**26:02 **So, I actually have another model for you, let me actually show it for you first.** **I go back to into a different part of RapidMiner, this process designer here, so I have the same data here but then I do some magic inside of this box here, I’m not showing you exactly what I’m doing but like an Auto Model, I’m actually, you saw same data as before. I’m running this whole thing and I will show you the performance. So, we have roughly 94% accuracy so that’s much better than the other models and it’s a linear model again using only a few coefficients, so this is fantastic. We have a better model with a higher accuracy for comparison the best models have been like around 80%, 87%, 88% before. Higher understandability thanks to being a linear regression model.

** **

**26:39 **This is fantastic!** **I am not a doctor but if I would be, I go into production with this, I use it on thousand patients and; I just killed a hundred of them. That’s bad news, what happened? Well I actually, if I now show you what I did, and I’m not going through the details because I don’t have the time, but if you’re interested I can discuss with you later I made a couple of mistakes.

** **

**27:00 **I didn’t really correctly validate the impact of my data preparation which is something** **unfortunately many, even trained data scientists are doing wrong and pretty much all automated machine learning products are doing wrong as well. I then did feature engineering but unfortunately, when I did this, I only optimize for accuracy and I was using information I shouldn’t have been using. And then I made the third, a smaller mistake while I was validating the model. So, the interesting thing is if I fix all those problems so the same process before but it just fix the problems and I run this again, this takes a second but what we see now is that the model we thought has 94% accuracy, in reality actually has 93 – sorry 83% so it’s 10% less. That’s why I said if I use this model on a thousand patients, I could just kill to 10% of them.

** **

**27:43 **And that is not acceptable, obviously!** **But that’s what is happening. People trust those model building products and they just say “okay fine” but you don’t know how the model was created. This is what we call the hidden black box. You need to be able at least to see all the details. How was the data prepared? How was the model optimized? Was the data preparation actually correctly validated as well?

** **

**28:08 **Most automatic machine learning tools are not doing this and most citizen data science** **don’t have enough experience really question those results. So, this hidden black box is never acceptable. You can play around with simulation like all day long you can see all those accuracy values but this is the dangerous thing. If you validate it incorrectly, the models actually will look “better” and the only point in time you will find out that it fails is after you put it into production. Again, not acceptable, never.

** **

**28:32 **Obviously, the solution is pretty simple.** **Even if you create a model like this, just have a functionality that you can open a process like that. Even if you a citizen data science, who’s not able to check all of this, maybe somebody else in the organization can do this for you – like a quality assurance check. So you can be on the safe side, that you’re not making stupid mistakes.

** **

**28:53 **So, a couple of key takeaways here.** **First one: reduce your focus on errors; just don’t try to optimize only for them. Balance it with understandability from the beginning. Explain models. You should spend as much time on explaining models than on creating them. If you actually spend more time creating them, then you’re over-focusing on prediction errors to begin with. You should really make sure that people can follow along and that business stakeholders really understand what’s going on because otherwise, the model will never go into production. And what’s the point about having a .1% more accurate model which is not in production because the value of that model is zero and not higher. So that is a no-go.

** **

**29:30 **And the last thing which is an even bigger no-go, you have to avoid this hidden black** **box. I love automated machine, we at RapidMiner love automated machine learning because it makes data scientists more productive and it makes it actually possible for citizen data scientists to actually build machine learning models in the first place. It’s fantastic! But only if you’re not allowing the introduction of hidden black box, because otherwise, it’s worthless. So make sure that you’re doing it right there.

** **

**29:52 **We try our best to do it right and we’re really grateful to all our users, customers, but** **also Gartner to recognize us for that so now for six years in a row we made it into the leaders quadrant of this magic quadrant for machine learning and data science platforms. That’s a fantastic result! We are really grateful for that and I’m also very, very grateful for your attention today thank you very much, visit us at our booth.