It’s easy to identify but the labels that which is asymmetrical, but that doesn’t make a difference for your analysis purpose privately Good So now jumping intellectual later So again my probably be important pandas and number I’m Siegen libraries, as usual, the standard once Then you read the churn data file from your data sitting to this particular one Then you look at how your data is looking like so you estate account length I can’t length is for it How long that corn has been active Basically the number off I’m assuming this should be the months Let’s take this is months for simplicity’s sake It is not related there so many months it has been active rich area.
This number belongs toward this customer belong store to the phone number Now international plans are these things All the data that we touched upon this is there So you see China’s false to really was essentially that is what we want to find out So the same date as it can be used for doing your basic statistics analysis same day does that you can do for coming up with the mission learning all of them for prediction.
It’s a classification problem you can use for that purpose Also Okay now coming toe Just I want to find out History Graham of the time spent on vehicles How many people are spending time on bakers and what is the distribution So basically I’m creating your history PLB basically in works of my probably by default pelted artist is like calling ah Matt Bradley function hissed on your asking Total day minutes on You are clearly saying 10 beans if you don’t specify have been standing it by default takes as many beans it feels by a formula that is there the backend.
If we specify 10 beans to be created it creates 10 bins Exactly Run toe 3456789 10 So that prayers of tendons essentially on the supers the data across these values So I’m your plotting on the label xx issue Just putting a label of Total Day minute by access You’re watching a total number of customers on this particular because it is just a frequency count of the customer’s Brother Total Day Minutes column You’ll see that What is this distribution looking like It’s a Belch Neither Belch the Bella Walker It’s not 100% Belch essentially because you see it there is a more dominance on decide compared to decide Right.
There is the higher domination on this site More testing If it is a proper bell girl this Sunday this would have been the same level To send this would have been the same level but it is close to your Baku that you can come off it in the real-time This is as close as two Belches that you can get with a real-time data All right, so there’s the first inference we’re getting in So there is another thing in the Si on last time we have seen which is called this Plotter If you create that with this particular one you can see it line also you can use that as an exercise for you.
Use that this plot in seeing on so create Ah glad for the historic rampart This one you will get Instagram plus the trend line off that particular one so you can see how the line is actually looking So now how do you get vigorous The channel and nonjournal Whether time spent on the bagels seeing Instagram you are breaking it by John So you’re creating a separatist program for people who have not turned says people who have churned You see the distribution majority of the people are laying in this been anyone.
Who is talking about 1500 minutes sent over Eliminate some made around this region Number of time There’s the number of minutes they’re spending in the day Essentially they’re unlikely to be churned But if we look at the total day minutes for those people have actually turned You can see that it is the flatter distribution that is out there on it is all across Only you see a little spike in the center Other ways it is more like a flat rate across all minutes That’s the first impotence.
If I want to find out how many people are actually using y smell plan the basic value counts Somebody called it or getting personal use It really can’t use of eloquence function to get how many assistant harmonies knows how many people have Ah after for the Wise Men plan he had 9 22 people opted Then you use account plot which is similar to the value counts essentially but the interesting pieces you get it graphically on also by a particular very available that you want here I want to break the people who have used a voicemail plan on Break It by the chin It’s like a cross tab you’re doing and putting it visually The number of people who have actually y smell plan on.
How among them how many people have shown and harmony Blown orchard Same here for the people who said no Then this is about creating a box But we have seen a huge box plot Outlier Twitter 10 different ah years Across our 12 years here we’re creating people who have an international plan activated for international plan So have an international plan, Yes International plan No I’m breaking it by area cord So here are the area courts for 24 44 64 80 There were only three areas I suppose.
If I remember correctly Yes for art eight for 15 and fight And only three year years are desperate across But this is a scale it is providing But the distribution sits like this the majority of the people who have an international plan are across all area Course there are no distinction people who don’t have our only family in this area Yes from Okay so now this thing here we in this one we created a visual cross tab This is about creating a visual cross, but that’s where Khan floats on The Hugh can parade you This is about actually creating across step I want to create across time between area court on voicemail plant between two variables So I simply call periods The pandas dot crossed appear across stuff on giving the two variables you want it will area Close step is the number of people who have a Y smell plan by each other areas You get those numbers you can interpret that data then how topi would information using by 10 for categorical values.
There is a people table function essentially call when you do people would You are not looking at just by one value he created multiple layers So here we’re trying to do it by area court as well as voicemail plan But each of the areas I want to know people who have a Y smell plan who don’t have a voice mail plan so s And now that’s what we’ll figure out the international plan They have So s no s No you’ll figure out on their also adding one more lunch in here So we took all the columns you’re looking there International plans these other columns for each other things that you’re looking at for the whole data set But every column in the data set by area cord for not eight You’re looking like people having y smell plan.
These are the people who don’t have a y smell plan These are the people who have a Y smell plan You’re getting it So this column thesis represents international planets are now This wise male plan if they don’t have it is written to know if they have it is written nauseous so tells how many people have that value here Otherwise male plan on for that particular set of people How each of these variables coming out the distribution It’s basically creating a people table.