Excel for Data Analytics - Full Course for Beginners

513.75k views127250 WordsCopy TextShare
Luke Barousse
🧮 Course Problems & Certificate 👉 https://lukeb.co/excel 📁 Course Files* & Projects 👉 https://lu...
Video Transcript:
dat nerds welcome to this full course tutorial on Excel for data analytics this is the course I wish I would have had when I first started as a data analyst you're going to be working right alongside me as we Master how to use a spreadsheet starting with the basics of functions charts and tables working our way up to our first portfolio project we'll then shift gears into advanced features like pivot tables power query and power pivot ultimately building our second and final project analyzing real world data now to master this tool we're not going to
go straight for 11 hours instead we're going to break it down into 10 to 20 minute lessons during this we'll have exercises for you to learn while doing not just watching followed by practice problems to reinforce your newly learned skills now Excel is the most popular spreadsheet tool in the world it's estimated to have over 1 billion users that's one in eight people in the world and for data nerds it's one of the most popular skills for data analysts coming only behind SQL oh and the same can be said for business analysts in this Tool's
popularity truth be told Excel was one of the only skills that I knew when I landed my first role in data analytics but it was able to handle everything thrown at me and so I've been cataloging over the years all of the most important features to perform data analytics and I compiled it in this course and this video is for absolute beginners you don't need any analytic or spreadsheet experience we'll be starting with the first half on the basic chapters which will build up your knowledge on the fundamentals with covering which versions of excel you
can use for the course along with installing it then we'll get you familiar with working around how to manipulate a spreadsheet from there we'll shift into practical exercises analyzing data using formulas and functions and then visualizing it using common charts and statistical analysis at the end of the basics chapters we'll put your skills to the test to build an interactive dashboard to predict one salary based on job and location for the second half of the course we're going to ramp up our learnings diving into Advanced Analytical features focusing on using pivot tables and add-ins to
dive quickly into Data in sites we'll learn power query to connect to a variety of data sets and perform ETL or extract transform and load finally we'll learn data modeling with power pivot and perform Advanced calculations with the Dax Language by the end of the advanced chapters we'll have built a full data analytics project analyzing the data science job market which you'll be able to share this and the previous project in order to Showcase your experience with analyzing data in Excel now I'm a big believer in open- sourcing education so this course and all the
content required to complete the course is completely free I not only get you set up with Excel but I also provide all the different Excel workbooks and sheets needed to complete this course with this you'll get access to the data sets needed to make those final projects and even how to share them now unfortunately the AdSense Revenue alone from this course isn't enough in order to support all the different costs associated with building this so I have an option for those that want to support and help out for those that purchase my supporter resources you're
you're going to get access to a lot of features that are going to help speed up your learning all provided through this custom dashboard to track your progress you'll get guided practice problems to perform after each lesson that will not only provide the solution but also walk you through how to get it if you get stuck along the way you'll have access to a community of others in order to jump in and comment and ask for help additionally you'll be getting my step-by-step instructions that walk through each of the lessons as I perform it and
finally when you complete the course I'll email you a certificate of completion that you can upload to LinkedIn now one quick shout out before we jump in and that's to Kelly Adams she helped me plan out a lot of the different lessons for this course along with being the brains behind a lot of the different practice problems and frankly if I didn't have help I probably couldn't have completed this course so before we go any further with what we need to and actually diving into this course we need to First understand what is Excel and
where the heck it came from so in order to understand this we need to go back oh a little too far back ah just right ancient Babylon when we used to trade livestock like it was crypto now it's during this time that we started recordkeeping and we didn't have paper so we used Stone and we partition it into rows and columns during the time of the Romans they began to perfect this even further with accounting eventually we get some advancements in technology we start getting this on paper this is when the term spreadsheets gets the
introduction this maintained that familiar row and column format in order to catalog different things spread AC across different sheets spread sheet fast forward to the 1900s and we pack rooms full of underpaid people in order to maintain and keep track of all the different transactions on paper spreadsheets with the Advent of computers in the late '70s we started to see our first spreadsheet softwares vial and Lotus 123 then our boy here decided to revolutionize the world little bit okay not with that but with this I'm Bill Gates chairman of my Microsoft in this video you're
going to see the future since its launch in 1985 it's been wreaking havoc in the spreadsheet software Community dominating market share and to continue to dominate over the years Microsoft has added more and more features it initially started out to where you'd only be using it for the cells of entering different formulas and forming quick calculations along with getting different charts and Analysis shortly thereafter it was upgraded with pivot tables and that's my secret weapon to quickly analyzing data as I no longer have to remember which comes first and index and match now VBA or
Visual Basic for applications was included in the mid90s and it's a programming language in order for you to automate task in Microsoft applications now we're not going to waste any time in this course learning VBA frankly I feel it's outdated you should learn python instead and there's newer tools that actually automate the process of data analysis like powerquery this was first introduced as 2010 and then rebranded to get and transform and then rebranded again to power query sort of similar to what Google does with renaming products anyway this bad boy is like washing down a
couple caffeine pills with a shot of espresso it can ingest and clean so much data in the blink of an eye hardcore data nerds call this ETL or extract transform and load power pivot was also introduced during this time of power query and it's like putting your spreadsheets on steroids this allows us to perform data modeling on data sets greater than a million rows greater than what Excel actually holding the spreadsheets and combined with the power of Dax or data analysis Expressions we can supercharge our calculations fast forward to today and there's been two other
major features added to excel co-pilot which is basically chat GPT inside of Microsoft Excel and python Excel which is basically python inside of excel anyway co-pilot is great wait that's a lie so I do believe AI chat rots are great at helping us out when we get stuck but I don't want you rely on that to actually learn this technology of Excel and for Python and Excel you need to know well python if you don't know this yet it's completely useless now with all these features it can make it seem like Excel is overwhelming which
I completely get that but when you focus on the basics and work from there I think it makes a lot easier to learn it it's also why this course is almost 11 hours long all right enough with the history lesson let's actually get into the course material and what you're going to need for this also we're going to be going over what data set or what data we're going to be analyzing for the project for this with the link provided below you can navigate to this which is the GitHub repo that has all the different
folders and files needed to take the course now don't understand if you're not familiar with GitHub we're going to walk through this this pane here basically outlines all the different folders that you have access to and if I navigate in something like resources I can see I have a data sets folder images folder and even a problems folder so for those that purchase the course practice problems you have access to the problems inside of here and they're broken down by chapter along with the lesson in addition to that resources folder you can see numbered here
we have each of those eight chapters and if we navigate into something like spreadsheets intro we have a workbook for each one of the lessons so you want to download this file you just navigate to it click the three dots and click download but have an alternate method coming up in a bit inside the workbooks I provide a blank template for you to go through and actually fill in and we'll be getting to what's in this final sheet of actually being filled in now as we move into the advanced chapters they're going to have something
like the data sheet or you're going to use the data from the data sheets in order to do different operations and we'll put those in different sheets as well so how do we get these files well the easiest way is to come up here to this code and go to download zip with the file downloaded all you need is to unzip it and then from there it has all the different folders with the appropriate workbooks inside of them now after going through a lesson I then have practice problems for those that purchase the course perks
to go through here's the course dashboard that you'll get access to that breaks it all down for the problems based on the chapter itself and then by the lesson and inside of each of these lessons is multip multiple different problems for you go through and work the other perk that you'll receive with those practice problems are the course notes these break down the concepts in a similar format of all the different chapters and lesson here's the one on Excel install which is going to be what we're covering next but it provides all the different background
on all the different material that be covering this and it's in the same format that I'm covering it in the video so you can follow right along just as a reminder there's no requirement to purchase these practice problems or course notes just helps support me anyway what are we actually going to be covering in this data analysis that we're going to be doing inside of excel well you're going to be taking the role of a job Seeker in exploring what are some of the top paying roles along with skills of data nerds for this we're
going to use the data from my app dat nerd. Tech that is collected to this point up to 3 million jobs it tells based on a job title and also on a location what are the top skills and it not only tells us the salary of these skills for a particular job but also the salaries of the jobs themselves now the main data set we're going to be using for the majority of this course is this one here inside the data sets folder of data job salary all this data set includes over 30,000 job postings
from 2023 and it includes a wealth of information such as company name salary and location as we go through these examples I'm going to be doing it from the perspective of a data analyst which is their top job in the data set but as shown here there's a lot of different other job titles that you can check out and use as well so feel free to deviate additionally I'll be primarily focusing on the United States but there's a lot of different countries in there as well so feel free to plug in your home country and
analyze this instead now with any course you're probably going to get stuck along the way and so how do you get help for this well I don't recommend just jumping into the comment section and waiting for somebody to help you out instead I recommend using a chat bot like chat GPT in it you can provide whatever era you're seeing and it will help you out and guide you along the way on what to do and there's other great options as well such as gemini or even Claude so feel free to use whichever one you're most
comfortable with all right if you haven't done so already it's your turn now to go in and download that GitHub repo with all the different workbooks needed for this course in the next lesson we're going to be getting into installing Excel and mainly understanding what are the different versions that you can actually get with Excel and which one you need for the course with it I'll see you there let's now actually get into working with Excel so in this lesson we're going to be going through how to actually inst install Excel onto your computer assuming
you don't have it but before we get to that for those that maybe have Excel or an older version of Excel or have different computers we're going to actually go through what are the preliminary requirements you need to have or set up in order to be able to have the Excel you need for this course now here's a breakdown of the different chapters within this course that is the rows here and then for the columns are the different micro Micosoft products that you can get in order to have Excel now if you're running Excel on
a Windows machine either through Microsoft 365 Microsoft Office at home and student or even an older version of excel up to about 2010 you're going to be fine with completing all the different course content however if you have the Mac version or Mac operating system and Excel is installed directly on that operating system you're not going to be able to complete the Advanced chapter specifically on power query and on power pivot along with the project and it's similar as well for Microsoft 365 online as you won't also be able to complete the Advanced Data analysis
section now if you have any of these first three versions of excel installed on your computer you can skip to the next lesson if you want I'm just be going through before the install process of breaking down each of these different versions so you understand your options what you can get so let's get into breaking down all these different versions available first up is Microsoft 365 now with Microsoft 365 you're going to get a host of different Microsoft applications not only Excel but also things like word PowerPoint and even Outlook and there's two major plans
I'm going to recommend for this either the family plan which allows you to give out these keys for these different services to up to six people or a personal plan which allows you to give it to well yourself now I do want to call out that if you're a college student or maybe you work for a big Corporation you may have access to a free Microsoft 365 plan so if you're in college check with your college and if you're working for a business check for your business if you have access to this so you don't
have to pay money for it but regardless of that if money is an issue Microsoft 365 family offers this free one-month trial which I think you can complete this course within a month so technically you could do this for free if you don't want to get charged you will need to actually cancel before the end of that 30 days and at that point you'll still have Microsoft Excel installed on your computer just everything will be in view only mode you won't actually be able to edit any of the different spreadsheets that we've operated on during
this course let's now move into Microsoft Office home and student now this bad boy is the alternate recommendation I'm going to give you if you don't want to pay for a Microsoft 3 365 subscription this is only a onetime purchase and it gives you keys to Microsoft Office so you can install all the different Microsoft products of excel word and PowerPoint onto your computer for the low low price of $150 similar to Microsoft 365 subscription this will not only work on a Windows machine but it will also work on a Mac machine Let's now move
to this last option because it's sort of in the bundle of it of Microsoft 365 online now this version of Microsoft 365 is completely free but sort of a catch to this here I am on my web browser logged into Microsoft 365 online and I have access to all the different apps within the browser including something like Excel so we can go to it now this version looks very similar to the version that you can actually install the applications on your Windows or Mac machine there are limitations like a disuss before about power query and
power pivot so you're going to be limited if you're trying to follow along in this course when we get to those Advanced chapters also the layout on the web browser version of this app is much different from that that's installing your computer so I'm not going to be providing any support on this course on actually actually how to navigate this so you're going to have to figure that out yourself so we've discussed everything except for these Mac versions of Microsoft 365 and office so here's a quick recap of all the different features and cost of
the three major versions of Microsoft that you can get in order to get Excel on your computer for this personally I'm using the Microsoft 365 family plan because it includes all the different features that I need and it also I save cost because I'm splitting with my brother who now that I think of it is actually paying for it but it provides everything that I need and so it's the one I'm recommending for this course now before we get into the install I want to briefly show what are the differences between using Mac with Excel
installed Vice windows and Excel installed on it anyway here's Excel installed on my Windows operating system and Excel on this operating system is in my opinion the flagship product from Microsoft so they're investing all of their effort and resources into designing this application to make it the best possible and then from there Excel online and then Excel for Mac are really just copycats of this anyway the two main differences and the problems I've run into in the past that Excel for Mac doesn't have are in this data tab I have a lot of different data
sources I can choose from and that's specifically related to our power query lesson and then finally it has power pivot which is just completely non-existent on Excel for Mac now here I am on a Mac machine and we can see that it looks very similar to before but there's a lot of limitations that we're going to find with this specifically going back to that power query not a lot of different sources you can choose from and then yeah Power pivot is just completely non-existent you may be like Luke I have a Mac machine what do
I need to do in order to have the most premier version of Excel and use for this well for that I recommend installing a virtual machine and virtual machines like parallels shown here allows you to host a different operating system on your Mac machine this Windows example that I was showing earlier if I actually expanded out you can see in the background here I'm running this on a Mac machine and I have full capabilities en able to carry out and running Windows on this now I've been paying for and using parallels over the past 3
years and I can tell you the support and the offers from it are perfectly fine and I love using it now personally I'm using the Parallels Desktop Pro Edition but you can get by with just using the standard edition now they also have this onetime purchase that you could do which is 129 but it doesn't get any further updates and I really like how it actually updates and fixes any bugs that may run into now the other reason why I like parallels is because it has this coherence mode I have this blue little icon that
I can click up at the top to go into coherence mode and then wait for it it allows me to access any of those windows inside of my windows vers virtual machine inside of Mac so here is Excel running right here inside my Mac and this is not only limited to Microsoft Excel but also products like powerbi which I'm using pretty frequently as a data analyst I can also run this into coherence mode but enough about that now that they got that out of the way let's actually get into installing Excel via in your Windows
machine or on your Windows Virtual Machine so the first thing we need to do is navigate over to microsoft.com and I'm going to click up here to Microsoft 3 365 we're going to be going through setting up the free 30-day version so I'm going to click this of try for free and from there start my one month trial it's going to ask me to sync my data I'm assume you don't have it I'm also going to assume you don't have an account so we're going to create one I'm going to put in my email address
and then from there create a password after providing some personal information you're going to need to verify your email with the code they send you now to be clear this is the Microsoft 365 family plan which after that 1 month trial it's going to be charging you at $99 every year so if you're just one person and you're trying to switch to the personal plane after this you'll need to do that at the end or near the end of those 30 days from there like any company they're going to ask for some payment methods I'm
going to just go ahead with PayPal PayPal's all set go ahead and do more paperwork of adding Bell and address and with that I can start trial and pay later so now that I'm logged in I want to install the desktop app so it gives me access to right here it's going to go ahead and begin this it's going to ask if want to allow this app to make changes to your device yeah I trust them so only took a few minutes and all the different Microsoft 365 office apps were installed so I just come
down to the search bar down here type in Excel let's pop it open make sure it's working and in order to get started you need to sign in in order to verify that it's your subscription so I put in my email and password and already forgot my password now I'm resetting my password and now I'm all set up all right and we got agre to some lawyer talk of accepting licensing agreements at this point I'm pretty worn out of going through this process so I'm just going to click through everything I'm not going to send
any optional data personally I don't like to do that I don't want to personalize right now and it looks like I'm finally done all right I'm into it and now that we're into Excel we can see up here it should have your name or your account that you're going into and go in here into the blank workbook all right so that basically concludes this lesson on installing Excel I do want to show real quick how easy it is to actually cancel your membership should you want to go about just getting the free version or the
free 30-day trial and you want to cancel it before any if I go back to my account I can go in here to manage subscriptions and here I'm inside my Microsoft account which tells me I'm subscribed to Microsoft 365 family I can share it with up to zero to five people and for that I just click on it and I can copy a link and provide it to whoever I want to share it with we're going to cancel it so we can go to manage subscriptions right here and all we got to do is click
cancel subscriptions it's going to have me confirm that I do want to cancel this family plan makes me scroll all the way to the bottom after showing me all these different prices that I could get instead and I'm going to say yeah I don't want my subscription and as I'm filming this on August 27th it basically says hey you still have access this for 30 days until September 26th so still technically have access to it so if you haven't done it already it's your turn to now go and install Microsoft Excel the one of the
options that I've shown here in the next chapter we're going to get into a spreadsheets intro to get you familiar with how to actually use all the different functionality or graphical unit or interface gooey of excel with that see you in the next one welcome to this chapter on an intro to spreadsheets and this chapter has three different lessons in order to understand what we're covering those three different lessons we need to explore some vocabulary with it so let's jump into Excel for this lesson we're going to be focusing on worksheets and that is basically
as you can see this tab here called sheet one that is how to manipulate these different cells within this worksheet or also known as a sheet in the next lesson we're going to be going into workbooks so workbooks basically captures either one sheet like this one sheet one if I add another one sheet two so it encapsulates multiple different sheets within this program of Excel and then finally in the third lesson of this chapter we're going to be moving into the ribbon which is up here at the top and has a bunch of different functionality
to extend into those spreadsheets along with using this file tab up here that has a whole bunch of features within it as well now this chapter was designed for those that may not have experience with using Microsoft Excel before so if you don't fall in that category as in you've used excel in your job and you're pretty familiar with all those different features I just shown you can feel free to skip this chapter and then move into the next one on functions along with all those different practice problems but if you're not comfortable with that
stick around we're going to get into it all right so the first thing you need to do is open up that first Excel sheet in the files you should have downloaded from GitHub on onecore worksheets inside of here I have an original sheet that allows you to actually go in and fill in everything we're going to be doing and manipulating during the course of this lesson then if you get lost along the way or want to peek ahead to see what we're actually going to do you can actually scroll over here or select the final
sheet to see that now I want to make this as big as possible for you to see so I'm going to go ahead and close out this ribbon up here and you can just do that by double clicking on any one of these different items up here and then from there I also want to zoom in so I'm going to come down here to the bottom right and I'm going to just zoom in to about 200% and scroll on over now inside the spreadsheet it has all these different cells and it's organized in a manner
where it has rows and the rows are labeled with numbers 1 2 3 all the way down to about a million and then we have the columns and the columns are alphabetical and they all go all the way to where they start duplicating where they'll put another letter in front of the other and it'll go all the way through xfd so let's practice some data entry here I have a table we're going to be filling in for this lesson basically has all the different skills associated with it and then I want you to actually go
through while we're going through this and you don't have to provide the values I do you can if you want we're going to be filling it in based on our difficulty when we made have started it or level and then filling out some other self formulas as we go so we're going to start first with Excel and then the difficulty so I'm going to select right here and I can see which cell is selected because it's sort of highlighted here on this B and also two but also right up here next to this formula bar
I just call that formula bar we can see that we're calling out the name of B2 so anytime we reference any cells it first references the column letter and then the row number so in this case I'm selected in C7 so I'm going to go ahead and give this a number I'm going to say four for myself as you notice I I just put it right in the Box alternatively I can also select the cell I want to go to and then come up here into the formula bar press what I want so I want
five for Python and go from there whenever I press enter it then goes down to the next cell so technically I could just go through and enter this all in using my keyboard and I don't have to click or move manipulate at all except to select the cell that wanted so those were all numerical values when we move into the skill known on whether we know it or not we want to put in whether it's known or not we want to put true or false this is known as a Boolean value so typing in something
like true I can see when I press enter it actually updates to be all caps for this Tru so it recognizes the data type of this as Boolean now if you're taking this course you probably don't know Excel so we're going to put in false instead now say I want to update the rest of these for false false I can yeah go through and actually type it up or I can select this lower right hand corner of cell C2 and now I can drag these values down and it will autofill it in now autofills not
just limited to Boolean values let's say I had something like Luke I could put that here and just drag it down it's going to fill in Luke all the way through here a cool feature about Excel is say I have something like one and then two I could select both of these cells and then when I drag it down it's going to actually fill in three or four now autofill can also throw you off especially for dates so let's say we're filling in when we're starting Excel which is we'll put in for the today's date
in my case it's August 27 20124 I'm going to go ahead press enter to save that in it automatically updates to this formatting here in America if in Europe you may see the month in a different location anyway if I select this and actually drag down what you'll see is is it will do that auto fill in but it's not going to keep that same day per it assumes we want to increment by one day now specifically with dates if I want to change the format I can actually come up here and I'll expand out
this home ribbon again and right now it's Rec recognizing that the number is of date and for date I have a few different options I can do short date which is shown here or even something like long date I can also go even further which we'll explore as we get further into this course into this more number formats and date actually has a whole bunch of other different options that we can choose from but for right now we're just going to keep it this simple date format and I'm going to click okay now assuming you
haven't started any of these I'm going to go ahead and actually just select all the different cells that I want and if you were to press delete it's only going to delete that top cell and that's sort of annoying because I want to delete all these different cells instead what I'm going to do if I'm on a Windows machine I'm G to press delete or in my case I'm using a Mac Windows VM I'm press function delete and it's going to delete all the different content right I'm also going to go ahead while I'm here
delete all that different content down there we don't need it now we're going to move on to level type of diet we're going to put into this is text so in the case of excel you're probably a beginner so I'll put in beginner and then if I want to I can go through and fill out different levels for each of these so python Advanced RBI Advanced and so on for all these now one thing to notice real quick is for the date it does specify in here under this home ribbon that it is a date
but all these other one it just characterizes as general which is perfectly fine now for these other options down here let's go ahead and say I wanted to put in beginner for all the rest of these can't necessarily drag and drop this but what I can do is I can actually copy it specifically I could right click the cell and come up here and copy it but I don't recommend that also over here on the home menu they have an option as well to copy or even cut something so I can select something like copy
as well and it's going to put these marching ants as they call it around the cell to tell you that hey it's actually selected and then if I wanted to paste it I go ahead and select down here and I could paste it down below that's not what we want to do I don't like going through and actually selecting all these different buttons I want to minimize it as much as possible and I want to use shortcuts so in order to stop these marching ants I can go ahead and press escape and I'll select the
cell that I want to copy and from there I'll press contrl C and that copies it and then I can go ahead and paste it below by selecting the cell that I want and pressing control contr V now you'll be noticing that when I'm going through this I have these shortcuts peing right here next to me on the screen so you'll be able to follow along as well as I'm using these shortcuts the other option is I could cut this so I could press crl X and then paste it in here crl V but this
is going to go ahead and take this value out of here we don't want to necessarily do that so I'll just copy this again crl C and then paste it right above here contrl V shortcuts are going to be a big timesaver and we're going to be using them a lot throughout this course in order to save you time and having you to go back to your mouse in order to manipulate it and select the different cells all right so let's step this up a notch and we're now going to get into using formulas and
formulas are denoted by whenever we go into a cell like difficulty here which we want it to be on a 1 to 10 scale we denote formulas by an equal sign and in this case we want the difficulty to be on a 10-point scale basically transition from that 5 point scale so we need to multiply it times two so we could do something like 4 * 2 and I press enter and it's going to give me as I expect eight but I actually don't recommend hardcoding values that are already inside of excel here specifically this
four so instead of this I'm going to remove this and I can either type in the cell coordinates of the cell so I could type in B2 and as you notice it's highlighting one the B2 is blue but then the cell B2 is highlighted in blue alternatively I can have an equal sign here and just go over and actually select it as well whenever I press enter it's going to go ahead and say a it's four now now that I'm referencing that four I want to say that this is 4 * 2 pressing enter we
have 8 once again we're going to use that power of autofill so I can select that cell of F2 and now drag it down and what's going to be pretty interesting about this is the two as denoted in the formula bar and actually whenever I click into it as well the two Remains the Same but autofill automatically knows to adjust the formula or the cell coordinates for the next cell Down based on how I did that autofill just to show this as well I could say hey let's equal this to B6 right below it and
then if I were to drag this over it's going going to then put in C6 D6 E6 then F6 so pretty cool I'm going go ahead and delete this now the last column we're going to be filling in is skill and level we're also be using a formula for this and we'll set this equal to this skill thing and also this level so I'll start by putting in an equal sign and then it's not on the screen right now but I know it's in b or sorry A2 and I can see that selected by scrolling
over here now how am I going to get in that F2 well I can do an Amper sand now and from there I'll put in F2 and it has this selected as well pressing enter ended up in the wrong one sorry about that should have been E2 and now I have Excel beginner but there's no space in between there this is sort of hard to read so what I can do is actually manipulate this to include another Amper sand and then in between this I'm going to put quotes and this is hey insert this text
character in between it specifically I want to have a space then a dash and then another space and then press enter now if I tried to do this without the quote if I just did this and press enter I'm going to get a typo in my formula you have to actually put those quotes around to show that it's text and it's trying to correct it for some minus sign I don't really like how it's doing it oh my gosh it's freaking out now anyway I put the quotes back in there pressing enter boom we have
it and like before I'm going to just do autofill to fill all those in so let's zoom out a little bit cuz we're going to be now be working with ranges which is a collection of cells now if you notice whenever I select in this case I'm selecting B2 it says B2 up the top but if I go to select more of this it will actually call out that five r or five rows by two c or two columns and then when I Let Go it just goes back to B2 anyway ranges are a selection
of multiple of cells so if I come over here to i1 put it in equal sign and then if I want to say copy this entire range I can go ahead and select this all so it's saying it's A1 colon G6 so start the upper left hand corner of A1 and the bottom right hand corner of G6 now this is pretty cool there's a new feature of excel of dynamic rages it's going to go ahead and fill this in there's only one formula in here of that A1 through j6 but you see that has this
Shadow border around here that's showing that this dynamic range is now filling in for all these different things and if we look at the formula bar it's sort of gray out here too for it only at the very beginning does it show that A1 and G6 and then you could manipulate it so if I wanted to I could change it to G5 and it would just go down a row now we're not limited to just that we could in fact select an entire column so in this case I'll put an equal sign and let's say
I want to do the the full column of column a right here I can select up here a it's going to select all the way down and if we go over to the formula bar itself we can see that it's saying a colon a that means all the contents of column A are going to be included in this and from there it's putting a copy putting all these different things and then when there's not a value in it because it's a copy similar to over here for these dates of zero we're going to see Zero
in all these different values all the way down now similarly I can also do a copy of a row so in this case if I wanted to or multiple rows if I wanted to do rows five and six I could press enter going to get an erir with this though and that has to do with this Q column right here that we're copy and pasting here so I'm going to go ahead and delete that real quick get rid of it and now we have that rows five and six duplicated below along with that shadow around
it and all there now these ranges are going to save us a lot of time later so I'm going to go ahead and delete this right now I don't want any of that as later on when we get into actually using functions within formulas I can use something like the average function put in a range in here so it selects all of it and then get the average of it in this case now one last thing to note on this before we wrap up here on how to save this is you may have noticed that
this date started over here is a number and that's because that's how Excel stores dates with within this spreadsheet right here so if I actually click on it go back up to home right now it's St storing it under the format of General right now so if I were to make this into an actual date we can see that it is in fact 827 2024 now just some fun little trivia if I were to put in number one and transition it to a date so coming up here and selecting date that first date starts at
January 1st 19900 and then they move on the numbers from there all right last thing we need to do is now save the work that you just completed with this you can do this multiple different ways we can come up here to the top of your Excel workbook right here and click save you can also as shown you can use contrs alternatively you can come over here to the file menu and then come on down to save or save as and then if you wanted to you can specify the location where you actually want to
save your file and save it there now you do have the option which I highly recommend if you're working with real world files you want to actually save them to save this autosave feature the one caveat to this is that your files have to be stored on one drive right now with the plan that I have I can store about one terabyte of files on there so if you'd like to do that feel free to transition your files there I'm not going to um and I won't have Auto saave on for this but for very
important files definitely do have autosave set up all right for those that have purchased the practice problems and notes you have some practice problems to go through and get even more familiar with manipulating cells inside of a spreadsheet after that we're going to be going into manipulating a workbook with that see you in the next one all right we're going to be continuing on with this spreadsheets intro focusing now on workbooks so previously we were focusing on worksheets which are a sheet inside of a workbook now we're going to be focusing on manipulating and moving
data between workbooks now for this I don't want you immediately jumping into that 2or workbooks Excel file this really just has all the answers in it it doesn't have really what we need for it instead we're going to be starting with a new notebook and instead importing in some data so specifically if we go into this folder of zore resource into data sets we have this one Excel file called Data job salary monthly now this is similar to the data that we're going to be using for the remainder of the course we're actually going to
use another Excel sheet but this one here is pretty neat because it's broken up by months into different sheets so all the job postings for January are in this sheet called Jan and so on for February and so on for March so what we're going to be doing in this lesson is moving we want to just evaluate the January data move that into a new workbook so to get a new workbook as easy as possible we're going to come over here to the file menu I'm just going go to new and click blank workbook now
here I have that new Bo notebook right now it's titled book two because it hasn't been saved anyway going back to that file menu just to show you I have different options I can get a new notebook so we went into new and just selected uh a blank workbook also we could use this Home tab and select a bank blank workbook based on that also have a bunch of different tutorials you can check out also we have this open tab right here which allows you on the left hand side to select a location like this
PC or even browse different locations in your file system but frankly I'm using more often than not over here on the right hand side this right here where this shows a past history of Excel files I've worked with so I can go through and actually select an Excel file pretty easily we're going to explore more about this file menu more in a bit let's get moving some data first now before we get into copying this data into the new workbook itself I want to actually just copy it within its own workbook so if we noce
some controls down here at the bottom we have all the different Sheets if we want to add another sheet which I want to copy it to I'm just going to add this in right here and I'm going to call this Jan copy press enter and that's new sheet and I and I added that by just double clicking in there and then allowing it to addit addition I can rightclick it and I can do things like rename it and that will do the same thing now there's also some controls around here you notice there's some arrows
on right here and what that does is just Scrolls all the way over or incrementally over so I can see all the different sheets in this case there's more sheets than I'd actually see in one view then we have the scroll bar over on the right hand side this is actually just controlling the scroll area within our new sheet of Jan copy so previously we we saw how we can copy ranges using a formula in this case I'm entering equal to and then I'm just going to select this range right here press enter and I
can get it inserted in and then actually looking at the formula it's just equal to J1 uh colon p8 and this has its range right there all right so I want to get the contents into this sheet so I'm going to start by putting an equal sign and then I'm I go over to that Jan sheet and when I go over here you're going to notice that now next to that equal sign I have Jan the name of the sheet and an exclamation point this is identifying the sheet and I want all this different items
so as I go to select it all you can see that it's updating in the formula bar right now I have A1 through P2 selected but I actually want to select everything in this sheet and we're about at 3,000 rows and right now I'm only about 500 of those this is going to take forever so I don't recommend necessarily doing this type of method to try to select all your data so I'm going to go ahead and Escape out of this and go back to where we were at the Gen copy instead once again I'm
going to press that equal sign go back to that Jan sheet right up in the form bar once again I can see that it has the Jan and the exclamation point I'm going to select A1 to start with and I'm going to press the shortcut contrl shift and then the right arrow key and now all the top row is selected from here I'm going to continue to hold control shift and press control shift down and it's going to select all the different arrows so as we can see up here A1 to P 3103 scrolling down
we don't have any more data now all I have to do is press enter and I did this to basically show the nomenclature now so now we're not only selecting a range but we're also selecting a range from a different sheet and this is how Excel does the nclat or the formula necessary to make this work and once again this is a dynamic range appearing inside of here but we really want to put it inside of here into this new workbook so what I'm going to do is I'm going to actually delete this sheet right
here because we don't need this copy sheet in here I don't want to actually manipulate my data at all going to right click it and select delete it's going to prompt me any time that hey you're going to permanently delete a sheet do you want to continue yeah I want to continue now once again I'm going to go back to that original blank sheet that we have I want to put it into here so I'm actually going to name this one Jan and then we'll call this one formula CU technically it was a formula not
a copy I don't know why I did copy before anyway back into A1 once again I'll press that equal sign and then going back to that other workbook I will select it the first cell in there which is actually A1 and now we can see we have in the formul of the bar which is actually the front of the bar which is sort of strange in the other sheet that our other workbook that we work with we have inside of brackets the Excel file name the sheet that we're in and then the actual uh cell
range of A1 we have dollar signs around this this locks the references of it which we're going to go into more detail on but the main thing to understand is this has A1 selector right now but we want to select all this data so that shortcut of control shift right select all the different columns and then control shift down okay it's all selected I'm going to go ahead and press enter and it's going to take me back to my original workbook that I was trying to work with this now that was using formulas to copy
this data we're going to explore two more options the second one is going to be somewhat familiar using copy and paste so I'm going to create this new sheet I'm going to call it Jan copy and paste from here I'm going to go back to our original data that we have and since we're at the bottom of the sheet I'm just going to select the bottom right hand corner press control shift left now if you noticed it went and stopped stopped at this Blank cell right here which isn't a big deal I'll press it one
more time it'll go to the next cell over that actually has a value in it and then once again it's going to go all the way to the end of a 3103 so basically if there's any blanks while you're trying to do this it's going to stop at those values there okay and then from there I'm going to press control shift up and as we're saying it's going to stop at every different Blank cell along the way this is going to take forever unfortunately I don't recommend you actually do that ever again instead start up
at the top left and do the control shift over to the right and then all the way down in order to select all the cells now like we did before we want to copy it I could either use this up at the top in the home ribbon right here I could actually select copy or the shortcut which I'm going to recommend of contrl c and from there going back into our new workbook selecting cell A1 and then using contrl V and pasting all this data in now moving on to the third example which is is
actually the one I recommend you do anytime you need to move sheets of data basically in both of those previous approaches you could go about missing getting data to move over so I don't really recommend doing that instead I would come down here to the Jan sheet write click it and select move or copy so we have this new window that pops up and it has two book right now it has this Excel sheet selected of data job salary monthly we don't want to move to that we want to move to book two we also
move to a new book but book two is open that's what we've been working in that's what we're going move to okay we can see we have the different sheets that we've already made in there and it says in this dialogue this is where you want to put this before this sheet and we want at the end so we'll select move to end now we don't want to take this sheet Jan out of here we just want a copy of it so we're going to select this create a copy and then click okay now JN
has moved over here but I do want to actually differentiate this so I'm going to put mover copy now in the next lesson we're going to be exploring more about the ribbon but we're going to be exploring now more about the file menu or also known as backstage view we've gone through this home new and open we also have this here for share this is available for well if you're sharing it via one drive this makes it super easy to share with your co-workers we're not going to go into a lot of detail but this
is a great option if you're working in one drive and you want to actually collaborate with other co-workers you can work on Excel files at the same time moving down to the list here we also have get add-ins and we're going to be actually looking at different addins we can use in the advanced chapters whenever we get to that so we working with some addins with that next up is info which has over here on the right hand side some key metadata about our Excel file itself then if we want get into actually protecting our
workbook which we're going to cover in a few chapters down the road you can get into actually doing that the only other thing that I find myself doing from time to time in this section is on version history once again this requires you to be using one drive for it but you could go back and revert back into a previous version that you work with so it's great for that now moving into save or even save as since we haven't saved yes they're both the same right here I'm going to go ahead and save this
but I don't want to save this on one drive personal I'm just going to shave this on my desktop so I'll come and select desktop and then I'll name this two workbooks and save it now Beyond save as we also have things like print which I really don't find myself doing that too often should be sending an electronic version export if I wanted a pdf version of something and then finally close as well same thing as this x up here just a x out of it and there's two more areas down here that I want
to call out and that's a count and that allows you to actually see behind the scenes of what going on with your Microsoft account and this is generic to all the different Microsoft products that you have so not just Microsoft Excel as you can see from my information I'm actually inside the Microsoft 365 Insider program so I get a lot of access to Insider features get to experiment with new stuff before any other people do anyway this is where you want to come anytime you want to make sure that you have your Microsoft products up
to dat I have automatic updates available so even I'm I check to update now it's going to tell me hey I'm up to date the other thing to note on this is the different office themes that you have on this I'm actually going to change this right now to use system settings which on my Mac I use dark theme so it's going to go to that last two options are hting down here behind more I have feedback so if I wanted to give feedback to this product i' probably go to something like X or Twitter
instead and then finally options we'll be getting to options later on in this but this allows a very much more advanced features that we can actually go in and customize using this menu especially whenever we get into add-ins we're going to be doing that from here all right so now you become an expert at how to manipulate different spreadsheets or sheets along with manipulating them between different workbooks in the next lesson we're going to be going into this ribbon up here and actually exploring everything a little bit further and getting a sneak peek into each
one of these for those that purchase the practice problems and course notes you have some practice problems to go through now and experiment working with different workbooks with that see you in the next one where we get into the ribbon see you there all right this final lesson of the spreadsheets intro we're going to be getting into the ribbon inside of Excel and better understanding what are all the different tabs and what are the capabilities by doing some simple exercises for this we're going to continue to be analyzing that January data set that we worked
from the last Lon and we're going to actually get into actually performing some data analysis with it so for this lesson you can open and use that ribbon menu Excel file which I have right here and all the data that we're going to be working with are that January data is in this data tab along with all the examples and all the different tabs but I don't need this I'm not going to work with this so I'm going to close this out instead I'm going to be working off where we left from last time in
that two workbooks where we actually moved over that January data set now quick disclaimer for any of these files that you're opening up if you're noticing the security warning of automatic updates of links have been disabled can go ahead and just enable the content and then click right here on do not ask me again for network files and select yes cuz I want to make it a trusted document now if you're getting any of these areas that the file has been moved renamed or deleted cuz mainly you have it in a different location of what
I had it here's actually the address of the file that I'm using I open it up anyway this is the actual address of where the file is anyway you can come down here and select these three dots on the file in question and just select change Source go into browse and then from there inside the actual file itself select where this is so in this case it's looking for that data set file with the data job salary monthly I'm going to select it select okay and then it's prompting me now that this link workbook hasn't
been refreshed want to and go ahead and refresh it and it's going to update it all right close out of this now anyway that was all s silly because I'm going to go ahead and delete this formula one right here and also this copy and paste tab right here we only want to keep the Mover copy which is the actual sheet that we moved over that has all the data for this lesson okay I'm just going to rename the sheet data so let's dive into this Home tab and this thing has a lot to do
with formatting the text and how things appear within the spreadsheet for example I can select all these top rows right here so basically A1 all the way to P1 I can change this font size to something like 12 for the fill color or the background color I can change it to something like a light gray right now it looks like it's already bold I could turn it off or turn it back on inspecting all these different columns I can see that some of it is hidden especially here this date column I can see inside of
here this is the actual value but whenever we actually look look at it from afar like it it has these Amper sand signs so double clicking on the edge of that H column right here it actually expands out and moves it where it needs to go you can actually do this for all the column by just selecting all of them and then double clicking that last one and then that expands it all the way we can see that that last column is well super long so it has all the different skills typically these titles up
the top I'd like to maintain centered so that way I know that it's a title but I could move it to either side also so I can move it up or down if I wanted to but we'll leave it right there in the center as well getting into the number formatting itself I can actually go and select something like job post to date it's going to select that whole column if I wanted to I can turn this into a date so in our case I want to do a short date now other columns I would
want to format are these salary year average and also salary hour average so besides just clicking here I can also just select that hey I want to use this as an accounting number format and it's going to automatically put these decimal places at the end two decimal places since we're in the 100 thousands I don't really care about so I'm actually going to remove them by saying decrease decimal I'm going to do that twice now for something like salary hour average I'm going to also convert this to a currency but for these these may have
two decimal places of values included in it so I'm going to leave it now so for the Styles and cells portion we're going to be getting into this more especially into conditional formatting in the spreadsheets Advanced chapter and chapter 4 so we'll save that for then the next thing I want to do is get into this editing and this is a pretty powerful feature we can actually sort and filter our data if we wanted to so what I'm going to do is actually select all these cells from P1 all the way to A1 and then
come in here inside of editing select sord and filter and apply this filter so let's actually get into filtering this data specifically I'm wanting to investigate jobs or data analyst jobs in the United States and specifically full-time jobs we're going to be looking at the salary data for this so I want to filter it down for it so I'm going to select here I'm going to unclick select all and select data analyst and now it's going to filter for all the different data analyst roles there nothing else that's not there additionally that job schedule type
I want to be looking at full-time roles only I don't want to include any other ones so I'll select fulltime I want the country I don't want to be skewed by any other countries I live in the United States so I'm going to then select United States and then finally I only want to look at the salary or the yearly salary data so I can actually come over here to the salary rate and select here I only want to look at the year data okay so now this has everything in it that I want we're
going to get to analyzing and visualizing this in a second before that I want to talk about two other features addins which we talked about before on how you access to the file menu you can get to addin via this and finally analyze data which in my opinion isn't that strong of a feature this tab uses a little bit of artificial intelligence behind the scenes for you to investigate so it'll actually provide you different visualizations that you could actually visualize out of your data and or even you can go as far as asking a question
about maybe you want to see hey the distribution of salary rate or something like that all you have to do is come down here and then insert in the chart that you want to insert in I'm going to close out of this now we can see that we've made this salary distribution um that we maybe want to visualize overall though I find that this analyzed data is pretty hit or miss so I'm not using it very often now the insert tab is where I spend the second most of my time after the Home tab they
conveniently put in the correct order there's three major use cases that I'm using out of this in chapter 4 on the advanced use of spread sheets we're going to be going into tables and then in chapter five we're going to be going into pivot tables but even closer to that in chapter 3 we're going to be going all into depth on how to use these charts but let's get a sneak peek into this specifically remember we filtered this table down to data analyst jobs in the United States and specifically full-time roles we want to visualize
this salary year average column so with column M selected I come up here to recommend charts and it's going to give me a visualization of some well recommended charts now there's only four here I can also select this other tab up here on all chart and actually try to see hey what would this look like maybe in a pie chart or a bar chart anyway I want this in a histogram which we're going to go into more detail on how to read this later what all have to do is just come in here double click
it it'll insert it in now notice how whenever this was created we now have new tabs appear inside of here specifically with this selected we have this chart design and format tab if I select off of it those tabs disappear and select it again they reappear this tab allows me to dive in and actually further customize these visualizations to how I want them to appear I can even move them to let's say a new sheet and I can title this something like histogram and then move it the charge Stone always necessarily appear just like that
let's actually do a deeper analysis to see what are the different job title short columns available I want to clear all these different filters on here so I'm going to come back up here with this one row selected come into editing sort in filter and I'm going to say hey clear all the different filters now selecting column A going into insert and into recommended charts it's recommended this clustered bar chart which is actually what I want to view so double clicking on this this provides me a breakdown of all the different counts of the different
job titles within our our data set and we can see things like data scientist engineer and analyst are some of the highest amount of job postings in this data set now unlike our histogram example this actually provides this data in a pivot table which we're going to be going into in the pivot table chapter which allows me to further manipulate the data so say I want to actually sort this I could rightclick the values right here and clict hey sort smallest to largest and then closing out this pivot table tab right here I can actually
see what is the highest amount of job compared to the lowest which is cloud engineer now there's remaining tabs we're going to be going and hopefully rapid fire in order to cover these as I find I'm using these less frequently than these other tabs that we previously talked about the draw tab allows you to well draw on your spreadsheet so I can just write on it if I wanted to but I don't really find myself doing that except for maybe being I'm building dashboards besides that use case is pretty rare if I want to end
do this drawing right here I can come up here and click undo or I can select contrl Z and it'll remove it page layout tab is great if you're having to print out any data for those co-workers that are living in the past and don't know how to accept things digitally you can do everything from adjusting your page layout to adjusting the scale that you're actually viewing things now personally I find myself more using these sheet options right here so if I go to this job count tab right here if I wanted to I could
turn off the grid lines on here as you can can see it got white on the background I really like that now if I wanted to make sure they had actual grid lines around my table I come back to the Home tab and for here I can select borders and from there I want to put all borders on there so now I look like I have this table right here along with my graph super fancy next up is formulas this is where you need to go if you can't remember a function that maybe you want
to use if it's a text function you come in here select something like text you can scroll through and actually see even a description of of the different functions that are available so in this case replace it tells you hey replace this part of a text string with a different text string depending on what version of excel you have and the newer ones you'll have this insert python to insert python functions and then finally they have more advanced features with maintaining and updating and formatting your different formulas and functions which we'll be diving to in
the next chapter now besides the home and insert tab the data tab is the next tab that I find myself using all the time in chapter 7 we'll be diving into Power query and we're going to be focusing heavily on this getting transform data and also queries and connections and then in chapter 8 when we get to power pivot we're going to be going into managing our data model with power pivot in chapter 4 we're going to be going into this forecasting and we're also going to be adding in some extra add-ins that are going
to appear in this data tab now I sort of skipped over the data types and sort and filter because we've saw them on the Home tab they're just conveniently located here in bigger format for you use also all right this tab on review is probably the least likely for me to actually use I can actually go through and check things like spelling and add comments or even protect my sheet besides that I'm not finding I'm using that this often view tab is similar to the review Tab and that I'm using it a little bit more
you can change the format of how you actually want to view things but mainly I'm finding myself using this the most of freeze pains let's say you see I'm scrolling down here and I don't know what the job or what the he headers are right here so going over to this data tab I can actually come in here to freeze panes and select freeze top row or even freeze First Column so in this case that top row actually stays up there and I really like it like that now let's say I want to freeze both
the top row and that First Column there's not really a selection for that so here's what you can do you can come over here to freeze panes and select unfreeze paines and then select something like a cell like B2 that means I want everything above this and to the left of it to freeze so now when I select freeze panes this upper or top row is actually Frozen and then the actual First Column is Frozen as well all right final tab is help and I'll be honest I think this is pretty useless if I get
stuck with anything along the way I'm finding myself navigating to something like chat GPT and it's helping me a lot quicker than trying to navigate through this help box that it provides and I'm already getting an error message with even accessing it so you can see how often I even use it then now we've been doing a lot of manual clicking with using the ribbon and I think a good resource that goes with this is shortcuts so if you come inside of the resources folder we have a Excel file here called Excel shortcuts and what
this has in it is a list of all the different shortcuts that I find myself using anytime I'm inside of excel so it's worth having all of these I'm not going to lie committed to memory it looks like a long list but I'm telling you by the end of this you're going to have all of these basically committed to memory they're going to be timesaver now although I shed on people that print out stuff this would be something that I do recommend actually printing out and having next to you so that way you can reference
really quickly while going through this course all right now I know we move fast through that but we're really going to be diving into as I called out during this lesson all of these different tabs even more as we advance through all the different chapters that was more of a sneak peek into what you're going to be exposed to coming up in this course all right for those that purchas the practice problems you have some problems to go through and actually experiment more with with the tabs in the next chapter we're going to be jumping
into functions and also more specifically formulas order to build them out and form data analysis on that data science job posting data set with that I'll see you in the next one all right welcome to this chapter on formulas and functions in this lesson we're going to be focusing specifically on going a deep dive and understanding formulas then in all the follow on lessons this we're going to spend the majority of our time working on functions for that we'll be exploring the entire function Library focusing on the key functions within this library that I find
that I'm using time and time again in data analytics so what are we going to be doing in this lesson well we're going to be focusing on a fictitious data set we're going to keep it small in order for us to get more familiar with operating with formulas and operating on this data set specifically by the end of this we're going to be able to input into into this worksheet a number of years of experience or total salary and be able to see whether these jobs meet those conditions specifically me that I meet both of
those conditions so for this you can follow along by opening that formulas intro workbook in this workbook will be staying in this data sheet right here all the different answers when we get to the math operators comparison operators or cell referencing are shown via that sheet but we'll just be sticking for data for now first as math operators and as shown by this table here you can use a variety of different symbols for to conduct different multiplication subtraction division operations that you want to do so let's dive into testing some of these out we're going
to be filling in each of these columns that correlate with the associated job title as we go through this so the first one's going to be experience pretty simple right we talked about before in order to reference another cell we would use an equal sign and then from there we can either type or select a cell I'm going to recommend just typing it to make it go faster C3 it's highlighted blue because that's the cell that's highlighted then we'll be using the autofill feature of this to fill in all the cells below and we notice
that it updates to here this one's equal to C12 which correlates to this one right to the left of it so let's calculate our total salary and this is going to be taking our annual salary in column D and adding it to our bonus Max in column e so we can do this by specifying D3 plus E3 and from there there pressing enter once again to autofill it I select that cell that I want and drag it on down now if I want to calculate what is the rate of bonus or the bonus rate that
is going to be the bonus divided by that salary so in this case E3 / D3 once again going to use autofill drag and drop it all the way down now for all these values I don't like what it's formatted as right now I'm actually going to change this to a percentage and I want to see one decimal place so I'll press this one to expand out one now anytime I do any type of mathematical operation in Excel I always want to try to confirm it that it's correct I did the operation correctly so in
the case of this bonus rate I can do this by confirming what we got for total salary previously so if we took that bonus rate is which we want to confirm right so we're going to take that and multiply it times our annual salary right so that should give us that bonus rate right there then if we wanted to like we said we want to confirm total salary right here so I can just add in that we want to also add in that annual salary itself and we do have that total salary right here to
actually confirm what's going on dragging it down and doing an autofill all these values look like they correlate to what it should be for total salary so I feel we calculate a bonus rate correctly now going back into the formula itself you can see we have multiple operations in here how do we know whether multiplication addition subtraction what comes first well really if you know the order of operations it really is the same here here the different operators listed in their order of Precedence exponentiation comes first multiplication division or second then addition and subtraction are
third it's Then followed by concatenation which we did in one of the previous lessons followed by the comparison operators which we're about to get to so with that segue here we are comparison operators for this you probably are familiar with the first three the last three are something that get a little bit more complicated whenever you have a greater than or equal to less than or equal to or in this case a not equal to so previously I just sort of did a cursor check to make sure this confirmed t total salary column equals this
other total salary column but imagine you have hundreds of thousands of rows how can we actually compare this and find these values well what we can do is we can say hey is G3 equal to I3 this looks a little bit confusing right CU you have two equal signs in there but everything to the right of the equal sign it's basically a comparison and from there it either ends up as a true or a false and we can drag and autofile this in and everything is true similarly if we want to find something like is
the bonus Max greater than the annual salary we can do hey is bonus Max at E3 greater than that at D3 and the typical of any data a science job none of these really exceed that at all all right now that we're familiar with math operators and also comparison operators let's dive deeper into cell referencing and we've been doing this previously whenever we reference another cell like A2 but we're going to add a little twist to this I'm going to go ahead and hide some of these columns that way we clear up the Clutter going
to hide column F by right clicking it and selecting hide then I'm also going to select all the columns H through k and also hide them want everything to appear on the same sheet so we're going to be referencing this table down here for this portion of the exercise and this is potentially goals that you may have when you're trying to land a job you may know how many years of experience or you should have know how many years of experience you have along with a goal total salary that you want to achieve and so
we're going to be building out formulas with this in order to be able to find out which of these jobs actually meet our conditions of the expected years of experience and total salary for so for this we'll go with that I have five years of experience then I'm looking at $90,000 the first we want to calculate in column L is whether it meets our experience so for this we'll say hey is C15 right here less than or equal to the value right here in our experience and as expected five is less than or equal to
basically equal to 5 it's true now we're going to run a problem now when we try to autofill this if I try to autofill this down I'm getting this one is false and then these all is true but I would expect especially this AI specialist at three it would be false and so let's actually inspect this well as we can see from this this is referencing well c23 which is way down here but it's still referencing the correct C11 right here the problem is we didn't really want this value up here this C15 to actually
change whenever we went to do the autofill down below it so what we can do here is provide a fixed reference of that cell in order to do this we're going to insert those dollar signs that we saw previously before the column and then also the row so in this case I have C locked and I have 15 locked now the formula itself doesn't change at all but now when I drag and drop this down all of these are updating correctly as expected AI specialist is going to be false whenever I actually click on it
to inspect it it's still referencing that C15 C11 next we're going to move on to column M of seeing if it meets our salary requirements so for this one we'll be seeing hey is the salary or total salary in G3 greater than or equal to our total salary down here of 90,000 now we already know we need to lock c16 of this 990,000 because we're going to be autofilling it down I can manually type in the dollar signs but a shortcut to this is just pressing F4 if you're on a Mac you'll need to press
function F4 anyway this locks this in so now whenever I drag and drop this down as expected the only other one that's less than 990,000 is this data analyst rule right here now I want to play with this just a little bit more so we talked about this right here putting a dollar sign in front of the column and then a dollar sign some of the row is a fixed reference they also have what is called a mixed reference so I'm going to go ahead and put my cursor right there next to G3 I'm going
to press F4 and it's going to do the absolute reference but if I press it one more time it's going to do a mix reference if you notice there's only a dollar sign in front of the three or if I press it again there's only a dollar sign in front of the G now technically this is going to work but fine because we're going to now lock this G column for this but it's going to allow the three to update so I'm going to show you this now by actually dragging and dropping this down and
from there inspecting that last cell contents we can see that that g is locked as expected but it moved down now instead of locking just the column we could also lock the rows so I could also do change up c16 now instead and lock the rows of c16 cuz we're going to still stay in that c column right there pressing enter now autofill we don't have to just go down we can also go up so inspecting it locking it didn't really change by only locking the row of 16 so let's wrap this all up by
actually def finding out which of these actually meet both of our conditions of 5 years and 990,000 well it turns out that behind the scenes true is equal to 1 and Z is equal to false so if actually were to take this and add this true to this true right here we should get two autofilling it all the way down we have two1 2 1 so basically confirm that hey zero yeah false is zero because 0 plus 0 is Zer now I recommend instead we're going to be going through and doing L3 * M3 so
that way anytime either one of these are true they will return a one and now in order to get a true or false back on whether it meets both we can select that N3 and see hey is it equal to one type over there equal to one and it evaluates to true so now I'm going to go ahead and just hide these columns so we can actually see this a little bit better but we can find values in here that meet our conditions of the 90,000 or 5 years and let's say we're doing job searching
and it lasts over a year um we have to change this to six this will automatically update the formulas that we've used here as shown here so that's our intro to formulas and for me the hardest thing to wrap my head around when I was first tackling this was around absolute and mixed references so we have some practice problems for those that purchased the course practice problems in order to go through and test this out and understanding what happens whenever you lock the row or lock the column all right and after that we'll next be
diving into an intro into formulas which I'll be covering for the remainder of this chapter with that see you in the next one for this lesson we're going to be focusing on an intro into functions specifically we're going to be going over all the different functions that we're going to be deep diving within this chapter itself along with some common problems you may run into and errors and how to troubleshoot it to do this we'll be continuing on from that data set that we used in the last lesson specifically we'll be calculating things like averages
and counts and how many jobs actually meet our goals and we'll be using functions for this so you can continue working in that workbook that you had from last time or open this function intros workbook in this function intros workbook I've gone ahead and moved our job goals over here to that column RNs and then added in this bottom portion right here for the averages and total counts really you can do and manipulate as you want so why use functions let's look at a couple quick examples on the importance of these things let's say we
wanted to get the average of each one of these Columns of experience annual salary and bonus Max previously we know we can actually reference each one of these cells to calculate the average we wanted to do that we would have to actually add up all the values so I have to go through select C3 C4 all the way down to C12 and we would need to divide it by that total number of 1 2 3 4 5 6 7 8 9 10 in that case we' get the average also that me count that 10 wasn't
necessarily perfect so I don't really recommend doing this but anyway nonetheless we can actually do autofill to calculate the averages as the is as well as it automatically update the referencing correctly to it but I don't recommend doing that instead I recommend using functions specifically we can use something like the average function as soon as I start typing a function a in this case all the functions that have the a name pop up if I wanted to well I do know I want average right here I can select it it provides a brief statement of
what it's actually going to do and then I can doubleclick it to insert it below here it actually specifies what's going on with this function here and specifically to provides me to hey provide in these numbers now I could select these number by number as we can see that there's in Brackets here this number two that means it's an optional parameter but instead what we'll do is we'll just provide a range providing it from C3 all the way to C12 in that case I got 5.3 similar to above and then dragging this over we can
get all the other values as well as a quick example also previously we had made this sort of convoluted formula in order to calculate calate whether we met both conditions of mean our experience and also our salary which we're specified over here well there's actually a formula for that and it's called the and formula and what it takes for its arguments are logical values so it can take a logical one for the first parameter I can specify L3 and then for the second parameter I can specify M3 and notice how this second parameter now highlights
or becomes more bold as I put it in so you can keep track of where you are in the formula any I'm going to close the parenthesis press enter and it evaluates to True dragging it all down these should match these other ones and yeah this is definitely something I'd use over these formulas that I've used before so let's dive into this formula tab more and understand the capabilities that we're going to be carrying out the next lessons in this chapter the most powerful of these especially for those new to excel is this insert function
anytime you're looking for a function and maybe can't can't recall the name and you're not sure what even starts with you can put something in here so say I wanted maybe the average I can type in average and then everything that basically calculates a different average off of it even if they're closely related like this rank average will pop up in here along with a description below explaining it if you've used a formula recently you can come in here under recently used and I frequently find myself just going back to this in order to select
something I may have used recently now in the next seven lessons we're going to be diving into each one of these all the way through it from logical and text to look up and also math and trick now one note we won't be going into detail on this financial functions because I find they're sort of nuanced but we will be going into all the different ones that I'm using on a daily basis as a data analyst that aren't specific to financial applications so let's get into understanding the basics about formulas by calculating these different counts
and especially counts around whether any of these jobs meet our goals for this I know I want to use a count function so I'm going to go to this insert function I'm going to type in count now there's a bunch of different ones that pop up count itself just counts the number of cells in a Range that contain numbers it has to have numbers in it if I wanted to do something more around text I would say hey count the number of cells in range that are not empty I could do even do something conversely
of counting the number of blank cells for us we want to actually do count so as we showed before I'm just going to come in type count it's going to prompt me that I need to at least put at minimum a value and I want to count all these cells here so using autofill to fill it over um we can see that all the different values are 10 nothing really spectacular here but now let's get into a pretty unique use case of count so in this scenario that I'm count trying to calculate in cell c16
I'm trying to find out how many jobs above here in these 10 right here how many meet our goal of less than or equal to 5 years and I want to count the number of these so I know I want a type of count I can go into insert function I know it's here inside these different statistical functions specifically I have these different counts right here and I'm going to scroll over this count if right here and it's going to provide me a description it says Hey counts the number of cells Within range that meet
the given condition and that's what we want to do we want to meet a condition of a certain amount of experience now it provides this box in order to help me input in these values so for the range here what I can do is specify hey I want to count inside of here if they meet a certain criteria and just going back to that range right here we can see that it already input all those different values into an array likee object okay so the criteria right now is NX want to put something in here
I can also press this box and it'll make it disappear and I want to compare it to this experience but I want it to be less than or equal to five so I can press enter to accept it but the problem is it's going to evaluate whether five is any of these columns here and right now we see that there are two I'm going go ahead and close up so we can see this better right now we can see that there's two fives in here that's not what we we want we want to see everything
that is less than or equal to 5 so instead what we need to put in here is less than or equal to 5 now I'm going to press enter and we're going to get an error this is pretty common whenever you are manipulating different formulas and you have in this case I have this less than or equal to right here so Excel is confused by this what we need to do is actually put parentheses around this which basically sort of makes it into a string or text if you will but now it knows hey I
want you to look for less than or equal to 5 I want you to evaluate this entire thing pressing enter bam we have six values here that are less than or equal to five now similarly I can drag this over because we want to also do this for experience but I don't want to do less than or equal to five I want to do greater than or equal to 90,000 and in this case we have nine cuz we only have have one that's less than this but as you find out on this course I don't
like hardcoding values into my formulas in this case I have five inside of here but I'm already having five right here what happens if I want to change this maybe to say something like three well it's not going to actually update these values right here so I'm going to go ahead and actually change that back to five and we're going to make another formula that actually fixed this so I want to drag these down but we actually didn't lock either one of the these cells and it will cause errors if we do so I'll just
select right next to it press F4 next to C3 I'll do the same of f4 doing the same in this cell as well all right now I'll take this and I'll drag this down so now let's actually fix this to be more Dynamic we don't want it to be less than or equal to this five right now what we can do is that Amper sand operator and then from there put in reference to S3 which contains our five pressing enter bam we got six same thing here I can delete that 90,000 put in an Amper
sand and then from there we're going to be basically putting it to mashing it together with that 90,000 and it evaluates now when we change this experience to say something like two we can see that it actually updates appropriately to see that oh only one job meets this requirement so pretty cool I'm going change that back to five now frequently you're going to run into errors with your formulas let's say I wanted to divide one by zero not a good thing that we need to do anyway I'm going to get this error right now you
can notice it because it has this green check on the upper left hand corner but also it starts with this hashtag and it's saying hey you have a divide by zero error I can even come down into here and it tells me even more on this provides help on this or if I wanted to even ignore it now in this sheet of this work workbook I have a bunch of different errors in here that you may run into from time to time again and we're going to be running into these errors as we go through
the rest of this chapter so if you get stuck along the way while we're going through this I feel like this is a good reference for you to maybe save somewhere in order to understand what is going on with the different errors you may encounter now the biggest time saer I've found with any of these errors is using some sort of chatbot specifically me I'm going to go to something like chat GPT or even claw they're going to be able to provide really quick help in understanding what an error is and what I need to
do to fix it all right so now it's your turn to dive into and test Out These intro into functions and play with them and experience some of the errors of your own after that we'll be diving into logical functions a major type of function that you need to be aware of with that I'll see you in the next one now that we have the basics down on formulas and also functions we're going to be moving into one of the most important typ of functions to know logical ones the most popular of these are an
if condition basically looking at something and then providing a response based on it so for this analysis we're going to be jumping into our data science job salary data set but we're only going to focus on the first 20 rows of it here and on the next few lessons as well as I don't want to overwhelm you with the all the data just yet now for the final results we're going to be doing two major things the first is determining within this list of jobs whether they meet our conditions of finding the job we want
of a data analyst or business analyst and will Market not desired or Ro desired additionally we're going to do a common practice and analytics of bucketing basically taking those salaries and depending on the amount value putting it into a certain bucket for us we're going to be looking at whether they have salary data in this data set or more specifically if they are greater than our goal of 85,000 so why are these logical functions needed well let's jump into that last data set real quick and simplify how we can actually use these as a quick
example previously in this P column we were evaluating whether they met both of our conditions of experience or salary we can use an if statement in order to clarify this so I can specifically call out with an if statement saying if it has The Logical test that we want to actually evaluate so I'm going to put in P3 in this case as it's going to return true or false and then from there the next value in there is value if true which what do we want to return if it is true well that our goal
is met and then if it's not met we want to have well not met okay and then this whenever we drag this down will provide not met or goal met depending on if this is true or false and so that's the power of these if statements in helping us actually provide this value so that was just a quick example of if let's actually jump into some more examples so you get more familiar with how to use this so here we are in this data set and I don't need all the columns of this data set
so I'm just going to select the columns that I don't need I'm going select B through G and then hide it additionally I'm not going to need I or J so I'll hide these as well so our first goal is to identify whether these jobs meter conditions of either a data analyst or a business analyst we're going to start simple by just finding out which one is a data analyst first and then which one is a business analyst and meets those conditions so once again we'll start with that if condition and for this we're going
to put in that logical test remember pretty the example we need to have a return either true or false so we're wanting to check whether senior data engineer in A2 is equal to data analyst in K1 now we're going to be autofilling this down so we need to make sure that the A2 we're fine with it actually adjusting as necessary K1 we want it to lock at least lock on the row value of one then if it's true we'll be roll desired and if it's not it's not desired as expected senior data engineer is not
desired let's drag this all the way down and just double checking it we see that the data analyst roles are R desired okay so I can drag this over now and just to double check it shifted over to B2 but it's still but it's selecting a right one of L1 so actually what I'm going to do is I'm going to delete this go back up here I'm sort of a perfectionist I'm going to end up locking that a value so it stays in that a column none my values are going to change here and then
when I actually drag it over I can check that okay A2 is the correct one I once selected to compare it to business analyst in L1 okay then I'm going to autofill all the way down looks like there's only two business analyst roles here so now how can we identify that it meets both of those conditions both data analyst and a business analyst well we're going to do one approach first and it's called a nested if statement and it's not really the approach I'm going to recommend but it's something that you should be aware of
so what I'm going to do is I'm going to select cell K2 I'm going to go ahead and copy this formula plugging it in here we have it here and making sure that it operates correctly yep it does so how does this nested if statement work well we're going to still evaluate our first condition is the first role evaluated as data analyst does it meet that if it is we want to mark it as rule desired now we get into what happens if it's not a data analyst well now we want to now check if
it's a business analyst so I'm going to close out this and what we can do is I'm going to take this business analyst formula right here everything up to the if and I'm going to go back in here and I'm going to drop it in right here inside of the value if false so it's an nested if statement an if inside of another if so now if we don't meet this first condition of the value if isn't true it will go into the nested if statement and start checking this condition is now the software data
engineer equal to data analyst if it is it's R desired if not it's not desired so let's now drag and drop this all the way down I'm going to expand this out a little bit and now we can see if it's data analyst we get rule desired along now with if it's business analyst also R desired but I'm not a fan of nessf as they're hard to read instead I like using the functions of and and or and should be a little bit familiar because we saw it from the intro lessons that we did previously
with and it evaluates whether both conditions are true so in this case I'll put in condition one of B3 and then condition two of C E3 and both conditions are true so it satisfies as true dragging this all down in all the following condition cases they're not true for both conditions so therefore it evaluates as false in or it checks whether condition one or condition two is true and then will return true so inputting in the conditions of B3 and C3 one of the conditions here are true actually both are dragging it down I expect
yeah the second and third rows are also true where the final one both are false so therefore it is false so let's run the same Andor logic that we've run before in order to determine which one we actually use so in this one we're checking whether both of these jobs of data analyst and business analyst are equal to this one here senior data engineer as expected false and what should we should expect for all of these all of them are false because none of these are going to be both data analyst and business analyst so
as you can probably guess or it's probably going to be the one that's going to work for us we're evaluating whether either data anal or business analyst are going to match up to that value of senior data engineer in this case we're getting those tree values for data analyst and True Values for business analyst so now we're going to put that or function inside of that if for The Logical test and from there we can determine whether it's rule desired not desired dragging all down all of it's matching as expected okay I'm going to go
ahead and hide these rows so now what happens if we don't want just a evaluate for a true or false condition basically we want to evaluate for multiple different conditions well that's going to be something that comes up if you need to ever bucket data which we're going to be doing with salaries now for this first one we're going to just use a simple if statement we want to determine whether a salary is greater than 85,000 or if it's not we want to just specify that the salary is low so for this we're going to
be evaluating if H2 we're going to go ahead and lock that H column is greater than that 85,000 which will lock that completely for the 85,000 then we want to say the salary is greater than 85,000 conversely if it doesn't meet this we want to say that the salary is low I'm going to expand this out a little bit and then we're going to drag this down as expected we have the values returning those are 85,000 and then this one at 35,000 it is Mark is low now the problem we're running into and why we
need multiple conditions is this is the salary is low but there's actually no data there we need specify in these conditions that well there's no data so for this we can use an ifs formula and what happens with this is you provide a test and then a value if true and that's just the first one we can then provide another logical test and the value of true so the first thing I'm going to test is if there is no value there I'm going to go ahead and lock that H column as well and when I'm
looking for a blank I'm just going to put in two quot Mark say signifying that it's blank and the value of true is no data okay put another comma we can see we now we're on to logical test number two the next thing we want to test is if it's greater than 85,000 so we'll see H2 again locking that H and we want send it if it's greater than or equal to that 85,000 which will lock if it is we want to return back that salary is greater than 85k and finally we're on to the
final logical test and basically we want all of them to pass this condition so instead of providing hey salary less than 85,000 we're just going to pass in true because we want it to be true and we would expect this to be any values between a number that are between 0 and 85,000 so like before we're going to specify salary low running this we going to expand this out and then drag this down we have when it returns no data no data salary less than 85,000 return salary low and then whenever it's greater than 85
the correct results now if s functions are one of the more complex functions to work with so you do need some practice with this like for those that purchased course practice problems you have some now to go into and actually try this out manipulate and better understand how to work with this with that in the next one we're going to be jumping into my next favorite type of functions math functions which heavily used in data analytics all right with that I'll see you in the next one now in this lesson we're going to be using
math functions and also some statistical functions in order to perform Eda or exploratory data analysis on our job posting data set and for this we're going to be focusing on the five major functions of count sum average and also Min and Max and we're not only going to focus on the core versions such as just count but also the if an ifs version so they have multiple different versions that we're going to get to now for our analysis we're going to be diving into the full data set of the data science job postings which has
over 30,000 different job postings and in it we're going to be specifically diving into data jobs that are in the United States for data analyst and we're going to be able to use these sort of different functions that incorporate if and ifs in order to fine-tune in what we're looking for one quick note you're not limited to using un States and data analyst you can use the scenario that you're in of what country you're in and what job title you're most interested in instead so we're going to be filling out this table right here and
we're going to start on Row three focusing on those count functions first now the data set is actually much larger than this three columns I actually I'll unhide between a through K but we're not using any of these columns in between here so I'm just hiding that them and making it easier for us to work with for this we're going to focus on the core function of only count and we're going to be looking at those that have all the yearly salary data in it as you can see over here that there's missing blanks in
here so we don't want to count those that are missing anyway what I'm going to do here is Select column M and as you knew it selects the range of M colon M and then from there press enter so what we're finding is that around 22,000 jobs out of these 30,000 we're going to find out have salary data and how do I know about that 30,000 well let's actually see we can actually use instead we can use a count a function which stands for count all and it counts the number of cells in a Range
that are not empty specifically I want to capture those in the job title short column right here so I'll do a colon alen running this we get to see that it's around 32,000 jobs One technical note before we continue these are since we're doing the columns themselves in this case the count M it's also counting that column header in this case so if we want to be exactly accurate which in this case I just need roundabout numbers if we want to be exactly accurate technically we would want to go in and S say subtract one
to get what the actual value is but frankly I'm just trying to look at General numbers right now I'm not too car about one or two off so now let's dive into analyzing this further on my needs looking for specifically focus on the United States first so we're going to find those that have in the job country here United States and for this we're going to use the count if function and this counts the number of cells within a range that meets the given condition so you provided a range in this case we're going to
provide the range of that column K and then the criteria itself we want to filter for United States which I conveniently typed above so we'll select it right there I'm also going to lock it by pressing F4 and then running this we get that about 25,000 jobs contain United States so now let's evaluate those data analyst jobs using that same thing of ctif once again we provide the range in this case we're looking at that job title short column and for this we want to look for data analyst locking this cell we get about 9600
jobs for data analyst now next up we're going to be using count ifs specifically we're doing this because we want to find jobs that contain not only data an but also contain that they're from the United States now we can't just add these two columns together because one it's going to as we once we add it up we see that's even greater than all the jobs there that's not what we actually want we want conditions like here on row 16 where it's a data analyst and United States whereas something here on roow 223 where it's
a data analyst in s that's not going to meet our condition so we wouldn't count it so using count ifs this counts the number of cells specified by a given set of conditions or criteria for this we need to specify a range and then the criteria first we'll focus on the range of a for job title short and we're looking to match that of data analyst which I'll lock by pressing F4 then now we're moving on to criteria range number two where for this one we're looking at Job country now and for that we want
to look for the criteria of United States locking this with F4 closing this with parenthesis and then running it we get around 8,000 jobs and this makes sense right because it would be less than that 9,000 data analyst because some of these aren't going to be from the United States now with how this is Flowing we could actually make a visualization out of this data right here so going into insert and then recommended charts we have here a funnel chart so I'm going go ahead and insert that in and this basically shows the funnel if
you will of jobs we have we started with almost 32,000 jobs and we got towards the end of the jobs that we actually care about us and data analyst at around 8,000 I'll go ahead and move this off to the side for now all right next moving into the sum function and the core one itself of actual sum itself it's pretty simple we have to just we're going to obviously using salary year average column for this because we want to sum up the numbers in them and I'm put in that column of M and we
get the sum of values there now unlike count where a count has a count a or count all where we're trying to find if there's blanks or not that's not really applicable In Sum and average and also in Min or Max so I'm actually going to go ahead and just gray these out because we're not going to need them now moving into suth which adds the cells specified by a given condition or criteria this one is a little bit more complex than we dealt with with count because we first want to provide the range that
we're going to be evaluating for a certain criteria which in our case the range you want to evaluate is job country because we're evaluating for if it contains United States which I'll lock with that four but we're not summing the countries because there are text column so we have to provide this sum range which is column M similarly once again we can do that sum if looking for data analyst so in this case we're going to be looking at column A to evaluate if it has data analyst in it and then from there the sum
range once again is going to be that column M now the sum ifs similar to that count ifs adds the cell specified by a given set of conditions or criteria for this one we provide the sum range first so it gets a little bit confusing you got to make sure that you're actually reading the formulas in this case we're going to use M because that's the sum range we want to use and then we're first going to evaluate for that job title short that column A which we're going to evaluate for data analyst and then
we'll evaluate for the job country evaluating for United States closing the parentheses and running this bam as expected this value is less than that of the data analyst now moving into the last three of average men and Max which I think are actually more valuable than that sum one we did I'm not going to walk through actually typing in all these in because now you've had a familiarity with how I did the sum which follows the same example for average men in Max feel free to if you want to you can go through and type
it out on your own to get more experience doing it but overall I think this has some very unique insights from it from this analysis we did in it we can see that salaries in the United States are around 125,000 where the data analyst is only around 93 and specifically us data analyst is around 94 so data analysts in general are lower salaries than the other jobs in the data science Industry as far as Min and Max go we're having as low as 25,000 but we're having as high as well at least for a data
analyst up to 650,000 and apparently there's a job in here around $960,000 and you may be wondering what jobs correlate to this $155,000 or $960,000 well we're going to be diving into that further when we get to that lookup functions one last note on errors before we go I commonly find the most common error with these functions is a value error and that usually occurs whenever in this case we had column a selected initially for criteria range number one let's say we accidentally selected multiple different columns for this obviously we're not trying to evaluate all
the different columns we only want to evaluate one column for that criteria of if data analyst Falls in it anyway when I run this I get a value ER anyway this is a common one that I see come up time and time again so anytime you're going through this any of these or the practice problems themselves make sure you're investigating to see that you've actually input in the correct ranges to evaluate cuz it's commonly causing those value errors all right with that you have some practice problems to dive into and next we'll be diving into
even more statistical functions in order to really dive into how deep you can go with Eda or exploratory data analysis all right with that I'll see you in the next one we're now going to be taking this up a notch shifting gears from focusing on math functions now to statistical functions for this we're going to be using our job posting data set and analyzing the salaries in this specifically looking at common statistical functions like median standard deviation and even quartiles once we have the basics we're going to shift into an actual analysis looking at what
is the average salary of different job titles and we'll even get a sneak peek of visualizing it for this lesson you can start by opening this syst statistical functions workbook we're going to be starting by filling in this table here on the different statistical functions we're going to be filling out and we're still working with that data set we did previously if you noticed I've hidden a lot of the columns that we won't be using for this so we've done a few of these different type of functions already let's go ahead and fill these in
for count we'll be using the count function specifically on that M column of salary or average and like before we have around 22,000 values for average we'll be doing the same on that M column we find that's around 123,000 for men we'll also run this on the M column and that's around 15,000 for Max that's going to be around 960,000 so let's move on to our first true statistical function we're actually going to go into this to actually see what it does and that's median it Returns the median or the number in the middle of
the set of given numbers so let's go ahead and type that out median and in there we need to specify number or numbers we can specify a range we're just going to keep it simple right now to actually show what this function is actually doing it's selecting the middle of numers so I'm just going to select these top three numbers right now and what I expect for this function to do is to provide basically in a set of numbers given provide the middle number so it should provide us 140,000 which is the center number of
these three we don't care about the center of just three values we care about the center of basically all of our different values so I'm going to place the entire M column into it and that is around 115,000 now why is this average higher than this median well let's actually visualize it I'm going to select this m column and go to the insert tab going to histograms I'm going to insert a histogram and what this is showing is the distribution of salaries from 15,000 all the way to 950,000 bottom xaxis is a little confusing to
read but it's basically a range so this case 87,000 93,000 how many counts of salaries are falling in between that and that's how large the bar is next to it anyway getting back to that original question why is the average higher than the median itself if you call back from definition a median is the middle number in our set of our list but our average however is taking all the different values and well averaging it out and as we can see from it we have a large amount of salaries around well $100,000 but we do
have some up here that are getting close to a million dollar these basically outliers are causing us to have a higher average so basically those values that are near 960,000 are dragging that average way higher so that's why I prefer to use something like the median when I can in order to analyze these salaries because they're not skewed by the these outlier salaries that are just something that you're probably not going to get all right next up is standard deviation and for this you have two options standard dev. p and standard dev. s the P
stands for population and the S stands for sample this data set is around 30,000 salaries and there's way more than 30,000 data science jobs available so that's a sample of the actual population so we're going to be using standard Dev s and for this we can insert a range into it so what does this value actually mean well if we had something like a normal distribution which our salary data is somewhat close to that we'll find that one standard deviation from something like the average has in this case right here 34,000 so if we went
above and below the average by one standard deviation around 68% which is a heck a lot of data is within this one standard deviation so in our case if I was to take the average and then subtract this standard deviation along with taking the average and then adding the standard deviation around 70% of the salaries are going to be between 75,000 and 170,000 but what if we wanted to be more precise about finding say something like where does 50% of the data actually fall well we can use quartiles in this case specifically calculating the first
and third quartile here's a graph that I did from my python course which when you get done with this course feel free to check it out but anyway it looks at the salary distribution of data analyst United States has this histogram right here very similar to what we plotted previously in Excel but in it I'm able to plot out cortile one where the quartile one starts and then quartile 3 where that one starts so between this quartile 1 and quartile 3 marker lines 50% of the data Falls here with this red dotted line being the
media again which let's actually get to calculating this so if we want to do something like the cortile we're going to see that there's a few different functions available for this we have exclusive and inclusive we're going to do inclusive first and then I'll show The Exclusive after to basically show how it's different so this takes two arguments the first is the array so I'll put in that range of the M column and then lastly it takes the quartile and we have one for the first quartile two for the median three for the third quartile
anyway I have these values over in the U column so I'll just select that and use that for this and for the second quartile we're seeing that basically as a just red that's also equal to the median now I'm going to go ahead and get rid of these Min and Max CU we can also use that by with our quartile function and I'm going to go ahead and drag and drop this up and then also below so what we can see from this with this first and third quartile is that around 50% of the data
Falls between 90 ,000 and 150,000 so frankly when it comes to using quartiles like here and standard deviation I find myself more gravitating towards quartiles anyway what about that other quartile function specifically that one around exclusive values well once again I can select the array that we're going to use we're going to use M and then finally the quartile itself now notice for this one this one doesn't have a value of 0 and four that you actually can put in for the Min and Max it's exclusive so it excludes those outliers basically of the Min
and Max so specifying that column next to it when I actually drag this down we can see that the Min and Max AR provided in this but it's the same values for that se first second and third quartile if you notice here we get this numb error and as we inspected when going through this formula zero and four were not available to actually input into the formula so any time you're inputting things into a formula that doesn't necessarily exist you're going to get this numb error all right the last function to investigate is the mode
and this returns a vertical array of the most frequently occurring or repetitive values in any array in our case we'll once again provide column M and surprisingly we find that 90,000 is one of the most repetitive values if we go back to that histogram we plotted earlier we can see that the largest line right here with a value of 19 25 occurrences occurs between 87,000 and 93,000 so this makes sense on the 90,000 being the mode so let's get into some data analysis Now by actually ranking the average salaries of these different job tiles I'm
going to go ahead and hide the columns v through R now in order to rank the salaries of the different job tiles that I have this list here for you where you need to First calculate the average salaries of each of these job titles so for this we're going to be using as last time average if first we need to specify the range that we're going to be basically running that if on not necessarily the values but the range of the job titles next we need to provide the criteria for this we'll provid it of
data analyst which is in W2 and then finally the actual average range of column M dragging this all the way down we have our different averages for all the job tiles one note real quick in future lessons we're going to be jumping into using median to evaluate these job tiles cuz personally I like that more but that's a slightly more complex problem so we're going to stick simple for now anyway with these advertisers we can now actually rank it and this Returns the rank of a number in a list of numbers it size relative to
other values in the list so first we need to put in the number that we want to rank in this case we're want to do that of data analyst and then from there they have the ref or the reference array in this case we're going to provide it right here from X2 all the way down to X11 now I can change this from descending to ascending but I'm going to keep it how it is now I'm going to drag and drop all the rest of these and we had a little bit of an issue CU
we have repeating numbers right here it's obviously because I didn't lock my cells appropriately so selecting this range that I want to actually lock and pressing F4 go ahead and lock that and then we'll drag and drop this again hopefully this works this time and boom we have all these ranked from highest to lowest we can see business analyst or some of the lowest thata analyst not far behind and Senior data scientist has the highest I'm going to take this one step further I'm going to highlight everything from job title down to the bottom salary
for software Engineers going to go into insert in here and go to recommended charts and basically the first one that pops up this clustered bar chart I'm going to insert in and I can just change the salary up here by double clicking in here and I put average salary of data science jobs and there we have some data analysis is actually viewing these one minor touch to this I really don't like how these are unordered right now so I could actually go up here select these three titles right here and then under the Home tab
select I want to actually filter it and then order this rank from well we'll say largest to smallest one note you may not have been able to see it but it actually rearranged the data inside of our data set that's not a big deal for me I'm not caring too much but that is something that will be effective whenever you do this anyway with this we can see things like senior roles or getting paid the most and things like analyst are sometimes getting paid the least compared to these all right you now have some practice
problems to go into and thus practice your skills with these statistical functions after that we're going to be jumping in the next lesson into arrays which is a super powerful feature sort of new to excel in the past few years all right with that I'll see you in the next one we're going to be now shifting gears and jumping into a more advanced topic of arrays and with arrays what you can do is typing a formula in a single cell we can use this to fill in cells below it or cells to the side of
it all with one single formula so we're going to be slowly working up to an easy then a medium and then a hard problem and how to use these first up with the easy one we're going to go through and basically identify all the unique job titles and then go through and actually sort it alphabetically using arrays next we're going to move into to our median problem of calculating the median salary if you recall back to our last lesson we were calculating the average salary based on a job tile well we can use a raise
to calculate the median and then finally one of the most hardest problems we're going to get into actually looking at based on the month how many different jobs were submitted during that month and before this we'll be using the Su product formula and a combination of other ones using arrays for this be using the arrays formula Excel workbook now before we jump into those problems we need to First understand that there's actually two different types of arrays we're going to start with the first one of modern dynamic arrays which we've seen before and with this
what we can do is using a formula we can specify a range to identify and then whenever we press enter B2 to B5 it's going to actually fill in with all these we can see that it's modern or dynamic because it has this Shadow around the edge if I select any of the other ones and not the core one where this one's actually highlighted when the other ones these are grayed out taking this a step further with array multiplication we can actually go in and multiply this column one of A2 to A5 and multiply it
times B2 to B5 anyway in this sequence you can see that it goes down 1 * 1 is 1 whereas 4 * 4 is well 16 anyway that's modern dynamic arrays classical arrays let's say we want to do the same thing in this case well we're going to have to go about it a little bit differently we need to select all the cells that we want to fill in first is a very key concept to get for right first then from there we can start entering our formula so I put in equal in this case
we want to do the same array multiplication I'll take A2 to A5 times it time B2 to B5 now whenever I am done with this and I want to actually execute this I don't just press enter I have to press contrl shift enter and then it fills in the array notice it's not grayed out around the edges like this as a shadow this one does not do that and all of the different formulas are now filled in below this and you'll notice that there's a curly bracket around this this was used prior to around 2020
and so you may come into contact with Excel spreadsheets that have this and if you don't know about it if you come into here and say you want to like mess with this formula and you press enter you're going to get an error message but now let's say we have some additional values in it we'll say We'll add five to each to the bottom of these if I wanted to adjust this array if I came in here and then change this to six for both the bottom and the top and press control shift enter it's
only going to adjust the ones that were previously selected so now if I want to include this bottom row right here for modern dynamic arrays it's pretty easy I can just come in here and adjust this to six and this is done however for classical arrays or classic arrays not classical I have to actually select all these different cells and then go in and actually enter the formula that I want to enter if I try and press enter it's going to give me an error message and I realize okay I have to press control shift
enter and it'll actually fill in anyway the main point of this is classical arrays or a mess we're going to be focusing on Modern and or dynamic arrays for the remainder of this course but you need to be aware of classical arrays in case you encounter them in the wild so jumping into our data analysis we're going to be focusing with the data set that we've been focusing on before and I've hidden any columns that I don't feel are relevant for our future analysis that we're going to do anyway the first thing we're going do
is find the unique job titles and for this we can use the unique function and this Returns the unique values from a range or array so the first thing we need to do is actually put in the array itself I don't want to actually select this column A because I don't want this job title short to appear so I'm going to select A2 and then press control shift down to select all the way to the bottom I'm going close this parenthesis and we have all the different unique titles in there now I want to get
the sorted job titles out of this so as you guess we're going to use the sort function and for this all we really need to do is specify the array now if you notice from this one whenever I went ahead and selected it it specifies that R2 pound and that basically says that hey there's an array basically formula inside side of R2 we want to extract all the contents of that using R2 pound and so that's going to work to be able to provide us all those values and then it's going to sort it in
this case we have it sorted in alphabetical order one thing I haven't called out both times is these are once again dynamic or modern arrays you can see that gray box around each of these but just to show this also works by specifying R2 to R11 it's going to provide us the exact same results but I really like the shorthand nomenclature of the R2 hashtag sign now we're going to get into calculating the median salary and if you recall back to our last lesson on statistical functions we went through and calculated the average salary for
each of these job titles using an average IF function but as it discussed last time when comparing something like the average to the median the average in this data set is slightly higher due to those basically outliers of those High salaries around 960,000 so we want to use median so what are we going to be eventually calculating and now that's this table right here where we sorted our business or job titles themselves and then we go into actually calculating the median salary based on these different job titles from our data set now there's a pretty
complex formula going into here so because of this we're actually going to break it down step by step by step going through each columns explaining how this actual process works in order for us to get to this final value for this we're going to be doing it for data analyst only as we can see the final value we're going to get to is 990,000 which over here which our final results 90,000 so I'm going to go ahead and delete this to actually start with now we need to look for two separate conditions the first one
we need to look to find do the job titles here actually match up to this value here of data analyst and this provides booing values back where we get to this value down here for true as expected in row 16 we have data analyst now if we scroll down further we can see that our next data analyst job doesn't have a salary for IT these type of things will throw off our final median function that we're going to actually be calculating and so we need to basically filter it out as well well so with the
salary dat data set selected I'm going to then go through and filter this basically not equal to a blank value and as expected we're getting false values for these blank ones now similar to what we saw in the intro in arrays where we were multiplying different arrays together we're going to do the same thing here with these bolean values for this I'm taking that formula and wrapping it in parenthesis it needs to be in parenthesis in order to execute properly for the we contains that analyst and then the second condition that the salary can't be
blank whenever we multiply these two Boolean values together we get returned back either a zero or one and the only way we get a one back is if both these values are true which is the condition we want to meet now for zero or one values we can actually see if we did an if statement here if we did a logical test of zero what is it going to return whether true or false so for Z returns false and for one we'd expect to turn true anyway we don't want to necessarily return true in this
case we want to return the salary that corresponds to that Row in the data set so I'm going to go ahead and delete this so for this we're going to start with that if function itself then I want to place all the different contents that we saw in that previous V column now we want to return the salary which are these contents right here so I'll be our value if true and then if false we just want it to be FAL false which we can just leave blank so now scrolling down we can see that
we have nothing but those values for data analyst scroll over just to confirm 129 yep that analyst all right the last step we need to go ahead and put inside of our median formula all those contents that we had before that entire if statement itself to evaluate so that array that it's going to basically find out for all those salary for data analyst and it's return back the median salary now this also works for other function so let's say we wanted to use the mode we want to use a mode if condition they don't have
this available so we could just plug this inside of mode and then running this we can see that well the most common value for thata analyst apparently also the median of 90,000 so going back to our data sheet let's actually go through and stepbystep calculate it for each of these different sorted unique job tiles that we did previously and we're going to be building this step by step how i' normally build a for so the first thing we going to look for the job titles itself do they match to that business analyst so selecting column
A2 and then control shift down to select all the contents on the cell we want to see if that's equal to this business analyst rule right here and now remember we're going to be dragging these Downs do an autofill so we need to be particular about how we lock these cells specifically we do need to lock these values right here and just for safe measure I'm going to lock the column of this one okay pressing enter all right we we have our array back looking for business analyst and we can see that it's working by
what we see down here in row 84 so let's actually do that array multiplication by now filtering out salary that doesn't have values or blanks so we're going to put another set of parentheses next to it we'll put in our salary data and that it's not equal to blank running this we confirm that the first value of business analyst that has a salary has a one now we need to wrap this all inside of an if to basically return instead of that one we want it to return the salary itself so for the value of
true I'm going to put in the selection of the salary yearly running this we confirm this is again correct looking at row 180 almost done just now need to wrap this all inside of a median function and Bam 85,000 and hopefully we actually locked all the cells properly dragging it Down Bam looks like we got all our things and we slightly messed up our formatting here so I'm going to go ahead and put a thick outside border on again to make that right again all right so that's how you basically transform any function in Excel
that doesn't have that you know count if or average IF function or capability into other functions now moving into probably the most complex example that we're going to be be using not only this lesson probably in the entire course so if you get around this you're going to be good to go for the rest of the course anyway what we're trying to look at here is the count of job postings based on the month that it was posted in and we're going to be using the sum product function for this now sum product is not
anything that you should be afraid of basically before we were doing whenever we were doing the intro and we were talking about array multiplication how went through line by line based on this and we have our values of 1 4 9 16 and 25 line by line well if we were to do the sum product of the values in column A along with the values in column B we're going to get 55 which when we look at the sum of these values here we can see that it is 55 so it's a sum of the
product of the arrays so getting back to our example that we're going to be solving we're trying to aggregate it by these names of these months if we actually scroll over to the data set itself the job posted date is in a date time format so similar to the last example I'm going to be walking you through column by column by column on how we get to this final value that we're going to be eventually putting into our table here to thus calculate these values for the counts per month so we go ahead and clear
these cells to start and we're going to start first by we want to extract out the month from this job posted date column so for this we can use the text function which we're sort of jumping ahead because we'll be doing text functions upcoming lessons but there a good little sneak peek anyway we can plug in here something like a date time value and then from there we wanted to Output what is the format text for well I know that if we do three M's it's going to provide me the shorthand month of this additionally
if I do fourms it's going to provide me the lonand month of this and there's a host of different format codes that you can provide sh by this table here when I'm looking it up in something like perplexity that says that hey if you provide certain things like if I provided a Double Y it's going to provide the two-digit year and so on for other values you look this up in something like chat GPT anyway get back to this example itself I want to actually autofill this all the way through it's not around any other
columns that I can actually autofill all the way down and I don't want to sit here and drag it all the way so what I can do is select the column itself and then when it has these basically four arrows I can then drag it where I want I'm going to drag it right next to here and then now actually autofill it all the way down now that I have it complete I'm just select this column again make sure I have those 4 hours again and drag it back to the column it needs to be
now seeing what you did here you probably like Luke can't you use something like a count if in order to calculate the months now using this and you'd be correct with that remember call back for the count if s we can provide a criteria range in this case we're going to provide it column V and then for the criteria itself will provide the actual month and then actually dragging and dropping this all the way down once again my formatting got messed up so I'm put that thick outside border back on there anyway these values here
for what we're going to get finally are the same and so you really could stop this lesson right here and if you want to do this of creating a new column and then just using count ifs you can do that but this is a lesson on arrays so we're going to get more complex with this in order how to use the arays in order to actually calculate this without having to create these extra columns so I'm going to go ahead and hide this cuz we're not going to use it so before we can actually summing
up we need to get an array of all the values that we'll say equal to January so so we'll start by creating that text function it's going to be slightly different before cuz we're going to be making it out of array we want to actually select all the values from H2 all the way down to the bottom we want to then go ahead and lock it we want it to be evaluating for that long month name so four lowercase M's and when I want to check if it's equal to in our case we're looking for
January we'll look up here at this U2 or U1 I got a typo up there update that to U1 anyway we now have okay that this value is true right here and we can tell from row 11 that this is in January it is true so it's working out just fine so now if I tried to actually run a su product which is what we're finally trying to do on all the contents of this array itself we'll do W2 uh hashtag we're going to get back zero because this isn't in the format that we want
we actually need to convert this unfortunately although it is on the back end is zero and on the actual functions themselves can't actually calculate it so we can do this by basically converting it and the first thing we can do actually is just we'll put one negative sign and then I'll put in that W2 hashtag and what this does is it negates the Boolean values so basically true which is normally a one it negates it and makes it negative one zero a negative Z is negative anyway we need to actually apply two negative signs CU
we don't want it to be negative one we want it to be positive one so doing this one more time we now have positive ones in there so now we are using some product because some product I feel are better with arrays but we could use in this case where it's a single array we could use actual just sum itself I didn't want to show that and we get that value of 3102 which correlates to what I expect as the value but we're going to use some product because as you'll find out in future lessons
we're actually going to be modifying it even further what's inside of here and so we need this Su product in order to do those anyway we get the same value of 3,12 so going back to our data tab let's actually calculate this fully for all of these different values walking through it step by step by step as we do previously we're going to start with our text function and we want to look at that job posted date column I'm going go ahead and lock all those cells it's very important for this going be dragging and
dropping that down and remember for the format text to this we want it to be four lowercase M and in this we're checking whether it's equal to this value here of V2 which is January and I'm going to go ahead and actually lock just that column pressing enter to make sure it goes correctly yep we got True Value here for our row 15 value first thing we want to do is do that double negation which we need to actually wrap these in this whole formula itself in parenthesis in order to get our Z and one
values and then finally we're going to wrap this all once again in Su product putting that closing parentheses on there pressing enter get 3102 and then doing autofill all the way down we have all our values once again format is messed up I'm going put that thick outside border now the other reason why we're using some product in this case is because in older versions of excel before we had these uh modern dynamic arrays some is not going to to be able to work over a raise and you actually have to use some product so
this allows us also to have a safe way to calculate using arrays and then give it to people that may be archaic and have older versions of excel all right it's your turn now to jump into some practice problems to get more familiar with working with arrays inside of formulas in the next lesson we're going to be getting into probably I think one of the most funnest types of functions lookup functions like vlookup and X look up and things like that which are super helpful for data analysis all right with that I'll see you in
the next one lookup functions are one of the most I'd say funnest functions whenever you're learning to be a freak in the sheets specifically we're going to be focusing on three different lookup functions vlookup H lookup and X lookup V and V lookup stands for vertical H and H lookup stands for horizont and x and x look up just uh they wouldn't be different in order to learn about these functions we're going to be performing some data analysis and if you recall back from our math and statistical functions lessons we found out what the median
Min and Max salaries were but for the things like the Min and Max what were those different job postings that correlated to that well based on the structure of our data set we can use the vlookup and also X lookup functions in order to find this out now because of the structure of our data we're going to have to do something different in order to implement H lookups and for this we're going to be able to get out or extract out horizontal type data we're going to basically transpose it into a vertical format using H
lookup but if there's anything you remember from this lesson it's that of X lookup this one is the most dynamic and flexible and how it can be used and we're going to be doing in a final example using this in order to bucket our salary data set allowing us to categorize it into different ranges and whether it has data or not all using xlup for this we can start using the lookup functions workbook we have two main tabs in this data and dataor 2 Data ones where we're going to start in first for this section
on vlookups so for this we're going to be using that job posting data set I've hidden any unnecessary columns and we're going to be filling in this table right here so what I'm trying to do with this is fill in based on this Min as you can see the formula for Min the formula for max and the formula for median where we actually calculate this from the Sal year average column we want to then extract out based on these values the company name a job title associated with it and then the country associated with it
so we're going to start with V lookup first and V lookup looks in a vertical type format specifically it says it looks for a value in the leftmost column of a table and then returns a value in the same Row from a column you specify so for the first value of this we want to provide the lookup value in this case we want to look up 15,000 from that salary year average column then from there we need to provide the table array now remember for this it needs to be the leftmost column of the table
and we want to get columns M and O I'm going to select column o because if we start at M and try to go down it's going to mess up cuz there's blank in it so I'm just going to do control shift over and then control shift down to select all the data and then change this a column to M instead the next thing we need to specify is the column index number and right now we're in column M so that would be the First Column so MN o we're in the third column you can
imagine if we have a buttload of columns what kind of problems are going to run into so we'll get to that when we get to it okay now they have a range lookup we're going to leave that blank for the time being we're just going to execute this formula as is and for this we're getting an NA error if we actually click into it value not available error and why is that well if we actually go back to that vlookup function in the definition that it provides for it the last statement is by default the
table must be sorted in ascending order right now our salary values are not sorted so it's having issues going through it and actually finding that 15,000 because it's unsorted anyway we're not going to actually sort that table that's going to be too much work we can actually now go into that fourth parameter of range lookup and instead of doing an approximate match which was the default we're going to do an exact match by providing false in that case we find that net two Source Inc is the company name of the job with 15,000 now I
want to autofill for this but we need to actually lock some cells real quick so I'm going to lock this right now by pressing F4 then from there we'll drag it down now one thing to note on vlookup X lookup and also H lookup this is just going to return the first value so in this case of this 115,000 it says it's Volt Technical Resources however I do a contrl f of 115,000 we'll find that yes it's at row 19 for Volt Technical Resources the first one that provides but it's also in row 42 with
northr Gan so it's only providing that first match now what happens if we wanted to next get things like the job title or the country itself well if I were put in the first two values the lookup value and then the table array what will we put for the column index number remember in vlookup the leftmost column of the table itself is what we're going to be looking up but however columns A and K are even well more left of that table so unfortunately we can't use vck up for this but we will be using
X lookup for this that's why I'm going to recommend it over vlookup but I think you guys start at the Bas phics first however before we get into that we're going to now shift gears and cover H look up in order to look up values in a horizontally oriented table this case this is horizontally oriented because we have things like the months across the Horizon if you will and then we have in the columns in the column standpoint we have the job titles of the different ones of data analyst and your data analy so on
now the data in this table is calculated using the data from the data tab in order to get the counts of months and you've previously seen this in the last lesson where we went in that hard example of some product where we now go through and do some array multiplication in order to find out the different counts for the job titles based on a month anyway for this H look up we want to look up based on a month what is the associated job count for a specific job type so let's say we want to
just look at that may column well we can put in h lookup and this looks for a value in the top row of a table or array of values and Returns the value in the same column from a row you specify so only selects from that top row for this we provide a lookup value in this case let's say we're looking up January then from there we provide the table array itself we can go and just select this data now I could technically I could select all this data because it's just going to go to
the associated column associated with this so that we included row a doesn't really matter then from there we want the row index number what value do we want from this January do we want data analyst senior data analyst senior data scientist so we can just count down what we want we'll start with data analyst first so we'll put in that's the second row in this so let's try to enter this and for this we get 753 which if you go back to this we're doing Jan A1 through M7 and then the second one so why
are we getting 753 well once again this has to do with the range lookup we're doing an approximate match similar to vlookup it expects that these values for that top row are in in this case alphabetical order in order to perform that approximate maass these aren't in alphabetical order they're actually in chronological order so instead we need to specify false now running it we get the correct value of 982 now we can also apply this to a situation where maybe we want to transpose these values into this new table that we have here on month
and count and then up here I'm going to also just specify what we're looking at we're going to look at data analyst now with our H lookup we're going to be providing that lookup value the table array and then the row index number say if we wanted to go in here instead of data analysts we wanted to look at data engineer instead how can we get this to update well we can use another function for this specifically we can use the match function for this and this Returns the relative position of an item in Array
that matches a specified value in a specified order so in this case I want to look up data engineers in the array from a 2 to A7 it's providing me a one cuz it's not it's doing the approximate match again once again they're not in alphatic so I have to specify exact match using zero okay and now I get data engineers in the fifth place I'm also going to move this column over and make this a little bit bigger going back into that H lookup that we're going to use for this we're going to provide
that lookup value which we want to actually lock by pressing F4 then we're going to provide the table once again I said you can select that a column if want or not we're going to lock all these values as well because we'll be dragging it down from there we'll be providing the row index number which we've calculated right here in this P based on that match that we're performing want to lock this as well and as far as the range lookup well we want to do exact match running this we get an NA error because
I was silly and the lookup value we want to actually do is for the month of January not the data engineer actual lookup confusing this with h lookup sorry about that so we'll put in 03 for this instead and then running it and now we're getting back to 236 which is not that Engineers thing we're one off and this has to do with how we did our match up here which specified A2 to A7 basically we're counting down from the second one where in h lookup we included all the way up to that first row
so this is just a simple fix by changing this one up here to A1 and now our values update appropriately and then I can go ahead and just drag and drop this all the way down and once again going to get into some troubleshooting because this is all the same values and that's because I fully locked this actual month number and instead I wanted to press F4 and only lock the column of O now finally getting to the final answer we have it and we can confirm this that data Engineers should have 396 on the
December value that's correct and now we can do things like this where I can go in and say hey instead I want to look at Dana analyst and it will update for this instead now once again with h lookup we run into issues like vlookup if there's values Above This top row I can't really think of that any applications that that's applicable in this but it is a limitation anyway this is why we're going to be shifting to the next topic and that is using xlookup to now based on these salaries that we were previously
trying to identify identifying a job title and a country associated with it so what is the definition of xlup and this searches a range or an array for a match and Returns the corresponding item from a second range or array by default an exact match is used that's pretty awesome considering all the issues ran to with h look up and V lookup anyway instead of using a single table we're going to be using multiple ranges for this let's get into it first we're going to provide the lookup value which in this case is 15,000 and
then we want to provide the lookup array so we need to select this entire M column here for what we want to actually look up but we have these blanks in here so I'm going to just do a trick of selecting the O column selecting all the way down and then from here I'm going to just go in and actually change these values to M instead now we want this to remain the same so I'm going to press F4 to actually lock this now that was our lookup array now we want to get into what
return array or where we want actually look to see and that's to the left of this in this job title short column these arrays have to match up in where they are uh where you're selecting them so in this case I selected over here in the second row I need to do the same for the job title then from there pressing control shift down I select all of them once again I'm going to lock all of these by pressing F4 now let's close the parentheses and go ahead and execute it we can see see that
data engineer is the lowest paid salary with this 15,000 now we can also add in this default parameter in case you can't find a value you can put not found but in our case we made or we calculated this minmax and median from our data set so technically this isn't really necessary anyway let's see what the other job titles are for these Max and median looks like it's data scientist and then the data engineer for the median which is that first one that appears right over here in row 19 now doing the same for the
country I'm going to go ahead and just copy and paste that formula in that we had from the other cell and I'm going to just adjust this now to use column K instead of column A for the actual return array okay with that updated press enter and we can see Brazil has the lowest one and what is the highest one United States and also the median United States all right we're going to crank this up a notch and now we're going to jump into actually bucketing our salary using X lookup specifically I want to use
this table that I've created in order to properly categorize different values based on this so in this case we have this value of 140,000 it's going to fall into our bucket of 125,00 th000 there's no data in this one so I want to say no data this one's greater than 200,000 so I want to say greater than 200,000 so for this we're going to be creating a new column column Q and we're going to call it salary year bucket I'm going to go ahead also and cod this column o for the time being we don't
really need this for this now technically you already have the requisite knowledge in order to bucket it I could put in a nested IF function similar to below and it has 1 2 3 four five if you will nested ifs to go through and basically check each of the different values as it's going through in order to bucket it appropriately in this case it correctly categorizes it and then if I wanted to I can drag and drop it all the way down but now this sheet is filled with all of these nested if statements this
is really going to slow your spreadsheet down so I don't recommend doing this also building something this like this you've now hardcoded in your values into it and what is if you want to change this later you'd have to update all your formulas it's a mess don't recommend doing it so I'm going to select all this control shift down and then just delete it all instead we're going to be using X lookup for this specifically we need to look up the lookup value which is going to be the same one that we did before that
M2 and then we want to look up the lookup array now I conveniently made this table here that it's providing values at if you will the higher end of the bucket so we're not going to necessarily do an exact match for this we'll get to that in a second anyway now we want to look at what do we want to return the return array which is on the left side of this table that's the values I actually want to return back in that column value if not found is not necessarily applicable so now getting into
how we're actually going to match based on these salary buckets based on these values highlighted in this T column right here well we need to do not exact match we need to do exact match or next larger item and this is the value of one basically in this case of this 12850 it's going to look for initially an exact match of 128 of 50 and it's going to see that nothing's there so then it's going to look for the next larger item which is that 200,000 so therefore it's going to return as we're going to
find out the 125,000 to 200,000 now I can try to drag and drop this down but I'm going to run into errors because I didn't lock my formulas correctly so I need to go back in lock that s column with F4 and lock that t column with F4 and then I'm just going to autofill all the way down and now we have all of our different job postings bucketed into these different salaries so instead I wanted to go through and actually change this to be 150k and then match this to 150,000 I go do it
and it would update appropriately I also need to update this column as well but now it all updates and it's in one single location so this is really the power of using that X lookup over the ifs in order to perform this type of bucketing all right you now got some practice problems go through and get more familiar with using these different Lookout functions as I said before make sure you're prioritizing understanding that X lookup it's the most powerful but the one caveat to X lookup is that it was introduced around the 2020s so anybody
using once again an archaic version of excel Beyond or before this year they're going to have compatibility issues using this so that's why you need to also be familiar with that V lookup and also H lookup are going to encounter them in the while all right with that I'll see you in the next one where we're jumping into text functions now I know this is a course on data analysis but text functions are actually imperative for performing analysis on Text data and for this we're going to be working in this lesson on a data set
of job applicants and we're going to take it a step further using text functions in order to analyze specifically for our final analysis we have information on the different skills that each one of these job applicants knows so we're going to be able to perform an analysis to see what are the most common skills from these applicants but before we get to that final analysis we first need to beef up our knowledge we're going to focus on three main areas the first one is text combination we're going to be working to combine different columns into
a single column from there we'll move into the second one of of text extraction being able to out of a single column extract multiple values and finally in the third one performing some sort of text search in order to also extract out in this case we're going to be extracting out the state name from an address that contains a city state and area code so for this you can start up by opening up the text functions workbook and in the data tab we have this data set which you haven't seen before it's only about 20
R and includes a list of job applicants now we're not using the full data science job posting data set because a lot of the examples we're going to do in this it would be basically bogged down your Excel spreadsheet so especially how we're going to be implementing these it's really meant to be used for smaller data sets you may be like Luke what ends if a bigger data set and need to clean up the text well that's where power query comes in which we'll be covering in the advanced chapters so stick around for that anyway
moving into text combination we want to Target these columns right here f and g we want to combine them into one line to have a single address so I'm going to go ahead and hide this column H for the time being we're going to be putting that full address in column J and this one's pretty simple all we're going to do is text join which concatenates a list or range of text strings using a delimiter the first thing I need to specify is the delimiter how am I going to to separate that street and the
city state all I want to do is a space so I'll do that en closing it in double quotes next is ignore empty basically if there was an empty cell in here it would just ignore this and it's not going to input multiple different spaces between it just ignore it so we want to in that case we're just going to put in true the final one is text and we can specify you could do text and then comma and then text to um that's really Vose I don't really like doing that instead I'm just going
to select the range of F2 to G2 now we can see that the address is fully concatenated and we can drag it on down and it works for all of it now the opposite of combo is extraction which we going to get into next and in this case we're just going to use a single column and extract out multiple values in this case we have this full name column we want to extract out the first name and the last name go ahead and hide these other columns we're not using in this case we're going to
specify the text split and it says it splits text into rows or columns using delimiters so we'll first start by specifying the text which is B2 in this case and then the column delimiter which in our case is going to be that space once again we're going to use that double quotes for that space and then end Double quotes and then this is going to be a dynamic array and it has these two values here now dragging this all down down we see that it fills in for all these different names now we just split
text there also could be cases where we maybe want to extract out certain amount of values or certain amount of text from a column in this case we also have our application ID number which is a combination of letters and numbers but as you can see from this there's some values in here that are actually repeating sometimes we want to refer to this the shorthand of this and let's say we only want to get the last three digits of the applicant ID because we know that's always different well in this case we can specify the
right function and it Returns the specified number of characters from the end of the text string we specify the text itself and then the number of characters in this case we can just say three and it's going to provide back that 548 we could also just change that to include all the text numbers in case this number gets bigger than that and then go ahead and drag it all the way down now one last one before we get into actually performing that analysis we want to we want to go through and extract out the state
from this city state and zip and as you notice from all these they have a common format in that the city has a comma and then the state starts and then there's another comma following that so we're going to be using those basically delimiters if you will in order to identify where we should potentially extract out this state value from where this state these two L twetter value so the approach we're going to use for this is as we go through this is we're going to find the location first of that first Common space before
this the next we'll find where it actually ends and then finally well using those values will actually extract out using the mid function that state abbreviation so the first thing we need to do is find that comma and this Returns the starting position of one text string within another text string so in this I'm going to specify that we're the fine text we're going to be looking for is the comma itself and we're going to be looking at within text obviously G2 now we also need to find the second comma in this we can use
that find function again specifically we're finding that comma specifying that within text of G2 and now we have the second optional parameter of start number we want to start from nine which is the first one we found this in running this we get nine now the problem here is because we're starting as the exact number that the comma actually starts that's why we're getting that back that value of nine we need slightly actually bigger than nine but anyway we'll fix that in a bit instead let's actually get into extracting out that or at least trying
to extract out that CA of this value and then we'll fix that issue in cell R2 so for this we're going to be using the mid function which which similar to that right function is Returns the characters from the middle of a text string given a starting position and length so in this case we want to extract out G2 and we'll provide it the start number of well what's valuable in Q2 and the number of characters and for right now we'll just put in we know we want to extract out two so we're going to
put in two now we're running into issues we're only getting back a comma if will and if we actually make this longer to actually zoom in on here we get commas space CA now when providing four and this has to do with right here this value on the start number isn't correct this nine right here is exactly at the comma we need to actually specify for that start number of where the C is and these are all two spaces over so I'm going to come in here and I'm just going to modify this shortly and
add two to this this is also going to fix our previous one that we had when finding this of 13 because 13 now has all the way over and then finally that mid is fixed we can change this now to back to two now you know me I don't like hardcoding values something like this to and really what we're doing here is we're doing adding two based on the length of the comma and then the space after it so there's two characters in there two this is still that 11 value that we saw before similarly
inside of our mid function I don't like doing this two here because States maybe could be more than two so I don't want to hold it necessarily to that so instead I'm going to do R2 minus Q2 which in our case is going to be two and we have California all right now we can take all the different values actually drag it on down and we get all of our states extracted from this all right diving into our final analysis we're actually combine all of these different functions we just learned about specifically with this data
set we have this column H right here and it's a list of different skills that each one of these job applicants have we want to combine this and aggregate this in order to analyze the most common skills for this we're going to have to walk through four different steps in order to get this into our final visualization that we can actually visualize and see here so I'm going to go ahead and clear all these values so we can get started actually doing this the first thing I want to do is actually combine all of these
values into a single long text string and we're already having the separator of a comma and space between each skill so we're going to use that same separator to continue separating this so using text join we're going to first specify the delimiter of that comma and a space it's asking if I want to ignore those hidden or empty cells I do and then finally we need to provide the actual text itself so we'll go down through and select in our data tab H2 to h21 going back up into the formula bar closing this parenthesis and
then pressing an enter look I have a like slight typo in here I need to actually put double quotes around both of them you can't mix double and single quotes now we have this super long list uh that has all of our different skills in it it looks like it's properly delimited now that we have all these values in one cell we can then use the text split function to now separate this into different cells because we're going to want to then move into transposing it next and for this once again the delimiter we're using
is that comma and space running this we have all the different values separated out by different cells so now almost there we need to get into making a table right here basically having skills in the left hand column and then the counts of those skills from what's above here so first thing we need to do is get the unique values of this but if we just run unique on that row six we're going to run into an issue to where it actually goes out to the right and actually doesn't get the unique values for all
these so the first thing we need to do is actually transpose which moving it from horizontal to vertic vertical of that row six okay so it's now up and down all the way now in this case we want to run the unique function on this to extract out all those unique values and scrolling down looks like we have all the unique values it does have a zero because that we did that row six and so when we get to these empty cells over keep on scrolling over here it records as zero I'm fine with that
for the time being and we'll continue last thing we had to do is use basically a count if to count these different skills based on whether they appear or how often they appear in this row of six so I'll type in count if we need to specify the range first and we'll do six I want it to stay there uh as we're because we're going to autofill down so I'm going to F4 that and then from there specify the criteria which is going to be a n Kafka okay so three values for that one and
then dragging this all the way down bam got this all filled in all right the last thing we need to do is actually visualize this cuz we want to visualize these skill counts select the area that we want we're going to go in and insert in under recommended charts you can do a bar chart but I'm more a fan of horizontal bar charts especially when we have text values and we need to be able to see all the different names so I'm going to have to expand that out a bit and I'm going to change
this title up here just to something like skill count of applicants and Bam now we can see things some Trends out of this that a lot of people are claiming to have experience with data bricks which that's unusually high there probably something I want to investigate for this but a good little thing that we actually can analyze and see from this analysis that we did one minor note I would normally go through and actually sort this from high to low and you can definitely do this you'd have to copy and paste the values over you
wouldn't be able to use these values right here and S sort and filter them because we're using the modern or dynamic array to find these unique values so that's definitely an option if you want to do and I definitely would recommend you do something like that before sharing some sort of visualization like this all right you now got some practice problems to go through and get more familiar with these text functions which like I said are imperative for de analysis in the next lesson we're going to be moving into our last one in this chapter
on formulas and functions on date and time functions with that I'll see you in the next one all right saving the shortest lesson for last we're going to be focusing on date and time functions and for this we're going to be using that same data set from that last lesson which is about 20 rows of job applicants now similar to text functions we're not using that full data science job data set that we've been using previously because I find it's not common to really use these date and time functions on a large set of data
because it's going to slow down your sheets so that's why we're using this smaller data set for this once again if we're needed to actually clean up date and time stuff we're going to use something like power query which we're going to be getting to in the advanced chapter anyway we're going to be focusing on two main types of functions first up our date functions which going to be able to extract out things like month day and year and then from there we're going to transition into time functions extracting things out like hour minutes and
seconds finally we're going to move into that final analysis looking at what is the time that is most likely for applicants to apply to jobs for this we're going to be using the date and time functions workbook and we'll be working in this data sheet for this filling in certain values as we go through this I'm going to go ahead and hide some of these unnecessary columns so we have more space to work with this anyway jumping right in if we want to calculate what the month is we have something like the month number putting
that in that's D2 similarly we can get the day by using something like day and once again providing it D2 then finally something like year we can provide D2 We Get 2023 now if I wanted to only extract out of this date out of this date time if I were to use this date function it Returns the number that represents the date in Microsoft Excel okay date time code got it we're going to put in the year so we need to provide the year first month and then from there day boom and analyzing this we
see it is febru 14 2023 one quick refresher on how Excel stores those datetime objects so right now it's in as a the number format of date if I change this back to General it's going to shift to this number and if we recall this stores the values in it if we start at something like one converting it to a short date we can see that it starts at January 1st 1900 now if you're working with dates before 1900 let's say we put in something like negative 1 I converted it here to a date it's
going to just provide all these different Amber Sands here there's a few different workarounds for that that's beyond the scope of this course main thing to understand is how it's actually stored within Excel anyway I'm going to convert this back up into a date and for each of these I want to actually fill in the values all the way down bam all right close up this home ribbon all right next up is today say we needed to today's date well I can put in the today function this actually takes no arguments and will provide us
the date I'm filming this on September the 3rd now the last common function that I find myself using all the time are when I want to calculate the days since something happen in this case we want to find out how many days has it been since they have applied to the job so we can use the date diff function for this now the one thing to note with this is I'm typing it in there's no if I type in just date there's no date diff in there there's no documentation that Excel natively actually includes for
you to use this so this is like a function you just have to know about anyway it takes three parameters basically the start date that we want to start from the reference date that we want to basically subtract from this which is today we want to actually go ahead and lock this I'm going to lock this with F4 and we want to provide this in the format of days which we provide this text character of D and this tells us it's been about 567 days since Valentine's Day in 2023 anyway updating all these cells for
this we now have this data shifting gears into our time functions as we can expect a lot of these are going to be the same hour we use hour function minute has a function as well as second but this doesn't really to show seconds but we can see up here it is is actually included in your data similar to the date function for time we have to provide three parameters of hour minute and then also second drag and drop this all the way down we can see that yep it's correlating correctly one note for the
hour that we previous calculated this is in military time or if you're in Europe you also do it this way anyway I really like this for an analysis purpose especially when we get into analyzing it now conversely we can also use for time and also date you could use the text function which we previously saw when we were extracting out the month out of date Times by providing a value and then the format text which we're going to say in this case is just hour hour minute minute if I wanted that am PM format not
that military time format I can just add in Here Am Pm and it converts it appropriately dragging this all down and then filling it in we get it now moving into that final analysis we want to analyze when are these job postings Happening by hour of day the first thing we need to actually do is get a colum here of the hours in the day so we can do some sort of like count if on it in order to calculate that so for this I'm going to use the sequence function and I went 24 rows
with it column's going to leave blank and I want to start at one and it's going to fill down from 1 all the way to 24 and now we need to run a c if basically for each one of these conditions run down this list basically matching to see what is the hour for these things so I have it hidden but I'm going to go ahead and make column again for hour and I'll put in here hour and unlike last time I'm actually just going to put the whole range in here and it's going to
provide me back it in a modern array now with this I can actually now use this in the count if we want to First provide it a range which is our modern array so it's going to do I2 hashtag and then a criteria for the hour we want to search for we want to search for that one A2 from here we want to fill it all in and we have some reference errors because we didn't lock our cells specifically we didn't lock this cell right here this I2 so I'm going press f4 on that to
actually lock that then dragging it all the way down we have it okay our last portion of this is actually visualizing this so we're going to go in select all that data go to insert go to recommended charts and I'm more of a fan of column charts with this type of data so I'm going to go ahead and put this in and I'm going to change this to job postings per hour and Bam now from this we're seeing that basically people are applying during normal working hours and apparently they're waiting until the end of the
day to actually submit their job applications maybe to get in before a deadline or something all right this is the last lesson on functions and formulas in the next chapter we're going to be moving deeper into understanding how to actually make these different visualizations I've only been showing you a sneak peek at it right now to get you familiar with how to easily create it but we're going to go in into a lot greater detail up coming up next now we spent almost nine lessons on these functions and it's because I feel functions are one
of the most important things to understand about Excel because it also transfers to other portions specifically we're going to be learning more about the Dax language in the advanced chapter and we're going to apply a lot of our knowledge that we already know about these Excel functions to Dax functions they're very similar anyway you got some practice problems to go through in work in order to understand better how to use these datetime functions and from there we'll get into that chart chapter with that I'll see you in the next one welcome to this chapter on
charts and as much as I love using something like python a programming language for making visualizations I feel that Excel has some capabilities built into it that allow it to basically exceed any programming language and the customization that you can do to charts that we'll be finding out in this chapter for this chapter we have four lessons this lesson right here is an intro to chart so we're going to be focusing on understanding the basics of using charts and specifically looking at three types of charts specifically line charts pie charts and bar or column charts
so technically that's four in the second lesson we're going to move into more advanced charts such as Scatter Plots and also map charts along with understanding more advanced customizations that we can do to these charts in the third lesson we're going to go Harden the paint in order to understand statistical charts specifically histograms and then also box and whisker charts which are imperative to understand statistical distributions of our data we'll finally wrap this all up with a final lesson focusing on spark lines which basically allow us to put charts inside of individual cells in Excel
pretty neat all right for this lesson we're going to be using the charts intro workbook first thing to understand is terminology Microsoft refers to all these different visualizations diagrams plots whatever you want to call it they refer to it as chart basically they want to use a safe term that encompasses all the different type of visualizations we can build with this so you may hear me from time to time call this a plot or visualization basically mean a chart anyway why do we use charts well looking these six examples here we can see some different
characteristics about this data that we're looking at but what if we looked at just the core data itself which is this table right here looking at what is the number of job postings per month if we look at this visually we're not able to see necessarily what is the highest month and also what is the lowest month I mean you can figure out eventually but it's not easy to spot and that's why charts are so powerful and so I have a variety of visualizations here in order to Showcase that same table that we were just
looking at in basically a variety of different forms here even have a few below here down below it but we need to understand which chart to use because let's say we wanted to use this pie chart here is that actually a good chart to use to visualize this or instead should we be using something like this line chart to better show a trend over time while also showing a magnitude of difference anyway as we go through this lesson I'm going to be calling out when you should use certain charts as best practice along with my
recommended tips for how to customize it to show them best so for our first chart as I hinted to we're going to be making this job posting count into a line chart and this is the chart I'd use typically for any time series like data as it's great at showing a trend over time and how it's connected so how do we do this well we're going to select all the data here all the way from A1 down to B13 come up into insert and we're going to dive into each one of these charts individually but
I would encourage you to actually just start with recommended charts I really jump to it every time I use it anyway first thing they has two tabs here recommended charts and all charts for recommended charts usually provides a lot of good tips that you could potentially use for different charts sometimes however I do find that I want a particular chart and it's not here and that's when I'm going to go to this all charts Tab and frankly it provides a lot more control while allowing you to actually visualize your different data in our case I
know I want a line chart on this but now I can go in and actually plot it with markers or even change it into a 3D line chart highly don't recommend this we're going to be sticking to a line chart for this and I'm going to go ahead and click okay I'm not going to lie this chart is getting us 90% of the way there now if you notice for this when we clicked on the chart we have certain values highlighted here basically this purple outline is showing that this is the X values right here
and then the blue coordinates right here are showing the actual values themselves and then conveniently they put the job posting count which is highlighted in Orange as the title we'll be jumping into how to customize this area in the advanced section but that's in the next lesson now for those new to charts there's a bunch of different elements and I can come up here and I can click this plus icon right here and it shows all the different elements on here I can use the checkbox to control whether I want to include the axes or
not in this case I do want to include it and then I can even find tune it further to select which one I'm talking about am I talking about the horizontal or am I talking about the vertical just going through these in Rapid fashion access titles allow us to provide titles for the X and Y AIS the chart title shown above I can remove it or keep it on if I want to include data labels I can do this along with controlling what position of them I want to go with I could also include something
like a data table below but personally I find this is sometimes sensory overload I don't really use that much next are airb bars for data grid lines whether I want to have horizontal vertical some minor ones or some other minor ones a legend if there's more than one data I probably want this a trend line which will be adding in this a little bit and then up and down bars which are going to show whether the data goes up or down based on each set but not really necessarily applicable to this one now I find
this plus icon is where I go most of the time but I could also go to this chart design tab up here and it has this box of add chart elements and basically you can go through and adjust all the different ones along with showing a more visual indication of what's going on here here showing that I was actual up down bars to actually see what they actually look like you can also use this quick layouts to quickly try out different themes that Excel has so doy myself from time to time using this so this
chart is almost done all I do want to do first is change the title and I usually like to either provide some sort of snippet of information from it or ask a question that I want the reader of this graph to understand or take away from this chart so I can put in something like how did jobs Trend in 2023 so it also tells what year what's going on here and it asks them to look at hey what is the trend going on here which it looks like we have a peak up in January and
a peek up in August now I try to minimize the amount of access titles on here because like in the month's case that's pretty self-explanatory however the number in the y- AIS is not so self-explanatory so in that case I would want to include it in this case give it a representative name of counts of jobs the last thing I want to do with this is just add a trend line and there's multiple different options for this we can do linear exponential a linear forecast where it actually goes into the future and then even a
two period moving average which is pretty neat I'm going to just stick with the basic one right now of linear and Bam that's our first chart so let's move in the next one now if we go back to our original data set in the data tab we have a column here on job no degree mention and basically this column right here includes whether there's a mention of a degree in a job posting so in this case where we have two different values we're trying to determine what are the proportions of each a way to compare
this we could either compare this in like a bar or column chart but I feel a better one for this is a pie chart so I've gone through and calculated a count of the jobs with a no degree mention along with those that have a mention of a degree I calculated the total and then from that I calculated their individual percentages now I'm not going to just select all the data here because I don't want to plot all of it I'm going to select the first two values here of A2 A3 press control and then
also select C2 to C3 then from here now I'm I'm going to go insert those recommended charts like got a lad two bar and column charts come up but the one we're going to be using for this it's a pie chart so I'm going to go ahead and insert that in now personally I'm not a fan of this layout here so I'm going to come up into chart designs into Quick layouts and I'm going to just experiment with different ones looking at them and frankly I like the one this one right here actually where we've
removed the legend and put the actual values themselves along with their titles inside the pie chart itself to make it super simple to see which one is which now Excel sometimes gets crazy with the colors I actually don't recommend using a lot of different colors because it could be very confusing for viewers on where to look personally I want to highlight more of the no degree mentioned so I'm going to use this single color palette right here or this monochromatic color palette right here that has these different shades of blue and and I feel the
ey is going to go more to the darker blue now with each of these labels here I can actually select it I double clicked it over time I can actually drag it and drop it and move it around where I want it to be I would probably want it to be more over here I want the degree mentioned to be stacked basically I want them opposite of each other now you may have noticed I can't really read this text right here and even this text is hard to read as well so what I can do
is I'll just click outside real quick and clicking back in I'm going to double click and this is going to bring up the format data labels if double clicking isn't work you can just select it go into the format tab up here and select format selection anyway there's a lot to unpack in this Pane and we'll be unpacking it as we go along this entire chapter but the main thing to understand is they have label options and text options we want to adjust the text options and this has things like text fill and outline text
effects and then also the text box for this we're trying to fill the text fill and outline specifically this drop down here of text fill we want to change the color so we want to change it to White now if you notice only one of these change and that's because I only had one of the boxes selected so actually actually click out of this double click back into this and then make sure both of these are actually selected go back into text options go into text fill and then change this color and then it's going
to change both of these colors now I'm fine with this text now but let's say I wanted to customize further the percentage here maybe I want to include one more decimal place clicking on the box itself I can now have this option for label options and then under well label options again I can scroll all the way down or I can actually cover this up and then unhide this number I can change the number formatting itself in this case I do want to still do a percentage and then maybe I want to do one decimal
place personally I think there's a little a little bit too much dat so we're just going to keep it with the zero all right that's the final customization the last thing we want to do is just add a title and I want a very compelling title what do they want to look at for this I want them to understand what jobs mention a degree and now with this we have a pretty great visual indication of that about one of jobs have no degree mention in them which personally I think that's a pretty high percentage and
hopefully gets higher so we have data similar to our first chart that basically explains how many counts of jobs for the different job titles now this isn't chronological so I don't necessarily recommend using something like a line chart for this that's why we're going to be making column and bar charts for this also let explain the difference between the two anyway I'm using the formulas that we previously have covered you can dive into it if you want to basically using unique and then also a c if formula in order to count each one of these
in their data tab anyway if I actually go to graph these by selecting all these things go to insert and recommended charts here provides the recommended charts and we're going to start with a column chart first I start with this one first because we're already running into problems with how long these labels are we can see that we have these three ellipses here basically telling the that the rest of the name is hidden here so not all the names are shown here the other problem that we're getting into with this column chart um named after
the fact that it looks like columns is that it's not in an organized manner I would expect to see it high to low to make it more easily to compare values to each other and also how they rank so we'll go ahead and delete this bad boy anyway this table is organized based on this unique function which doesn't necessarily put things in the correct order and I won't be able to actually go through and filter it or soter it appropriately So Below this I made a different table that I basically use sort to sort these
values from above by their job count in descending order now since it's in this order I could actually select a few less of this remember how it was cut off last time I could select only the top six go into here go into recommended charts and once again and put in our clustered column chart now this one I can play around with and as you see as I expand it out I can actually see all the different names here but once again I'm not a fan of this column chart I'm not going to be using
it for this case instead we're going to try out a bar chart instead so selecting all this data to show the power of these bar charts and then coming in I can put in that bar chart now I do like this one better because all the titles are organ ganized and they're right off to the side and so this is a much more easier read the problem now is I'm really nitpicky with my charts the problem now is I don't like the order that this is in what happens is is Excel starts plotting these although
it's in descending order in our table as shown over here it's going to be plotting them starting at this zero axis up here and then plotting from there so technically we don't even want it like this instead what I can do is reverse the sort order here I'm just controlling it by using uh either one or netive one in that sort order portion anyway with this order now now we can finally get into the final bar chart that we want to actually put in and I'm just going to skip this recommended charts come up here
into the column and then the bar charts we want this one inserted in and I'm also going to zoom out some now this is more in lined with what I want let's actually clean up this visualization to identify what we want I'm actually more curious about what are the top jobs in data science so that's what we'll name it additionally feel the titles are pretty self-explanatory based on that title but I would need something for the x-axis down here so we'll add an axis title calling this count of job postings now with this question I'm
asking of what are the top jobs in data science I'm not really feeling like we need to include things like machine learning Engineers software Engineers cloudware Engineers or business business analyst how could I actually adjust this well one way is I could control what areas are highlighted over here and I could actually drag this and change this to whichever ones I want um but I'm not necessarily going to recommend that instead I'm going to select our data make sure all the columns are selected themselves rightclick it and then go to select data and this new
window is going to pop up here this tells us a lot of great things about our visualization first is the chart data range it tells us we're selected from a25 to b35 so we could change that here if we wanted to the next thing is the two windows down here of the legend entries and the horizontal axis so this controls our job count I'm going to scroll this over here we could just remove job count but it's not going to do anything this guys mainly right here the access labels we can control so I know
I want data analyst and all the way up down to senior data analyst I can actually go through and select remove business analyst machine learning engineer software engineer and Cloud engineer and then click okay and it will remove it from this visualization while still keeping this data here so I can easily go back and add or remove job titles as necessary and now we have our final visualization earlier I did go through and actually delete the chart and start over but you do have this option in the chart design tab of change chart type and
allows you to basically go through and try out different ones if I wanted to go back to that column chart I could and it would show me an example of what it looks like now there is one last thing that I want to format on this I do find it a little difficult to read exactly what are the amount of job postings that they have here so I'm going to add data labels to this we have a couple different options we can be inside end which can't read at all inside Base outside end which I'm
more for and then also a data call that's just too much there we're going to do outside end now with this these numbers I don't like the level of detail I don't need down to the single or the on digigit place to tell what it is instead I would rather it shows something like 9.6k or 9.6000 so we can actually format that so double clicking on one of those labels this format short area is going to pop up again and for this I'm going to go under label options and then label options again and finally
number and for this I'm going to use use instead of uh any one of these I'm going to use a custom type now I have a few of these already built into here and so they may not pop up to you but this is actually sneak peek this is actually what we want but if you don't have this popping up right now what you can do is actually go in in this case I'll just show a different value what we're going to first say is how we want this formated with how many decimal places so
I want all the values before the decimal place then a decimal place and then I only want in this case let's go with two places after the decimal place and then from there I want a K on the end so basically to show this as a thousand so I'm going to use a parenthesis put a k and then close parenthesis and I'm going to click add okay so now this changes it to the double digits for explaining that this is the thousands this automatically whenever I do that K parentheses it automatically does the math to
basically divide that by a th and transfer this to K instead of the thousands anyway I don't really I'm going to go with the original one I had of only one decimal place and Bam that's our final visualization and we can see from this that we have a lot of insights into understanding that more Junior roles like dat analyst dat scientist dat Engineers are more prevalent than the senior roles and that luckily it seems like there's a lot more data analyst roles than data scientists and data Engineers all right you now some practice problems to
go through and get more familiar with those four major type of visualizations that frankly I feel I'm using on a daily basis anytime I'm making visualizations so don't think that they're just too plain or too simple they're really powerful and explaining data in the next lesson we're going to be jumping into not only more advanced charts but even more advanced customization so with that I'll see you in that one we're going to crank this up a notch and get into some more advanced visualizations specifically on this we're going to be doing a deeper dive dive
into the pay of different jobs not only based on the different job titles but also based on where a job is located using things like a map chart and so for all these charts also we're going to be looking into how we can further get into deeper customization of these so Scatter Plots are great at comparing two numerical values in our data set we have these two columns here one on the salary year average and the other on the salary hour average just as a background on why it's called average at the end of these
sometimes job postings have a range of salary and so I took the average of the Min and Max and hence I named this average anyway we have yearly salary data and we have hourly salary data what it did next is get the unique value of the job titles and then from there using that median basically modified median IF function got the yearly median salaries and then the hourly median salaries so because we have these two numerical values to compare basically we want to see if there's a trend correlated between the two because well there is
we're going to find out I'm going to go ahead and select these all then from there go into insert and we can come into charts I know I want a scatter plot and if we go to insert it in can't see cuz it's hidden behind here well we'll just go ahead and show it this isn't necessarily showing us what I want us to show with this it's basically showing hey this is the yearly data up here in the blue and then this is the hourly data since hourly data it's super low it didn't work out
how I wanted to by selecting all the data like we've previously been doing instead I'm going to go ahead and delete this what we're going to do is we're only going to select basically this B and C column of data once again we're going to try again inserting that scatter plot and at this point it's actually working correctly as we want it unfortunately we can't tell there's no basically like data labels for this to understand what are the different job titles associated with it even with the graph we can see that it's only highlighting this
also the incorrect titles up here it's not just hourly median salary we're going to fix all this anyway the first thing that I want to clean up is actually the selection of data right now we can see these numbers are overlapping down here also it goes all the way down to this zero axis on both the X and Y I want to change that so I'm going to double click this x axis and format access pane pops up and we can see that we have bounds here 0 to 180,000 I can see that there's no
values under about 75,000 so I'm going to go ahead and put that in for the minimum and press enter so it's going to jumate this way now I want to do the same thing for the Y AIS I'll just double click it and this one didn't necessarily go where I wanted it to go I wanted to actually change the values here so we can go under access options under access options again and under access options again we can change this minimum maximum I'm going to change it to looks like there's nothing above 20 or below
25 so we're going to go with that now even with this change in the formatting of the values here the minimum I can still see that there's overlap here so I want to update this similar to last time basically cut it off the thousands place and place and put a k at the end so under access options access options again I'm going to close this drop down of access options also instead we're going to go to number for this we want a custom type and I do have some values in here but we're just going
to go if you don't have them in here we're going to add a new one specifically with this I wanted to show one I wanted to show a dollar sign at the front and I don't want any decimal places whatsoever so I'm just going to put a zero in there and then from there like last time I want to format this in the thousand's place so I'm going to put a comma and then double quotes to put around the K which signifies I want to formulate this in the thousand's place I'm going go ahead and
click add and now this is much more readable not so much sensory overload for our y AIS I don't care at all about this decimal place right here so going back into numbers again I can just format the decimal place places as zero and I'll just leave this one as an accounting category now which one's yearly and which one hourly salary well we need to include actual access titles for this so I'll go ahead and enable that and then for this we're going to do something a little bit different I'm going to select this ya
AIS title and instead of actually typing in values in I want to use actually the column header right here so I'm going to come up into the formula bar type equal to I'm going select C1 and then press enter and now this updates for that column head I can do the same thing here for the x-axis title selecting it then from there going to the formula bar put an equal and selecting cell B1 and pressing enter for the title we don't want that hourly median salary we're really trying to find out what jobs have the
highest pay and we can basically tell it from this all right so let's actually finally get to adding data labels to this and we can see what data labels are actually available but scrolling over the different options here we're going to just go with above for the time being then I'm going to close on out of this and I'm going to select the data labels themselves and format data labels should pop up if it doesn't you can also go about doing it by right-clicking this and going to format data labels anyway for this I don't
want to actually show The X or the Y value for this anyway uh I made I made it disappear by actually closing out of that so actually I going have to add those data labels again again anyway going back into it under label options label options then label options again I'm going to leave that y value selected for right now but what I want to do now is provide the job title itself right next to the data point so I can do this option here so label contains value from cells and it's going to ask
me to select the data label range and so now this is when I'm going to select all of these different job titles here and press okay so now we have these values from cells I no longer want this y values and I do want to include this leader lines because we're going to be actually dragging this around because as you can see some of these values are overlapping now also I'm noticing that this is really busy right now with all this text and stuff so I'm actually going to remove the grid lines for the time
being actually for the remainder of this cuz I I don't feel like it really needs the grid lines in general and now I have a little bit less sensory overload so I can go through and actually clean up where a lot of these different job titles are located by just selecting it and then dragging it and you notice uh we had that leader line selected so I have arrows or basically lines going to each of these ones to signify which one is which so now I've dragging these all over so that way they're basically more
represent I want sometimes if I dragged off of this and drag maybe the whole chart itself and make a mistake I press just control Z and it reverts it back to where I'm going and then I just continue on to selecting the box that I want and moving it anyway this is pretty neat now I could actually go in if I wanted to and add a trend line to this and basically it shows for an increase in that yearly salary I expect the same with the hourly data in this case I don't find it as
much useful so I'm going to just keep leave that off but in general it is pretty neat to see the trend that's going on with this that senior data Engineers although they're underpaid compared to senior data scientist in yearly salary you could get the hookup if in instead you look for an hourly gig instead in order to get a little bit higher pay a similar Dynamic happens between business analysts and data analyst so if you're a data analyst and you're looking for a job maybe on upwork maybe you should advertise as a business analyst instead
all right going back to our data set itself we have another column in here I want to investigate and that's specifically around the country is called job country basically where the job is located at and I like to visualize these type of things well on a map to actually see how it affects others so I've made this table here under the map chart tab where we have our all the different countries in the data set then from there we use a count if to determine how many counts for each of the countries and then our
modified median if in order to determine what the median salary is in each of these countries I've also had to wrap this one in an if error because some of these if there's no values it throws an error and I didn't want that popping up in the chart so so I had it disappear or make it basically a blank value if it does have an error anyway let's get into visualizing this we're going to first just visualize what are the counts of these different jobs based on the country so I'm going to select column A
and B go to insert and then maps and go to this map chart now you may have a pop-up warning that comes up during this that says data needed to create your map chart will be set to B and I'm fine with sending this data to being you should be fine too with it so feel free to accept this then you shouldn't get this pop up anymore anyway this chart's pretty neat because it goes and shows we have a heavy concentration of jobs basically from the United States for my job scraper I'm heavily aggregating jobs
from this country compared to other countries sorry other countries out there but I am still n less collecting from other countries like us has 25,000 India is around 580 for this one I'm going to change the title to where are most jobs in Luke's data set from there's not to say the United States has more jobs than other countries this is just how my data set is and how I extracted the data so don't want you to come up with the wrong conclusions from this now the visualization that I really care about is comparing these
countries to the median salary so holding control I select a and then C I'm going to do recommended charge from this cuz I'm having problems using the maps one anyway I see that it has the filled map here I'm going to select okay and I have all the data filled in all right with this visualization we we can now dive in we can see that we have a range of these median salaries from over 157,000 down to 30,000 with country like China having around 68,000 and then over in Africa we have Algeria at 45,000 so
looks like we have a lower salary in the African continent over in North America and also South America pretty high salaries along with Australia as well anyway pretty cool visualization we were able to generate out of this I mean I love data and I just love this visual a with this I'm going to change the title to what are top paying countries now the last thing is a minor Point sometimes if you're going ahead and actually moving maybe columns around you'll notice that my visualization is also moving as well and this can wreak havoc especially
whenever you've made your dash or made your chart a certain size and then move columns around and it messes everything up we can fix this so I'm going to go ahead and contrl Z both of those column moves to get it back to where I had previously and then from there I'm just going to double click on the chart itself go under chart options and once again this like resizing one here and going under properties right now it's selected under move and size with cells we don't want to do that basically we don't want to
move or size with the cells so I'm going to select that now closing out of this whenever I go to adjust the column size it's not going to adjust the visualization at all this is much more of what I want also one last note on this I do do have a filter currently applied to this data set specifically I go into it it's a custom filter and I wanted to make sure that I had basically removed any na values so I put hey I want values that are median Sal greater than zero and are less
than 200,000 so if I go ahead and clear this filter we can see that we have some other values up here basically rushes up here at 300,000 for a median salary and if we actually go in investigate Russia we'll see that they only have around four jobs with salary data listed so I feel like this salary is more of an outlier than anything so that's why I'm applying this filter of 0 to 200,000 applying this filter again we get final visualization now you could also play around with this and filter it based on the number
of counts to make sure you have values that are above a certain count that's also an option and probably maybe even a better option as well all right chch turn now to dive into those practice problems to try out some different Advanced visualizations and along with some Advanced customization with that in the next lesson we're going to be diving deeper into understanding how to use statistical analysis specifically box and wher charts and also histograms and how to read them with that see you in the next one this lesson is going to be focused on actually
visualizing a lot of the things that or a lot of the functions that we used in that statistical functions lesson where we're looking visually at things like the median and core tiles specifically we're going to do a refresher on histograms we've seen it a few time reality but we're going to dive into further understanding how salaries are distributed specifically for a target audience of data analyst in the United States you can feel feel free to do whoever you want and then from there based on the limitations of it only be able to visualize one job
title we're going to shift Vex to looking at box and whisker charts and these are great at also showing statistical distributions like a histogram but we can take it a step further and we compare different values specifically in this case we're going to compare them across the different job titles on how they're distributed now box and whisker charts aren't probably a chart that you're familiar with or most people are familiar with so we're going to go through a review and understand and break them down to understand those Concepts we talked about previously about median and
quartiles and where they fall into this for this we're going to be using the charts statistics workbook specifically we're going to be starting in this data Tab and for all this we're going to be analyzing salary data in this video we're going to be focusing specifically though on that yearly salary data so let's actually go back into breaking down how to read a histogram we go back into insert recommended charts and then from there select histogram and insert in the histogram I don't like where it is right now I'm actually going to move this chart
into a new sheet now quick refresher on histograms each one of these bars represents a count of values within a range so in this case there's 920 values between the range of oh my gosh so hard to read 75,000 to 81,000 and as we're noting by this we have a large number over here if gets even out to 960,000 this would be called a skewed right distribution now this is different from a column chart because this data down here on the xaxis is basically continuous data when one bin stops so this first bin of 15,000
to 21,000 the next bin picks up now the first problem with this histogram is this is for all salary data specifically all job titles across all countries I want to actually find tune to look at my specific use case of data analyst in the United States so you can come here into the histogram 2 Tab and I have the four Columns of interest that I want to use from the data Tab and I already have the filters applied but if you want to you can come in here and actually select to clear these filters and
I'll just select it here from that Home tab then from there I'm going to go through and select data analyst roles that are full-time only that are in the United States and then finally I don't want any of these blank values here so I'm going to uncheck this value here for blanks now we'll say filtering this data did take some time to actually do so don't be alarmed if this taken more than 10 or 15 seconds all right so back in let's actually make a histogram with this data we'll go into insert from here I'm
going to insert in a histogram now once again this distribution is so the last one skewed right and we have a heavy amount of outline s right here even out this one value around 370,000 I don't think this provides a lot of value instead I want to actually focus more into these this actual distribution and not actually on this portion out here that we have just outliers anyway I'm going to come in here into our filters up here insert a number filter and that it's less than 300,000 click okay all right this is looking a
lot more readable which we can actually see now the x-axis now each one of these bars right here or what what you would see in like a column chart are called the bins and they're all equally space but we can control the width of each one of those bins that they Encompass specifically I can double click on the chart to bring up that pane to the right selecting the x axis I can then go into access options and then once again access options we can go into something right now we're noticing that the bins are
automatically determined we can actually change this binwidth I'm going to change this something to like 15,000 notice that it is bigger in this case the bins are bigger than they were previously you can feel free to test different options if you will I feel if you go too small in the case let's say we went down to 1,000 it just gets too noisy and also you can't necessarily see the distribution as well so really you just have to play around with it until you get to what you want to find as far as the access
goes this is a little bit this is sensory overload for me way too many zeros in here so I'm going to move this selecting the xaxis we can see that has format access now I can go under number and once again we can go in our custom type none of the ones that I've previously done are here sometimes it pops up sometimes it doesn't we're going to go ahead and just put in we want the dollar sign zero and then formatted with the K value basically removing all those uh thousands zeros and I'm going to
go ahead and click add all right this is a lot more readable to actually see what those different ranges are and from there I'm going to change the title of how much do data analysts in the United States make probably also best practice here to add a title on the Y AIS for count of jobs and B now we have this final visualization show on our histogram we can see that a lot of the salaries are more around the range of 85,000 to 100,000 which 70,000 85,000 is coming up next so this show is really
visually great and at where I can expect to have a salary as a starting data analyst but now what if we want to analyze multiple different job titles which we're going eventually get to is this box plot here where we're plotting it for all the different job tiles we'll be able to actually compare different values across each other but before we get to that we need to First understand how to read a box plot also sometimes I call it a box plot but it's also known as a box and whiskers chart anyway I made this
visualization here you don't have to do it there's a bunch of customization along with it the main purpose of this is to demonstrate or help understand how to read a box and whiskers chart so I took our data that we previously were analyzing for data analyst in the United States it was a full-time role along with all the salary data and then I use like we previously did calculating things like the Min first quartile median average third quartile and Max just ignore this portion right here it was used to make build this visualization right here
anyway I tried as best as possible to line up this histogram where we have the x-axis going from 25,000 to 285,000 with the box and whiskers chart I may below it from 25,000 to 285,000 so the Box itself signifies what that nerds call the inter quartile range basically all the values between q1 or quartile 1 and cortile 3 had a typo there got to fix that anyway that's why it was so important that previously we calculated that first quartile and third quartile and if you remember from that there quartiles so 50% of the data Falls
within this box and if we look up we were to draw imaginary lines into our histogram we can see that about 50% of the data does fall within this the next up inside of here is a line that is for the median in this case our median is 990,000 and then we have our average of 90 5,000 which as we discussed previously the average is going to be higher here because we have things all the way out here called outliers basically dragging that average higher and outliers are signified by these dots outside of the whiskers
themselves these whiskers are the lines and the lines themselves extend to the minimum and the maximum and these are just relative mins and Maxes they're not necessarily the true men and Max anyway so that's a box and whisker chart and frankly by themselves I don't think they're really great but when you pair them with other categorical values I find them super interesting so let's actually build this visualization so you can come over to this box plot2 Tab and I have our data inside of it none of it is filtered it has all the different job
titles and all their Associated salaries for this I'm going to select column M and then also holding control I'm going to select column A then from there go in and insert and go to recommended and from there look at the box and whiskers chart which looks like it's already pulling it up for us so let's pop this bad boy in now one drawback of these box and whisker charts in Excel is unlike that last box plot that I made I custom made this in order to make it appear in this horizontal fashion you can actually
do that you can only have the option to have them vertical up and down anyway this is pretty close of what we want to get the main problem I'm noticing right now is we have outliers up to 1.2 million and it's really with the data around 100 150,000 it's really hard to actually look into those boxes so I'm going to change this yvalue scale double clicking on the Y AIS I'm going to change the maximum to 300,000 additionally since we're here I'm going to change that number formatting to use that 0k value then also I'm
finding the color is a little hard to actually see these x's in here so under series option selecting fill in line fill I'm going to change this color to more of a lighter blue okay and that's definitely easier to read I'm going to add a vertical access of salary USD I'm also going to bold it all to make it a little bit more readable and then from there change that chart title to what are the top paying jobs in data science all right getting into actually analyzing this and getting insights from it now one drawback
out of this is there's not an easy way to sort these values right here right now I'd normally put them high to low I'd probably put them high to low based on median salary but they've been put into this graph based on the order that they first appear over here in column A and that's when they pop up so that's the order so technically I could go through and sort this column alphabetically but that's going to take a little bit too much time if you want to do that feel free to try that out anyway
it looks like roles like machine learning engineers and also software Engineers have a pretty large inter cortile range or that where that 50% of that data Falls so there's a basically a wide range of data or salaries you could find with that whereas data nerds data scientists data analysts and data Engineers have a tighter band also as expected those data analysts and business analysts have some of the lowest median salaries where something like the data engineers and the senior roles have even higher median salaries overall this is pretty great at going in comparing values I
would probably work with this more to fine tune it to only have a couple of job titles in it and for that we can use something like slicers which will be covering in an upcoming chapter well the next chapter when we get into Advanced Techniques in Excel so we'll be able to customize this further once you have that knowledge all right you now have some practice problems to go through and get more familiar with those histograms and all scope box and whisker charts in the next lesson which is a quick one we're going to be
moving into spark lines which is the final lesson in this chart overview with that I'll see you in the next one moving into this last lesson on charts focusing on spark line spark lines are basically ways to insert mini charts into a cell that summarizes data that's next to it if your data is coming in a horizontal form similar to this table you probably have the possibility of considering inserting a spark line we're going to going through how to make them but also customizing it all right for this we're going to be using the spark
lines workbook for this we have like usual our data Tab and then our original tab that calculates data off it and for this data set we're just looking at what are the counts of the different job titles based on month so this is basically horizontally oriented this is great for a spar line so how we're going to do this well we'll go ahead and select the data only so C4 to n10 then come up into the insert tab then right here we have this section on spark lines we can insert a line column or a
win loss we'll just start with column to start with and it fills in for the data range C4 to 10 but it wants us to choose where you want the spark lines to be placed so the location range and click this Arrow here and then from there actually drag it next to it all close this Arrow back and click okay anyway I wanted to demonstrate that bar chart because it's not really that great for here remember anytime we're doing continuous data in this case we're doing that monthly data I'm going to want to use something
like a line chart instead so I can easily change it by coming up here selecting all of our different data selecting that spark Line tab and then just changing it to I can change something like win loss which no really data from this line chart that's what we really want from this now getting into the customization of this I really personally I'm like blue so we're going to stick with the blue color but we could change the color if we want to and the other thing we change is the marker color right now we don't
have any markers on it we can actually change which markers are right here in the show selection right here so I can select the high points right now it's going to highlight all of them red uh low point also red negative points there's no negative point you also do the first point which I don't really find much value in that or last point and then actual finally the markers itself you just put every single one of them with a marker I really like this High Point and this low point and we can customize this the
high points I would really want to call out to be a green color right now this green that's sort of hard to see so I'm going to change it to something a little bit darker and Bam we can see that one a little better the red for the low point I'm going to keep it as is and the last thing is all this data has Bally a grid around it I'm just going to add that in real quick by selecting all the cells come up into home into the borders I'm going to put in all
borders around it then it looks like I have a double line right here for this lower one so I'll insert this bottom double border and then finally I'm going to put a thick border around this all bam we have our final visualization there now I can go through and see things like okay with that analyst and other analyst we saw spikes in January but things like thata Engineers we didn't see a spike however all the job titles ran to a similar problem where apparently they ran out of budget and the least amount of jobs were
posted in November and December so this's a pretty cool feature to show some quick snapshots about the data you're looking at right you now have some practice problems to go through and basically practice making some of these spark lines we're going to next be jumping in the next chapter it's our final chapter of the basic section and it's going to be focusing on Advanced features inside spreadsheets such as tables formatting and how to collaborate with others it's our last section before we build our first project so with that I'll see you in the next chapter
on Advanced spreadsheets then nerds welcome to this last chapter in the basic section focusing on Advanced features and spreadsheets there's a last chapter we're going to be covering before we get into our first project and this chapter is broken into three different lessons this one right here is going to be on tables how to use tables how to use things like slicers and how to manipulate them second lesson is on formatting not just on making cells look pretty but developing conditional formatting rules in order to highlight CES according to well a certain rule pretty interesting
feature within Excel and the third lesson is on collaboration for a project we're going to be making a dashboard and so we need to enact certain measures in order to protect it and prevent people from going in and messing it up and so we're going to go over a lot of features in order to set it up properly anyway back to this lesson what are we going to be doing for it well first we're going to start out by using a smaller subset of our data set basically 15 rows and creating your first table we're
going to be manipulating it using custom formulas that we really haven't seen before along with using some other ones that we have seen before in order to calculate totals subtotals and Aggregates by the end of this lesson we're going to be building a mini dashboard to analyze that histogram that we talked about in our previous lessons specifically we're going to add slicers to it in order to be able to filter down and look at a subset of data that we're most interested about and that's all could be done without the help of tables for this
lesson we're going to be using the tables workbook in chapter 4 for this you're going to start in the tables intro original sheet and then the final one's going to be what we're going to eventually get to all these are going to be labeled similarly with the original and final and we're all going to be working with the original it should look like the final when you get done with this so let's dive into creating our first table first thing you have to do is make sure that we're selected somewhere in here we don't necessarily
need to select the full table but just somewhere in here from there we'll go into the insert Tab and we'll insert a table also notice that we can use the shortcut control t for this so I'm going to do that instead and for this it automatically pinpoints the rightmost cell and the bottom most cell and we need to make sure we have this check mark enabled of my table has headers because we have well call them headers and Bam we just made our first table this lesson's over but seriously let's actually get into exploring this
table design tab that now appears anytime you're selected to the table if I click off of it it disappears anyway we're going to first look at the table name and i' like to have a table name that's easy to reference so I'm going to just name it something like jobs it's going to come into handy naming it something simple whenever we're making formulas later for this now we'll get to this section in a little bit on tool and external table data but I want to move over to the style options you can play around with
some of these options here where you can highlight the First Column or you can highlight the last column has a lot of different formatting options with it but what I really like is this color formatting if I'm not really liking the color that it's given to me just come over here select a new one so we'll get back to table design in a bit but what's really the benefit of this table well one thing is you can easily add data to a table and it will will autofill let me show you let's say I wanted
to add a new column with a solid year average copy whenever I enter this new column name and press enter it automatically fills this in I can the skills are sort of covering this up right now sorry about that and I can make this a little bit bigger but you can see we have salary or average copy now included within this table and I can verify that it's included also in this table by if I want to go to resize table it will say that now it goes to k16 now for this I just want
to copy the results of the salary year average column over here in h so what I'm going to do is press equal to and I'm just going to select the cell over here of H2 now this is what I was talking about whenever I said tables have their own unique formulas what it's going and doing here is it's referencing the salary or average column which is this portion right here and then it's also using this at symbol to basically refer to this is the same point in the row of H2 that is a K2 anyway
when I go ahead and press enter Watch What Happens we actually fill in all the different values of this so if I were to actually double click into this one down here we still have that same syntax of we're selecting the Sal your average column and we're using that at value value to get the one that corresponds in that same row now let's dive deeper into these different formulas we can use for this table so I'm going to come over here into column n and for this remember we named our table jobs so I'm just
going to type in jobs and I have two tables in here one called job one jobs you only have one popping in here anyway it automatically pops up so I'm going to select jobs and now whenever I do this I'm going to press enter it's using our modern dynamic arrays basically to fill in all the data that we have over here inside of our table so pretty unique in how we can reference this now what happens if we wanted to also include the column headers up at the top well I can type in jobs and
then from there I'm going to add a square bracket and we have a few options popping up right now it looks like it's just a column titles but if we scroll down we have these values here with hashtags in it specifically I want with the column headers so I'm going to put hashtag headers I'm going to put a close bracket on this and then press enter and now we have the column headers across the top now that's a little bit too much work having to do two different formulas for this if instead I wanted to
do job and then square bracket and see the options available I can see I have an all a data only a headers and a totals row totals row we're going to get to a little bit so we'll do the all for now and if I go ahead and press enter bam we now have our data with our column headers and also the data itself but what happens if you want to just access certain columns well I thought you never asked that well once again I can type in something like jobs but the square bracket and
then we have a list of different columns available let's do the salary year average and do a close bracket once again this is going to provide the data values only if we wanted to include the specific header for this I once again need to put in jobs and this time I'm going need to specify not only the headers so I need to put this in its own square brackets but I'm also going to have to do a comma put another square brackets and put salary year average within its own brackets so it's almost like a
list of items if you're familiar with python this would be like a list anyway we have the headers in Brackets and we have salary year average in Brackets pressing enter we get salary your average up at the top now honestly an easier way to do this all is to well use that all command or hashtag all but it has to be put within its own square brackets then from there a comma and then we want to say hey the subset only that we're providing for this is salary year average close that bracket and then close
the entire brackets for jobs now from there when we run it we get the Sal year average along with all the column values at any time if you forget that it's not that big of a deal as you can just go through and put an equal sign and like we did previously I could just highlight well not that um our salary your average column and look it automatically populates with that same formula above here and when I press enter boom it pops up there so don't think you have to memorize these formulas that I just
went over but what do all these formulas actually provide any value value for well let's look at a use case let's say I wanted to identify jobs that whenever we looked at the skills we could find out if they contained the skill of Excel or not so I'm going to create this new column over here and call it Excel and for this we're going to be using the search function which we need to provide what text we want to actually find conveniently I put it in the column header so I'll go ahead and just select
it and automatically populates the formula for this then from that we need to go to the next parameter of within text we're trying to look at that job skills column it puts that at symbol at the front of job skills to basically signify look at that row then from there I'm going to go ahead and close the parentheses and press enter so for that search function it provides the N numerical location of excel in here Excel is 36 characters deep into this so I'm just going to modify this cuz I don't really care about the
number of that I'm going to say I'm going to use the is number function which checks if it's a number and then returns true or false in this case we have True Values so we know that for these columns if they contain Excel or not they'll have true so that's how I find myself using these different formulas and understanding how to actually manipulate them anyway let's get into our next step let's say we wanted to include some sort of totals Row in order to maybe calculate median salary how many job postings there were Etc so
we'll go into this table design Tab and I'm going going to select the total row and now down here in row 17 we have total written down here along with a bunch of well blank values except for all the way to the right looks like it puts us the number of 15 which is the total of these now going over to that salary year average column I can basically select this totals row right here and you notice a drop down appears right here from here we can select some basic statistics average count min max variance
go ahead and select average that's the average of this column right here so pretty neat I'd go through and if I wanted to do other columns as well that now you can also go into here and select more functions and then like we said we want to calculate Median on this salary we could go ahead and select this function of median but I'm actually going to recommend another approach you see if we double click inside of here we actually see that this totals column is using a function specifically the subtotal function function so let's actually
build this out from scratch without selecting it luckily we have the salary your average copy column over here so I'm going to go in and I'm going to type in subtotal and it returns a subtotal in a list or database first is the function number what do we want it to actually do and this has even more values available to it that you can actually select from and perform on this so in this case let's say I wanted to find out what the max value is I would plug this in it would be 104 and
then for the reference for this well we're just going to select this salary year average copy column it automatically transformed into this special syntax and then add a closing parenthesis and press enter and so now we have the max salary which looking at this it's true but if we go back into this and actually inspect what values are available in this function number we can see that median is not available in here so what are we going to do well there's another function we're not going to use median but that I recommend instead of using
sub total and for this one we're going to use the aggregate function and this returns an aggregate in a list or database it's similarly designed where it has a function number but with this one we have a lot more options including things like CTO and stuff like that anyway it has median available as number 12 now the second parameter on options allows us to select a host of options uh no pun intended for allowing us how we want to actually perform this aggregate basically do we want to maybe ignore hidden rows or do we want
to ignore error values in my case I don't really want to ignore anything so I'm just going to do number four and then finally we need to insert the array or the column itself in this case we want salary year average closing the parentheses on this and pressing enter we get our median value of 94,000 now depending how fast your computer is you're going to run into some limitations here I have in the table limits original tab which is the next one we're going to be working with in this uh portion of the lesson it
has around well 32,000 which is in the data set anyway we're going to run into some limitations as I'm going to show I'm going to encourage you to just watch along uh me do this and then from there basically decide if you think you have a strong enough computer or not to continue on to do this um but if you have a pretty uh basically slow computer I wouldn't necessarily follow along with this anyway I'm going to convert this into table by selecting any portion in here pressing contrl T it selected all the different values
and that table has CS so now we've converted this into a table and one of the benefits we haven't really discussed yet is the ability to actually filter data because it automatically provides this filter up at the top now I'm going to go ahead and filter this down based on a data analyst job title and when I go through and actually select this to just select it at analyst and press okay it runs pretty quickly but I have run into problems in the past especially working with smaller computers where it takes a while to do
this I'm working with about 24 GB of RAM on this virtual machine so if you're something at like8 or even 4 I'm going to highly recommend that you may not perform this exercise so moving to this last exercise of this lesson I've gone ahead and condensed down this data set you can go into histogram original and our previous data set I basically shorn it down to these four columns and limited to only positions that have a salary year average value listed basically if there's blanks I remove those rows so it's about 208,000 rows anyway this
is what we're going to be manipulating for this this shouldn't lock up your computer if you have a basically a computer with less RAM and we're going to convert this into a table first pressing control T I select all the values on here and press okay so now we have a title now also in this sheet you may have noticed hopefully that it's been on the screen I have this histogram here which is basically aggregating the data from this Delta column on salary year average anyway we're going to be manipulating this further we want to
basically make this into a dashboard so we can go through and maybe filter for different job title different job schedule types or different job countries and it can be mildly inconvenient to come up here and actually select this arrow and then go through and select the values want that's why slicers are great so with our table selected I'm going to go into table design and then from there under Tools I'm going go to insert slicer we're going to be entering in both a job tile short job schedule type and a job country slicer so all
three are here now I'm going to go ahead and position them make them look a lot neater all right got them cleaned up and then from there I can go ahead and actually select the slider sir and if you notice this slicer tab pops up conveniently labeled this slicer has a caption on it or a title as well and I can just rename it basically to a better visually appealing title in this case I want it to call job title and then it updates here for job title I'm going to do the same for the
other two updating it to schedule type and then also Country Now by default this slicer and all the slicers have all the value selected so if I wanted to to go in to actually select a value I could do something like well we want to look at data analyst I just select data analyst it's going to clear all those other ones and then only select that analyst as you notice it took a second for it to actually load that's why with this 20,000 rows of data even that's a little high for tables I recommend it
around 10,000 if you're using tables anyway we have it filtered down to data analyst I could also do it down to fulltime along with filtering it for U basically uh I want to do United States if you notice these values are gray out that means there's no country basically available with the current selections that I have of data analyst in full-time so that's what that means there but I can go into that for United States selecting it and Bam we now have our final basically visualization but what happens if I want to maybe look at
multiple different values what if I maybe want to look at both data analyst and business analyst well in that case you want to select this box up here and it allows multi I select and so I enable it and now I can go through and select something like business analyst and this provides both those values along with I wanted to look at full-time and also part-time I could enable the multi select on this schedule type and select part-time and Bam now we have multiple values selected for this along with the United States and this makes
the dashboards that you're building a lot more interactive and a little bit fun to play around and to visualize the different data all right we have some practice problems for you now to go through and dive into not only creating tables manipulating them but also adding and playing with slicers as well with that we'll see you in the next lesson we're going to be jumping into formatting specifically conditional formatting so see you there in this lesson we're going to be focusing on formatting and not just self formatting where we're going through and adding borders and
colors but also conditional formatting where a cell's basically formatting highlighting will update dynamically based on a value in the first example we're going to focus on Cell formatting specifically we're going to go back to that table that we've worked with previously that does a count of data science jobs over the month anyway we're going to go through and actually format it using all the different functions we can in order to make it look pretty like I made it from there we're going to move into our first conditional formatting example where we're going to look at
basically highlighting based on a job title those that are basically high and those that are low highlighting them appropriately green or red and then in our final example we're going to move on besides using color scales to also using things like datab bars and also icon sets to make it look a lot more Dynamic we're also going to go over best practices on what not to do cuz sometimes you can go overboard in how much you're actually coloring a table and you can make it a little distracting and and ultimately not meet your goal for
this lesson we'll be working with our formatting notebook in chapter 4 as usual all the data is located in the little data Tab and we'll be starting with the underscore original of each of these sheets and then it we'll get to in this case format original we'll have what it looks like format final for the cell formatting we're going to be using this format original sheet and we're going to be focused on this Home tab here so I'm actually going to leave it expanded and for this we're going to make this to where well what
this table looks like by going through and actually formatting using all the different features in here so the first thing we need to do is highlight it all and actually remove the formatting so with it all selected I can go to editing and then clear and I can either clear all which is what I don't want to do I want to do clear format and Bam now we have an ugly table that doesn't really make a lot of sense now previously we were mess with tables so I could highlight from B3 to 010 and make
this into a table by coming up here to format as table basically selecting the color that I want saying that it has headers and allowing it to update there's definitely an option um but I'm not necessarily a fan of this so I'm going to clear this by pressing contrl Z Now an underused feature of formatting is this cell Styles tab right here so I'm going to go ahead and select the months up here basically the titles and for cell Styles they actually have a lot of pretty unique formatting you can see happening in the background
so I'm going to try out in this case I'm going to try out heading two which is pretty neat because it makes it bold slight bigger and it puts a little line underneath it I could do something also where I highlight all the rows over here and then make this into maybe heading three and then all these values in here are calculations so technically I could just highlight this all and for the cell Styles I could come up to the top here and select hey this is a calculation and this not a bad looking table
uh but not necessarily all I want to do so I'm going to just remove this all instead I'm going to start with my months I'm going to make them bold and also add a light gray background I'm going to do the same thing over here for the values in my rows and then from here we're going to get the actual column grid lines put in I'm going to only select C3 all the way down to o10 I'm going to show you why and I'm going to add an all borders so this is NE it add
it adds all borders to it what I'm going to also add this which will add a little bit of flare to it is a thick outside border so now we got a thick outside border around all of this and I'm going to do the same with this one of an all borders and then a thick outside border now it did remove that thick outside border that I had on this line between B and C so I'm actually going to go ahead and put that back in by just clicking it next thing I want to do
is format these with a comma so I'm going to come up here and well add a comma and then unfortunately it adds this space in here and makes this table bigger than what you can see now I'm going to first remove the decimal places and then in order order to fix this I'm going to highlight all the different columns through here to January and just double click on one of them to make them slightly smaller anyway it's still not fitting completely on here and I want this to fit within the view here so I'm just
going to select this all and I'm actually going to make these values slightly smaller and I'm not liking the positioning of these it looks like it's lower now that I made this smaller so I'm going to actually Center this this do a middle align basically move it up slightly all right my OCD is no looking good all right now this is looking good now the last thing we want to do is add a title to this basically describe what is this table that we're looking at and I want to insert this in up on the
top row but I basically want it centered over this table so what I can do is highlight from b11 and from there select up here for merge and also I want to Center because that's I want my text Center during this and from there I put in hey this is the data science job count tracker and for the cell style I'll make this heading one now let's get into conditionally formatting this table and specifically I want to say if I'm looking at data analyst I want to be able to look across here and see which
ones are the highs and the lows right now I have this grid lines and I can see that based on the green and red or the highs and lows but I want to actually be able to see this in this table right here and so underneath the Home tab we have this conditional formatting available we're going to focus on these three right here first and that is datab bars and you can see if I put it in it's basically looking like a you know like a bar chart color scales allows us to do well different
color formatting with it and then an icon set basically allows us to put in a nice looking icon and we're going to stick simple for now we're going to do color scales right now I have C4 through N4 selected I'm going to go ahead and select this green to Red which is not bad if we're looking this right this is doing exactly what I want I want August which is the highest to be highlighted green to attract my eyes to it and then I want the red to be November and December cuz a Lis I
want to attract attention to it but we want to highlight the entire table here so if I were to actually select the entire table if you will from C4 all the way down to n10 go into conditional formatting color scales and do the same thing you're going to notice it basically does these bands but it does this entire table all formatted together and this is not what we necessarily want of course the total road is going to be the highest I want to look through that row and actually see where I should be actually looking
so anytime we need a clear mess with any rules we come into conditional formatting and go to clear rules you have clear from selected cells or entire sheet we're just going to do the entire sheet then we're going to go back to where we were before of selecting just the data analyst values going into conditional formatting color scales and I'm going to go to this green white red I actually want to try to limit as many colors as I do two is enough so I'm going to go green white red red and I really like
this one better now I don't need to necessarily go through once again of selecting senior data analyst doing this again what I would do instead is I'm going to select data analyst here and then come into this home menu up here and you notice this paintbrush this is a format painter in the instructions it basically says select the content with with the format you like click format painter and then select something else to automatically apply the formatting so from here I can just paint my formatting on unfortunately this doesn't have a shortcut so I have
to go do go back up every single time it removes their marching ants reselect the format painter and go through and select it but now we have this formatted how I want it where I can look at a certain Row in this case I look at data analyst see what some of the highest are and Senior data Engineers I can see how they contrast to the other job titles additionally which going be jump into a little bit more later is we can go into manage rules and we can see the current conditional formatting appli right
now I have show matting formatting rules for current selection I'm selected the top cell right up here so there's no conditional formatting if I were to change this to just this worksheet I can then if I expand this down I can see how this applies this this type of formatting of the red white green applies to each of the different cells and if I needed to actually control what cells are actually selected I could do that I could have also gone through instead of done that copy formatting and pasting I could done a duplicate Rule
and modifying the code as well but I decided to do my way instead anyway this is where you need to go if anytime you need a manage conditional formatting we cck okay let's crank this up a notch and get into using some more advanced functionality with conditional formatting here we have a new table you haven't seen before basically it has all the different job titles the counts of those jobs aggregated from our data sheet the median salary what is their work from home percentage or likelihood based on the jobs and then finally I have this
job rank right here which basically uses these cells that are hidden right here that if we actually expand it out goes through and normalizes the values so in this case the job count normalize it between zero and one so this job count is 90 is the highest so it gets a value of one where it's the lowest gets a value of zero anyway I did this for all the different values and then from there provide a certain waiting factor of like 0453 and 0.15 in order to wait it appropriately this is all my bias and
how I wanted to actually do it so feel free to adjust it to what you want anyway we have this final job rank in order to assess based on these three values and this is commonly done especially in like kpis and stuff like that so we're going to be making like icons for this column so let's get into formatting our first column we're going to do job count first and for this one I want to have data bar so I'm going to come down into condition formatting into data bars and we'll add these data bars
right here I like the bars in this case because we're dealing with a count and we can really see especially data analysts scientist Engineers they really make up the majority of the data here so it really draws your attention to it next up is a median salary we're going to do similar to last time maintain a color scale we're just going to do this first one right here where green is the highest salary and red is the lowest and then one more we're going to do that work for home we're also going to do it
in a color scale but for this one let's actually do a different color go into more rules and in this case we have this new formatting rule window right here I have two colors just say I want to do one color I'm going to do white from the lowest value and then we'll do like purple for the highest value anyway this is all basically to show a point this is becoming entirely entirely too much visually distracting if you're if you were to give this to somebody else or a stakeholder where are they supposed to look
and actually organize their thoughts on where they should potentially pursue a job right now I'm thoroughly confused at looking at this so let's clean this up a bit and for this I want to make it to where I like maintaining a solid coloro across so that way you know like hey if this color is darker or there's more of this color I should be looking there so in this case we'll make this job count we're going to just clean it up slight slightly for the data bars B going to make this like gradient appearance cuz
then I feel we can see the numbers better and it's not too visually distracting for the median salary I really my goal of this is to find jobs that are look say greater than 100,000 so let's actually just make highlighting that highlights those jobs that are greater than this value in this case I'm going to come to conditional formatting and enter a new rule this new formatting rule popup comes up against once again and we have a select a rule type this allows us to do things like format all cells based on the value format
only top or bottom rank values format only values that are above or below average I personally like this one of use a formula to determine which cells to format and in this case I want to say I'm going to collect this formula thing right here I want to look at you can just select the first item in the item selected so I'm select D3 it's going to go through and actually do all of these don't worry we'll see and for that we want to highlight those that are greater than 100,000 and press enter and then
right now it doesn't have any format set so I'm going to change this to format and we can control a whole host of things such as the fill border font and the number formatting itself but we're going to stick with that blue theme I'm going to just come down in here and I'm going just select this blue color right here and click okay and then okay again now you notice my formatting is not appearing that's because we have multiple formatting applied to a cell which you can do so in order to fix this we need
to come into manage rules and as we see we have both of these applied to it so I actually need to select this one and I need to delete this Rule and click apply and then okay now we're running into our second issue and I slightly misled you earlier when I said that D3 works if we go back into manage our rules and we see our formula right here I'm going to double click it we don't need to actually provide an absolute reference to a D3 because it's actually going to evaluate all those cells based
on D3 instead we want it to be D3 without the dollar sign so it's not an absolute reference and therefore whenever I click okay and okay again bam now it knows appropriately to check the actual cell that it's looking at within the range on whether to highlight it or not moving on to the work from home we're going to keep this similar in that not going to be purple though we're going to change this to Blue instead so going into manage rules we have the actual color right here selected I'm going to just go in
and change this to this color that we used previously and click okay and then okay as well so that way it applies it all right the last thing is this the job rank itself and for this we're going to be using icon set specifically I like this one over here on ratings but this becomes a little bit over helming when where we have this rating and also the number next to it so we can actually remove this number in the column we go back into manage rules we can double click on that icon set Rule
and we can even further customize when these stars are appearing but I'm going to just go ahead and get to this portion where it says show icon only this allows us to only show the value so going into applying this bam it's now showing the icon I want that icon centered both vertically and also horizontally so bam now whenever I look at this I can see especially since it's all one color my eyes really gravitate to well data scientists and data Engineers based on this full star rating and more of the blue being in this
region and that's what I would hope people would go to or gravitate to as well when they're looking at it one quick note in this condition conditional format we didn't cover this highlight cell rules where you highlight greater than or less than or you do a top uh bottom rule where you can highlight the top 10% or top 10 you can also adjust that number anyway I find that myself more using custom rules instead by coming in here into new rule and then actually fine-tuning what I want to do so with the practice problems I'd
really dive into actually relying on using these type of options instead and so as I desly hinted to you have some practice problems now to go through and really practice how to do formatting and also more specifically conditional formatting in the next lesson we're going to be move into collaboration and covering how to actually protect your workbooks and your worksheets so that way whenever you share these with co-workers or friends they don't go through and actually mess them up all right with that I'll see you in the next one welcome to this last lesson in
spreadsheets advance for we jump into our project and this lesson itself is on collaboration which sounds sort of cheesy but in order to demonstrate what we're actually going to be learning in this lesson we need to actually jump fast forward a little bit and jump into our project so I'm going to open up the salary dashboard which is located under project One dashboard so here's the dashboard that we're going to build in it they have three boxes that you can go through and select this is going to be using data validation which we're going to
be learning about in this lesson but it allows you to basically standardize the inputs that we want somebody to actually select in in order to get the results and it prevents them from putting in values that maybe don't exist and then breaking our dashboard so for each of these job titles country and types we have an Associated visualization for each showing the salary by job title the salary by region and then also salary by job type finally at the bottom I have some I call them kpi cards basically outlining certain characteristics or certain indications of
the median salary what is the top job platform and then what what is a account of jobs but I can come in here and select something like maybe I wanted to look at business analyst and it's going to filter Down based on this telling me what their median salary is that LinkedIn is probably the best place to go to for this what are the different types of rollers availables and what's available in the job database so the other feature we're going to be going through besides this data validation process that we can do right here
is actually protecting your sheets which you can find this here underneath review under protect but anyway if you try to move these cells around you're not able to at all so we're going to be able to design this dashboard in a way that other co-workers won't be able to destroy it additionally if you notice down here at the bottom there's only one sheet in here there's actually other Sheets if I go to unhide here there's other sheets I'll just unhide one of them we'll just unhide data there's other sheets inside of here but if they're
not applicable to my co-workers or stakeholders I don't need to have them so I can hide them so that's the another feature we're go on over in this all right nothing be yaen let's actually get into this lesson for this we're going to be using the collaboration workbook in chapter 4 now we're going to be building out these three sheets as we go along and as a sneak peek in this first example we're going to be building out this little portion right here this is going be basically preparing us for our project so a lot
of this work is going to be put to good use anyway we're going to be building the simple one right here I'm going zoom in where we have based on the job title we can go through and select it so senior. engineer it's going to pop up with our median salary so that's what we're going to be building with this and specifically we're going to be using this feature of data validation so I'm going to create a new sheet to start with because I don't want to start with the answer right there I'm going just
call it calculator I'm going to put in job title here and then median salary below I'm also going to bold these by pressing B and then these are where next to it in column C is where we're actually going to use the actual control of this now we need to get a list of job titles to put in this so I'm going to create a new sheet and call it validation and basically what I going to do with this is create a sheet of all of the different job titles available specifically I'm going to say
this is going to be from the column job title short and we're going to be using in order to get the unique values of it well the unique function we need to provide it an array so I'm going come back over here down to column A2 use control shift select all the way down close the parenthesis press enter okay so now we have all of our different values I'm going expand this out I'm also going to zoom in a little bit now whenever I do this drop- down menu I want it in some sort of
order specifically I wanted in probably what is the highest count value I wanted it appearing at the top and those that are less likely down at the bottom so what I'm going to do is actually just copy this value right here because this is what we actually want to use what we want to do is a count ifs we want to count based on a condition for the criteria range we're going to be providing that job title short column from our table and then for the criteria we're going to be selecting right next to it
A2 B there we'll just autofill it all the way down and then finally we want to now sort it by this so I'll use job title short sorted from there we'll use the sort function to then sort this by the second column position in descending order so bam this is more like I want I want those data analysts that scientist engineers at the top and the senior roles and so on cloud Engineers car Bel so we now have this list available that we want to use for data validation we speak of I'm going to go
back to the calculator tab that I made and for this we're going to go to the data tab specifically under data tools they have this this selection available where where data validation actually is and now this is going to allow us to well customize it right now the data validation for this cell is any value I can place any value into it I could limit it to a whole number I could limit it to decimals a list a date a time a bunch of things we're going to limit it to basically a list of values
and we need to basically so provide a source for this so for the source we're going to go in and select the validation tab that we just made and I'm going to select all the different jobs right here and then press enter from here I'm going to accept this and press okay now as you can see we have this little drop down right next to it and I have different selections actually available of data engineer if I were to go into here because I have this uh set to data validation if I was going to
put in something like data nerd which isn't available and press enter it says this value doesn't match the data validation uh restriction defined for this cell therefore I have to go in and retry so so only values within there are going to be able to work in this so now let's actually get into calculating that median salary and for this we're going to create a new sheet similar to this median salary sheet we're going to call this one salary wrong spot need to actually enter it down here and call this one salary throw this all
the way over first I need the names of job title short and all that kind of good stuff so what I'll do is I'll come over to our validation Tab and I've selected equal to already I'm going to select these cells right here press enter so now they're all appearing here now I'm going to calculate the median salary for all these jobs I know our calculator or dashboard has uh only one value that is calculating a time but in our dashboard we're going to build we're actually going to build a graph with all these median
salaries so we just need to calculate them now all the median salaries and then basically calculate using data validation and also an X look up what the median salary is going to be here so for this we're going to be using the median function and specifically we're going to be using that if inside of it because median if isn't available we first want to check does the job title here of data analyst meet our condition of the job title short so I'm going to type in the table itself of jobs and then the column of
job title short close bracket and set an equal sign equal to A2 then I'm going to close the parentheses on this and actually we need to wrap all this in parentheses because we have to do multiple different conditions we're going to do some array multiplication the other thing we have to check is that the values are not blank or not equal to zero so once again I'll put in jobs again and we're going to be using that salary year average column and we want to make sure that it doesn't equal to zero and so that's
the condition we're checking for and so now what do we want to return if true well we want to return the salary so we'll do jobs and then salary year average I'll then close the brackets on that then we need to close one parentheses I can see a red parentheses still and then a final black parentheses NOS I'm good press enter looks like I got it right on the first try let's actually drag this down boom this is pretty nice so now we have all the median salaries for these different job titles I'm also going
to take this a step further of actually sorting this by the med CER because I know I'm going to be actually visualizing this in the Project's lesson so we'll go ahead and sort this as well sorting it on the second index in descending order so now we need to provide the value in this case data Engineers there is selected we need to provide based on this value the median salary and I want to just calculate it over here just in case I need to go back to it so for this I want basically 125,000 to
here right here in G2 so I'm going to provide an X lookup and the first thing is this lookup value right we're going to look up the data engineer in this now I'm not going to use a cell reference of going over here of selecting this cell of data engineer which is calculator C2 I'm actually going to escape out of this we're going to stop this right here I want to go back to this I actually instead because I'm going to be referencing these cells specifically well this what s right here a lot I'm going
to just rename this from C2 to title so right now I can see that it is named title so going back over to that salary tab again now we can perform our X lookup and for the lookup value we're trying to look up the title for the lookup array we're looking up through this job titles right here and then for a return array the actual salary values so now we're getting that data engineer value of 125,000 similarly I also want to name this cell as well I'm going to name this one median salary pressing enter
boom locks it in so now when I come back over to my calculator tab I can just put in here equal to median salary I'm also going to go through and format this to make this look better so just playing around with this I can see that I can put in something like senior data analyst and then a job the associated Med and seller is going to come up with it but let's say now I want to give this to a coworker right how can I prevent them from going in and potentially you know entering
in this cell and then breaking it well we can come up here to review and in this case we're going to select this of protect sheet now the first thing you can do you can set a password to unprotect sheet I'm not going to put a password but say you wanted to put one you could and then we have these options for for what you can actually protect whether that's select lock cells or select unlock cells to protect we're just going to leave both of these checked for the time being click okay and now while
one we can see that underneath protect here it now says instead of protect sheet it says unprotect sheet whenever I go through this and say I want to change it any value whatsoever I can't change it so it's good because the numbers can't change or the median tile can't change but now I can't change B job title which is a little bit of a pain so unfortunately Excel doesn't necessarily make this the easiest I'm going to start over again and just click unprotect sheet and what we want to do is we're going to select all
the cells in here so with all the cells selected I'm going to press control and unselect C2 then right clicking it I'm going to go into format cells now under this protection tab right here we're going to notice we have options for locked and hidden we want to actually be able to lock all the cells except for C2 we don't want to hide any so we're not going to adjust that right now but now we're going to have the ability to adjust whether it's locked or not this doesn't actually change anything right now so if
I go into here yes I locked those certain cells but if I were to type into here it's still going to allow it to be changed so now what I can do is go into protect sheet and previously we had both of these selected of Select lock cells and select unlock cells and in this case because we locked all the cells except for C2 we only want to allow people to select the unlocked cell of C2 so I'm going to uncheck this click okay and now I can't click anywhere else except for where I've set
up that data validation in this cell and I can still change it and it will manipulate the value now we could also go through and protect the workbook itself I don't necessarily manipulate with this as much instead what would I would want to do in in this case is actually hide all these other sheets with the exception of this calculator and so I can do this by right clicking a tab and selecting hide so I'm going to go through and actually hide all of them so now we have everything as shown by this tab down
here of calculator we have every tab hidden except for that and if I wanted it to reappear or get a sheet to reappear I would just right click it click unhide and then it's going to allow me to select which option I can unhide and and if I do want to make it to where a user can't go in and necessarily unhide sheets well I can go in here and select protect workbook once again I can enter a password if I wanted to I'm going to just set this up but now when I come down
here to rightclick it there's no option to hide or unhide a sheet so the entire workbook is now protected so I'm not going to lie that was definitely an advanced intro into Data validation and also protecting your workbooks but I promise it's going to just come into great use for whenever we're building this project which will we get to next now we do have some practice problems for you go through and just test out all these different features and with that we'll be jumping in the next lesson and actually building this data science salary dashboard
with that I'll see you in that one all right let's now dive in and build our first project with Excel which is this data science salary dashboard this project is going to combine everything that we've used and learned up to this point from formulas and functions to charts and then even to data validation we're going to start first by looking at the dashboard itself you can just go to the project One dashboard folder and Open salary dashboard workbook now in this right now you're only going to see one sheet and as you try to click
around you're not going be able to do anything so as a refresher if you want to actually dive in and see what's going on behind the scenes you'll need to First if you want to actually touch any of these points actually go into the review Tab and click unprotect sheet then you'll be able to investigate how I name certain cells and whatnot additionally if you want to investigate any of the workbooks that I worked on you'll need to go into unhide and select the appropriate workbook that you want to well unhide so for this we're
going to be building it out section by section specifically we're going to start up at the top building these data validation drop-down menus then from from there we'll go into building the different graphs associated with it and then finally we'll end up with these kpi cards now powering each one of these major topics I've built individual seats so for things like jobs I have all the jobs along with any key information to then build the visualizations in it so here is the basically the table that I made in order to show the graphic right here
similarly for Country I have all the different countries and then they're Associated Med and salaries and I use that to not only make the drop down but also make the graph same thing for type and then finally for platform anyway that's just a quick overview to make sure that you're under familiar with how we're going to be working through this but let's actually dive into it for this I recommend picking up where we left off in the last lesson on collaboration did a lot of work for that so we're going to use this workbook first
thing I'm going to do once this is open I'm going to go in and actually save it as this final dashboard and I recommend that during this you're saving this pretty frequently so we don't lose progress first thing I'm going to do is start moving this around I basically know where I want to get these different titles of these drop downs and then where I want to put the drop downs we're not going to be using meeting salary for a little bit so I'm just going to take that control xit and place it down at
the bottom then take the job title put it in C3 and then move the data validation to right below that we'll fix all the format add in when we get later on it okay so we have the job title now the next thing we need to jump into is country and we'll be putting that right under this portion right here for this I'm going to create a new sheet and call this country with all these sheets I want to have them pretty much similar to what the title is above it so in this case here
where we had median salary it's actually the titles um you have named it in the previous one salary so let's go ahead and just name this title anyway going back to that country tab that's where similar to the title tab if you see we first grab the names of the job titles from there and then calculate the median salaries for each we're going to be doing something similar in the country tab with first putting in the country names and then from there putting in that median salary but I want to keep a similar format as
in this title case remember we actually pulled this from the data valid ation tab which we're pulling here so I want to keep this consistent anytime we're creating anything for those drop downs we're going to make it here in this data validation tab so I'm going to create a column here called job country and then in this I want to get the unique values from our data set specifically that jobs table it's still named that jobs table and of that column job country go ahead and close the brackets and then close parentheses and now we
have all of these different countries not sure why but this is bolded I'm going to go ahead and remove that anyway I want this in a sorted format I'm not going to necessarily sort it like count like we did here with the job tiles I'm just going to sort it in alphabetical order so I'm going to use the sort function and I'm just going to identify that we wanted to use G2 hashtag and Bam now we have all of this also name this appropriately of job country sorted so now we have our list we can
go back into here and actually put in the country for the data validation portion we do that by going to the data tab selecting data validation and the values we want to provide a list to this and for the source we go back to that data Val station tab close this out and we basically want to select all these values here so I'll just do control shift down pressing enter we now have everything all the criteria for this I'm going to go and click okay and I get this error message and there's a problem with
this formula for some reason I guess when I move back it added this extra sheet in here I'm not too sure this extra data I can't even select in here anyway just make sure it's only one sheet there it's going to work fine country is now in here I can s something like Argentina next value that we're going to be looking at is the job type so part-time full-time whatnot with this although we're not going to use it yet I'm going to create a new sheet and call it type and also move that to the
end but now we want to get the unique values of job schedule type so I'm put in the column here of job schedule type and then from there we want to get the once again unique values for this we're using the jobs table specifically that job schedule type column and Bam now you will notice from this one this one it's a little bit this needs some data clean up with it there's a lot of values in here like it sometimes it has combined values like full-time part-time and internship and and whatnot we really I'm actually
going to expand this colum out we really just want the single values from this so something like fulltime contractor part-time internship and then also temp work so the first thing I'm noticing about the thing ones we want to remove is that they contain the word and so we'll first identify those that con turn and we do this using the search function which is a text function to find text specifically we're looking for that keyword of and with intext we want to just look through the whole array so we'll put in J2 hashtag and I got
a little error message I need to make sure I use double quotes for the text itself and running this now I have basically number values for where the and is located at and it looks like yeah it looks like we're good on everything with the exception of the zero which we'll get in a little bit okay so we need to convert this into basically Boolean values because we're going to end end up using this to to pull out that we want using a filter function so we're going to wrap this in the is number and
we're going to get false or true and whatnot anyway all right so now we have false or true the last thing we need to do is use well not the last thing second last thing we're going to use the filter function and in this we provided the array so in this case it's going to be J2 hashtag and then for what we want to include is this other array that we just did so I'm going go ahead and close this and see what we get returned back and we're returning now only the values that have
and in it we actually wanted to do opposite of that right we want the values that don't have an and so in order to do that we're going to fix this entire statement right here for the include portion we're going to wrap it in a giant knot to turn everything around add an extra parenthesis on the end bam now we have full-time contractor part time we got the zero in there internship and temp work we just need to remove this zero out of it so we just need to modify once again this right here this
portion of this include we're going to do some array multiplication basically once again looking through and making sure no values equal to zero so I'm going to do a multiplication do an opening closing parenthesis and basically we're just checking whether J2 hashtag is not equ equal to Zer let's go ahead and enter this boom now we have it down to the values that we want for this I'm going to name this appropriately job schedule type sorted also for some reason this is in this column we're going to move it over looks like we're buing one
spacing anyway now we need to go back to our basic calculator Tab and we need to enter data validation in this portion to make sure can select the right type so going select data validation once again allow values of list and then for the actual Source itself we'll go to that data validation tab select all these values in here press enter and enter okay so now we have the type in here so all of our data validation portions are now built next thing up is moving into building the three different charts here we're actually going
to start with the country chart because it's the easiest and a sneak peek of what data is actually needed for this I can go to the country tab inside my final salary dashboard and all we really need to do is for each country calculate the median salary and then throw it into a map graph so back to our Excel worksheet first thing we need to do is get those list of countries and remember we already have that so I'm put equal sign it's inside of our data validation here with these sorted values I want all
these values here here so I'm going to do H2 hashtag press enter we have all them all so let's actually start developing the formula for building this out using only we're just going to calculate first the median salary for that country and then also remember in the past we've have to filter out any values that basically equal zero so for that if condition for The Logical test we're going to do we're going to have to do array multiplication and for our first array we're going to be checking for the job country right so we do
that jobs table and specifically that job country column and we want to make sure that it's equal to basically A2 in this case the country right next to it additionally we want to check that there's 9 zero vales and so we're going to be checking the salary year average column and making sure that it's not equal to zero so now moving on to the value if true we basically want to use the salary year average column value false not applicable here go ahead and close this looks like we have a typo it went ahead and
added that extra parenthesis and we have a median salary now and go ahead and copy that all the way down now this is great but remember in our if I go here back to to the basic calculator tab we also want to not only filter for a specific country but also we're going to need to filter for a job title and also for a job type so we need to include not necessarily the country because we're doing it for each country but we need to include the job title and the type now in order to
add that this formula is going to get a lot longer and it's now getting hard to read so I want to actually I want to one I want to operate in this formula bar if you press control shift U it expands it out and then from there you can actually change it to the desired length that you want so what I'm going to do now is actually break this into new lines I can press on a Mac you're going to press Alt Enter on the Mac I'm pressing option return anyway I've went ahead broken this
into different lines I've also inserted some spaces in there to basically put in some indentation so I can read it better don't have to necessarily do that but now I feel like this is much readable for my eyes go ahead and execute this and Bam we have all the results and if I do a drag and drop all the way down all the other ones are updated as well so the first thing we need to add to this is to check for the job title itself so I'm going put a multiplication there go to the
next line pressing Alt Enter and for this I want to check jobs specifically I want to check that job title short column and whether it's equal to basically title remember we created title so I'm going go ahead and press enter and it looks like we have a typo because I forgot to insert a parentheses at the end press enter looks like I misspelled the actual table at itself my bad press enter again now I'm getting this name error right here and that's because of this title that we're using if we go back to that basic
calculator and select that cell C4 right here it's named titlecore exe and I can inspect the different names assigned to cells by going to formulas Define names and then the name manager now I started directly with this workbook before we actually created all these variables here so what we'll do is this I'm going to go ahead and actually just delete this titlecore ex that was just an example that's why it says ex then from there I'm going to just rename it I'm going to select the cell itself of C4 and I'm going to change it
back to title okay now it's Title Here back to the country tab uh we have this updated for the title it's actually appearing now no name eror and I'll go ahead and drag it all the way down there's going to be a lot less values for this cuz we're further filtering this so I'm seeing some num erors that's as expected all right the last condition we need to now take into account is this type right here and we haven't named this cell already so I'm selecting K4 and I've come up here and I'm going to
select type and now I've rename that as type so we can finish this formula off we wanted to I'm going to do a multiplication sign start a new line by pressing Alt Enter then do open and closeing parenthesis for this we want to check if the job schedule type column is equal to type okay I'm going to go ahead and press enter for this looks we have a value I expect a few more even filtered from here okay not a lot now one note on this this formula is perfectly fine for checking the job schedule
tyght I'm going to make it slightly better and actually slightly more correct if I go over to that data validation tab I'm going to press uh control shift U to actually close that formul bar if you remember from our job schedule tites yeah we narrowed it down to this list but actually there were the true list is this so what we actually need to do is check if a value is in here so in our case we want to check whether the type is in here so if we select part-time we will also match on
this job type here where it says full-time parttime or this one here where it says full-time part-time temp work and we can do that using the search function so we can find something like part time within text of right here and it's going to give us back a number and then if it's not there if I were to actually drag it down to something like third column it's not there it's going to get a a a value error so I'm going to come back into this and expand out the formula bar and I'm going to
change this formula right here to basically get that condiction remember we want to use the search function we want to find the text of the type which is that variable that we have for the job type and we'll be searching the job schedule type column now remember this is going to return back a number of the position if it's there so we're going to need to wrap this all in a is number function and then put closing parentheses so I'm going to autofill this all the way down again and it doesn't look like any values
at least in view actually changed underneath this formul bar for right now so I'm going to go ahead and hide it and then for this when we go to plot it we actually need to remove these numb values from here so in order to do this I'm going to I'll create this new one called job country filter and we're going to be using the well filter function and for this we need to include the array so everything from here downwards pressing control shift down to select that and then what do we want to actually include
well we want to check to include anything in that b column so is a number we going to check those values are equal to a number so I entered in that b column then as well all right let's go ahead and run this and it looks like it has all of our values I don't like the order I'd rather it sorted this is just me preference I'd rather the numerical values be sorted so I'm going to wrap this all in a sort function and this is the array we're applying to it we want to sort
it on the second index and for for this we wanted to put it in we'll say descending order and well Puerto Rico has some of the highest jobs may have to move there and okay we're going to get into applying this now I want to make sure that we have the maximum amount of values present there's a lot of countries missing that I know we available so I'm going to just select the most basic job possible to make sure that we have all the jobs that we can appear so so we'll just select data analyst
United States fulltime okay now we can go about selecting column d and e and then inserting in our map now I don't want this here so I'm actually going to grab this map and then come over here and put it in I'm only going to do some minor cleanup right now I'm going to remove the chart title and also leged but we now have this chart map available for countries that shows the median salary one quick note you are going to have this sort of warning right here if I click on it and it says
hey we plotted 74% of the location from the data with high confidence basically some of the countries in there couldn't align properly in my opinion it picked out a lot of the major countries so I'm really fine with that I'm fine if I didn't identify all of them 74 is good enough back to the final dashboard so we made this country map right here now we need to make these other two one thing to call out with this which I don't think I've called out before if we notice whenever we select a job so in
this case I'll select data scientist it makes that barall are a darker color blue the way your eyes go towards it and then you can compare it to the other ones so how did I do this well if I go to my jobs tab my final jobs tab what I'm doing here is I have all the median salaries which we calculated already in ours but I added this over here basically I have one column without we have data scientist selected right now so I have one column without the value appear in and then one value
with it appearing in and then what we'll do from there is just some basically manipulation of the graph to make it to where in this case data scientist appears so going back to our worksheet of our fancy Dancy dashboard we have so far going to go to that title sheet remember we already did all this portion of the last section first thing we do is well we need to do some cleanup we need to get rid of this name error also we are going to create those extra columns right here for basically what job title
selected but we need need to more importantly if I expand out the formula bar we need to update this median salary similar to what we do with job type to not only take into account the job title but also the country and the job schedule type so I'm all for not repeating our work I'm going to go back over to the country tab select the median salary and I'm going to basically just copy all that portion that's in there anyway I'm going to escape out of that come back into the job job title tab select
B2 and I'll go ahead and just press uh Alt Enter insert all that in and then now I just want to clean this up we do want this country which we're going to have to fix but we don't need these middle two right here that we already basically have specifically with the job country though so remember this thing's calculating the median salary based on the job title selected in this col here and column A so this A2 is going to work here previously we were doing the same thing with country we don't need to do
country anymore we need to actually put in a variable of country which we haven't created yet so I'm just going to enter country in it's going to give me an error this name error I'm going to come back over to the basic calculator tab select this and then rename G4 to Country press enter come back to the title tab we're no longer getting that name error looks like it's executing just right I'm going to go ahead and drag it all the way down and we do have an error in my formula I have this comma
right here this is supposed to actually be an array right this whole thing is supposed to be um an array so now let's try it again press enter okay 990,000 for data analyst in the United States I know that's true and now we're filling it in for all the rest okay so we have what we need I'm going close out the formula bar and remember we want to basically in one column if it has the word data analist we want to not include it and then another one we want to only include that one so
we're going to use an if for this so if this value which we're going to go ahead and lock the column is not equal to the title then we're going to basically display those results which I'm going to lock the column for this otherwise I just wanted to display an A and not a value Okay g to go ahead and enter this and it is dat analyst so it's not going to appear there but it will appear all the rest of these and so I locked those columns so I can just drag this over and
now with this other one I want to do the opposite basically if it's equal to title I want it to appear and then I'll drag and drop it all the way down so these are the values I want to plot so I'm going to select D2 to d11 then holding control also select these values right here go in and insert recommended charts and first one up is actually the one that I want so we'll go ahead and insert that so I'll take this chart and also move that right here into the basic calculator tab with
this one once again I don't want a chart title and I don't want a legend the other thing are the values the horizontal values down here I'm going to go ahead and double click on that scroll down here all the way to number and we're going to do that custom formatting that we've done previously if it's not peering uh feel free to type the code in but we're going to use this to basically format it as with the dollar sign in the front and then also the k for the thousands place all right the last
thing is you know I don't like to use a lot of different colors in this so making sure the graph is selected go to chart design and then into chart colors right now it's set under colorful which I think is awful default value I'm going to come down here and select not this monochromatic palette 4 five sorry the but the monochromatic palette 12 and that's because now data analyst will be the darkest blue the other ones will be light so that way my eyes go to that one instead so now what we just did with
the job title we need to repeat it for job type so a lot of copy and pase in so we're going to move a lot faster with this one because we've done most of this before for this we're going to be entering in the type sheet and I'm going to go ahead and pull all those things in from data validation tab now we need to get the median salaries for that I'm just going to come back over to the title sheet come into here and actually just copy this en typable formula then expanding this out
with control shift U pasting this in here now we need to just change this up slightly so for the job title we need to actually use the job title whereas conversely for the job type we no longer want to use type we want to use what's available in A2 pressing enter we get our value for full-time 990,000 of data analyst that's correct and then drag it on down I'm going to go ahead and close this for of the bar and for this I'm going to use uh similar to what we did in that Country Sheet
in where we not only filter the data to make sure we include is numbers but also we sorted it and that's because sometimes these values sometimes we may not have values and we go back to this type tab sometimes there may not be a certain job schedule type so I'm going to go ahead and paste this in now it is working I know there will always be five values so I'm going to actually change this to B6 here and also B6 here and press enter now I also realized I made a mistake earlier whenever I
went to the title sheet this is only doing the sort function and we may have a condition where in certain countries they don't have all these different job titles available so we need to do its similar Hill here as well so I'm going to paste that formula into here and then adjust it because I know there's always 10 job titles so it's going to go down to 11 in this case and 11 here we go ahead and run that there's going to be no change the one issue though is in this case if I go
back to that basic calculator it doesn't do it in the order that I want so going back to that title sheet I'm going to change that sorting value from a negative one to a one so that way it goes in basically ascending order and I need to do the same thing here here as well in the type sheet where it's also in ascending order cuz we're going to be making the same graph all right similar to last time I wanted to if the value is selected I want it to be highlighted so we need to
make those same columns again so if this is not equal to the type I want the value to appear and it be na because right now fulltime is selected dragging it over and then adjusting it for equal instead and then dragging it down I do want it to appear if it's full-time now I'm going to select D2 D6 and then these values in f and g once again we're going to go to insert recommended charts I don't like these clustered columns I prefer a clustered bar chart so I'm going to take this and then put
it in here make similar format and changes as well of removing the title and then also the legend updating the xaxis by going into numbers and changing the format to a custom format to using the K value instead and then finally the actual color Itself by going to that monoch chromatic the color palette 12 so bam now we have a lot of this made so I can go through now and select say data data scientist it will update for selecting data scientist and then you see all these other values update as well I can also
select the different type um part-time in this case and then the values still remain the the same it just changes the bar that it's selected to all right the last major thing before we get into formatting we're going to make these three kpi cards one is for the median salary the next is for the top job platform and then finally on the job count itself for how many counts of jobs for all of these now one quick thing Excel doesn't necessarily have kpi cards like if you use something like powerbi or looker they provide cards
to this we're going to do some sort of backdoor approach if you will to make this into a kpi card basically I'm going to insert in a text box and we're going to put a cell equal to it you'll see what we're going to do with it but the main point is these values this value itself is not as you can see it's a rectangle it's not in a Cell per se but it is calculated within the workbook anyway what we're going to be doing I don't need this down here this median salary what we
did from the last lesson I'm gonna go ahead and delete this but the first we want to calculate is that median salary and we basically have it already and I'm going to calculate it right here in this column of I2 and for this we're just going to use a simple x lookup and the value we want to look up is based on the job title selected so title and the lookup array is this array right here and then the final return array is right next to it there's a missing value right now because Cloud Engineers
is not available in the currenc are selected so make sure you're selecting the full values and we going to go ahead and close it but we have now the median salary so I'm going to actually rename this I2 cell to median salary and then going back into our basic calculator tab remember I'm not going to insert it into a sell in here but instead we go into insert and then illustrations and I'm just going to insert a simple old textt box I'll drag it right there now the thing is I don't want to type inside
of here what I'm actually do is I'm going to select the Box itself so you no longer have that blinking cursor in there come up into the formula bar up here type in equal to median salary and Bam now if you notice it copied the formatting that we previously have right here as a cluster number looking at right there it copied the same formatting that we're using here in I2 so what I'm going to do is just go in here and change this formatting to a currency with zero decimal places and then once we have
this value actually updated go back to basic calculator we can see boom looks a lot nicer we'll adjust the formatting as far as the size and stuff in a little bit after we calculate all the other ones the next one from our final dashboard is the top job platform so we've only calculated things associated with the job title the job country and the job type so we need to make a new sheet and we'll rename it platform and technically the column name is job via and for this we need to get the unique values of
the job via column now for this one we're trying to get the top job platform so we're not necessarily doing that based on what is the top median salary on this I just want where are the most jobs actually located so we're going to be doing a count using control shift U to expand the we've been using this median with this if array in it we've already built this out already which this formula does so you could so we're going to use this I'm going to go ahead and copy it by pressing contrl C coming
over to platform and then pasting it in with contrl v okay and instead of median we're going to use count and the only other thing we need to update on this is we stole it from the job country page is we need to update the job country to be well country and we need to check one more condition so we need to add to this array I'm going press uh Alt Enter to create a new line and we want to check that job via is equal to in this case A2 and we go ahead and
press enter looks like 10 were available for Via script zip recruiter and then it calculates all the way down now remember our data set also has hourly data in there as well so technically if you wanted to which I'm going to I'm going to remove move this condition right here that we're checking that it's not equal to zero basically it's also going to include if there's a job that has an hourly salary included so I'm going to go ahead and backspace out of that press enter and then from there drag and drop it down and
I can see we added a few more values because of this I'm close this formula bar control shift you all right so now I need to sort these values basically from high to low selecting all the values using control shift down the sword index we want to use the second index and we want to put this one in descending order cuz we want the highest one up at the top and for this it looks like snag a job is the highest anyway uh this is what we want this first one actually appearing in our kpi
card but if you notice all of these have via in front of it so what I'm going to use is a text function of substitute which replaces existing test with a new text and for our text in D2 the old text that I want to replace is via with a space and the new text is just a blank value so snag job is now up the top this is what I want to be known as we're going to rename this variable to platform then we do the same thing on our dashboard of inserting a text
value and for this I'm going to select it and say that it's equal to platform all right so snag a job and for this one this one is well somewhat simple but in our data validation tab we were in the very beginning in the last lesson we were calculating the count and we were calculating a generic count of all of them so we need to once again modify this because we want the count based on our three conditions here so what I'm going to do is just basically steal it from what we did previously go
into that B2 cell in the platform sheet go ahead and copy this all and then then in here I'm going to expand this formula out I'm going to go ahead and replace that in B2 with this now a few modifications we can make to this we're no longer checking the job via column we're not trying to check that for the count that was specific to where we stole that from so I'm going to delete that and also this uh multiplication point and then this is checking all of the things selected of country title and type
we're wanting to check the count of a certain title so instead of having title we'll put in a A2 pressing enter we have a lower value because we've the current filters are lower and then we'll fill it all the way down closing the formula bar out we now want to get the count for whatever is selected so I'm going to go to an empty column over here right here and we're going to be doing an X lookup again the lookup value is what is the title that we're using the lookup array is we'll use this
one right here and then for as far as the return array right next to it pressing enter boom get a value of 537 now just to be safe in case there aren't any results like say it was zero or something or not applicable it's going to be basically not applicable I do want to include if not found I'm going to enter in no results and I'm going to do the same thing underneath the title sheet for where we calculated the median salary put for no results so I'm going go ahead we want to get that
count in there so we insert that illustration again for us we're going to insert a text box and that textbox is going to be equal to count which I don't think we actually named yet so I actually need to go back to escape out of this go back to the data validation tab rename this count and then from there with the text box selected I'm going put that equal to count now for each one of these text boxes I need to go through and actually as you can see the we have a text box for
the value but I actually want to use a shape basically background to tell us what we're actually performing or calculation that this kpi is showing so I'm going come in here and to insert illustrations for shapes we're going to keep it actually we'll say a rectangle this time and then we'll go ahead and draw it now for the shape format itself I'm going to go to this one right here basically a blue around with white on the front and with these shapes you can still put in text in here so I can put in something
like median salary and I can open up the Home tab and I can actually customize this further so I can make this bold I can put in the center I actually want Center top and I'm going to make this slightly bigger by 20 point also I'm noticing this box is a green outline I don't really like that I'd rather a blue outline so we have that now okay so how do we get that number if you notice the number is no long it's hidden behind here we can do a couple different ways but I'm just
going to rightclick this object and then under shape format you can go to send backwards specifically I want to send all the way to the back now getting into the actual text box itself if you notice there's a little bit of a a box around it I don't really like that I'm also going to exp expand it all the way to the edges I'm going to format this one as well to be centered bold and then we're going to make the font much bigger on this and I'm going to once I like I talked about
remove that shape outline right now it has a a light one I'm going to say no outline okay so now it looks like a kpi card copying this I'm going to then make two more and for each of these I'm going to send them back to the back name appropriately to top job platform and job count for this I'm going to just copy this text box here that has the median salary in it and I just want to copy the formatting to the other ones as well so we can conveniently use this paintbrush this format
prer and I'll select this one it disappeared I have to reselect it and I'll also select this one if you notice the names are cutting off so it's really important that you extend it all the way over same thing with the job count as well now we're getting into the format portion of actually just doing some final touches on here I don't like grid lines so under view tab I'm going to select remove grid lines for each of these charts I don't really like those outlines I want it just to sort of blend in to
make it look like it's there so for the shape outline I'm going to change each of them to no outline up in our data validation point I want to make the spacing right I'm also going to make these titles slightly bigger for the dropdowns themselves I want them to basically pop out so I'm going to change this formatting I'm going to go to the cell Styles and I really like this one of input because it sort of calls your eyes to what you need to go to I'm going to make this G column slightly bigger
and then shift the type over some the other thing I want to do is add a title up here at the top for what this dashboard actually does so I'm going to select cells B1 through L1 I'm going to do merge and center and I'm going to change this to data science salary calculator along with going to the cell style we'll do heading one for right now I want that to still be slightly bigger okay now we're going to to start moving stuff around but I want to get in it's like its final form that
I'm going to give to colleagues and co-workers and I'm going to give it with the Home tab closed and also with if I view this can remove headings so it moved the column headers the A and the B and then the row numbers as well so it looks like everything's upda correctly one minor thing this job count I want to make sure after I select it fulltime I saw that the formatting of the thousands with the Comm is not there so going back into that data validation tab I'm going to select this go to home
make it a comma and remove all the decimal places okay looking good all right now we need to get this set up to give to colleagues I don't want them to have all these other tabs or all these other sheets so I'm going to go through and actually just hide the ones that aren't applicable for them Additionally the sheet of basic calculator doesn't really make sense anymore cuz that was for that first lesson I'm going to actually name this to salary calculator now call could still potentially go in and they could mess up these formulas
and so we need to now protect our worksheet and we only want them to be able to manipulate these three cells so we're going to be going through protecting the sheet but we need to actually recall that we have to pick what cells that we want to lock right we need to select all the cells and I preemptively told you to hide the headings you need to go back into view and show the headings again cuz we need to be able to select this triangle in the upper left hand order to select all the different
cells and then from there holding control unselect these three cells and then from there we're going to right click in there go to format cells under protection and we want to make in that case that they are locked or basically we are going to be able to lock them conversely we need to escape out of this and now select the three cells that we want to unlock right click go to format cells and for these we want to make sure that they are not checked for this so basically unlocked whenever we go ahead and protect
the sheet so now whenever I go into review go to protect sheet I want to be able to select unlock cells once again if you want to enter a password you can I'm going to click okay so now I can't click anywhere else except for where we have our data validation so I can go through and select things like data scientist and turkey now I'm just going to add that last final touch of removing the headings bam we have our dashboard now I promise last last thing before we go I'm noticing and you're probably noticing
as well if you're going through and manipulating these values in this case let's go from data analyst from previous selected data scientists this me talking in real time I want to show it takes how long it takes to load and it takes forever to load why is it doing this this is not good for stakeholders they're going to get annoyed if it takes this long I'm going go ahead and unhide some of our sheet repats specifically that platform one now these formulas that we're using um the array formulas to calculate these values it's F so
in this platforms one we have like oh my gosh in this case we have close to 200 oh no it's like slowing down even going through this we're executing this hundreds of times in here whereas if I compare it to something like the title sheet we're only running this you know n 10 times which I feel isn't that big but if we're running this formula hundreds of times it's going to slow down this sheet so I have a quick fix for this and it involves we're not going to especially for this sheet here platform sheets
we're not going to use this um array multiplication order to calculate this instead we're going to use a count ifs the first thing we're going to do is check that the Java is equal to the criteria one of A2 so basically job platform is what it is says it is from there we'll check the job title short column to make sure it makes up with title we'll check the job country is equal to Country and then finally we're going to check that the job schedule type is equal to type and then we're going to go
ahead and execute this and then we're going to autofill it all the way down notice that 1490 it's actually going to go down slightly to 1426 and that's because we've now changed this condition inside of this count ifs specifically if I go back to that title sheet you remember whenever we match for this we did a really indepth search so if any job schedule type contain those keywords we match to it now we're only matching it if it exactly matches but since this job platform is just providing it's not providing a numerical value it's providing
what is the Top Value I don't think the Top Value is going to change that much so I don't think we're being inaccurate about this if we change this formula anyway going back to the actual dashboard itself now whenever I change this from data analyst to data scientist it is much faster so now I'm go ahead and hide those sheets and we are done so that was a heck of a lot of work so in the next lesson we're going to be getting into how you can actually go through and share this dashboard specifically for
those that have a Microsoft description you can use something like Microsoft online because it has all these features that we have within here and host it there for others to use additionally we're going to get into my recommended method of sharing any your projects and that's via linked in now just a heads up we will be getting into git and GitHub after project 2 at the very end of this course and during that portion we'll talk about how to share not only project 2 but also this project here but that's more complicated and I really
want to focus on Excel so with that we're going to be shifting in the next lesson to quickly share it and then moving into the advanced chapter all right with that I'll see you in the next [Music] one first up congratulations on completing your first project in Excel and building this salary dashboard been nothing short of your hard work and you shouldn't let that hard work go unnoticed so in this lesson we're going to be going over different methods you could go about actually sharing this project to your social network and to others to help
out in the job search or future employment now if you were just learning these skills for fun you had no intent getting a new job or increasing your pay in your current job then you can feel free to skip this and go to the next chapter on pivot tables so there's a few different ways you can go about sharing your work that you did we're not going to go dive into deep any of these we're going to look at these more at a high level before jumping into one of the options first up is a
portfolio website here I have luk bru.com and if I wanted to I could come inside of here and edit it and include my project here along with what I did for others to see another option even if you don't have a big following on YouTube is you could actually go in and record and describe what you did within your dashboard and host it somewhere like YouTube now for both those options you may be like Luke how do I actually actually share my Excel file that actually went through well that's where we run into a little
bit of issues as as yes we created this Excel file right here but how do you actually go about sharing it with others to see your work well one option for this is actually hosting your file online via something like one drive which if you're paying for a subscription of Microsoft service you have access to one drive and you can host your dashboard online all I need to do is navigate to One drive. live.com go to this add new and files upload from there select my file that I actually want to upload online and then
we can go to it and our file is actually uploaded here which we can actually go through and select something like data scientists and it will actually calculate based on the changes we make to it now one note the country chart inside of excel online doesn't work but I have a fix for it and mainly it's to just remove it you go into the review tab under protection and go to manage protection and then you turn off sheet protection then from there you can delete it next all you need to do is just take those
charts and actually extend them over so way they take up that extra space and then once you're complete with that turn back on the sheet protection and now you can go about actually sharing this so here I'm coming into share and you can add an email if you want or if you just want to share it in general with a link you can come down here and fine-tune the control of a link to provide in this case I'm selecting that I'm going to share with anyone they can edit it you could make it view but
then they can't change the dropdowns so I recommend that you still leave it on edit you could set an expiration and even password and then from there click apply and now you have a link to your dashboard that works even if you don't have a Microsoft account so here I am in incognito mode within my browser so I'm not signed in at all and I can actually go in and access this dashboard and go through and select something and it updates in real time and because I got that sheet protection on they can't go through
and change anything except for these dropdowns don't believe me you can check out my project via the link below but what happens if we want to not only maybe share our file but also write up what we did the work we did with this and all the different skills that we used well that's the case of using something like GitHub GitHub provides a location to store Excel files like shown here along with giving you the ability to go through and perform a write up detailing all the different work that you did now if you wanted
to see this you could just navigate over to my project where you download all these files from on GitHub navigate into that project one-board and in here has our Excel file and also this read me which then appears actually underneath here and details all the different work that we did for this now getting this project onto GitHub if you're not familiar with GitHub up is fairly complex we're actually going to be saving this for after project 2 and in that case navigating back to the project itself we'll not only be uploading project one we'll also
be uploading project two as well so after we finish the last chapter chapter 8 on power pivot we'll be getting into all of this and you'll be learning more about git GitHub and how to manage a projects now from what I found working in data science it's that the best way to share your work and your project and potentially collaborate with others is use something like LinkedIn a social media platform for networking in order to share your project specifically here I am on my profile right here and if we scroll on down they have a
section in your profile to basically show all your different projects that you've worked on and contributed to and adding a project is super simple I got to do is click this plus icon include a description in my case I was trying to help out job Seekers inves salaries for their desired jobs put in a few skills up to five of Microsoft Excel data analysis or Excel dashboards now for media they do have options to add a link or media in the case of the media it doesn't support Excel files and then if you try to
insert your one Drive Link I ran into errors so I find the best way to actually just share the link is to post it inside of the description from there specify when you start and stopped on this project anybody that contributed to it this or anything that is associated with and then from there click save the other option that I recommend is actually just going in and making a post here I just write up a short little description of what you did with your project and then if you want include something like an image or
even something like a gif which shows an overview of the project and then probably the most important thing is actually sharing that link to your one drive online you can also Post in the comments and not include in the description it's really up to you anyway go through there and then post so bam that's how you share your project as a reminder we will be going into greater detail into how to share both this project and also the second project on GitHub using git and also use things like markdown in order to write about your
project but that'll be included after we go through all of the different Excel content just wanted to have a quick way of you going through and actually sharing what you've done so far cuz I know you're probably excited and proud of it all right in the next videos we're going to be shifting gear into the advanced chapters getting starting off first with pivot tables with that I'll see you in there all right welcome to the advanced chapter and because we're get into the advanced section you know it's time for a new flannel and with this
Advanced chapter we're going to be focusing on a few core topics that I think is going to make your life a lot easier specifically we're f focus on things like pivot tables power query and also power pivot all of these are great at automating my Excel workflows to make it a lot easier to do repetitive analytics that my boss may come to me back and back again for instead of with something like a formula where I have to go through and make and copy and paste that formula all over again and rerun that whole analysis
these Advanced chapters are going to make your life a lot easier anyway in this chapter we're going to be focused on pivot tables this lesson specifically will be getting an intro into pivot tables how to make them how to manipulate them how to even read them in the next lesson we'll be going into advanced pivot tables looking at things like grouping and even aggregating such as getting percentages of grand totals and whatnot and then the final lesson in this chapter is on pivot charts which allows us to basically take what we have in our pivot
tables and convert it into a usable chart hence the name pivot chart all right so let's actually get into it and understanding why these pivot tables are so important so in the basics chapter we made this table right here which uses hardcoded values for the different job titles along with the different months and then from there uses formulas specifically some product along with some array calculations in order to calculate how many job counts per month this is cool and all but what happens if we wanted to add another job title so say we have like
some like business analyst or we have software developer we'd have to actually manipulate and upgrade all these different formulas that we have here well here's that same table but in a pivot table and by its name that's what they're great at they're great at pivoting and thus aggregating data based on certain values and whatnot so what is if we want to add more job title this well I can just come in here similar to how we manipulate a table select this filter dropdown and then go from there and select things like oh I want to
include something like a business analyst and then the data automatically updates for this no readjusting formulas makes it super simple I can even take this table a step further and if I wanted to I can actually filter by the job country in this case I'm filtering by the United States and we now have these values makes it super simple anyway we're getting ahead of ourselves we actually need to get into creating our first pivot table all right so for the advanced chapters it's going to be a little bit different for what files you're going to
use for this the final results of this lesson will be in the lesson title of pivot table intro but what I want you to do whenever you're going through or following me along in this lesson is actually revert back to the previous file of the last lesson in this case or the first lesson so we don't have one so I have this one called zero of just pivot tables that's the one you want to start with so in this case pivot tables itself just has the data tab of the data we want to work with
and this sheet of the table that we've been familiar with in Basics chapter which by the end of this we're going to make a pivot table out of and when out of I mean actually of the core data itself anyway for the actual pivot table intro this will have also those similar tabs but then also the lesson itself will have all the different work that we've actually done to complete what we need to do so feel free to just have both of these up during a lesson so that way you can consult back and forth
in case you get lost all right so let's get into our first pivot table we're going to be using the data that we previous been using of all the salary data for those job titles anyway if I go into the insert tab up here in the top left hand corner I have pivot tables but I also have recommended pivot tables if I don't have an analysis in mind I could come into recommended pivot tables a Pan's going to appear on the right hand side and notice here that it actually selected the data range I know
that's the data range and it goes through and provides some recommended different pivot tables that you could put into here whether you put it into a new sheet or an existing sheet but I know what analysis I want to do specifically I want to do a count of the different job titles so data engineer I want to find the accounts of this senior data analyst and so on right now it's not providing any of that I don't typically find that any time with recommended pivot tables that it provides me what I want so I don't
find myself using that often instead I go directly into pivot tables right here and then we have three options but we're really going to focus for this lesson and this chapter is from table or range I'm selected inside of A4 right now but it automatically knows that this is the data range all the way down to the bottom the other thing it says is choose where you want the pivot tail to place you can either do a new worksheet or you can do inside the existing worksheet but you have to specify a location we don't
want that I typically like it in a new worksheet to keep my analysis in one standard location the last thing it asked is whether you want to analyze multiple tables specifically add this to the data model we're going to be going into Data models very heavily in the power pivot chapter or chapter eight or last chapter this is a super powerful feature when you have multiple tables you need to combine it we're not doing it in this lesson or in this chapter so we're going to leave it unchecked so now I'm in this new sheet
that I'm going to rename to job count and I'm also going to move it over here to the end anyway this pivot table this pivot table 2 that is calling it is there's nothing in it right now and you notice there's a few things that popped up first is the pivot table analyze tab which is available with this and also the design tab we'll be going into these in some upcoming examples that we're going to get into we're however going to be focusing on for this example example on the job count I'm going to close
this out on this pivot tabl Fields pane right here now the layout of this you may see it's somewhat different is we have the columns over here on the left so if you remember the job tile short column job tile column job location and then these fields on the right hand side are things for like filters row columns or values so I can take the job title short column put into something like the rows and get basically all the values in the rows now your layout may be a little bit different if you come up
and select the tools icon right here you may be under this Field section and area section stacked which has the feels down here on the bottom I personally don't really like this because look how short my column titles are so I like having them like this instead anyway I think we understand this columns area right here but I don't think we understand these filters rows columns and values so let's explore this by calculating the counts of these different job titles now anytime I add something to the rows or any of these columns I can either
remove it by grabbing it and pulling it off notice they have the x mark on it or similarly I can also just come in here and click the uncheck Mark box that's more applicable if especially for having it in multiple different panes and want to move it completely makes it simple besides rows we also have columns and so instead of the job titles being in rows they're in the different columns I don't really like this too much I typically find myself using rows so we're trying to calculate what is the count of these job title
shorts so I'm just going to take that job title short again and put it into the values and it automatically Aggregates this by counts of that but what happens if I don't want to do that count aggregation well one way is to come back into that values right here and I'm going to just click it not right click it just normal click it and then go into value field settings and this pop-up is going to come up first up is the name of the column itself I actually don't like this for of a name I'm
just going to rename this to job count under here under the summarized values by tab you can select a lot of different aggregation methods we're going to stay with count you can also change how you show value as basically if we wanted to do a percentage of some total or not we're going to be jumping that in the advaned lesson so stand by for that the last thing to note with this is the number format so I can come in here and actually select in our case we have thousand values so I like to use
a th separator along with zero decimal places and then clicking okay to apply this all it updates the formatting and the name so we've going over rows columns and values what happens if we want to then filter let's say for only United States jobs well I could drag something like the job country column into filters and right now it's selecting all you have you see this pan come up right here and from there here I can actually go through and select something like the United States click okay and now the values as you can see
they reduced and are only United States value other type of filterings I can do I can filter the row itself so if I wanted to I could select the different job titles that I want to appear in this and click apply I could also do something where let's say I wanted only job title so we're going to do a label filter and jobs that contain the word data so I could just type in here data and whenever I filter it I get all the different jobs that contain data similarly I could also filter by this
job count here and that's by the values filter so I'm going to remove this label filters to start with and we can go back in here in the values filter and we could do something like hey we want to get jobs that are only greater than let's see here Cloud Engineers 33 I don't want to see that anymore I get to greater than 100 and it filters down but we're not going to use any filters right now so I'm going to one clear this filter for the table and then also remove this filter from filtering
for the United States so let's get into taking this analysis of step further and we're going to want to now analyze the average salary of these different job titles while we're going through this we're also going to be exploring the pivot table analyze tab so a quick tour of this tab first up over here on the left is Pivot tables if I wanted to I could go through and rename this i' probably name this typically something similar to what is my sheet name itself this case I named it job count additionally inside of here we
have options which allows us to do a lot of detailed control of how we're building our pivot tables it's a very Advanced feature I don't find myself going into it quite often unless I need to fine-tune the functionality of it active field so that tells us basically what's the active field grouping is something we're going to go into in the next lesson we actually go and Performing groups of different job titles slicers and timelines we're going to be going into the last lesson on pivot charts in order to basically use these slicers and timelines to
filter data section is used to control our data so I can click something like refresh or refresh all it's going to refresh the data that we have so in this case remember business analyst is around 1,1 so if I go back to our data itself and I find this entry on business analyst and then and let's say that that's not correct and I delete that out of there whenever I come back to this table itself it still says 1,1 what I have to do is well we've updated the data so I have to well refresh
it now that I refreshed it it's down to 1,000 I actually don't want to remove that entry so I'm going to just press contrl Z and bring that right back and then also click refresh to make sure it's up to date if I want to change the data source or maybe the range I could go into something like this of change data source actions allow us to clear select and even move a pivot table for calculations they have things like calculated fields and items but we're going to get into measures and I feel they're way
more powerful so we're not going to cover this much the last thing to cover with this is over here on the right hand side is the show sometimes whenever you're navigating you'll click into your pivot table and that pivot table Fields pane won't pop up you can also pan it on and off by clicking this field list and if you didn't want something like row labels at the top you could just remove the field headers as well so getting into that actual analysis we want to analyze the salary year average what is the average value
now I can't see all the different values selected in here so I'm going to actually going to go ahead and close this paint up here to have a bigger view anyway what it did was it did a sum of the salary year average we don't really want that we want to go to average and I'll change this column name to to average yearly salary now if you've been following along since the basic chapter you probably know that I prefer me performing a median for this salary data over an average but if you actually go through
this there's no median value for this that doesn't mean you can't do median in pivot tables you actually can you can actually do even more advanced stuff which we're going to get to in chapter 8 and power pivot but for now we're just going to stick to only performing average for this I'm going to click okay so the formatting on this is all jacked up and we could go into that field settings and adjust that or I can actually go in as long as I have all the values selected here I can select hey I
want to convert this to a currency and that I don't want any decimal places and it's going to format all the values and I feel this is a little bit easier because now actually if you go back and in exploring the value field settings inside of number format it actually applied this custom formatting for me so it knows to apply that since I applied it to all the values that were visible now since this is so easy I could also do something like get the average of the hourly salary once again it's doing the sum
of that and I don't want that I want the average itself and I can change that column Name by just going in here and typing in average hourly salary inspecting the value field setting it also updates inside of here and I'm going to go ahead and adjust the formatting as well changes to a currency with two decimal places so let's get into actually cleaning how this table looks up and we can go and do this by going into the design tab now I'm going to start over here on the right in pivot table Styles and
we can actually change what it may look like in this case I sort of like this one right here the simplistic look I can also change things like column headers which I like the formatting on it or whether I want banded rows or banded columns in my case I kind of like the banded rows we'll go with that last portion is around the layout if you notice down here we have this grand total over here this is a grand total based on well the column values it's adding up all the values in the column so
this is on for the column so if I wanted to turn it off for rows and columns I could come up here and actually do that I kind of like this so we're going to leave it on I could also turn on on for the rows and columns but in this case because we're doing different aggregation method so a count here and an average here it's not necessarily going to do anything over here for the row grand total whereas for something like the columns gram total that knows that hey for a job count I probably
need the total count for the average I probably need an average and that's what it does for both of these there's some additional ones up here on adjusting the report layout adjusting for blank rolls and then also subtitles we'll be exploring that as we go along as we build out more complex pivot tables so let's now get into that final analysis and we're going to be creating basically this pivot table that we did previously with formulas and functions so what we'll need to do or think of right we're going to need the job title short
in the rows and we're going to need the month the job posted months in the columns and then we'll need to aggregate this by count for the values now I can navigate back to the data Tab and once again go to insert pivot table if you notice here it says from table or range so that's the really good thing if we actually convert this to a table we'll now be able to once we do this press okay and rename this to something like jobs now we can really be anywhere in this workbook in this case
I created a new sheet I go hey insert from table arrange specifically I want to do a table of jobs and we want to do this existing worksheet in A1 and all the values from that jobs table are now here so we know we need the job title short along the rows but then we need the job posted month across the top which right now we have a date we could put the date into the columns but we get this air Message hey you cannot place a field that has more than well 16,000 different values
for it so we're not going to do that also before we forget I'm going to rename the sheet to monthly count anyway we need a monthly value here so what going to have to do is good thing about the table itself is now that we've created this as a table I know next to this job posted date colum I want to insert in a column called job posted month and for this we'll just use that text function that we already know using the value of job posted date and then for the format we know we
want three lowercase M to get the month itself it's going to fill all the way down okay so now we have job posted month going back to our pivot table itself remember we're not going to see job posted month in here until we actually go back into pivot table to analyze and click refresh now job posted month is inside of here and conveniently it's also in the correct order now this thing is completely blank right now we need to actually add what values we want so I'm going to drag job title short into values and
it's going to do a count notice here we do have column value Val which go up and down and then the row values itself so we can see what the count of business analyst is around 101 I'm not really a fan of these things that say row and column labels I'm going so I'm going to toggle off field headers to make this look a little bit better and I'm also going to change the name of this to monthly job count so bam this is looking good and we compare it to our basically non-pa table just
to make sure that our values are correct we can see we have 982 for data analyst come over over to data analyst we have 982 all the last thing we want to do is actually filter this down and better sort our values specifically I'm curious about roles in the United States so I'm going to drag that job country over here and select United States from here to apply to it additionally I care about the most important jobs at the top and the least important at the bottom mainly by this grand total right here and so
what I can do is I can sort it by the grand total but if you notice I remove that that filter button right here whenever I actually remove the field headers so I can also go in Instead rightclick This Grand the value inside of grand total and I can say sort from in our case largest to smallest so I feel like that makes it a lot more convenient alsoo sort additionally I'm noticing the formatting isn't correct for this I'm going to put in that comma separator and then remove the two decimal places similarly not only
did we sort by the grand total let's say I only wanted maybe the top six of these right here I could rightclick any of these job titles right here and then go into filter in this case I'm going to go top 10 instead I'm going to select top six press okay now that we have this all sorted I can once again go into that design tab change the grand totals we're going to turn it on for columns only and Bam now we have basically the same pivot table that we had before with our values or
using formulas but instead now with pivot tables and this is a lot more customizable all right all right it's your turn now to get your hands dirty with some practice problems and exploring how to make some different pivot tables in the next lesson we're going to go deeper with pivot tables looking at things like grouping hierarchy and how we can show different values as with that I'll see you in the next one so let's get into some Advanced pivot table features and for this lesson and actually for everything in advanced chapter we're going to be
sticking with that salary data set of over 30,000 rows in order to actually analyze for this so I'm not going to be calling it out really any further into other lessons or chapters the first thing we're going to focus on is hierarchy which allows us to look at things like we want to aggregate not only the job title itself but also by the country so what job titles are within a country and then look at specific values there for say like the salary next we're going to move into grouping focusing first on automatic grouping basically
using that job posted date column to automatically aggregate by year month and whatnot and from there we'll then shift into some manual grouping we'll be able to create groups of different job titles and basically break out whether we want to look at maybe senior roles such as senior data analyst senior data engineers and compare them to just normal data nerd roles such as data analyst or data Engineers with this we're also going to dive deep into understanding a deeper method to analyze maybe percentages of totals or percentages of grand totals when analyzing these type of
groups for this you can continue working with that workbook you were working with on the last lesson if you've did everything you did there or you can just open the pivot table intro for this lesson once again as a reminder the solution is going to be in pivot table Advanced we don't want to open that just yet because it could mess up what we're doing here if I could it is so we have four different sheets that we cre created with this I only really care about the data tab right now so I'm actually going
to select all these other ones by holding control and then right clicking it to hi them so let's actually look what a hierarchy actually creates I'm going to go in and insert a pivot table from table arrange remember we're using that table of jobs you should have named the table that in order for this to work and we're going to insert it in a new worksheet I'm going to move this over and I'm also going to create uh call this sheet hierarchy so for this we want to look at the salaries for job titles in
a certain country so we're going to start by dragging that job country over to Rose and right now there's no hierarchy but if I drag job title short into the rows as well when we close this tab up here we can see that now we have two values in here and how we have values underneath here we've now created a hierarchy so Albania is basically the parent or the top of this and then we have data analyst data scientist senior data scientist notice there's only three values here and that's because an Albania sort of a
smaller country they only have three types of jobs there at least in the data set now we want to look at salary for this so I'm going to drag the salary your average into the values it's going to do a sum once again going into value field settings I'm going to change this to average rename the title to salary year average and then changing the number format to currency with zero decimal places pressing okay for all this bam now I'm also curious by this how many jobs we actually have with a salary value this just
sort of an add-on so I'm going to drag that salary year average over going into the value field settings I'm going to do a count of this and we'll call this job count click okay so now we get more of a relative idea of how many jobs are so in Albania we have well only five job postings so now I want to get into actually seeing what countries have the highest pay now as a refresher you can come in here and select the dropdown and we could either select how we're going to filter the row
labels or filter the value labels but remember we want to sort them and right now this is only the or option to sort a toz or Za to a for those row labels instead I can just click make sure I'm clicking the Sal your average because that's where I care about I can rightclick it and from there go to sort in this case sort large just the smallest and what it did is it sorted the values well within each of these it still kept this kept the countries in alphabetic order instead what I can do
is Select this cell for the countries because I want to sort the country's highest to lowest and then I can sort largest to smallest as well now this is pretty neat because now we can see things like Belarus Russia Bahamas I got to go down there have some of the highest salaries by country and then what those are based on the different job titles there now some sometimes I find reading this somewhat difficult in this manner that it's laid out here I'm going to show you how you can actually change this so going back into
the design tab remember we had this reports layout that we sort of breezed over last right now it's in this show in compact form we can actually change this to something like show in outline form and it will basically shift this over and have this hierarchy basically in two separate columns it also makes it nice that you can actually a little bit easier to sort with another method is show in tabular form so now it basically crunches it up and I actually like this one even better and it's still breaking out the job country and
job title short into two different columns but now it's actually aggregated to less line so I can actually see more data on here now this is definitely a form that I'd like if I want to hand over on boss and even if I wanted to convert this even further to what is this repeat all item labels so now I could if I wanted to actually copy and paste this into its own table and analyze further at least now I have like Bahamas with the software data engineer not software data engineer I mean software engineer or
senior data engineer anyway you may have noticed there are some blank values in here and that's because it has an Associated hourly salary but not yearly what i' need to do is actually apply a value filter because it's a value so I come in here and click to drop down go to Value F filters and then maybe put something like greater than we'll put zero and now those values will disappear next analysis we're going to do is a count by the job month but we're not going to use the this job posted month column that
we created in the last lesson instead we're going to use automatic grouping for this so we'll go ahead insert in a pivot table we'll insert into a new sheet and we'll call this group automatic I'll go ahead and move that to the very end okay so what I'm going to do is I'm going to take the job posted date remember it's a bunch of dates and I'm going to throw it into the rows and this is going to get into some aggregation it's going to take a little bit to load my computer's not even loaded
yet but it's about 15 seconds later and it is now available if you notice now we have this hierarchy of this grouping and I can now dive into in this case January and then one Jan here and then diving in further we can dive into specific times of job postings going to go ahead and close this up if we actually investigate over here inside of here we can see that after I dragged that job posted date over it basically created a month days and then the date itself which is actually a date time but anyway
three different values its own hierarchy with this automatic grouping and so now I can go in and do something like drag the job title short into here to get the job count I'm going to change this to job count also go in and actually adjust the formatting but now whenever actually go into each one of these hierarchies and look in we can see how many job postings were having on a daily bra basis and how many were happening at a certain date time so now let's say I wanted to dive deeper to understanding maybe why
July had such a high number compared to all the other months one I could double click it or I can just rightclick it and go to show details this is going to show well the details and if we actually go over to the job posted date column it's going to have all the values for July inside of here so this is a pretty unique way to get into diving deep and showing the details of what is the data being used to perform these aggregations and also double check your work now we're going to get into
manual grouping specifically we're going to create this where we actually go through and aggregate based on the job titles itself assigning it into well a group so put data analyst scientists and data Engineers into Data nerds senior RS into senior data nerds and then these guys into other data nerds so we're going to create a pivot table for this go in and select okay using the jobs table and we're going to be grouping the job title short so I'll drag that into the rows for the time being and we'll just start by grouping just the
data nerds so I'm going to just select one of these and then hold down control and then also select data engineer and then also data scientist then I'm going to rightclick it and select group the other way I could also do this is go into pivot table analyze and select group selection the next one on want to group are senior roles so I'm going to just select all the different senior roles conveniently they're all right next to each other then I'm going right click it and go to group so now they're the own group the
only thing left is getting the rest of these I'm actually going have to control these select these and then these as well and then from there we'll group that for group one I'm just going to select it come up to the formula bar and type in data nerds name group two to senior data nerds and then group three to other data nerds also going to zoom in a little bit to get a little bit closer now that we have all these grouped let's actually dive into performing a basically deeper anal analysis on this to look
at how or what percentages these make up of all the job titles and also of their respective groups specifically we're going to be looking at on going job the job title short over here we're looking at the count and how the counts of those jobs are going to be of the percentages anyway I'm going change this to job count along with going through and updating the formatting to use a comma and no decimal places so for this I still wanted to use that basically count of the job title short so with these counts we're going
to do the percentages so I'm still going to use that job title short column going to drag it into the values we did a count but now let's actually go in inside of the value field setting remember we got to that show value as and inside of here we can have different values percent of grand total percent of column total percent of row total we're just going to go percent of grand total to start press okay and Bam this is now showing us the percent of the grand total now I'm not liking how this is
ordered right now I'm actually going to I'm going to sort this selecting one of the values inside of the job count from sort it from largest to smallest and then also I want to do the actual grand total itself sort largest to smallest anyway we updated this to percent of grand total we need to update the title to specify perent of grand total and so we can see that data nerds for their parent are taking up about 76% almost 34 of the jobs are that and individually we can see that data analysts are nearly 30%
of that whereas we get down to the other data nerds they're only taking up a very small percentage now what happens if we want to see so what is data analyst of the actual parent or what is the cloud engineer of the parent other data nerds well I can drag that job title short into the values again it's aggregating by count but I can go in and this time I'm actually just going to rightclick it and we can have this show value as I'm going to use that instead we can do that percent of grand
total but instead I'm going to come down here to percent of parent total and in our case it's asking us what is the parent now you didn't we haven't gone over this but it actually recreated that that grouping as job title short two so I'm going to click okay we don't want to do the job title short that's not the parent job child short to and that's you can see it actually down here job title short too it created inside the rows but anyway getting back to the parent now it's showing the percent that it
takes to make the parent and then obviously the parent is at 100% so I'm going to rename this one percent of parent now we just looked at percent of grand total and percent of parent but the show value as has a lot of different other ones you can also do in here if I wanted to I can even do something like rank largest to smallest once again it's asking us do we want to rank part of the parent or part of the job tile short I want to rank part of job tile short and it
will show its individual rankings underneath each from highest to lowest I'm going to go ahead and undo this I don't want to know necessarily keep that one more note before we go for those that purchase the course practice problems and also note I also go into calculated items and field and have its own little worksheet for you to follow along and try out calculated items and field I didn't necessarily include it in this lesson because I felt that it wasn't a very powerful feature I instead use like using measures instead which we going to cover
in the power pivot chapter but if you're interested about it I have content on it in our notes and those calculated field and items is underneath that pivot table analyze tab in here on calculated field and items we're not going to be covering it outside of those notes that you can follow along and do your own self-study with it all right you have some practice problems now go through and get more and familiar with these Advanced features and pivot tables because in the next lesson we're going to be diving into actually making charts out of
these pivot tables using pivot charts with that see you in the next one moving now into pivot charts so we did a lot of work already in analyzing things with pivot tables we're going to take it now to Next Level pivot charts specifically we're going to be looking at first what is the average salary by a job title next we'll be looking at which job has the highest percent of demand and then finally lastly we'll be looking at how basically how are jobs trending over time we're going to be building all these charts using pivot
tables additionally we're going to include the features of slicers and also timelines based on what chart we're using in order to be able to filter down and more easily make our graphs more interactive as usual in the advanced chapters I want you to starting with the Excel workbook from the last lesson so pivot tables advance and if you want to see the examples or the final answer you could go to Pivot charts for this we're not going to be using the hierarchy or that show Det tail tab so I'm going to go ahead and hide
those so let's create this first chart to analyze what is the top paying job in data science for this I'm going to just create a new pivot table for this using that jobs table and we're going to be aggregating by job title short in the rows and then the salary your average in the values and for this we want to summarize values we don't want to do the sum we're going to do the average I did this by right clicking it but we do to have these all formatted correctly in currency with no decimal places
and I'll update the title as well to average yearly salary so in order to insert in this pivot chart we're going to go to the insert Tab and we're going to come here to Pivot chart there's only one option right now because we're selected on a pivot table and that's a pivot chart itself so I'll go ahead and insert it with this there's no recommended charts but I know I want a column chart so we're going to go with that and if charts aren't that different from regular charts I can come up here select this
plus sign I can remove things like the legend I don't really need that and then I can change things like the title by just double clicking this to something like what is the top paying job in data science now you may notice these pivot charts are a little bit different as they have these field buttons on here that basically allow you to with the chart itself go in and filter it this is really convenient if them say this chart was in a different page anyway I want to have these salary sorted from highest to lowest
so I can come into here and you know we can go sort A to Z or Z to A and you can change it around we want to actually sort from highest to lowest so I can come in here under more sort options and I can change this from the job title short column to that average yearly salary column and we want it to be descending and we'll click okay so bam now we have our salary oriented from high to low with our values if you don't like these field buttons right here you can come
in and right click it and go hide all field buttons if you want but if you want to get them back you have to come back underneath the pivot chart analyze Tab and select field buttons and uncollect this hide all the next thing to analyze is which job has the highest percentage of demand we're going to use that percentage of grand total column before and we're going to be adding a little twist with this one as we're going to be also building in some slicers so we can slice the data for what we want so
back inside the work should you should be working in so we want the percent of grand total only so I'm going to move out count percent of parent and also that rank count to only have what we want next move into getting a pivot chart Built For This and once again I'm going to be using that column chart I'll go ahead and insert that I'm going rename this to which job has the highest percentage once again I don't really care about that Legend now I want my basically target audience whoever I give this to to
have control to be able to select which group they can filter for whether that's data nerds senior data nerds or other data nerds so in order to control that I'm going to first zoom out we're going to insert some slicers for this so if we come into the pivot chart analyze tab we can have with this chart selected I'm going to go into insert slicer and we're going to do it for remember that that group from last time is actually job title short 2 and also we're going to filter this one also by country I'm
going to click okay they're going to pop up here on top of this I don't really like this I'm going to drag it over and I'm going to fix the formatting real quick so now with these slicers I can make it a lot easier for somebody using this to come in and say hey I only want to look at data nerds or I want to look at other data nerds and see what their appropriate percentage is when you click on a slicer you will notice that this slicer tab comes up there's some different formatting options
the one thing that I Define myself do changing is the appropriate label or the slicer caption in this case I would rename this one to something like job group and then for the job country I would just rename this to Country and you can see they update appropriately here for it as a refresher right if you wanted to select multiple different op options I would select this multi select right here and then with that enabled I can then select data nerds and also senior data nerds the last visualization we're going to be building with this
it's a line chart looking at how jobs are trending out of time using that previous pivot table we made on the job count for this one we're going to be using a timeline filter to be able to select down to maybe a certain quarter or month so back in the workbook that we're working in I'm in this group automatic sheet we want to create a pivot chart so I go to insert and into pivot chart and for this one we want a line so I'm going to go ahead and insert that I'm going to give
this appropriate title of how are jobs trending over time additionally I'm going to remove that Legend and I want to add a trend line to it now you notice by this one the actual field values for this you have multiple different ones here remember it did that automatic grouping in the last lesson so you have not only months to filter by days and also that job posted date so a lot more values here now to add a timeline for this I'm going to go up to Pivot chart analyze and I'm going to go into insert
timeline there's only one value that's going to be available for this job posted date and right now if I expand this all the way out we have all the different months that are available I'm going go ahead and close this up right below it so if I wanted to filter by a specific month I could be like hey I want from February to November in this case October I actually need to select February I'm holding down my key for this and then dragging to November anyway I can also change this with this filter not only
months but also quarters and even something like years I prefer I typically analyze things in quarters so we're going to do it that manner and I'm also going to shift it up here to the right hand side similar to slicers if I have the timeline selected I can come up here and actually change the name in this case I'm going to change it to date I could also change thing like formatting or even things like color now one thing to note with this with what I have selected here it's only going to filter what I
have the chart set up to or what I actually created the timeline while the pivot chart was selected so let's say I came into here and I wanted to look at in our case just data nerds and then also go into looking at the counts themselves this isn't necessarily going to update for that those slicers aren't connected to other charts but you can change it to do that so in this case I could select something like the pivot table itself going into pivot table analyze and then here under filters where you can create things like
slicers and timelines which we did in the pivot chart anyway they have this thing called filter connections and I'm going to expand this out so we can actually see it and right now we're saying that well for pivot table 3 as we can see up here probably need to give these even better names only the date is actually connected to this if I wanted to connect the other ones such as country or job group I'd have to select them and press okay now I don't know if you noticed that but it actually adjusted these values
actually decrease because I have less values selected here whereas if I actually select more all of these going on this is going to increase the values anyway that's sort of hard to see let's actually show this by with uh sheet one which actually should be something like top paying jobs and in this case I can go into pivot chart analyze into filter connections and this is going to show us based on pivot table 7 which is this one right here I should have renamed these there's no different slicers or timelines attached to it so I
can actually select all of these and apply it to this one and now when I go to our grouping right here right so we had all of them selected if I want to just look at data nerds here so I can see the percentages of data analyst dat engineer and data scientist I can see what their salaries are for it and then also I can see their counts for those as well so this is definitely a useful feature if you're looking to link charts or specifically pivot tables that are not necessarily connected all right now
it's your turn to get more familiar with using pivot charts we have some practice problems that you go through and actually understand more about how to use them with that in the next lesson we're going to be jumping into well the next chapter on Advanced Data analysis and using some pretty unique and pretty complicated features in order to analyze data so with that I'll see you in that one welcome to this chapter on Advanced Data analysis and this entire chapter is really focused on using addins which are basically programs that people have built to incorporate
into Excel to do very unique and specific tasks because of that going from less lesson to lesson we're not going to necessarily be building on each other as we go through these lessons every lesson is going to be sort of its own unique sort of Learning Journey about a specific feature or features to start with this lesson we're looking at just enabling the add-ins and looking at some basic ones such as what if analysis and we're going to get more into it in a second but we're going to be focused on looking at if weed
three different job offers which one should we actually take in the next lesson we're going to be continuing on with what analysis focusing on data tables and this shows us how values are going to be changing based on one or multiple variables and then finally the third lesson is on an addin called analysis tool pack that provides us access to a lot of different statistical analysis that we can just easily select what type of analysis want to perform and it does all the analysis for for us and provides it in a sheet pretty neat anyway
getting into this lesson we're going to start by first enabling these add-ins so that way you have it and then from there we're going to move into our first somewhat simple example forecasting what's going to happen into the future specifically we're going to look in at our past job postings and try to predict what's going to happen in the future from there we're going to be moving into what if analysis and for this we're going to have a scenario where we have three job offers and we're trying to find what is the most optimal one
we're going to use things like scenario manager to go through and automatically calculate what it should be for those three different job offers and then let's say we need to actually negotiate one of those job offers and we want to match another we can use solver or goalkeeper and both of these have both unique different features of them that we're going to dive into to allow us to adjust what we could potentially negotiate for better job offers one quick reminder on which versions of excel will support this chapter on Advanced Data analysis all of them
will with the exception of Microsoft online it doesn't have the ability to add in these specific addins but you're on Mac or the windows version you're going to be completely fine so for this we're going to be working inside of the analysis addins workbook I know it said previously you need to work with the previous workbook from the previous lesson but this chapter in general doesn't build on anything it has everything you need within the workbook so you're going to be fine with this anyway we just need two sheets from this forecast original and what
if analysis all the others are just the results that we're going to be getting and feel free to go through and select the sheets that we're not using so these four in this case and hide them so that way we only have the two sheets of forecast original and what if analysis so before we enable addins I think you need to know what are exactly Excel addins here I am in perplexity a and I asked the question and it goes into to specify What It Is by saying that basically interacts with Excel objects and data
and it will add custom ribbon buttons or menu items and thus providing custom functions now this is a little technical but there are three different type of addins they have web Excel and com add-ins today we're going to be importing in Excel addins which are actually created using something like VBA anyway the most popular Excel add-ins are things like solver power pivot power query you don't necessarily have to add in unless it's not included and then also things like analysis tool pack which we're going to get to in that third lesson all right enough on
the history lesson let's actually get into enabling your addins if you go to the data tab right now you'll probably see that you have this forecast section so you do have what if analysis available but you don't have anything ex else over here right now it's um well usually blank but we're going to add to it so I'm going to go into file and then from there it's hidden but under more I'm going to go to options on the menu on the left hand side I'm going to go into addins and this menu right here
tells you what your active application addins are right now I have no active applications and then your inactive application addins so I do have access to all these different ones right here so I want to enable them specifically I want this analysis tool pack and then well the one we're going to use in this lesson solver so um on manage I have Excel addins that's the one that I want to actually use for this I'm going to click go and now we need to enable which ones we're going to use so analysis tool pack for
the third lesson and solver for this one from there I'm going to click okay and now over here on the right hand side we have analysis popup data analysis which is the analysis tool pack and then solver is the solver added so let's actually get into forecasting specifically looking at what we expect job postings it to be next year and right here in the forecast original sheet I have date and then also the job count and this goes all the way for or this is all the data for 2023 anyway this example is going to
show the custom features that we really can do with some of these add-ins and also built-in features so I can select the date and job count column and then for this we're going to go into the forecast and specifically to forecast sheet in this it plots in blue what are our values that we currently have for basically 2023 and then from there it plots into the future using this orange I can toggle this between this a line chart and also a column chart but I'm not really finding the column chart that useful It's Time series
data so I'm going to go back to that line chart the other major thing I control is the forecast end date so if I wanted to only do maybe two months I could change this instead to end in March additionally have hidden underneath this drop down of options the ability to go in and actually change other things like confidence interval and seasonality and things like that right now it's automatic set it up to basically detect automatically and seasonality is as you notice in this data it goes up and down up and down up and down
it has a seasonality to it basically every single week there's more postings during the week and on the weekend there's less as expected so this seasonality is carried out into the predicted data as you can see here because it's still in the orange actually goes up and down anyway going to close this this is great I'm going to click create in this new sheet it automatically has this popup here that says this table contains a copy of your data with additional forecast of values at the end you can manually edit the forecasting formulas in the
sheet or return to the original data to create a different forecast worksheet okay great got it I'm going to zoom out a little bit and what this table did is it still kept that date and job count column but it also built out three other columns to actually look at scroll all the way down what the forecasted would be a lower confidence band and then an upper confidence band and then looking at the actual chart that it provides we can see this where this darker orange color is what The Forecastle band is this is the
upper band and then this is the lower band anyway that's pretty cool that I could generate this all by just clicking a single button of forecast sheet all right now we're going to move into wh if analysis and we click this wh if analysis we have three different things here we have scenario manager goal seeker and data table for this one we're going to start with scenario manager but let's first go over what the data is here in the sheet that we're trying to basically trying to calculate first let's focus on these columns B and
C this is a if you will dashboard or calculator that I built so I can put into here a base salary a bonus rate and then an annual raise amount and it will calculate it so let's say our base salary is 12,000 I can put that into here assuming the same 10% and 1.5% it's going to automatically update for this over here on the right hand side in E through H over here we have three different job offers that we received and they consist of the base salary the bonus rate and the annual raise underneath
here this fourth or fifth row if you will this is constraints that we're going to use later on I would just ignore this right now so what's going on down here in the result cell well what we're doing is we're calculating what the expected salary is for year zero all the way to year four and then from there we're actually getting a total so in this case this is summing up all these values right here so why am I doing four years why am I do a total left for these four years well the Bureau
of Labor Statistics basically estimates that most people have the average tenure at a company of four years so the idea with this calculator that I've made is that we're able to calculate based on a job offer we re what would we expect if we were to stay at the basically average amount or median amount of time that a normal person stays at a job like just looking at what's the first year because sometimes things like bonuses and annual raise may actually push us into higher salaries even though the base salary is lower than another salary
so it basically helps calculate this out and even the playing field for these three jobs that we're trying to calculate anyway you can go through if you want to and and understand what formulas are going on behind the scenes here but basically I'm just taking into account these three parameters right here and then every year basically starting with that previous years and then adjusting it for the annual raise and then giving it its appropriate bonus so as expected because there's an annual raise on each one of these the salaries are going up so with that
what is going on here do I need to actually go through and actually put in every single one of those jobs so I'll put in job one and get the 566,000 and then now do the second job and third job no I can use scenario manager for this so going into what if analysis I select scenario manager and we're going to add three different scenarios so I'm going to come up here and select add this scenario name we're going to call it job one next we're going to move into what we're going to use for
the changing cells and I've labeled these basically or made these into an input format we're going to select these three right here so C3 through C5 we'll leave the comment as is protection as prevent changes and go to okay now it's going to ask us what values we want to use for each in this case I use 100,000 10% and 1.5 it's already filled in pre-filled in from there I'm going to click okay now we need to add job two for this I'm going to leave changing cells the same this one I'm going to change
to 880,000 15% and then change this bottom one to 1.2% then finally we need to add that job three one of the last steps we need to do is now go into summary right here and for this we need to figure out what we want to actually have it provide for us in our case we want the result cell of C9 through c14 to be provided from there we click okay and bam we're going to get this scenario summary sheet that goes through in details based on job one job two and job three for the
value that we input into it and from there it's going to tell us what year zero is year 1 2 3 all the way down to the total salary now one thing to note is you see these names of Base bonus raise year zero uh and then total salary if I go back to what if analysis I've actually gone through already for you and actually Nam this so in this case I'm selecting zero it's named year zero and total salary if I were to use things that were maybe not named it would just provide the
cell so if we're using the values here it would just going to be provide F6 and in that case we would have saw F6 here also back in the scenario summary you may not have ever saw this before but Excel allows this sort of grouping if you will to basically manipulate the sheets and what values are hidden or potentially shown here anyway pretty unique feature that you may or may not have seen before all right moving on to goal Seeker let's say we have the scenario now where we got the job offer for job one
in this case but we want to try to match that of job three specifically if I go back to that scenario summary sheet we can see that job one is at around 566,000 but job three is at 640,000 we'll say we have some Insider information that human resources told us hey we can't adjust the base or the bonus but we can adjust the raise what raise you get every year and so you could potentially ask for a higher Rays what Rays would you need to basically put into here to get equal to that job three
so the first thing I'm going to do is go in and make sure that we have inside of our formula input in the job one actual statistics of it so 100,000 10% and 1.5% for the annual raise now I could go through there so I type 1.7% and then 1.8% and just keep on going up until I actually find what it is or instead we can just actually use this goal seeker and for this we're going to be setting a cell specifically cell c14 to that 640,000 that we want to get to and we need
to provide what cell we're going to actually change in this C case we're going to change cell C5 which is the annual raise no for this we can only change one option we're going to be able to change M multiple the next scenario but not in this one of goal Seeker so from there I'll go ahead and click okay and Bam automatically goes through I don't know if you saw that it Ste through it and it went up to 7.6% and that's what we'll need in order to get to that 640,000 and it even provides
an old nice dialogue box saying that hey it did find a solution sometimes you may put a goal in that's not achievable and in this case it would it would tell you so 7.76% is a pretty high raise let's say we get further information from HR saying hey we can actually change not only the annual raise but also your bonus we still have the same scenario you can't change the base salary needs to stay at 100,000 for that first year so we have multiple parameters now that are changing this is when we're going to shift
from using this goal Seeker now over to solver one thing before we start we need to actually reset these values in here I'm going to change this back to 1.5% both of these Step Up in value so you want to reset it before you go so opening up solver I'm going to set the objective as before that c14 of that total salary and we want to get it to a salary of 640,000 and we want to do this by like we said we can change two things in this case the bonus and the annual raise
we can also add constraints which we'll do in a second after we just run through this one but I want to actually just go through and solve it first and the last thing we need to look at is select a solving method we're going to just leave it here I really like this grg nonlinear we'll leave it that for the time being and we'll go ahead and click solve now for this it says solver found a solution all constraints and optionality conditions are satisfied as we can see it increased the bonus and then also the
annual raise and we got to that 640,000 inside of this popup box we can have it output certain reports so I'm going to just hold control and select multiple different reports along with clicking this for outline reports that's it's going to actually print to different sheets and from there click okay anyway the most important of these three different reports that it gave to us feels the answer report basically tells us hey what was the original values put in for the D bonus and raise and then what are the final values in order to get to
that final value of 6 40,000 they also have these two other reports one on sensitivity analysis and the other one evaluating the limits which we're going to get to um but these I don't find as important so now with this with solver we found that we can input more than one different input now we can also specify constraints if I come back up to solver and it says Hey in this dialogue box subject to the constraint right now the annual raise is sort of low still at 2.1% but that bonus skyrocketed it was previously at
10% and it went all the way up to 23% so we could actually put some constraints in by clicking add and we'll say hey the bonus we're not going to let that exceed 15% we'll click add for that and then for the next one we don't want the annual raise to exceed we'll say 4% and we'll click okay remember I did name these cells so that's why it pops up automatically as B and raise makes it super easy whenever you name cells all right let's go ahead and click solve so look at this solver could
not find a feasible solution with these constraints basically maxed out the bonus and maxed out that annual raise and we didn't get to that 640,000 so what I can do is I can return to the solver parameters dialogue click okay and in this case I'll change the bonus to we'll say 20% now and then for the raise we'll change this to 5% click okay and then try to solve again and we found a solution we have 17% and 4.4% and for this I'm going to Output the answers I'll click outline reports to export it click
okay close this out and then go to the report we can see what our finally values are along with how we got to our 640,000 final value all right you got some practice problem problem to now go through and try these different features out of scenario manager and goal seeker and also solver and I think once you play around with them more you can find out which one is more applicable to which scenario with that I'll see you in the next section where we're going be going into deeper into what if analysis specifically on data
tables one my favorite features of what if analysis with that see you there let's now get wrapped up on what if analysis by focusing on data tables we're going to be focusing on building one input and also two input data tables for the first one on one input we're going to be continuing on with that exercise from last lesson looking into that job offer one and seeing how we could change the annual rays in order to thus affect different salaries at our 4-year point and mainly the total salary at this point and from there we're
going to shift into building two input data tables where we're not only analyzing that annual raise increase but also a change in the bonus rate to see what the different salaries are for that final total amount of those four years so for this lesson and also for this chapter we're going to be starting with the actual workbook of the name of the chapter in this case data tables and we're going to be working this original sheet but I want to jump into that one input to basically show you what we're going to be building we're
going to be inputting into here the annual raise percentage we're going to put it in increments of. 5% and then along the top in the row we're going to be inputting the values from over here um and these values right here across the top and then the data table itself is going to fill this in with the expected result so in this case year three at 2% s uh 2% increase in arrays it's going to get around 116,000 we also do for coloring at the end uh the data tables don't do that that's done with
conditional formatting so here we are back in the original sheet first thing we need to do is get the annual Rays put up here remember we want to go in we'll say. 5% increment so I'll do zero 0.5% and then for the rest of these I'll just drag them down I end up messing up the formatting so I'm just clear the borders and then put a border back around the outside now for the salaries I want that to be for what year zero then year 1 I'll drag this on over for these and then we'll
put a total so for this I want to enter in that year zero we're going to be doing this for all the different values right there I'm going go ahead and put it in if you notice it has this line through it and I actually click it and then from here whenever I look into it it provides the error of stale value you may or may not see this but I'm going I show you how to fix this if you are experienced this you can go into file and then into more under options and what
happens is under formulas my workbook calculations went from basically automatically calculating to manually where under manually if I look at this little icon right here they can be manually calculated by pressing F9 or going to formulas calculate now anyway there's nothing wrong with having automatic calculations that's actually what I want all the time somehow my thing switched into this manual if yours does switch it back to automatic click okay bam we're good to go and we'll continue on now it's important for up here at the top that we have them equal to the formulas here
because this is what's going to be ultimately getting changed and manipulated so I wouldn't want to go through and actually manually fill this in with a 110,000 it needs to be connected to the formula that actually is getting calculated so building our data table now I'm going to select this entire range right here E3 all the way down to K12 go to the data Tab and select data table now this provides us two inputs a row input cell and a column input cell we're only doing a one input data table so we only need to
fill in one of these specifically we're looking for the input either into the row or the input into the column in this case we're going to be subbing in this this column this e column right here we're going to be subbing it into the formula here and it wants to know what is the input cell for in this case the column so I'm going to go ahead and select it it's C5 I'm gonna go ahead and click okay and it's going to automatically fill it in now what's unique about this is I could also go
in here if I wanted to and maybe change this to something like 10% and it will update this entire data table with that new value I'm actually going to change that back to 3% but pretty unique anyway if I wanted to I can come in also and I'll so go in and to conditional format it I'm only going to select Euro 0 through four and I'm going to do a white to green and then for the total I'm going to do its own because it's almost in its own bracket here right it's a a sum
of all those different values so I'm also going to do the same thing of the white to green and then you know me I don't really really like green so I'm going to go ahead and select this and I'm going to end up changing this by going into manage rules and conditional formatting selecting on this one adjusting the color to Blue and also selecting this one and changing this one to Blue as well click apply and then okay and Bam so with that example complete let's move into a two input data table and let's look
at the final example for this for this we're going to have as we had before the annual rays in the column but this time we're going to have the bonus up on that top row and for this we're going to be calculating as we click here it's going to be calculating c14 which is the total salary we're not going to be calculating that 0 1 through 4 anymore and it's going to go through and calculate it for all of these different scenarios if you will all right to do this I'm going to go back to
that original sheet I'm going to actually duplicate this by saying copy it create a copy and click okay okay so now we have original two so I'm going to name original to one input and then rename or two to two input now for this one I'm going to end up just clearing the contents from here I'll go to editing clear and I'll just select clear contents and now thinking about it I want to also clear any of the formatting that's in here cuz we're going to be doing something different with it I can go into
clear rules I can go clear rules from entire sheet all right so we're have the Rays and the rows and now we need the actual bonus in the columns for this we'll go from 0% to 5% and I need to actually change this formatting to actually be a percentage and then drag this all the way through along with fixing this formatting so now a two input data table is a little bit different in that we need in the upper left hand corner what we actually want to change whereas the one input put we did across
in in our case we did across the rows in this case we just want to have in the upper left hand corner there it is I sort of grayed it out you can make it a little bit darker if if you want to but I would just want to make it known that hey we're not necessarily using it so similarly we're actually going to get into creating it we're going to select the entire data table go to the data tab what if analysis data table for the row input cell so this row up here what
are we wanting to substitute these values into well we want to sub it into the bonus and then similarly for the column input same as last time that's the annual raise so we're going to want to sub that into C5 going go ahead and click okay so I'm going to dress this up a little bit I'm going to bold the header right here also I'm going to merge and center this all so we can put inside of here bonus and then finally I'm going to conditionally format it like we did last time using that white
to green and then changing that green to a blue to get it more of what I want so bam now we have a two input table and we can see what it's going to be across all these things also with this if you remember from our last lesson right we were looking at finding what is the value we'd want to be to get around 640,000 now we have a few different values we can actually look at for this and we can tell from this well we going to need to be above a bonus rate of
15% to even be considered to get up to 640,000 so sometimes I like this visually better than going in and doing something like goal Seeker or even things like solver because now I have multiple different variables I can look at and analyze and try to adjust on my own all right so you now have some practice problems to go through and get familiar with data tables I found when I first started with data tables got really confused on the row input and also the column input cells but really understanding how those are being applied into
the original formula helps you figure that out all right with that I'll see in the next one where we're going going into the analysis tool pack and diving into a lot of different statistical analysis you can do with Excel so with that see you there all right this is the last lesson in this chapter on Advanced ad analysis and specifically we're going to be focusing on that analysis tool pack addin now this addin is packed full of features and I can make a whole tutorial just on this addin alone but we're only going to be
focusing on four core things of it that it does that I use from time to time on our job posting salary data set of over 30,000 rows first we're going to look at how we can get descriptive statistics of something like a salary column so we don't have to go through and use formulas to get all the different statistics for it second we're going to investigate how to make histograms but these are a little bit with a Twist in that I feel like they're more customizable than the previous histograms we can make third we'll get
into ranking and assigning a percentile for our salary data so we can understand where it actually ranks for percentiles and then finally we're going to be moving into looking at at a moving average if you remember our job posting data set had all the seasonality in it basically went up and down a lot depending on where it was posted during the week well we can remove those fluctuations by a moving average for this we're going to be working in the analysis tool pack workbook and all the answers in there so you can feel free to
go ahead and actually select all the different sheets in here and go ahead and hide them so we only have the data tab in there and we'll be working with this so as a refresher this is the data analysis tool pack you should have gone through in that first lesson and actually enabled it by going into options into the addins itself and it should now be under the active addins if you didn't do that remember you all you have to do is just go into go into here and select it all right so let's open
this bad boy up and if I click that analysis it's going to pop up here and this dialogue box allows us to select like I said from a variety of different tests that we can actually perform there's a lot of different statistical tests in here such as regression and sampling and then even things like correlation Co variance and whatnot so let's start with the one that I find myself using the most and that's descriptive statistics when I want to perform Eda or exploratory analysis this is the first thing I want to do now the thing
about this is we need to provide a column that has numerical values in it so we could do the date column but what we're going to do is we're going to to provide the salary year average column go ahead and press enter for this we do have labels in the first row so I need to click this here for output options we want to go to a new worksheet so that's what we'll leave for this and with this we do want the summary statistics you can go in and also specify things like confidence level and
the cith largest and kith smth but we're going to leave those default for the time being and click okay now it's popped up in this new sheet called cheap one and diving into it I'm actually going to expand this out and then format all these numbers real quick so that's much more readable so now we have all the key statistics from it we don't have to go through and calculate a formula for mean median mode standard deviation the minimum maximum sum whatnot all right next up is histogram and previously remember we could just select something
like the M column go into insert here and actually insert a histogram now the one problem I have with this is the formatting of the rows or the X values down here it basically provides this range this is a lot of data right there and there's it's really hard to format this so let's look at an alternate option for this using the data analysis tool pack specifically we're to come in here to histogram for the input range once again I'm going to go ahead and just select that column M press enter it does have labels
for bin range I'm going to leave m I'm not going to specify a width of the histogram or the bin I'm going to leave it just default for the output I'm going to leave it as the new worksheet ply I don't want either of these the parto or the cumulative percentage instead I just want the chart output of this press okay and here we have the histogram it's honestly not too special it's a little hard to read based on the size of these bins as you can see basically the difference between these is around it
looks like they're doing basically an thousand increments so the increments are way too small we need to adjust the bin anyway the one good thing is along this xais it's only one value now so a lot easier to read so now let's go in and adjust that bin size so if I go back to data analysis into histogram and click okay for the bin range it wants me to actually put in a range or a selection so we need to actually pre-fill out what range or bins we want for this so I'm going to copy
this header up here cuz we're going to keep the bin in frequency start a new sheet paste it in here and I want to go in we'll say 50,000 increments so 0 50,000 and I want it to go to basically 400,000 so now going into Data analysis again histogram opening it back up still has the input range selected correctly now for the bin range I'll select A2 to A10 select the output range to I basically want it to be inside of of this notebook so I'm going to select up here on D1 we'll just start
there and we want a chart output on this page okay I'll click okay and I'm getting this error message that the input range must contain at least one data point right now this Elm is not referring back to the correct sheet it needs to look at so actually I'm going to select right here you can see it selected that other sheet I actually want to select the M column of the data tab now we'll press okay so now I love this because wanted output this I didn't apparently need to do this frequency thing I got
confused anyway we can actually go in and format this to remove the legend and then update the axess title for salary and then we'll update this one for frequency anyway I really like this because now look at this control we were able to minimize it not to go past 40,000 and have all these outliers and everything else that has past 40,000 is put into this basically more value you anyway this is my preferred method for making histograms especially whenever I need to control that xais next up is Rank and percentile and with this one we're
going to be doing a rank and percentile of that salary year average column once again now this one depending on the size of your computer may take up it may even crash your computer so if you're concerned that this is not going to be able to performed on your computer don't run it just look at my example and understand what get out of it anyway I selected rank in percentile and then for the input range once again I'll select that column M and then we'll output it to a new worksheet ply and I can do
something like even name it in this case calling it Rank and percentile of the sheet that it's going to go to so clicking okay it says Rank and percentile input range contains non-numeric data basically I forgot to click this of labels in the first row clicking again it's thinking how long is it going to take all right so Excel just on me maybe that wasn't a great idea let's try that again using Rank and percentile this time instead of selecting the whole column I think because it had some blank value especially down to a million
rows sort of crashed it instead what I'm going to do is I'm going to just select A1 and then select down all the way to the bottom I don't know why it changed it over to column F but the main point of me to doing this is that way we select column M and also I need to remove this A1 at the beginning okay and also need to update this to be starting the second cell and we're going to try this again I gave it the name of rank percentile I didn't have the labels in
first row selected because we're going from the second cell how long is it going to take this time all right so that was a lot quicker this time and we have our now in this Rank and percentile sheet our actual data it did take about a minute to do so once again if you have a computer that's not necessarily that fast don't try this at home all right so some key statistics about this it provides a point which is the row number it's itself and then from there what is the value that's the column the
rank and then the percentile what's cool about this because of provided point we could do something like the index function and you provided an array and then the row number in this case that's the row number so if I wanted to find out what the job title is I could select column B and then from there for the row number go back to rank and percentile and select this value right here then close parenthesis press enter looks like it's a clinical NLP data scientist and I can actually autofill this all the way down anyway let's
make sure this is actually correct okay yeah just double checking the row number at 25589 is clinical NLP data scientist so we have it correct anyway I could go through now and I did this for the job title itself but you could imagine you could pull out things like the job country job tile short all sorts of other key information and get this in a list of what it's rank is along with its percentile our last feature to look at is moving average and this is what we're going to be calculating here the Blue Line
already is data we already have of what are the job postings over time but that orange line is the moving average we can use this analysis tool pack in order to calculate this and as you can see it removes a lot of these fluctu these weekly fluctuations if you will from it and makes it a lot more are basically readable to see where actual the Peaks and the troughs are now in order to do this I can't necessarily just put in that job posted date into it I have to actually get a count of the
dates and also what are the counts of the job postings per date so we need to create a pivot table so we go in insert pivot table from table we're going to do it from this table which is named jobs and we're going to insert it into a new sheet similar before we're going to put that job posted date into the rows and I'm actually going to take out you can see it aggregated by month I'm going to take out the month from there so it does by days and now I'm going to throw into
the values here it's going to do a count so I'm just change this to job count and we can actually visualize this by itself by going to insert pivot charts inserting in a pivot chart we want a line and that's what we saw before with our Blue Line before that showed how it basically went across uh went through time all right so goes ahead and I'm going to delete this chart because we're going to be making it and once again we're going to that data tab into Data analysis and we're going to be forming moving
average for the input range I'm going to select B4 and then select all the way to the Bottom now this grand total went into it so actually I'm going to back up one and change this to 368 we didn't select any labels in the front row so I'm going to leave that on blank in the interval I'm going to just set it something like seven for the time being for the output range I want it to go right next to my chart so I'm going to copy this above and paste it below and change these
B's into C's so it's C values right next to it and we want a chart output along with standard errors I'm going to go ahead and click okay now this chart is not correct um we made a little bit of a mistake but I did want to show you real quick this moving average we can see that it starts 7 days later right here and so that's what's happening in this C column here that's the actual moving average and then the actual error itself is right next to it it's pretty consistent around 30 to 40
anyway we need to fix this we need to take this entire value if you will and move it out of a pivot chart so I'm going to select this all the way down to the bottom and copy it then inside of a new sheet I'm going to come in and paste it I'm going to just paste looks like a pasting with the pivot table formatting I'm going to paste uh the values only and change this to job date so let's try this again using data analysis going to moving average for the input range we're going
to select B2 and then all the way down to the bottom remember this has a grand total so I actually need to change that to minus one for the interval I'm going to adjust it a little bit I'm going to actually change this now to a 21-day moving average and then for the output range this actually needs to be adjusted to match what the input range is but for b or c sorry anyway go go ahead leave everything else checked click okay and Bam now we have blue and also orange if you will for the
actual and the forecast now one thing I'm noticing with this chart is well the markers are pretty heinous they're making they're clogging up this chart so what I can do is Select something like this orange line right here I can rightclick it go to format data series and then here underneath this fill and line go into markers and then for the marker options just basically do none we just want to have a line instead additionally we can just go ahead and click that blue the blue line and for the markers there we can do none
as well okay sensory overloads gone now looks a lot more readable with the exception of down here for some reason it didn't pick up the dates on mine and we can adjust that by right clicking that and going to select data underneath the horizontal ax labels I'm going to go ahead and edit this I'm going like from A2 all the way down minus one we don't to do grand total click okay that changed the names let's see if that updated the chart and Bam it did now I'm going to do some minor cleanup I'm going
to remove that Legend from there and that looks a lot better so now we have a graph of our moving average of the job postings and as we sort of suspected in August we had a peak along with January seemed sort of high then went down a little bit but then up again in August so we see a lot more Trends and then tapering out towards the end of the year all right now it's your turn to go through and practice with those practice problems and exploring some of these features in the analysis tool pack
add in with that we're going to be wrapping up this chapter and in the next one we're be jumping into Power query which I'm super excited about in order how to clean up our data and load it in in the format that we want easily all right with that see you there welcome to this chapter on power query and no pun intended but this is one of the most powerful tools within Excel it allows us to perform ETL processes or extract transform and load which just some fancy data engineering talk for connecting to a data
source and loading it in after you clean it up anyway in this chapter we have five lessons specifically in this one we're going to have an intro to power query what it's all about how to actually connect to a data source in the next one we'll be moving into the power query editor and we'll be covering that for three lessons in order to go in how to actually clean up your data and get it prepared to a format that you want in the last lesson we'll be diving into the M language which is powering power
query don't worry we're not going to do any in-depth coding or anything like that just want you to have some familiarity with you so we have more experience with using power query so what's this lesson about well in order to understand that we have to understand is what is power query and here on Microsoft's learning platform they have this fancy Dancy diagram that basically shows this what power query does it allows us to connect to different data sources it could be something like a database a text file or even something on the cloud from there
power query will then pipe it in to a bunch of different products they have and we're going to be using it for Microsoft Excel but it's also famously also in powerbi now if you have a Windows version of excel power query is going to work just fine on the Mac versions it is available however it's very limited so a lot of the stuff we're going to do within this lesson you're not going to be able to do and also Microsoft online is just completely not available so as a reminder power query is an ETL tool
or extract transform load and we can connect to as a data source such as this here's a Wikipedia page on the list of S&P 500 companies and it has all the different 500 companies that are part of the S&P 500 anyway let's say I want this table I could go through and try I mean as you can see I'm trying to select it right now and it's like selecting the whole page it's a whole mess if I'm trying to get this but we can actually use power query to extract all this components out all I
have to do is go in and provide the web address of this which I know it's located right here I'll then select which of the tables I want out of the web page which is this one right here and then I just load it in and here it is in our workbook now don't worry I sort of ran through that example real quick we're going to go more in depth and Detail in the last example in this lesson but I just wanted to show the power of this and how we can actually get data even
from online into our workbook so easily so why do we need to use power query well we're going to find that out as we go along but I'm going to give you the tidbits right now of One it automates the ETL process so I don't have to do that annoying task of going to a sheet and copying it over every time I get new data I can just get power query to do it for me additionally with that sometimes I may have mistakes I'm copy and paste and sheets over therefore I have reproducibility and then
finally with this I'm now allowed to bring data in that potentially exceeds that 1 million row limit of Excel which we'll show how we can deal with that in a bit so let's actually get into performing our first example of loading in a simple data set specifically from another Excel sheet like I talked about the beginning of the advanced chapters you're not going to be able to actually work inside of the workbooks that I have given so in this case power query intro has the final results but I don't want you working in that I'll
tell you what works you need to be working with as we go through this which you're probably getting the security warning of external data connections have been disabled and we'll get to troubleshoot shooting that at the end so instead we're going to be starting with a new blank workbook I'm going to go to navigate over here to the data tab this is where power query is located specifically under this get and transform data it doesn't really say power query but that's where power query is hidden now anytime I'm importing any data I typically go to
this get data and then from there I navigate Down Deeper depending on it's file database from fabric and power platforms or from even other sources they do have for all these for it here they also have smaller icons right next to it that you can navigate over and basically highlight okay this is from web and then this is from a table of range and whatnot we're going to be going over multiple examples in this video so don't worry if you're not following along with which data sources you can actually import I think you'll have a
good idea by the end of this so what are we going to import first well if you navigate into our course folder under resources under dat ass sets and then data jobs monthly we have Excel files for every single month we're going to start by just importing one Excel file to start and then in the next exercise we'll go into how to import all these at once anyway we're going to start simple first with just this Excel file so for this I'm going to go to get data and it's a file specifically it's from an
Excel workbook inside the course folder I'm going to then navigate to the data set going to resources data sets monthly and then select that January data set and click import with power query you're going to find that it has this Navigator window pop up and from there it will show you what is actually importing in in this case January data jobs the Excel sheet and then if it had one or multiple sheets it will appear there underneath it whenever I select sheet one it then shows me to the right hand side a snapshot or a
preview of all the different data in there it doesn't show all the columns but a snapshot of it at the bottom there's a few options to load or load to and then also transform we're going to keep it simple for the time being and we're just going to load so I'll go ahead and click it so we just imported in this data set from another worksh sheet it's already in its own table and because it also was sheet one it's naming the sheet sheet one parenthesis 2 to signify as the second one so congratulations we
just completed our first ETL process of actually extracting transforming and loading an Excel workbook into another workbook so we loaded this table in but how do we actually go about using it well in this portion we're going to be demonstrating how we can manipulate it with a pivot table and how to basically control all our different queries if you notice we had over on the right hand pane this queries and connections now if it's not popping up you can go up here to the data Tab and then you see queries and connections you can navigate
it on and off by clicking this button power query sets up these queries and in this case it named it sheet one after the sheet one in that workbook that we exported in I'm sorry that we imported in and if we hover over it we can get some details about the columns when it was last refreshed it's load status and even data source now connections over here on the right right now we have zero connections that's actually what's controlled by power pivot which we're going to be going over in the next chapter on power pivot
but anyway back to Power queries itself right now we see with sheet one that 3,000 rows are loaded and if necessary we go through and refresh the data set as showing it loaded the data and 3,000 rows are loaded again pretty quick so let's actually get into manipulating this well it says that 3,000 rows are loaded but I actually I can go in and delete this tab and it's going to give you this warming that's going to per delete the sheet do you want to continue yes I do and whenever I do that since the
data is no longer loaded it now displays that it's connection only so we can actually change where we load our data to if you will and I can get to this by right clicking it and then going into here and we'll be exploring all these other options as we go through but I'm only want to focus right now on this load to and they have a few different options in here let's actually explore them right now it has only create connection so right now it only has a connection if we go back to that table
and click okay it once again loads it into that table if we want to actually get into a pivot table we'll select this on pivot table report we can also do a pivot chart and it asks whether we want to put it in the existing worksheet or a new worksheet and then finally it has ADD this data to the data model you've seen this one before and once again we're going to be going over data models more in depth in chapter eight on power pivot so we're not going to be enabling this checkbox just yet
anyway I went in the existing worksheet I don't need that table there so I'm going to click okay and it says hey there's possible data loss because we're going to be basically getting rid of that table and replacing it with a pivot table do I want to continue yeah and now like we did before in the pivot table chapter we're now using a pivot table and so we can put things like job title short and analyze it for the count of different jobs that it has within it there's no change whatsoever in everything we learn
in pivot tables still same application that we're using it here for so now let's actually get into importing multiple different Excel files we're going to specifically be importing all 12 of these of January through December this time whenever I go into the data tab under get data and we want to get it from a file but I'm not going to select an Excel workbook instead what I'm going to do is select a folder because all those Excel files are in the same folder inside my course I'll navigate into resources data sets and then I'm going
to select the folder itself and select open now you may notice the Navigator window looks a little bit different and that's because now it contains the metadata of these Excel files itself such as the name data access modified created and whatnot and with this one before we had that load and load to along with transform data we're just going to go into combining this data set so I'm going to go ahead and click that and specifically we're going to use combine and load now we navigate to a window we're more familiar with of combined files
and what this is doing is showing is how it's going to actually combine the files in that we need to make sure one that they're all the same format but if I actually click sheet one of which the sample file is looking at is the first file this is what it looks like and we know this already because we looked at the January file anyway if you wanted to you could also change this to a specific file I'm fine with just using first file selecting a sheet if I was having errors I would do skip
files with errors but I'm not worried about that just yet I'm going go ahead and click okay and Bam now we have that once again that table loaded into here and this has all the data so I expect it to have around 30,000 results similar to what we've been working with before and it looks like it does and if you notice we have this new column right here on Source name which tells which Excel file each of these comes through and just doing a cursory check it looks like all the different months are in there
now onto this queries and connections paint up here I'm going to actually make this smaller so I can actually see it all previously we only had our sheet one query but now we have also this data jobs monthly query and with that up here at the top because we're connecting multiple different files we have these helper queries that were created during the process so you can navigate over these and basically see that hey it used the September file as a sample and this is the steps it took or this is what the sample file actually
looks like anyway I'm not too concerned with those helper queries right there or with anything underneath this transform from files I mainly care about what's under those other queries so we have sheet one and data jobs monthly speaking of which sheet one is a really bad name for this so I'm going to rename this to data jobs January I also rename the sheet so just to prove with the data jobs monthly that we actually imported it all in we're going to go in and load to and we're going to do this time a pivot chart
going to go ahead and click okay we're doing the existing sheet with the table I don't care if I get rid of that table so I'll click okay and similar four I'm going to put that job title short this we're going to put in the Axis or the rows and then we're going to want a count of that as well and then I'll just organize this in descending order based on the count of job title short so bam we now connected with power query to multiple Excel files and imported in at once I hope you
realize that now this unlocks a lot of potentials because say you get January of next year's data you could just put it into this folder here and then just all you need to do is go back into the data tab click refresh it's going to go through and refresh all that data set and pull those new numbers in all right in this example you're not going to follow along I'm just want to show the power a power query okay the pun's getting old by now anyway I have this CSV pile or comma separated values basically
it has comma separating everything I looked at this is in VSS code don't worry about any of this stuff like I said you're not doing it the main point is to show this data set itself right here it's starting at the top row of one and if I scroll all the way down we get to the last entry and that's 2.7 million jobs that I have here in this data set we can actually import this into Excel now if you recall if you scroll all the way down to the bottom of excel it only includes
about 1 million rows so how the heck are we going to do this with power query so this is a CSV file I'm going to go to data get data from file specifically it's a text or CSV and I'm going to import in this data jobs large file that I have reminder again you don't have access to this file it's just too big to even get onto GitHub so that's why this is a demo only this is the data set itself so I'm going to go in and actually go and look load it now this
has taken a little bit of time as you can see it's loading around 100,000 rows as it goes through also it had well it has three errors now in here this usually appears whenever it has a row of data that doesn't necessarily make sense for what it's supposed to import it alert you there's an error so the 2.7 million rows are loaded but I get this error message the query returned more data that will fit on a worksheet remember it automatically by default tries to load it into a table into Excel and it's telling me
that hey it's not going to fit so I'll click okay now it's still going to try to load that table but it's going to cut it off at that 1.5 million but this doesn't mean we can't analyze it if I scroll over this query it reminds me that the results of this query is too large to be loaded to the specified location worksheets have a limit of 1 million rows sure instead what I'm going to do is go and load to and I'm going to load to a pivot table click okay and it's going to
warn me again about the table loss yeah I know so once it loaded like a hot minute to do that I can actually go through and now analyze these 2.7 million rows so if I do something like put the job posted date into the rows and we also want and we want to get a count of this so I'm going to put the job poster date also into the values so we get this counts anyway reformatting it with commas to actually be able to read this now we can see that we did actually get in
2.7 million different data points for this and as a side note this is all the data that I've collected since I started in 2022 doing this so there's a lot of different jobs so Excel is not necessarily limited to just analyzing 1 million rows of data all right so let's finally get into that last example of importing in this list of S&P 500 companies feel free you don't necessarily have to do this table from Wikipedia but I'll drop a link below on where this table is located and you can use that if you want so
I copied the web page of that table then I'm going to come in here and select like this of from web you can do basic or Advanced with Wikipedia it's perfectly fine to do the basic version putting in that URL clicking okay we get into that Navigator window and there's actually multiple tables inside of here one is the list of 500 companies and the second one is a list of companies that have been added and also removed from there they also just have random tables in there as well just because in the internet you're going
to have random tables like this one of main menu contents tools appearances not applicable anyway we want to do table one I'm going to go ahead and click load and now that we have it in here anytime we do this probably need to rename it appropriately from something like table one to S&P 500 in this case and Bam scrolling down we can see that we have um should be 500 oh a little bit more than 500 apparently the list has been updated to clear a little bit more I don't know why that is but got
all the DAT nonetheless now quick note on if you want to actually navigate into any of the files and see what I've done whenever you go to open it so in this case I want to open power query intro I'm going to open it up you're going to get this of external data connections have been disabled do you want to enable content in this case yes you want to enable all that now the problem you now may also have is that it may give you a warning that your data source settings aren't correct and what
do I mean by that if I go into data and then under get data we're going to see this thing here for data source settings and it's managing settings for your data sources anyway you're going to see these locations here these are file locations of the data sets and they reference the files that are on my computer that's not going to be the same for your computer it's probably going to be in a different location with a different name so here I know that this is the data jobs monthly folder if I wanted to actually
go in and update it with the actual location for where it is I would go down here select change source and then from there select browse to navigate to it you're going to once again navigate to your course of excel. analytics into resources data sets and then there's that data jobs monthly click open and okay and then it's going to update you're going to have to do that for this file and all the files within a power query and also power pivot because your file locations are not the same as my file locations then after
you do that all it should go through and refresh but if it doesn't you can manually refresh it underneath the data tab by clicking refresh all the last item to call out is the options menu we're going to be going into the query editor in the next video so we're going to save that for that one anyway query option has a lot of advanced details in controlling power query in this case of showing the query Peak when hovering on a query in the query's task pane that's sort of annoying to me it pops up every
now and then I'm going to go ahead and unclick it but they also have different behaviors you contr control for data load for the power qu editor the security privacy and even Diagnostics so feel free to go through this and navigate and see what is available to actually customize with this I'm going to go ahead and click my changes of okay and now whenever I go to the queries and connections and actually hover over something like data jobs Jan doesn't just pop up on the screen and sort of catch me off guard so I sort
of like that all right we now got some practice problems for you to go through and get more familiar with performing Bally ETL with power query and loading in some different data sources with it with that we'll see you in the next one we're going to get into the power query editor anyway nothing to be intimidated by as a lot of the core principles we've learned already in Excel are going to be applied to this new window so you're going to pick it right up on it all right with that I'll see you in the
next [Music] one in this lesson we're going to be continuing on with power query focusing on specifically getting you introduced to this power query editor and in order to facilitate this we're going to be going through or walking through actually importing and cleaning up our data science job posting data set that has over 30,000 rows of data we're going to be automating a lot of the steps using power query that previously we had to use functions and formulas for so it's going to be saving us a lot of times in order to actually automate this
data in justest for this we're going to be starting out with a blank workbook so I know we do have this power query editor but I don't want you actually editing from that that's more for a reference now if you do open this file in order to reference it as we go along this remember you're going to have issues or an error saying hey data source isn't there remember you need to go in and actually select where this data set is so under the data tab get data and then under data source settings you're going
to need to update this link or this address right here of where you're actually accessing the data job salary all Excel file this is my location not yours got to update it anyway like I said we're not going to be using this so I'm going to open up a new notebook and like before we're going to be importing in that data set so we'll go to get data from file from Excel workbook you'll navigate to the course itself under resource under data sets and then we're going to be using this data jobs salary all Microsoft
Excel file go ahead and import this in we're going to select that sheet one and this time instead of doing load or even the load two we're actually going to go into transform data and this is now going to pop open the power query editor and this is where all the magic happens behind the scenes in order to get our data cleaned up so let's go over a quick overview of the window itself it's very similar to laid out to excel up at the top we have a ribbon with four different tabs of Home transform
add columns and view we'll be walking through each one of these as we go through this lesson underneath here on the Le hand side we have which query we're selected to once we're building multiple queries they'll start popping up underneath each other we can close this if we want and make more room is right here in the middle is what the current step or what the current status is is of our data set now yours may look a little bit different right now specifically I have this column distribution enabled underneath the view tab which I'm
going to go to more in a second but anyway it basically outlines all the different columns or where we're at with the data set itself before we finally loaded in now right above this area is a Formula bar just like similar again to the Excel UI and this has all the steps or all the code the M language done in this current step if you will of actually cleaning up this data set and you're like step like what step well over here on the right hand side we have our query settings and in it we
have the name of our query and then we have the applied steps this lists all the different Transformations that we've walked through so just a brief walkr the first step is source and if I look at the formula bar basically what it's doing is it's connecting to that Excel file with the file path that it has in the next step of navigation it's basically selecting hey out of that Excel file actually select sheet one from there to actually load in then from there we can see that the headers are actually in the first row and
not up at the top so the next or third step is the promote the headers up to the top and then finally the last step is change type it actually goes through and assigns for each of these what data type it is so in this case job title short it assigns to type text whereas something like job posted date it assigns to type number which needs to be a date which we going to fix that in a little bit down at the bottom there's a few statistics on this specifically talks about 16 columns and over
999 rows and it tells you when the last preview is downloaded anyway if I just wanted to stop here with this data transformation if you will I would just come up into home go into close and load we're just going to do close and load two and in this case like I'm just going to put in a pivot table specifically analyzing for job title short specifically how many different counts or that we have of this we can see totaling it all up have around 32,000 anyway that's a quick overview let's actually get into exploring each
one of those tabs in the power query editor so we're going to go back to data get data and from there you can just select this of launch powerquery editor similarly you can also use a shortcut of just alt F12 I'm on a Mac so I have to press option but actually launching this up boom it has it with just a shortcut anytime you launch it it may be grayed out here so we need to make sure that we go in and actually select a query that we want to analyze and transform for this overview
we're going to start with the view tab because mainly I want to get into actually how we can use the power query editor for Eda and thus save us a lot of time of actually having to analyze it in Excel in the spreadsheets itself instead we can do it right here so going through this first thing is you can toggle on and off the formula bar I always leave the form on so I don't know why that's an option next is the data preview I can change the font type I can also CH the column
quality so this is telling us if there would be a potential error in here or if in this case of job location if there's empty values you typically have error values whenever the data type isn't being being understood correctly so in this case job tile short is text everything in there is a text column if I were to change this to number press enter to run I'm going to get errors all the way through here because well that was text and can't convert text to numbers also not sure why but it should say 100% error
but it's not anyway they also have this green bar up at the top and you can use this that's what I actually prefer so I'm going to unclick on The View and changes from the con quality because you can actually look up here and see and then also togg it so in this case for salary or average it looks like there's 60% of them are valid and 40% are empty now remember this is only doing the data sets around 30,000 or 32,000 rows but it's only profiling so down here on the bottom column profiling based
on the top 1,000 rows so that's all we're seeing right here if I wanted to see all of the data itself now depending on how big it is we may not want to do this I can select this at the bottom and column profiling based on entire data set and it's going to reload back into here not sure how long it's going to take now going over I can see there's 22,000 data sets of for data points of the salary year where 10,000 are empty the other thing that you may have enabled by now is
that column distribution to be able to see what are the what is the breakdown of distinct and also unique values investigating what actually distinct unique means I went back to the job tile short looks like now it's actually picking up on all the different errors I'm going to actually change this back we don't want this to be number for job tile short we're going to change this back to text and I'm also going to refresh the preview by going to that Home tab basically refreshing it to get it all cleaned up anyway if we recall
from our previous analysis there's 10 different job titles of sat senior data scientist data engineers and whatnot and so that is the 10 distinct values they're distinct because they have repetitive values in here like right in here in six and 7 data engineer appears more than once now if we go over to something like job country they have 111 distinct so meaning 111 countries that have multiple different countries and only 12 countries that have one value for it or one unique value all right the last thing in data preview is column profile and this is
pretty neat right now I'm selected on the job tile short column it provides one on the left- hand side key statistics about the column and then two it actually shows a breakdown of the value distribution of it so this is really helpful in performing Eda if I wanted to go through here and actually see something so I can easily go in and even see something like job country and see how United States has the majority of the values and then how the different other countries Fall underneath that now this takes up a lot of room
and sort of valuable real estate so I find myself togging Ling this column profile on and off all right last few sections in this view tab go to column if you have a large data set with a ton of columns you can just come down here select the column you want to go to and then it will navigate you to it parameters this is beyond the scope of this course we're not going to be enabling parameters or even using them so we'll call This na next is the advanced editor which allows us access to basically
the behind the scenes of our am uh M language which we're going to be breaking down further in an upcoming lesson so we're going to save that but you can also access that from the home menu in advanced editor as well lastly is query dependencies whenever it gets into complicated ways that you're actually building your different queries and how they're connected to each other this is going to come in handy and this case we're showing that hey we connected to that Excel file on my MacBook and we loaded it into a pivot table all right
next up is query settings I'm actually going to go ahead and close this out for queries over here anyway with the query settings we can actually change the name of the query if we want to in this case is named sheet one I don't really like that I'm going to name it something like J jobs and I know it has salary data in it so I'm going to have salary down here on the applied steps like we mentioned this is a step through walkth through of each of the individual steps that power query has taken
to actually clean up our data set now one thing I will call out in this if I need to modify anything so in this case if I wanted to modify the data source here I could come inside of here into the formula bar and edit it I would encourage you if you're not familiar with the phone of the bar with using that or comfortable using it instead click click this settings icon over here on the right hand side and then typically a window will pop up and allow you to edit it so I could technically
change the location of this or change what type of file it is the same for navigation as well I can basically pull back up that navigation window that I had before and change the sheet I wanted to for the change type this doesn't really have a gear icon next to it for us to edit so we're about to go through and actually change it but if we inspect the job posted date we'll see that here one it has it underneath the type number but then actually looking at the column itself it's a number value because
remember Excel stores ex uh dates as number values behind the scene well we could convert this to a date by typing in date here but you may not be comfortable doing that just yet anyway with that that's a great segue into the Home tab into how actually we can change something like a data type with the home typ we've already seen a lot of things already right we saw the close and load too we also saw that I can go through and actually refresh my query query to make sure that it's fully loaded and up
to date if I have multiple queries I can not only do this refresh pery I can go to this refresh all and it does refresh of all queries we've already seen Advanced edited before properties just allows us to actually go in and change the name of this query if you want to and manage is more advanced we'll be dive in that in a little bit similar to Under The View tab with goto column we also have this option of choose column and go do column we can also just actually select a column if you will
so if I wanted to actually select job post to date or even more than that I can just do that and it's going to select it and it's going to actually remove all the other columns so which is not what we want to do which brings us a good point if we want to get mid rid of a step all we have to do is come over to the applied steps and there's a red x mark that will appear over any step that you do so I'm just going to go ahead and click X here
and it's going to remove anything that I've done moving on to remove columns which I think is pretty self-explanatory if you want to remove a column you just select it and you select remove column additionally if I want to remove all other columns so in this case job title let's say I want to keep that I could select remove all other columns and it would do that I want to cancel this step so I'll click X similarly to remove columns we have well keep rows and also remove rows and then we have options for also
sorting our values if we want to sort them from a to z or Z to A depending on a column so back to job post to date maybe I wanted them in numerical order I could just click A to Z and it would go through and actually sort it anyway I don't really want to do this I'm going to clear this step as well this brings us actually into what we want to do of we want to change this job posted date to a date time and that's we're going to use underneath this transform section
in the Home tab right now this data type as I'm selecting this job posted dat it notices that it's a decimal number I go to something like search location it changes to text so what I want to do is change this data type of decimal number to specifically a date time because that's what we have in here we have date and time now this popup is going to come up if you're doing this underneath the step that has changed type already what it's noticing is that the selected column has an existing type conversion would you
like to replace the existing conversion or basically preserve that as a number and add a separate step I'm just going to go ahead we're going to do replace current but I just want to show what it looks like of adding another step in this case I converted it in this step to a number and then the next step I converted it to a date time I don't like having a bunch of steps I want to make this as concise as possible so I'm going to clear that step instead and instead this time whenever we go
through it and select date time I'm going to say hey replace current now underneath here it updated that job post to date type to date time and it's all within one step love this similarly to that date time I also want to convert the salary or average and the salary hour average columns right now they're decimal numbers which is nothing wrong with that but I actually have the option to change it to something like a currency in this case once again I want to replace the current step for that I'm going to do the same
for salary hour average and change that to a currency as well for replace current covering briefly these other sections in the Home tab first up is merge and append we're going to be covering an entire lesson on this and how we can actually take different Excel files and different queries and combine them together with this manage parameters is outside the scope of this course I don't find myself ever really doing this so not something we need to worry about data source settings similar to what we saw outside of the power qu in Excel basically the
same popup is going to come here to allow you to change where your data source is and then down here at the very end if we have wanted to put in a new query I wouldn't necessarily have to back out of the power query editor I could just come in here and select a new source a file or database or other source and then work through actually importing it in in a query sometimes I find myself also using this one of enter data say I had a simple table that I wanted to input into Power
query to have I could go through and just create that table all right next up is the transform Tab and this one I feel is maybe actually although it looks like a lot of options it's probably one of the most simplest as you can see we have things like text column number column date and time columns structured columns basically if we have a data type of this we're going to go to you can go to if I have a number column I want to go to this and see what things I could do to it
such if I could do statistics to it I could do rounding to it or I could even get information out of it if it's even or odd I also have this section on any column that basically applies to any column this is allows us to one like we saw in the Home tab actually convert the data type of something but also even more advanced Transformations such as pivoting and unpivoting columns which we're going to be diving deeper into in the next lesson on Advanced Transformations and finally we have this section on tables which just does
more of generic things to this data set such as if I wanted to actually go through and count the rows on this could and I find out I have 32,000 different rows on this anyway I actually want to transform a column of this specifically this job via column as you notice from here that all these different job platforms have via and then a space right at the beginning of it I want to actually remove that so in order to do this I make sure that one job via column is selected I notice up here in
the any columns it has the data type of text now there are a few options in underneath the text column section for like splitting columns I could split it by this half and then delete that via but I find actually the easiest way to do this is just go through this replace values and we're not going to do replace errors we're going to just do replace values itself and we find a value in here in this case we want to find VIA with a space and we want to replace it with well nothing if I
wanted to go into advanced options and I have a few different selections available but neither of these applicable does so we're going to just go ahead and click okay and Bam now we have these job platforms cleared up now we've been going through this and keeping the names of these steps the same but sometimes I like to be more descriptive in when it's not a general tyag now it named this new Step replaced values I may actually do that a few times and I want to be able to whenever I go back to this actually
be able to identify what steps did what in this case change type promoted headers navigation Source those are all only usually typically done once so I know what that means however however for this one I don't know so I'm going to right click it and go to rename and I'll say this is replaced via in job via which is much more descriptive in my opinion all right only one more tab to cover and that is the add column with transform we transformed a current column with ADD column we're adding additional column to this similar transform
it has these options for text number and also date and time so very familiar features with this so let's say I wanted to extract the month and the year out of the job posted date column basically I want to Callum for month and I want to Callum for Year anyway previously we learned with that transform tab if I were to come into here under date time and then select something like month it's going to transform this tab so it's going to get rid of the contents of the job posted date is not necessarily what I
want I want a new column so I'm going to actually get rid of this Stu so with ADD column what this does is with that job posted date column selected I select date in this case I want month I could do start a month end of month day of month whatever I just want the month itself and then inserted month is pretty descriptive I however don't like the name of this so I could come in here this is an option and change I double clicked on this and name this job posted month and then press
enter now with this I'm going to get a renamed columns here so now I have two steps of this month was inserted into this and then we rename the column I would encourage you to minimize the amount of steps you have because these queries can get quite long in this case I'm going to delete this rename column go back to this inserted month if we actually re read this you don't actually need to understand what's going on much in here but I can see basically that we have this month in quotation marks and this is
named month so I basically can reason that this is probably the new column title of this so instead of using month I'm just going to edit this in the formula bar to job posted month then I'm going to click at the end and press enter and now all within one step I inserted that month and renamed it as well if you're not comfortable doing that feel free to go through that next step of actually double clicking this and actually changing it but I would encourage you if you can actually try to mess around with the
formula if you make a mistake it's pretty simple to just X out of that step and then redo it again so there's no harm to your actual data set now similarly if I wanted to create that job posted year column I could just go through here select year whether I want start year end of year year itself once again it inserts year and then I would want to change the name of this and change this to job posted year and then click enter and Bam now we have it I don't actually need this all these
are from 2023 I don't actually this is not going to provide any useful data for me so I'm actually going to delete this Stu all right I want to do one last transformation before we actually load this and going to actually visualize this so we have our salary year average column and then also want to compare this to the salary hour average column but right this is on a yearly basis this is on an hourly basis what we could do is do a conversion to our salary hour average column to get it to an equal
value or comparable value to our yearly value meaning we could put the number of hours in a year multiply it times this value and from there get what would be the expected yearly salary for this hour data so I could do this via the transform tab right going into that number column under standard we want to actually multiply and then there's 2080 hours in a year working hours for 40 hours of work week I could go through and actually do that and that's going to update this column itself but remember we probably want its own
column so I'm not going to use that instead we'll go to add column with this hour average column selected select standard multiply put in those hours of 2080 and then click okay once again I'm going to rename this I can see that this multiplication column is titled this via in this step right here so I'm going to rename it to salary hour adjusted and in this case I'm going to also rename this step to adjusted hourly salary to yearly now I'm sort of a stickler for keeping my data set in order right now I have
this job posted month and it's sort of right away from it's pretty far away from my job posted date I would actually want to move it right next to it so there's a couple options I can do to move it I can select the column and then come up here to the transform Tab and move go left right to beginning to end or I can actually just take it and then drag it and this is taking forever it's like paint dry but find where I want it boom plant it in and then inserted the step
of reordered columns I'm going to do the same thing with salary hour adjusted and put it right next to salary hour average and both of these done with one step of reordered columns so I'm fine with that so now let's actually get into analyzing this specifically I want to be able to analyze and compare this salary hour adjusted column that we just created compared to the salary year average so going back to home I'm going to close and load this in we have this previous analysis that we did before doing Eda on the jobs actually
want to create my own from scratch all right so back on sheet one we can see our queries connection specifically that data job salary remember the data tab you can go into that and it can toggle on all that queries and connections anyway we want to insert I want to analyze that hourly adjusted salary so I'm going to come in to create a pivot chart we also do pivot chart and pivot table at the same time anyway when this pops up for pivot table or pivot charts we want to we're not going to select a
table AR range because this is a power query connection if you will we're going to use this external data source and we're going to say choose connection what connection do we want to use for this specifically I want to use that DOA job salary so go ahead and click that and open and we're going to insert it into the existing worksheet so now the pivot table set up for us go forward to do one quick note you may be tempted say if we went back to jobs Eda to rightclick this and then go load to
and let's say hey I wanted to create a new pivot chart well the problem is is going to then get rid of this pivot table that we previously created so you don't want to necessarily if you want to keep this you don't want to actually do that back to the pivot table itself you'll notice now because we have these queries and connections but you can toggle between the two over here on the right hand side anyway what I want to compare is that salary hour adjusted to that salary year average right now it's doing sums
we don't want that we do eventually we're go to Value fail settings we're going to do average here we're eventually going to do median I promise you but we're going to STi for average for the time being I'll adjust both of these to be of average then I'm not really liking the formatting here I know we adjusted it as currency back in the the power query but this is the one data type that I find doesn't actually follow through in actually making into the correct data type when you import it into Excel so you do
need to go back still and actually convert it into the correct thing anyway we're seeing that the hourly salary is much less than the yearly salary and moving this over we can also see this via visualization this doesn't really show as much I would rather look at this when compared to job type so I'm going to go ahead and grab job title short and throw it into the axis now closing out of this and then closing out of this on the side we can now get a better view of this I'm not liking the format
of this pivot chart specifically I'm going to go in here design under change chart type and change this to a bar chart I feel like it's going to be easier to read yeah it's a lot easier to read also for these visualizations I'm going to rightclick this and I'm going to say hide all field button so that make this easier to view and I'm going to go ahead and stick The Legend at the bottom okay we're off to a good start other things I want to do to clean this up is oh my goodness this
is so long I'm going to change these column titles to hourly adjusted salary and then yearly salary additionally I want to sort this a little bit better specifically from high to low so under sort options more sort options I'm going to go into sorting this as sending based on the year L salary from high to low sorry that's actually descending selecting year salary clicking okay no it was right the first time it's ascending okay this is looking good you know also I don't like having different colors I like actually going with a consistent theme so
going into design change colors I'll change this to this monochromatic pallette 8 and Bam we now have our final visualization that we use power query to basically ingest all our data in clean it up create this new column of hourly adjusted salary perform an analysis in Excel to average it and we can see that consistently the hourly salary is well below that of the yearly salary so I guess it pays to have a salary job all right we have some practice problems for you to now go through and test out all these different features and
get more familiar with the power query editor in the next lesson we're going to be going into advanced Transformations and Diving deeper specifically in analyzing skills and using power query to actually clean it up so where we can actually analyze skills with that see you in that one all right welcome to this lesson we're going to continue on with power query specifically focusing on using more advanced Transformations and for this we're actually going to get into analyzing those skills and being able to put them on a graph and actually visualize what are the top skills
of data nerds now if you recall way back in the functions and formulas chapter when we went over text functions we did a little bit of text cleanup to clean up this column and then plot it but we were only able to do that with around 20 rows now with the power of power query we're actually going to be able to clean up all these values and be able to visualize it for all 30,000 job post so let's jump in if you want to you can continue on from that worksheet that we used in the
previous lesson and just make sure that you do go through and actually save it before you continue on however if you got lost in the way or you just don't have that file anymore feel free to use the lesson or the file from the last lesson of power query Eder once again you don't want to be using the actual one working cuz that has the final results we're going to want to work with that one and this has all the different work that we did it also has some some additional analysis whenever I looked at
plotting it over time to see if how the salary of yearly versus hourly compared anyway let's get into editing this and we can get to the power query editor by going up to get data launch power query or pressing alt F12 once it loads and need to click on the query that I actually want to look at and I'm going to close this or minimize this the first thing that I want to do is start an index column on this data set because in general whenever you have a source data set or a fact table
like this is you want to have an index associated with it yeah these row numbers are good but that's not good enough and we'll be using it more in the power pivot chapter but it's good practice to start it now so moving over to the add column tab I'm going to go to index column it allows us to start from either zero or one I'm a coder so I like from zero now Pro tip I want this index at the front now I could go to to transform and then move and then move this to
the beginning but remember we did this reordered columns right here so what I'm actually going to do is take this added index put it before reordered columns now that the reordered columns is right there whenever I select this index and move this over to beginning it's going to be included in part of this step of all of our column reord so I don't have once again multiple different reordered columns all right in order to clean up this job skills column we're going to end up being putting this uh these skills right now they're separated by
column inside of this list we're going to be breaking them up into their own individual rows and because we're breaking this up into different rows this now is going to put for this Row one value here this is going to make 1 2 3 4 5 6 7 this is going to make seven different rows of data this is going to mess up anytime we want to analyze anything because imagine if you have like a salary data it's then going to appear seven times so the main point of explaining that is we want a new
query to actually populate and actually break these skills out into their own separate rows so in order to create a query or another query right now we have queries one to create another query from this we have two options and that's underneath Home tab they have manage and we can either delete a query which we're not going to do we can either duplicate it or reference it I can also get to this by just right-clicking the query and it also has these of duplicate and reference let's actually look at both of those starting with duplicate
first so I've created my duplicate query and as you can see it basically has a duplicate of the original query nothing really has changed from it now this is cool if I want to walk through all the different steps again and I wanted to have it in this new query but I actually like this other option so I'm going to go to data job salary this CL I'm going to go down select reference okay this query this one named three is referencing data job seller and it only has one applied step if we look at
the applied step all it is doing is referencing the data jobs salary so this first query right now and populating it for us and this is really good because say now I make changes to the original query such as say I want to go through and I don't want any any more of the hourly data in here I only want the yearly data so I filter down to only have the yearly data so now it's filtered these rows for the yearly data don't worry we're actually not going to do this I'm going to delete this
Stu but anyway if I go to that duplicated query the one with the three at the end this one only has year values in it this I can verify is 100% yearly by looking either the column distribution or the column profile everything is your anyway we don't actually want to do that step so I'm going to go back to this original query clear the filtered rows and once again it's going to just clean this back up to have two distinct values so compare checking the S rate yearly and also hourly okay so we like the
reference for our case cuz I like we may make changes to the original one so I'm going to delete this number two because remember that was the duplicate and we're going to keep the number three one which was the reference we're also going to be doing all our alterations on the skills on this one so I'm going to to rename this one data jobs skills so with this new query data jobs skills let's actually get into cleaning up this column of data of job skills specifically we're going to be separating this into each of these
skills into the new rows by this comma delimiter but we need to remove a few things from this specifically this has brackets around it and it also has single quotes we don't need any of that we need to remove it so going to that transform tab we're going to go into replace values and we've done this before so for the value defin I'm going to just start with the first square bracket we want to replace with nothing I'm going to click okay additionally we want to replace the other bracket as well replace it with a
blank and then finally we want to replace that single quote as well also I'm going to just rename these all next thing we going to do is actually split these columns on this delimiter of a comma so under transform we can go here to split column it has a few different options by delimiter number of characters by positions we can go to by delimiter I'm going to select that for this we're going to use a comma delimiter because there's multiple different options you could potentially use for this we want to split at not just the
leftmost but we want to split at each occurrence there's no quote characters in here we removed all the quote characters so I'm going to click none and then click okay so now we just split these skills into let's see how many different columns we have here looks like we have up to 24 skills for all these different skills that we have so now what we need to do to get all of these if you will skills within a single column we need to unpivot them but the one issue right now so I have all these
skills right here but we also have all these other columns right here I don't really care about all them just I don't really care about around too much I want to mainly just analyze job title short and indexed so what I'm going to do to make this easier because I need to basically select which columns I want to remove or which ones I don't want to remove in this case so what I'm going to do is go back to source and this one has before we actually broken up the job skills so I'm going to
select job skills hold down control and then from there select job title short and also index and then underneath the Home tab we're going to go to remove call s what we're going to do remove other columns basically going to keep those three columns that we have now we are doing this in the applied steps after that first step of source so it's asking hey do we want to insert this step yes we do and so now we've limited it down to those three columns and Bam now whenever we go down here down to that
last step of change type we can see that we have all our different job skills and then over on the right hand side we have our index and our job tile short which I don't really like the order of this I'm actually going to go back to reorder this over here I'm going to just take these column values and then put them in this order of index job title short and job skills so now we actually get into unpivoting these job skills columns basically making all these job skills into one column so I'm going to
select instead of selecting all the job skills column I'm actually going to select the opposite holding control select the index and job title short and I'm going to go into to transform tab into unpivot columns and for this one once again we're going to use the other we want to unpivot other columns and go ahead and do this all right so what we do here we now have this new column of attribute and value attribute if we go back that is just the name of the column that was created previously and then the value is
what was in the cell itself and that's filled with all the skills so personally I don't really care for use of this attribute so I'm going to go ahead and just remove this column by right clicking and selecting it additionally I'm going to go back up here and I don't want this to be named value so I can go in and inspect this under unpivot other columns I can see in here that it renames these columns attribute and value in this case I don't want to be value like I said I want to be job
skills clicking enter boom renamed it to job skills and then in here it is job skills now one thing that's bothering me real quick before we continue on to actually visualizing this data is this column here typically I like to name things something like job uncore whatever it is in this case index I want to Name jobor ID but if you recall back we created this back in this data jobs salary portion especially here under the step of added index I want to change this from index as we've done before going in and renaming it
to job ID however whenever I do this press enter this is going to break my queries and this is going to happen to you anytime you're manipulating it so I think we need to get familiar with it so if I go to the next step of reorder columns we're going to have this expression error the column index of the table wasn't found duh because we named it job ID in the previous step instead of index but this step is still the same so what I can do is come in here change index to job ID
press enter and Bam that updates but then now going to data job skills we're going to have the same thing you're going to notice with this one right the column index the tail wasn't found index so same error message what we want to do you can do is go to error it's going to go to the first occurrence of that error in this is trying to reference index we if you call back from if we go to the first step of source we expect it to be called job ID now because we renamed it right
so I'm going to change this to job ID and then scrolling through the applied steps to see whenever we get to our next error if there is an error and that's unpivot other columns specifically they have job title short and index I don't want index here I want job ID and now bam now we have it cleaned so I should have done that job ID but that was actually good troubleshooting to walk through that you may encounter so let's actually get into visualizing this so we're going to go to home and we're going to close
and we're going to close and load now it's popping up as a table but we actually want to analyze this I don't really care to have it as a table so I'm going to right click it and I'm going click load to specifically we're going to go to a pivot chart and we'll insert in the existing worksheet because we're going to get rid of that data yes there's going to be possible data loss we understand that so I'm going move this chart off to the side select inside the pivot table and we want to analyze
the job skills so I'm going to take the job skills put them in rows and then the job skills also in the values to to count up the values then also I'm going to sort them I want to sound them from high to low so I went to more sort options um we're doing a descending order count of job skills so now there's a ton of different skills in here but want you to inspect this if you notice one these skills have sometimes have spaces in the front of them basically we didn't do a full
cleanup of this so that's why we have python twice in here is cuz this one has a space of it so opening up the power query editor by playing by pressing alt F12 so underneath the data job skills query I'm going to go ahead and we want to do a text transformation specifically if we look underneath this underneath for format we can change this to lower case upload case capitalize each word we're going to do trim which removes leading and trailing white space from each of the cells in the selected cell from there we'll go
back to home close and load this and now it's going to be reloading the data and those duplicate values are now going to be removed now there's a lot of skills here so I really only want to see the top 10 so I'm going to put a filter on here go into value filters and top one specifically want to see the top 10 items by count of job skills also I'm going to rename this to skill count and because these are text values down here I'm actually going to change this from a column chart going
to change chart type into a bar chart instead clicking okay boom and then with this obviously it's not sorted from high to low that's how I want actually to sort it so I'm going to go in here back underneath our more sort options Chang this from descending to ascending and the good thing about this is we still have that top 10 filter on it so it's still going to apply this and have the top 10 values on there first last little clean up I'm going to hide all field buttons I'm going to get rid of
this Legend right here and and then I'm going to rename this to what are the top skills of data nerds now let's say that I'm frequently referencing the top 10 skills as we have right here and instead of having to populate this every single time I want to actually create a own or create a query for this so opening power query going to alt F12 I could do the same analysis inside of power query query and get this into its own table to be reused but for this I don't want to use this data job
skills query instead like we did before I'm going to create a new query we're not going to duplicate this instead we're going to reference it so now it's Unique and distinct and I'll rename this data jobs skill count because we're get the top 10 and their Associated count so in order to do this analysis to find what is the count of all these different skills we want to do a group buy and it's right here under transform form under that Home tab and I can do group by which group rows in the table based on
the values in the currently selected column we're going to be forming a basic Group by we're using that job skills column I could change it to another column if I wanted to and that new column name is going to be skill count operation we're going to be counting the rows we could do any other type of aggregation as well if we had numerical data we could do average median min max whatnot go ahead and click okay so we've done this aggregation now the next thing is I just want to get the top 10 values but
before to do that I need to actually sort this in descending order right now I can tell looking into the numbers this isn't necessar although it looks like it isn't right so clicking the arrow up at the top I'm just going to say hey sort descending and then we want the top 10 values so underneath the Home tab under keep rows I'm going to have keep top rows and it's going to prop me how many number of rows do I want to keep 10 in this case I want the 10 values and now from here
all I got to do is close and load this into its own separate query and Bam here we have it and so if I needed to reference the top 10 skills any time all I would have to do is just reference this query and I wouldn't have to like we did last time go through this full analysis so power of query is really great at automating some repetitive analysis and having it just ready for you all right last little cleanup if we look at these skilled names they're not formatted correctly specifically if I look at
something like SQL I expect to be all capital letters SQL capital letters python I expected to be Capital At the beginning python so we're going to go through and actually fix this so that way whenever we present our data to someone it doesn't look like a hot mess so opening up the power query menu by pressing alt F12 we're going to go into the data jobs skills query specifically on that last step on and we're want to alter the job skills column so the first thing I want to do with this text cleanup the easiest
thing looking at this is we just need to capitalize the first letter of every single word and then from there we'll go through and actually fine-tune it to capitalize in case of SQL capitalize all letters we'll have to put in special case for this anyway if you recall from before we have that transform format and they have this capitalize each word we're going to do that the next thing though the more complicated one is we're going to go into add column and we're going to add a conditional column so what we're going to do is
go through we're going to keep the the name of custom column cuz we're technically going to be since we're adding a column we're going to have to go and delete this job skills column once create this new one I don't want to name a job skills right now going to call MK anyway what we want to do is we want to select the column that we want so if job skills equals in this case we expect to equal something like SQL we want the output to equal SQL then if we want to add more conditions
or Clauses to it we go to add Clause once again I'm going to select job skills and I'm going to put something like powerbi it had a lowercase ey at the end I want the powerbi to be fully capitalized at the end I also went through and added some other ones such as AWS gcp no SQL and SAS most all these required them to just capitalize fully except for the no SQL one then what do we want it to be if it's not any of these conditions well we'll add this else clause and we want
it to be basically the results of an entire column we want it to be whatever it is already in the job skills column I'm going to go ahead and click okay so now we have this cleaned up data set as well with nice looking names now if you want to if you're going through and finding anything in here that you want to clean up feel free to add to that conditional column statement those are the ones I'm just going to go for right now anyway because we added this new column and I don't really know
an easy way to do this without actually creating this new column we need to now go ahead and remove job skills and rename custom so going to the Home tab I'm going to remove column I'm going to remove the one that's selected and I'm going to renames custom to job skills and conveniently because we're using that same name and just replacing it if I go to the data jobs skill count that one because it references this one will also get updated and all those values in there are updated as well anyway let's go ahead and
close and load and inspect this is our previous pivot table and pivot chart that we analyzed it's now going through and loading all the data and now we have it updated with all that correct formatting for those different data points one last thing before we go this is generic these top skills of data nerds tall data nerds and that is using the data job skills query which has the job title short column in it so we can actually visualize this for a certain job by going into pivot chart analyze I'm going to go into insert
slicer specifically we're going to look at job title short I'm going to put it over here and then as usual I'm going to rename it real quick to job title and now let's say we want to analyze something like data analyst we can see that SQL is the top skill but Excel is in second place followed by python Tableau and SAS what about for business analysts very similar in that sqls top and then Excel is in that second place so really unique and showing the importance of excel Within These skills and pretty meta that we
used Excel to find this out all right now it's your turn to give it a shot you have some practice problems to go through and get more familiar with doing these Advanced Transformations specifically pivoting unpivoting and then also Group by all right with that I'll see you in the next one we're going to be diving into append and merging queries specifically going to be doing this with that skill query that we did previously all right see you there let's now get into how to perform a pend and also merges and so the first portion of
this lesson the easiest portion of my opinion is going to be a pend specifically going back to that Excel sheet where we had all those different uh sheets for the months of the year and they're job posting on each because all these data sets are of the same format I have the same columns we're going to be able to append all these together and get what is our final data set of all 30,000 rows if you recall each month had around 3,000 postings so that's how we get to that value from there the primary focus
of this lesson will then shift to merge for this we're going to be combining our two queries that we built previously one which was our original data set so we titled that one data jobs salary and then that new query that we created in the last lesson on the skills so data job skills we're going to be merging those two together and this will allow us to do some pretty interesting analysis specifically now that we've merged those we'll be able to see based on a skill what is the expected salary and we're going to build
a visualization for that for the top 10 skills now merge unlike a pend is a very complex operation mainly because there's a lot of different types of merges specifically there's six type of merges in Microsoft alone so we're going to be walking through each one of those so you understand the differences and know which one to use when for this first append example we're going to be using this data job salary monthly data set and just as a refresher this contains everything for in this case I'm selected on the January sheet down here and this
has all the January data which has around 3,100 rows for this and we have each one of the months for the year here anyway let's use power query to append all these together because previously before you knew about this you'd have to go through and actually copy and paste all these different options right here and then put it into a new sheet doing this 12 times is a hot mess so since this is only a simple example that we're not going to use later on I recommend just opening up a new workbook for this now
coming into the data tab I can come down to get data and they do have this option right here for Combined queries merge and also append but this is for append two queries from within in this workbook it's basically assuming you've already imported it in so instead what we need to do is actually go to from file and actually start our first query of connecting to that Excel workbook with all those different sheets navigating to the course underneath resources data sets and then here down on data job salary monthly I'll select that select Import in
the Navigator we can see all the different sheets that are available we want to actually do enable this of select multiple items and then go through and select all the items with all these loaded we're going to then shift into not just loading it we want to actually go into the power query editor so I'm going to select transform data and it's going to start by loading each one of those sheets and just going to be naming each one of the queries respectively after those sheets with power query editor launched we can see over here
in the left hand pan all 12 of those queries for each of the months so these are all their separate own queries because of that we need to now move into actually appending them and make it one final query that we can actually export into or import into Excel so underneath the Home tab they have the option for combine append queries they have appen queries and append queries is new with the January query selected I'm going to go to append queries and for this I can say either do two tables and specify the table I
want to do we're going to do three or more cuz we want to do all of them with them all selected I'll now go through and click okay to append now this inserted a step of appended queries inside inside of that January query so now that January query is all those different data sets so I just want to verify that I got all the data in here right now if we scroll down well I'm just going to show it right here we're only showing column profile based on the top 1,000 the fastest way to actually
find this out is just go to the transform Tab and go to count rows which it tells me there's 36,000 rows which it's a few thousand too many and if I go back into the appended query option and actually look into it I can see in the formula bar we have August in here I accidentally selected it twice so I'll go ahead and delete it and then look at the counted rows that's actually what I expect the value to be around 32,000 anyway that was just to count the rows additionally I don't want the append
query to be inside of that January query so I'm going to delete this step as well instead with the January query selected I'll go back to that home append queries and then select append queries as new this is going to create a completely new query once again we want to do three or more tables this time I'm going to hold control and select all of them and then move them over at once make sure we don't have duplicates this time so this now starts a new query right now it's called aend one I would probably
name it something like data jobs all and then pressing enter it then loads in here but you can see these queries like imagine the case where I right now we have 13 queries I want to organize these a little bit better so we can actually group these specifically we can group these monthly ones I selected April and then holding control selecting all the other queries as well then right clicked it and I'm going to select this option to move to group we need to have a new group and I'll call this real uniquely data jobs
monthly and click okay so now we have these two folders one with data jobs monthly I'm going to close that down and then there other queries which we've seen before and there's one query inside of this of data jobs all this cleans it up also you may get this disclaimer up here the preview may be up to 33 days old feel free to refresh it if you've been getting that should have no effect on your data then if we wanted to we could go through and actually Analyze This by pressing close and load to I
pretty maturely selected close and load I recommend you select close and load to anyway nonetheless I'll go to the data jobs all we'll go to load to specifically I want to look at a pivot table I know there's going to be some data loss because it's going to remove the data in the sheet and then I can inspect that job posted date specifically for the account dragging job post date into the rows and then also dragging job posted date into values and once again this is why we double check it this time it looks like
I accidentally imported in January twice with this as we can see that it's 35,000 anyway opening up that power query editor going to the data jobs all query and updating it to remove that second January that I should have caught from before and then close and loading it and now it should refresh and update for these these correct values boom so now it's actually aligned with what I expect to see this why we always double check any type of query or analysis you do this double check of the work is going to save your butt
all right let's now get into the bulk of this lesson I'm moving into merge for this feel free to continue working with that workbook that you were working with in the last lesson if you didn't Happ to save it or you got lost you can use the advanced transform workbook from the last lesson that'll pick right right back up where we left off and then as usual the append and the merge are the final examples that you're going to see at the end of this which specifically for append you've already saw so let's actually get
into merging those queries for this I want to press alt F12 and right now we have three queries in here the data job salary which is basically like our fact table this includes all of our data going into transform and count rows we have as expected around 32 data point points I'm going go ahead and delete that Stu similarly we have this data jobs skills which has all of our skills in it let's see how many rows are in this by going up to transform and to count rows and this has 167,000 now it's important
to understand these numbers because we're going to be using them or need to understand them whenever we actually get into the joins to see when we have missing or more data so I'm going go ahead and delete the step of counted rows as well we don't need it then we have also this final query of data job skills count this was made as an example only we're not going to use this any further into the future so I'm actually going to go ahead and just delete this to minimize my queries it's going to ask them
I'm sure want to delete it yep so let's get into merging these queries I have data job salary selected come up to the Home tab under merge queries we're going to have merge queries and merge queries as new like we learned from the append of appen queries and appen queries is new we're going to want a new query so that way we still have these Source queries so I'm going to go merge queries as new with this this merge window pops up and it says select the tables and matching columns to create a merge table
specifically we want to go with the data jobs salary and we want to merge it on the job ID that's why we created that a few lessons ago we're trying to connect to the data jobs skills on also that job ID now down here underneath this there's a join kind and there's six different options from this of left outer right outer full outer inner left anti and right anti now Kelly put together this fancy chart that shows visually what is happening with these merges and we're going to be walking through all of these briefly in
order to understand which type of join you should be choosing depending on which scenario you're in as a quick overview these circles are signifying the two different tables so in this case table a and table B and the Shaded Blue Area shows what portion of the contents from those tables will be included in the final table first up is a left outer join and with this join what's showing here is that all rows from table a will be included in the final table and then from that Center portion right there where A and B overlap
this signifies that it's only going to keep items from table B that are in table a or match with table a so what does it actually mean so if we go here into join kind and select left outer and then what we get told based on this next to this check mark is the selection matches 29,000 of 32,000 rows from the first table so what are those missing jobs well basically there's some jobs that don't have a skill now this isn't necessarily a bad thing although we're not going to go with this join this could
be an option we could use I'm going to click okay to load it in so right now we have it under this query called merge one and as you can see there's not repeating any job IDs basically we have the original dat jobs salary table and then we scroll all the way to the right we have the data job skills over here and if you see each one of these items is a table if I click on it and expand it to see hey what's in this table we can see that for this one there
job posting or job ID of 10,000 And1 this is the table associated with it so I'm going to go ahead and actually delete out of this step and go back to it so what we could do is expand it out and there's this icon up in the top right hand corner I'm going to go ahead and click it and it's going to ask me how it wants to basically expand out and in this case I already have the job ID I already have job title short I would expand it by job skills so now seeing
how these skills are broken over I can actually scroll all the way over and see that now 10,000 And1 ID is duplicated multiple times and if actually looked at the number of rows within this data set this new data set we have 170,000 rows now technically this merge has exactly what we want but we still need to go through those other merge examples to understand them so we're going to show them as well now for this I want to go back to that merge window and I'm going to click the settings icon I need to
get rid of the step we're going to be trying out different types of merges so I'm going to xit out and then go in here and click the gear icon now it's popping back up we did left outer next thing we're going to look at is Right outer for right outer this takes all of the rows out of table B and then from there any that match those rows in table a are included now this one when we look down here it says Hey the selection matches 167,000 of 167,000 rows from the second table if
you recall back from that left outer we had 170,000 so 3,000 higher why is that well that table a or data job salary has 3,000 roles in here that don't have any skills listed hence why 3000 is less this provides a similar type of merge that we did before where we need to actually go over to that data job skills and expand it out selecting the job skills column and with this table we can just check that we have 167,000 rows which bam we confirm all right I'm going to get rid of these two steps
we're going to move into the next merge next is inner join and this provides only matching rows from table a and matching rows from table B so depending how you're join it there could be missing data on both A and B for this one it's saying hey the selection matches about 29,000 of 32,000 rows from the first table which what we expect and then basically all of the rows from the second table so this one if actually go into it and then expand out those data job skills looking only at the job skills column with
it expanded out actually counting the rows we have once again 167,000 so missing that 3,000 of jobs that don't have skills next is left anti and in this case it checks to see what matches it doesn't have and Returns the value for that specifically for table a whichever values don't have a match it's going to return that so in this case it says the selection excludes 29,000 out of the 32,000 when I go to load it I get the rows from table a or data jobs salary and it still has the data job skills but
actually if I looked into here right we should be matching on things that don't match or don't have a value specifically there shouldn't be inside anything in this table that I'm clicking on and as expected they're null values because it doesn't have skills so exiting out of navigation going back to Source counting these rows we can see that we have 3,000 jobs basically with no skills for right anti this gets rows from the right table that do not have matches in the left table and for this with right anti- selected this selection excludes 167,000 out
of 167 rows from the second table so basically everything from this table is included we're not going to walk through this in the power query cuz this is also not what we want the final one we're going to actually use is a full out join from this it takes all rows from table a and all rows from table B and if there's a match it will join those two if there's no matches it's still going to return them in the table it will just be a null value for where it doesn't match up and this
talks about how basically selection matches 29,000 of 32,000 rows from the first table and all the rows from the second table loading this in once again we have data job skills we need to expand out and we only want to expand out those job skills and then from there just going to do a double check I'm I'm going to do count rows and this has 170,000 rows in it so similar to our left outer we could have done either of these these are one the twos that we want but I'm going to stick with this
one of the full outer because I have all the work here any I'm going to close out the step and I think that's a great example of sometimes there may be multiple joins that fit the example it's important that you go through and actually count the rows and understand the data set to figure out which one you need to use and for what purpose anyway one thing I glossed over real quick going back to source and that gear icon is right underneath this underneath the join kind they have used fuzzy matching to perform the merge
right now we're doing basically exact matching as the job ID of 10,1 we're matching up exactly with the 10,1 from the other table fuzzy matching allows you to connect to tables that have basically non-exact matches so in this case we have table a with a student ID and a student's name and only their first name but then in table B we have the student name full so first and last name and the grade with the fuzzy matching we could merge table A and B based on that student name First Column and the student name full
column now what happens if we get to where we have students with multiple similar first names it's going to create a hot mess so I don't always recommend using this unless you know the data and you know you're going to cause complications with it so that was a quick overview of of the different joins within power query if you want a more indepth tutorial for how this is done then and you can check out my SQL tutorial where I go through it with all the different SQL analysis that we do in that course and break
it down step by step I'll include a link to that video right here for you to go and see it all right so we have the final table that we actually want for this remember these do have duplicate values in it so you have to keep that in mind anytime you're doing analysis I'm going to rename this as data jobs merged one last thing for close and load we have this job skills column which is sort of redundant right now because we actually have the data job skills not job skills the actual skills itself so
I need to get rid of this column I actually want to do this I'm going to do this in the source step before we even break this out so I'm going to select job skills and select remove columns it's going to ask if I want to insert the step which I do and then after we remove the columns we go into expanding it out and because we did it in that order I can actually come in here instead of renaming it here I can just rename it via the formula inside of expanded skills and just
change it to job skills and Bam now I only added two steps Vice one all right go ahead now we're going to close and load two I'm going to want a pivot table and also pivot chart so I'm going to select the pivot chart option here and underneath quers and connections it's going to show that it's loading this in here under data jobs merged so let me show you what we're going to be creating with this I want to build this visualization that's showing what is the salary of the top 10 skills top 10 skills
by count for data nerds and this is a combo chart we're going to have not only the salary or the average salary for a skill but also for this line portion we're going to have the associated count for the number of skills that appears or how many jobs it appears in all right so I'm going to go ahead and move this pivot chart out of the way and select the pivot table remember we want to use the job skills we're going to be analyzing that so I'm going to throw in the rows the first thing
I'm going to look at is the easiest is the count of these job skills and I'm going to rename this to job count along with changing the value field settings going to number format I want to change the number specifically I want to use a thand separator with zero decimal places I'll go ahead and press okay so we have a count now we want the average salary so I'm going to take salary your average drag it into the values right now it's doing a sum so I'll go into value field settings select average and then
for number format we're going to do currency with zero decimal places click okay and okay again and I'm going to change this one to average salary and then specify the units of USD all right so now xing out of this and xing out of this now our pivot chart is sort of all jacked up well it is jacked up mainly it's trying to PR this as like a dual column chart and that's not what we want so we're going to change this design of it going to design change chart type I'm going to go over
to combo and then underneath here for the combo for the job count I want that to be a line so I'm going to go up here and select line and for the average salary I actually want that to be the column now I want the job count on a secondary axis I don't want the same axis as the salary itself because they're just not proportional I'm going to go ahead and click okay I want to clean this up a little bit further by removing the legend and then also right clicking here and hiding all field
buttons on this okay there's now there's still too many skills on here remember we want the top 10 skills so going into the pivot table itself I'm going to come up into the filter into value filters and top one we're going to do top 10 items by job count all right this is getting a lot more readable now because I have the top 10 by job count I want to order this from high to low by salary so I'm going to go to more sort options and we're going to do descending on average salary I'll
click okay and Bam now we're getting somewhere so we're seeing things like spark and AWS have the highest and Excel did make the top 10 so it's on there at 100,000 other things I'm going to change selecting on this pivot chart is the actual design itself you know how I am about colors so we're going to change the colors I'm going to use this monochrom MAAC palette 8 I want the line to be a lighter color than the actual bars itself I'm going to go ahead and add access titles for primary vertical and secondary vertical
for this I'm going just select the box go into the formula bar and say hey for this one make it equal to average yearly salary for this one selecting the Box going into the formula bar pressing equal I'm going to make it equal to job count I'm also going to add a title to this I'm going toall this of what is the salary of the top 10 skill of data nerds and remember this is for all data nerds so I want to be able to actually what's the great thing about this of joining these tables
now we not only get salary data but we can get job title information so I'm going to add a slicer now but going in pivot chart analyze insert slicer add in that job title short only going to move that out of the way now I'm going to go to slicer I'm going to rename this to a more friendly title of job title and now now let's actually look at it for data analyst so with this looks like python arlor the highest Excel still makes that top 10 and for data analysts at 86,000 it's also if
we look at this it's the second most important skill behind SQL which has a value of 96,000 let's see what it is for a business analyst once again SQL and Excel are two of the highest and for business analysts Excel is paying 87,000 so bam we just showed the power of well append but also more specifically merge we can now take this analysis to another level analyzing skills to other data points from our main fact table or that data jobs salary table that has all of the data in it so now you have some practice
problems to go through and get more familiar with using both a pend and also merge after that we'll be jumping into the last lesson of power query focusing on the M language as I warned at the beginning don't worry if you don't have coding experience or anything like that we're going to be taking it nice and easy and you're going to be able to follow along and fill it out pretty easily we're going to be doing some final prep before we finally send this data set on over to power pivot which we're going to cover
in the next chapter all right with that I'll see you there welcome to this final lesson on the M language and we're going to be going into some pretty Advanced Techniques and understanding how to read and better utilize the M language in building your power query queries anyway nothing in this lesson is going to be used that we actually go through and do used to build on our project so if any time you're not following along or you're not able to do anything don't worry too much nothing's actually be used it's more to inform you
about the M language so you get more familiar with it as a disclaimer you will not be an expert on M language you not be able to code in M language after this mainly you'll just be able to look look at it understand what's going on there from there and make slight adjustments if necessary feel free to continue working on in that worksheet that you've been using previously where we just calculated in the last lesson looking at the top 10 skills and what the salary is for them however if you got lost or wasn't able
to follow along or just starting over feel free to use this merge notebook don't use once again that M language one that one's going to be what is going to be done at the end of this lesson so what are we going to be covering in this lesson well if you open up the power query editor we can navigate into it we're going to be covering three main things first is the Z Advan editor actually walking through a previous query and understanding how to read it and then from there under add column tab we're going
to go into these different examples on creating custom columns and also custom functions so what exactly is this m language well if we dive in documentation we can see that the power query engine uses a scripting language behind the scenes for all power query Transformations the power query M formul language also known as M so although we're doing all these edits inside of this power query editor behind the scenes if we navigates something like the advanced editor it's actually using this m language right here to carry out all the Transformations and it goes on to
say if you want to do Advanced Transformations using the power query engine you can use the advanced Editor to access the script of the query and modify it as you want it even goes on to discuss that if you're not finding what you need in the actual GUI or the graphical unit user interface of the power query editor you can use the M language editing it in the advanced editor for this so let's go into breaking down this m language more by going to that data jobs merge and entering the advanced editor and we're going
to be just breaking down this simple query right here up here on the right hand side there's a few different options display options I'm going to do this render Whit space basically it shows me the indentation that's going on here right now I'm seeing that there's four spaces in here anyway the key thing here is we've have first have this let keyword and then in keyword this Begins the basically definition block if you will this whole portion right here for defining different variables and specifically different tasks if we look we have things like source expanded
data job skills sorted rows remove column remove columns if I go ahead and move this over to the right those applied steps are the same thing those are the variables itself I currently have enable word wrap enabled and I'm not liking the format and how it looks I'm going to go ahead and unclick that finally we have the in keyword and then this displays the final value that we want to appear for our query so in this case we want the final value of rename columns or the last applied step to be what appears now
this Advanced ER I'm going to expand it back out again is also a syntax Checker so in this case let's say I deleted this quotations at the end of this rename columns it's going to one it's going to give me these red squiggly lines to say that hey there's something wrong here and two it's going to actually give you an error of invalid identifier and so we would probably know that we probably need to fix this so we're not going to be breaking down much more of the formulas here but I do want you to
spot two main things from this the first thing is this column names column names are always put in quotes in here and conveniently they're also highlighted in here so if you needed to do any changes to column names or see what's happening that's one quick way to identify it the next thing is this every step that is taken refers to the previous one what do I mean by this so this first step is assign the valuable variable of source and I know it's assign this variable because it has an equal sign right next to it
and then whenever we go to the next line of expanded data job skills inside this function of table expanded table column it references source which if I scroll over it I can see that it's giving me the same formula for source which is right above it so basically it's plugging right into it similarly this expanded data job skills is going to be located in the next one below it on sorted rows and it's going to be the first value in here for this table sorted and if you're curious about what these different functions are doing
you can just scroll over it as well in this case table. sort sorts the table using one or more Columns of names and comparison criteria and it tells us via the syntax inside the parentheses that the first parameter is table is table so it takes that previous variable which is a table anyway one minor last thing about this if you notice these are surrounded by these variables have a hashtag and then double quotes on each side and that's because they have white space in the actual names that we're doing for this in the case of
source there's no white space it's only one value with no white space so it doesn't need to have this around it anyway why am I yaen about all this stuff if you need to understand this m language anyway we're going to actually create this data jobs merge query I'm going to select it all press contrl C to copy it then from there I'm going to close out of it we're going to now create a new query so underneath the Home tab I'm going to go to new source I'm and then under that other source and
I'm just going to go into blank query okay right now this is completely blank but I can go into that advanced error of query 1 and it has the let and instill and obviously nothing going on here what I can do is just highlight this all and then using contrl V paste all of that other query into this now when I press done it goes through and actually creates that same exact query from data jobs merged now it could could have gone through and right click data jobs merged and click duplicate but this is more
of to show that you can actually go in copy queries or copy portions of queries and thus paste it into other ones which we're going to do in a little bit so let's get into more of learning about the M Language by actually cleaning up this query one that we just created by using this column from example first thing though I do want to rename this query one this is the one we're be working with for the remainder of this lesson and I'm going to call it data jobs clean because that's what we're going to
do we're going to clean it up so we have four major tasks that we're going to do with this the first is for job schedule type I just want to extract out the first value out of here that's full-time out of it additionally we're going to be using the date and date time columns to extract the weekday and also the hour of the job postings and then finally we're going to do some data cleanup on this job title column that frankly is a mess specifically we're going to move job postings that have this parentheses remote
around it anyway let's start with this first one of this job schedule type if I go into view and then look at the column profile it looks like we have that full-time contractor part-time and whatnot but we have a lot of combines of full-time and part-time contractor and temp work full-time parttime and internship I basically want to go through and just extract out what is the first value that appears in here so in the case of this full-time and parttime just want to extract full-time contractor and temp n work only contractor so under add column
and then column from example we'll do from selection and this appears at the top of add column from examples enter sample values to create a new column control enter to apply so I'll first go by entering fulltime and it's already picking it up I'm just going to type it in first okay and then I'm going to scroll down but in this case I'm going to put in hey I want full time for this one this is the example remember so now it's cleaning up that let's scroll down further if it's done this fully for even
more okay it's getting the first of these and you might think that this is correct but the problem we're running into now is if we go down to this one where it says contractor it's only contract do and just looking at the formula this is the formula it's generated so far it's doing teex start and nine I don't really know too much what's going on here but I'm assuming that it's taking the first nine values that's not I want so inside this contractor one I'm going to type in contractor with an R so that way
it hopefully fixes this so this is good and now it has text before delimiter and a space so I'm going to go ahead and click okay to load this in so let's scroll down to just inspect it to make sure that we have this correct and an easier way instead of scrolling down and trying to find something I can just use this drop down right here and look in here and it looks like we're good except for we now have a comma here specifically I have a fulltime and then a full-time comma so what's going
on here well for values that have more than two so three they actually insert a comma in there and when we inspect our formula opening up the formula bar here it's only checking for a space so the easiest way to fix this is actually just like we did before we're pretty familiar with it let's go to the trans form Tab and then under replace values we want to go to replace values specifically we want to find commas we want to replace it with a blank bam so now pulling down that drop down we don't have
multiple different full times we just have that single one without the comma we have what we want all right we're going to rename this and I can just go ahead and double click this and rename it but I'm actually going to do something first I see that I have the step already for renamed columns so I'm going to take that and I'm going to drag it to the Bottom now with rename columns as the last step I'll then rename it to job schedule type first press enter and then it inserts it into that current step
as we can see from here cuz we're now familiar with it and we don't have multiple rename columns in there and then finally you know how I get about column ordering this job schedule type first I want it next to the job schedule type so I'm going to drag this on over here see how long it takes and we've moved it over and we now have this new step of reordered columns all right let's look at some other quick examples for column from examples for this we're going to be using the job posted date for
this using column from example I'm going to select from selection now with some of these things whenever I type in this box I want to get let's say the year in this case if I were to type in four one it would pop up that hey with all these different options we can do and so this provides a lot of different options as far as okay I do know if I wanted to do the month I could do that and pressing enter it's going to copy it all the way down that's not what I wanted
this case though I'm going to double click it again go 2023 and scrolling down and looking through this this option here of year from job post to date so we're going to go with that then press enter and looking at the transform we can see what is the m language code that it used for this it used the date and year function putting in job posted date this is what we want we'll click okay you know I I'm with naming so we're not going to keep this named year so I'm going to modify this m
language to be job post posted year with that renamed let's actually move over to our other example extracting out the hour for this we're going to be using that job posted datetime column column from example from selection in this case I want the hour out of it so I'm just going to put something like nine and we can see that we also have this here for hours from job post to date time I want that one press enter again inspecting the M language formula it's extracting the hour out of this one I'm good with it
I'm also seeing the other values are updating correctly I'll click okay and we have our new column called hour which you know me we're going to fix this an updated hour to job posted hour press enter all right now we got it so you're probably like look I already know how to go something like the transform Tab and already extract out that information using these functions that we used before well that was mainly as a primer for this next example we're going to be doing and that's that with this job title column there's some job
titles in here that have a lot of sort of frivolous information that we don't need like in this case supervisor information technology specialist and then parentheses it has associate director I don't need anything in parenthesis similarly for this for the senior data engineer I don't need this remote in here so let's select this job title go into add column column from example and from selection for this first one with the associate director I'm going to select it so it appears below and then just highlight what I want press contrl C and then paste it in
here then scrolling here through here to do a cursor check so I'm seeing that senior data engineer remotes in here I could select it and copy this down here another option is I just go in here double click it since it's now populating and delete out that remote press enter and it looks like it's doing this it's getting the text before the limiter job title specifically before the parenthesis and looks like in this case University grad data scientist PhD only now hiring it removed all that okay so this is now doing what we want click
okay and I don't want I want to call this column text for delimiter I want to call this job title clean pressing enter all right so last thing I want to now clean up these columns and you know how I get I want the year an hour to be next to the date time the job tile clean be next to the job tiles I could drag and drop these I'm going to show you something else this reordered column step we're going to be modifying the M language for this and I don't want reordered columns to
appear more than once so I'm going to take it once again and drag it to the very end now what I can do is take and modify this m language that we have in here now if we actually inspect this reordered columns it may do this or may not in my case it didn't add anything after job skills it basically let any new columns just fall towards the end so this job skills all these other columns after it aren't included which not a big deal so what I want to do is I want to move
this year and hour to near job posted date and job posted month so I'll enter inside of here put in job posted year and also job posted hour make sure we're putting commas after both of those then I'm going to run this to make sure there's no issues with it and it looks like it moved it over inspecting next to job post a date we have our month and also year and hour all right the last one is this job title clean and I want this to be right after job title so I'll go ahead
and put that in right here making sure to put a comma after that and then from there press ing this check mark up here to move it inspecting over we have job title clean right next to it our next to look at is custom column we'll go ahead and actually just select this and whenever we pull this up this tells us this allows us to add a column that's computed from the other column provides a box to basically put in the new column name but right here this is where we put in the custom column
formula or the M language to maybe clean it up now let's start with something simple let's say I just wanted to repeat the job ID column I would come over here select job ID click insert it's going to put it in notice that the variable itself is inside of brackets and I'm going to rename this job ID repeat down at the bottom it's telling me that no syntax errors have been detected I'll click okay and then I get this new step for added custom and we can see hey it's job ID repeat scrolling over yep
it repeated it if I want to go back in to edit it I'll press that settings icon and it's going to pull this back up so let's do something a little bit more complex now and it going to involve the salary year average column and that salary hour adjusted column go ahead and cancel out of this what I want is to create a new column that if there's a salary year average value it will basically be in that new column and then if there's a salary hour adjusted value it will be in that column instead
just as for warning anytime salary year average is null there's always a value for salary hour adjusted and vice versa so like I said we're not going to becoming coding experts with this so I recommend taking use of chat Bots like chat gbt gemini or whatnot lots of free options available out there anyway we have this prompt of generate a power query formula for a custom column on building make the column salary your average if it's not blank otherwise it is salary hour adjusted now it's giving do the entire M language right this is what
we providing to the advanced editor providing that previous step name what column we're using everything like that I care about really this formula right here specifically everything after the each I'm going to copy this from if all the way to the end that's the actual code right here going back to the custom column I'm going to delete that job ID out of there I want to make sure that there's an equal sign still there and I'm going to paste this in and you can see from this this is just basically an if formula it's doing
if salary year average is not equal to null then salary year average else perform salary hour adjusted down at the bottom we can see that no syntax errors have been detected so I'm going to go ahead and click okay so bam we now have this I did in that jav ID repeat value here so we're going to actually change that to rename that value to salary year combined and then clicking the check mark in order to to rerun that formula to update the column and you know I like have my steps in order so I'm
going to grab reordered column and I'm going to drag it to the very end and for this one I'm just going to drag it over to salary hour adjusted right after it to salary year combined so now scrolling down just to double check it it looks like we got 140,000 here 140,000 82,000 82,000 there so the formula filled out correctly so let's get into our final task so we've been working this data jobs clean data set we made this salary year combined which is pretty useful actually what happens now if we want it in something
like data jobs merged what do we need to do to actually add it into here because we have everything we need for it specifically we have that salary year average and we have the salary hour adjusted columns well we could recreate it in here going through all those steps creating that if statement or we could just copy it out of the advanced error and bring it in here so I'm going to go back to data jobs cleaned and then under home Advanced editor I'm going to go and find the step that's in here specifically it
was this of added column and I'm going to copy it because I can see that hey it has the salary year combined in it I'm going to copy it all the way the the end and I'm going to copy it by pressing contrl C okay go and close out of this one and then bring over to data jobs merged go into the advanced editor and I want to insert it in right at the end so I'm going to go to at the end of this block of this let block going to press enter and then
from there press contrl + V to paste it in now I'm already getting an error message and it's saying hey token comma un basically expected and it's not getting it if I scroll over I can see these squiggly lines right here basically there's not if we can see there's commas after every one of these variable definitions so I need to come up here put a comma in there next is this a comma cannot proceed an in so if we scroll over we can see this is red highlighted probably wrong not to have a comma here
so we'll get rid of it now we're not done it's going to say there's no syntax errors but we didn't complete this remember you have to have the name of the it's got to reference the previous name here in it so if I tried to even though it says no syntax errors if I try to click done and go to load it I'm basically getting an error I can see this by this basically air Bo at the top of each one of these columns also there's only one applied step and it's calling it data job
merged of the actual title itself but we need to fix this query and actually get it back to where it had multiple different applied steps so I'm going to go back to the advanced editor we're going to show what we did wrong here and that has to deal with remember we had before where we had something like remove columns you reference the previous column in it so in this case remove columns right there well rename columns is the last one we had I'm going to go ahead and copy this by control cing it but yet
we have inex inserted text before delimiter one which is not correct so I'm going to select all of that and replace it by pressing crl +v so we have the rename columns now one other thing we have to do this last statement or and the in portion needs to be referencing that last variable of added custom so I'm going to go ahead and copy this contrl C and then pasting it in control V click done and now scrolling all the way over we can see that we have that salary year combined column that we created
in the last query it's at the end we do need to move it over but it's in there nonetheless so it helps with understanding these queries now one quick thing before we go we've gone through basically every single thing in this chapter on power query up to this point with the exception of this invoke custom functions this basically invokes a custom function defined in the file for each row of this table this is more advanced and Beyond the scope of this course we're not going to be covering it but is available for you to dive
into say you're doing a lot of different Imports and you need to automate the Imports that you do this would be a path you would go but for beginners like us I'm going to say stick away from it for the time being so this now wraps up on the M language and that was really a crash course and understanding how to use it by no means do you need be a professional or be an expert coder and codeing the M language if you got lost at any point in the way nothing to feel ashamed about
this is a very pretty complex topic if you would like to learn more I do recommend this book which is M is for data monkey it's a good little read talking about not only Power query but also how to manipulate the M language I'll include a link in the description below anyway power query in my opinion is one of the most important features the most powerful tools within Excel and also powerbi and so it's worth your time investing and learning it and so this all culminates and we're now finalized covering power query in this chapter
in the next chapter we're going be jumping into Power pivot and that's going to jumping into actually data modeling but before that for those that purchase C practice problems you have some practice problems to go through and get more familiar with that M language for proceeding forward all right with that see you in the next one welcome to this chapter on power pivot and this chapter consists of four different lessons where we're going to go an intro into Power pivot and over the wind window that it actually provides then from there looking into Dax or
data analytical Expressions which is a Formula language very similar to excel formulas but before we actually jump into this lesson and going over what we're going for it we're going to focus on what exactly is power pivot so here I am in Excel and this is meant for me to just go through and quickly explain what is the power power of power pivot I know that pun is getting sort of old by now but it really is powerful if you're curious of looking at it it's in the workbook of power pivot intro part one part
two is what we're going to be using for the actual lesson so in power query in the last chapter we end up clearing up our data set to have these two main tables versus data job salary which has the complete data set on all the data science job postings and then data job skills which is unique to the skills for a job we also created a data jobs merge table but that table is actually going to be well it's pretty much Obsolete and power pivot is going to help replace that and for good reason so
what exactly is power pivot well it's an addin we're going to get to adding it in and it has a few different features that you can do within it such as accessing the data model adding measures kpis and whatnot this lesson is going to be going over this tab as a quick refresher power pivot is going to be available in basically any version of Windows for Microsoft past 2010 but it's completely not available in either the Mac version or the Microsoft online version so you won't be able to do this chapter if you have those
versions or the final project anyway the core portion of power pivot is actually managing a data model and what's a data model well a data model defines how data is basically structured stored and also related in this case we have the data jobs salary table right here and we have the data jobs skill table what we can do with power pivot besides modeling these tables and showing how they're structured is the more important thing of creating a relationship in this case I created a relationship between the job ID of data job salary and that of
data job skills and because I created this relationship I can look at things like the job title shot short column see how many jobs it has with it but also I can query across a table over to the job skills and see how many skills has with it in fact let's actually do that real quick here I have my data model itself I have my two tables which are shown anyway I can look at things like what are the count of the different job titles themselves I'm going to do that on job ID and like
we've done plenty of times before here's the job count with a little clean up of the actual text here but now with power pivot I can actually reach across to that other table of data job skills and drag the job skills into here and this is telling us obviously the count of the skills based on the job title pretty cool that we can reach across the tables and do this now the other cool thing that power pivot unlocks is Dax or data analytical Expressions recall previously that we were using the average of the salaries and
like we learned way back earlier in this Excel course we prefer actually a median salary but unfortunately looking at the value fied settings window here there is no option to actually pick median from this and that's where where Dax comes to the rescue with this I can go to something like the power pivot Tab and now create a measure which is where you actually insert in your Dax and I can create a new one called median salary and we're going to be using this Dax formula in this case I'm going to use the median formula
very similar to the Excel formula and I can do it on the entire salary year average column here I'm going to format it real quick and then press enter anyway bam now we have because of the power of Dax we have the ability to get the median salary and those Dax things can do some pretty complicated calculations so in the case of here we have this job count and count of skills and we want to see what were the skills per job specifically in this case what is something like C2 / B2 and then dragging
all the way down and filling it for all these this provides a much better analysis of what's going on with these values of counts and skills here when we get this proportionality we can create this with measures as shown in this final pivot table that we're going to be creating coming up in the third lesson of this chapter so in summary power pivot provides us the opportunity to now model our data which allows us to one create relationships and two allows us on unlocks these measures that we can create using Dax all right so let's
get into this lesson what we're going to be focused on for well first thing is we're going to enable the power power pivot plugin and then from there actually getting in to data modeling or modeling our data that we imported through Power query after we have everything set up with our data model we're going to then move into performing our first analysis analyzing based on a job title how many different skills they have associated with it like I said we'll eventually get to that skills per job in an upcoming lesson so for this you can
continue to work in that workbook that we were working with in the last chapter EMP power query we're going to continue work on that because we want to use those queries that we built if you got lost dur in the way and just want to start back up we're going to be starting from that M language workbook back in the power query chapter as a reminder these lessons or workbooks are what are the completed workbooks at the end of the lesson specifically for this lesson part one was just that intro part two is what will
be done at the end of this lesson anyway here I am in the M language workbook we need to get into enabling power pivot right now you probably don't see Power pivot up at the top of the tabs so I'm going to go into file and then go down to options from here I'm going to select add-ins like we did before and instead of excel addins we're actually going to be using those Comm addins I'm going click go and they have three different ones available data streamer power map and power pivot we want Power pivot
I'm go ahead and click okay now power pivot should appear up at the top all the way on the right hand side and should look something like this quick little overview of this tab manage here pops up the power pivot window which we're going to be doing a deep dive on this in the next lesson we're going to use it a little bit in this lesson but anyway that's one way you can actually access it you can also go to the data Tab and then here under data tools you should see it also and you'll
be able to manage your data model and once again it will pop up the window additionally on this tab you have the ability to create measures and kpis which going to be diving deep into in the third and fourth lesson if you have a table within your worksheets you can add it to your dat model you can also go about detecting relationships although I don't find that this feature works that well and then finally they have settings and settings I don't really touch that much nor does it have much control here so let's actually get
into EMB boarding some data into our data model we're going to do a simple example first here I created a new sheet made three columns of ID name salary and then different values associated with it one way I can add to the data model is if I have data in a table is to do this feature of add to data model in this my table has headers I'll go ahead and continue and then it will pop open power pivot a similar like environment will exist with Excel I can't actually edit any numbers in here this
is just how you're modeling your data if you needed to actually edit it I have to go back to the sheets and like I said this isn't a method I typically use typically have bigger data sets not located in tables so I'm going to go ahead and rightclick this down at the bottom this table name of table two click delete it's going to say hey do you sure you want to delete this table and Bam it's gone all right so now there's nothing in our data model right now here we are still inside the power
pivot window and if you've noticed from this in the Home tab right here it has the option to get external data they have options for you to actually connect Direct ly with power pivot to things like a SQL Server Microsoft Access you could also get it from some sort of data feed and then this option would be more probably useful in that it has a lot of different sources you could use such as other Excel files text files such as csvs and whatnot now you may be asking yourself I'm going to close out of this
power pivot why would I import of that whenever we just went through with power query to get data via this when which time should I use which well it's very important to remember the purpose of the tool that you're using power query is an ETL tool extract transform and load we did a lot of Transformations with our data set and so that's really the power of power query and then it loads it in power pivot strengths is not in ETL or data cleaning instead it's in data modeling creating these relationships and Dax now now you
may be tempted to come inside of existing connections and try to connect to specifically that salary and skills and if we went through like in the salary case and try to click open we're going to get an error message and I'll be honest this is really confusing because we have this workbook connections why isn't this working well it really just comes down to naming conventions and that the fact that power query connections are not the same as power pivot connections but we have a fix for this we just need to exit out of the power
pivot window here inside of queries and connections remember you can get to that by going to the data Tab and going to queries and connections we can go to something like data job salary which right now is a connection only rightclick it and go to load to right now it's only under only create connection but we need to check this check mark of add this data to the data model I'm going to click okay it's going to go through this process of loading the data and now it talks about the rows are loaded but mainly
if I go to the connection it has this new connection now of this workbook data model which if I go to and actually open up or manage our data model we can see that it's inside of here we have this basically sheet for the table itself of data job salary inside power pivot inside the data model now we do need to get that other pivot table or other table into there as well so I'm going go to queries data job skills s right click this load to and also add this to the data model okay
it talks about 167,000 rows are loaded and another connections still it's only going to be one connection because we only have one data model in this case and now when I go to manage the data model I have two basically sheets down here but two tables and now we have the data job skills in here anyway I want to do some cleanup real quick I'm going to clean up power pivot but this data jobs merged and this data jobs cleaned it's going to be very confusing like I said we're not using this mainly for the
fact that we have duplicate values in here for senior data scientists in this case and then for the salaries and so if we don't manipulate this in a correct manner we're going to get the wrong results so we're just going to get rid of these so for data jobs merge I'm going to write click and select delete and it's going to say hey should you want to delete data jobs merge yes I do and then I'm going to do the same thing with data jobs clean right click it and select delete also if you have
these tabs down here for data jobs clean or merge you can go ahead and delete those as well with our models now cleaned up let's actually get into going over really briefly this power pivot window with this we have three main tabs of Home Design and advanced advanced we're not going to go into a lot of things inside of this if any at all it's beyond the scope of the course we're going to be focusing mostly on the home and the design t tab so with this tab we've already gone over get external data but
we can do things like refresh our data if we know that it's updated in power query generate pivot tables and pivot charts based on our data model itself change the formatting of a particular column in this case is noticing as text if we go to the data jobs salary data we can actually scroll over and see that for the salary your average column it knows that it's a currency we did a lot of this cleanup right in power query and setting these different data types so this saves a lot of steps here in power pivot
if it wasn't done now we have options displaying the table below that we can actually sort it we can filter it or sort by a certain column they also provide options to find a specific value within here and then these features for calculations I don't find myself using that much as far as the auto so anyway over on the right the most important thing I find is allows you to toggle on the different views of your data set so right now this is the data View and if I scroll over here this is the diagram
View and this is going to show our two different tables side by side I'm going to move them over and actually expand this one out to show all the different columns and then the data job skills now back on that data view clicking that we have data view but also below this we have this calculation area which I can toggle on and off calculation areas are where we're going to be storing our different measures that we build with dacks and so they'll be appearing underneath here here if we have any hidden columns we'll be able
to toggle them on and off right now I don't have any hidden columns now one thing to note with this data cleanup some of that we did before with formatting stuff some of it's going to be quite limiting you may not be able to do like in the case of this so data job skills has this job title short column and actually if we look at the data jobs salary data set we have the same repeated column in it so data job skills this job title short right here is unnecessary now I could rightclick it
and try to delete the column and ask me if I want to delete it it's going to tell me it's not going to be able to do it because it was created by a query I.E through Power query and instead I should actually update it through Power query which I would actually argue as best practice anyway so I could exit out a power pivot launch power query by pressing alt F12 then go into the data jobs skills query and if I want I can just select this column and select remove columns but you know how
I am I like to actually clean up the applied steps because it could depending on how large your power query query is it could take a long time to load it and unload it necessary so if I go to this remove other colums that's the first time that it appears in it I can remove this by deleting it out of there then pressing enter we may get an error message we may not I'm not sure going to the last step in here I notice there one thing of the table wasn't found specifically here it's appearing
job title short in here so I can go ahead and delete job title short along with with that comma and Bam we now have this Final Table just to lean for those two steps I'm going to go ahead and close and load this and now going back in to look at our data model and power pivot I can see that it updated for data job skills all right moving into this design tab within power pivot this has a few different options within it for adding columns freezing columns just messing with the columns they also have
different options for creating calculations concerning columns we'll be getting into calculating columns more in the next lesson so stay tuned for that right the main thing that we're actually going to be doing in this portion of the video is actually setting up relationships and that is we could go about creating a relationship here and right now I have data job skills and I could relate it with the job ID by pulling the drop down to the data jobs salary table on that job ID now that's a way I can do it I'm actually not going
to do it this way I actually prefer going to the diagram View and then from there just dragging and dropping the job IDs across each other and then it establish this connection which we can see through this line through here now there's a few different things that we need to notice from this line here one this Arrow it's going to come to bite Us in the butt later and that's that that Arrow only allows data flow in One Direction and by data flow I mean filtering if I try to filter something in the data job
skills table this arrow is only pointing in One Direction I won't be able to filter it back we'll encounter those problems in a little bit and we'll talk about strategies how to actually offset it the other thing to note with this relationship here is you notice right here it says one and over here it says star in this case this is a one to many relationship and what does this mean well going to our data view for data job salary we only have one unique ID for each job whereas in the data jobs skills we
have multiple different job IDs or many job IDs now if we only had one job ID in there and we actually looked that diagram view for this relationship we'd have a one to one relationship but we have multiple skills in there so that's not possible now it's also possible to have a basically as to ASIS or many to many relationship but that causes a mess slows down your data model and I don't recommend it so you should typically see either a one to one or a one to many last little wrap up before we actually
analyze and use this relationship we have the options for table properties which we're not going to be able to look at because this was created the a power query for this connection and then we have options to create date tables underneath calendars which we're going to be exploring in an upcoming lesson and like always you have a undo and redo anyway let's actually get into analyzing and putting this actual relationship to the test so what we're going to do is inside the Home tab go to pivot table CU we're want to create a pivot table
with this we're going to insert a pivot table and we'll have it insert into a new worksheet selecting inside the pivot table it's not having the field list come up so I'll select it under pivot table analyze anyway we want to query across this table to show the power of the relationships so what I'm going to do is from the data jobs salary table I'm going to take that job title short throw it into the r those and then from there going to come down to the data jobs skills table and I'm going to throw
the job skills into the values it should be performing a count and then I'm going to organize this real quick from largest to smallest and it looks like data Engineers have the most so this is pretty neat we're able now to query across tables going back into that power pivot window this connection allows us to do that I'm going to just show you something real quick by clicking this Rel ship right clicking it and deleting it want to delete for model and I want to show you how these values are basically going to change inside
our pivot table basically to the fact that they're going to have it to where they're all the same value and that's how you know that your relationship is not set up correctly whenever you have multiple repeating values and you expect them not to be anyway sometimes you'll see this popup come up of relationships between tables may be needed autodetect and sometimes it works sometimes it doesn't um in this case it looked like it worked so we're going to go with it and just double- checking it in power pivot it is set up correctly so for
this final analysis we're going to be looking at building this visualization right here analyzing what are the top skills of data nerds we're basically remaking what we did in the power query chapter now that we have that updated data model anyway we're going to build this out to see where the skills counts for each of these and also provide filters for job country so back inside the workbook that we're previously working with if I would actually remember we did make that sort of similar visualization that I talked about but however if I go to data
and actually refresh the data it's going to give me this error message because once again we deleted dat jobs merged anyway I thought this was actually going to go away it didn't it is not what we want we're going to delete this one and then we're going to do a little bit of cleanup so that one that we created the job analysis on I'm going to actually just rename that quick to job analysis and then now in this new sheet we're going to do we're going to name this one skill job analysis anyway let's insert
a pivot table in here so we go to insert pivot table and now what we have the option for is from data model and it's ask if I want to put it in the existing worksheet yes I do remember we want to analyze the skills and specifically how many counts they have associated with it or how many jobs they have associated with it so I'm put the skills into into the rows and then from there I want to count how many jobs are associated with it so I'm just going to drag that job ID into
the values right now it's doing a sum going click on it go to Value field settings change this to count now you may be like Luke could we use the job skills count and we can which has the same exact values but actually closing this out and taking out job skills you're probably more interested in why can't I use something like the job ID from the data job salary table well if drag that over and then I change this value field setting to account count and click okay you notice it says 32672 which is coincidentally
the same number of rows of that data set and this gets into the point of filter Direction what do I mean by that let's go back to the data model itself looking at it in diagram view remember the arrow is pointed towards the data job skill table right now I have job skills in the rows and I'm trying to filter for data job salary based on the count of the job IDs but the arrow doesn't flow in that direction we can't do it now in something like powerbi you can actually rightclick this edit the relationship
and change the direction that's not possible within Excel unfortunately anyway we're going to be using Dax to fix this in the future for the time being we're just going to go about using in this case for this analysis the same values in the same table I'm going to remove this other job ID from the other table anyway we're going to sort these values from largest to smallest then additionally I only want to show the top 10 skills so I'll go to Value filters and then top one dot dot dot top 10 items by count of
job ID is what I want and so now we have this so now we have the values we want to visualize I'll go in and actually insert a pivot chart for this I like the bar because it makes it easier to read the different skills that it has right there and I'm realizing now the sword order is actually back backwards in this I want it from smallest to largest I'm also going to right click and hide all field buttons we're also going to be adding access titles for the primary horizontal and then removing that Legend
we'll update this title to what are the top skills of data nerds and then the y- axis is self-explanatory but for the x-axis we'll label this skill count in job postings okay the last thing we need to do now is actually add some slicers to this so we can actually control it better so selecting the table itself going to insert slicers I'm going to select the job title short and also we want job country right here which each of these slicers I'm going to rename them also this one job title short I'm going to rename
to job title and then job country I'm going to rename to Country now when I go through I can actually select something like data analyst and it will filter down and actually see the associated skills I could also do something like like look at those in the United States specifically for their counts and we see that SQL Excel and Tableau are the three top skills now you may be scratching your head on like okay I thought we were trying earlier to actually aggregate something in the pivot table and it didn't work well remember this arrow
is pointing to the filter Direction so in our case we have a job title short slicer because this arrows in the direction back to the data job skills table we can filter in that direction but we cannot conversely filter in the other direction that's why we can't get the counts from these tables little confusing I know but I promise you we will work out as we go through this entire chapter in power pivot so bam we just completed our first analysis for our final project we have a few more analysis coming up in the next
lessons you do have some practice problems though to go through and get yourself more familiar with power pivot and understanding what's going on with these relationships the one to many and whatnot all right with that I'll see you in the next one which we're going to do a deeper dive on looking into that power pivot window that I'll see you there all right let's now dive further into Power pivot and we're going to be focusing on the power pivot window for this we're going to be looking at some major aspects of it for this we're
going to get into using a little bit of Dax to create our first measure and with those measures we're also going to be exploring the difference between implicit and explicit measures don't worry we'll cover that in a bit from there we're going to move into a feature that's related to measures called calculated columns and it's going to allow us to inside of our data model create different values such in this case we can actually create a date colum from our date time value the last thing we'll explore are date tables which power pivot gives with
a click of a button and allows us to connect these data tables of these date tables to our original data source and then filter it by a lot of different data and so we'll wrap this all up with a final analysis where we're looking at job postings based on a day of week using this date table anyway jumping into Excel for this we're not going to be using any of the work that we've done previously instead we're going to open up a completely new workbook and be working out of this instead and the reason is
all the work that we're going to be doing within this lesson we're not going to be carrying it on to our project that we're going using this is more this lesson is more to get us more familiar with the powers power pivot oh gosh this pun's killing me and so we'll eventually incorporate some of the stuff into our final project but like I said we're going to be starting with a blank notebook or workbook for this as always if you want to see what the results are at the end of this lesson you can just
go to Power pivot window and it will have it all right so let's actually get some data into here to start working with and like I said we're not going to use power query at all for this we're going to use power pivot so I'm going to open up the goto to the manage the power pivot data model and we want to get this external data specifically we want to get that Excel workbook that we've been working with of data jobs salary all so underneath the Home tab I'm going to go to get external data
and it's going to be from other sources we scroll all down we could look at how we can import it from different databases or whatnot we're going to be doing it from an Excel file then from there we're going to browse the connections navigating into that data set folder I'm going select data jobs salary all it PR me if I want to use the first row as column headers I do if I wanted to I could go in and test the connection to make sure it's it's going to succeed and it does so we'll go
from there to next it sees that it has one sheet within the workbook that's the one that I want I'll click finish next it'll go through the import looks like it completed it has a success got 32,000 rows I'll click close now let's go through and actually clean this data set up using power pivot now I know in the last lesson I talked about hey we're using power query for ETL and that's true but let's say you have a quick data set you need to connect to and model quickly in that case you would do
some of the stuff that I'm going to do here in order to quickly model it if I wanted to rename it I'd come down to this basically sheet tab down here it's called sheet one after where it's at I'll rename it and we'll keep a similar naming Convention of Jatt jobs salary go ahead and click enter so let's say for this quick analysis that we're trying to do in this lesson I'm trying to analyze only the yearly salary data I don't care care about the salary uh hourly data and I don't even want the data
entries in here well I can get rid of that salary hour average row by just deleting this column by right clicking it it's asking me if I want to delete it yes and now there's still blank values in here right so I need to get rid of this salary rate values that are equal to hour so I'm going to click the filter here unclick next to hour and click okay so now we have that out the other thing I want to do is actually clean up the format of the salary and I'm going to change
that instead to a currency and this talks about how the data is going to be a changed when where it's stored yeah I don't really care about that no doubt that I care about will be lost it'll all be here still and then I'm going to reduce the decimal places by two the other thing I can do if I wanted to is actually sort this based on that job posted date could come up here and sort from newest to oldest and then it's in order sorry actually want it oldest to newest got confused on that
one so bam just did some quick clean up to our data set and now we're ready to proceed forward so let's actually get into building our first measure or measures specifically I want to analyze this to understand what are the different the the amount of jobs in here and then also what is the average and then also more importantly the median salary well there's a few different ways we can do this we're going to do this first within this power pivot window so in order to do this I'm going to first first I want to
do a count so we're going to just run this on this job title short column and here underneath on the Home tab under calculations we have this Auto sum I don't frequently use this I use it every now and then but I can run things on this like count or distinct count I'm going to do count in this case and this is going to create our first measure down here remember down below this area is our calculation area I can toggle it on and off by clicking calculation area up here anyway I can also make
this column slightly bigger and what's cool about this keep on scrolling over is now it tells us the name of this measure count of job tile short and that there's 22,000 remember there's normally around 30,000 but because we've taken out that hourly data we're down to 22,000 now I can also edit this measure if you notice it appears right up here similarly they have a formula bar in power pivot and to the left hand side it tells you what is actually selected job title short column and then the actual measure itself in here now one
quick note there is basically a colon and then an equal sign that's how we're going to know that we're doing measures and we'll get to calculate columns in a little bit and it will only be the equal sign but this is Microsoft's way of signifying that this we're using a measure so that way you don't confuse with anything else anyway I can edit this the actual title in this case and I can change this something to more more descriptive to job count pressing enter it now runs it and it's a lot shorter additionally if I
want to actually format it I can have the measure selected come up here select comma and then it formats it with the comma and then I don't want two decimal places I'll go ahead and remove it next let's get into analyzing that salary column with this once again we can click this I could use that auto sum and do something like average here clicking average and below it it generates that average of salary or average 123,000 and I can change it if I want to average salary but if I wanted to calculate something like the
median instead I would have to actually manually type out this calculation so selecting right below average salary and then coming into the formula bar I can type in something like median salary remember we want to create a measure so it's going to be a colon and then an equal to and then for this we want to use the median function now a lot of these functions that are Dax functions are very similar to what we use in Excel so they have a lot of different similarities but with this like we talked about before this allows
us to now put in basically an entire column into it and then perform that entire aggregation on it in this case I want to do it all on salary year average making sure I put a close parenthesis to close out that function and Bam right next to average salary we have this median salary now which needs to be formatted so I'll format it as English United States stes USD and remove the decimal places now what happens if I didn't enter that colon equal sign so here I am selected below median salary we'll go ahead and
paste in that formula and we'll delete that colon I haven't run this yet now I'm going to run I'm going to press enter and as you notice by this it's not actually calculating a value it actually just converts this to text so this is not what we want that's why we have to do the colon equal to sign for entering in the formula bar there so with measures it's important to understand implicit vers explicit measures so let's close out the power pivot window and actually getting into exploring these different measures by creating a pivot table
of that median salary we just created so I'm going to go to insert pivot table from data model we're going to insert it into the existing sheet here we have our table of data job salary I'm going to analyze the salary based on the job title short column so I'll put job title short into the rows and then look we scroll down at the very bottom you'll notice that the measures that we created have this F ofx basically it shows us an equation that it is a measure so I can take these measures this median
salary and in this case drag it into the values and now unlike power pivot where it did in that same column we're now filtering down to do it by well the appropriate job titles now we could also do something like drag the the job count into the values as well and actually see the job count there now both of these measures are explicit measures because we explicitly defined it we despine defined what job count is and what median salary is so what is an implicit measure well you actually created this before so in regards to
that job count we're doing a count of the job title short column if I were to drag that down into here you can see it says say count of job title short this is an implicit measure these are great for quick short analysis as we demonstrated before you can quickly throw something in and generate it and you didn't even know your us the measures and you were similarly with the salary year average if I drag that in down here we previously well changing this up to actually perform an average mov to average from there that
was also an implicit measure so I think you get the point but we're going to see the power of this as we go through this when we start to make newer measures that are actually going to use our explicit measures specifically we're going to be using our job count in other calculations and so these explicit measures are going to save our butt and save us so much time and ensure we're doing the correct calculations so let's get into our first calculated column and we're going to be going back into the power pivot window for this
we're going to be creating a column that will convert the salary year average values into Euro values so there's a couple ways we can do this or add these columns we can go under design and right here under columns we can click add to add a column additionally without that that unselected selecting back in into again you see this add column up here we can just go right in and add a column I feel that's actually easier anyway in this case in order to get the Euros value of what it is for Sal year average
we need to multiply by a conversion rate so inside of here I'm going to put the equal sign and we see it's popping up here in the formula bar from there I'm going to use the value in salary year average I just selected one of the values and it popped right in then from there similar to how we wrote formulas before I'm going to put times 0.9 enter now notice from this one I didn't use the colon equal sign right because is not a measure it's a calculated column and it still knew that this was
a currency although I don't like it it has two decimal places so I'll remove it and to me it knows it's a currency but it doesn't know that it's a Euro so I'm actually going to convert it over to Euro and then remove the two decimal places additionally I'm going to rename this from calculated column one to salary year Euro you can identify calculated columns because Normal columns are green the calculated columns are black also if I go to the DI diagram view we can see that well you can't really tell that we have the
calculate column C Euro but you can see your different measures that you've created all right so back to the data view even though we have this calculated column we could also create a measure on this calculated column clicking in the box below here and then typing in here I could do something like median salary Euro and then put in that median function for salary year Euro and then close the parenthesis and Bam now we have it I'm going to spread it out to actually see we have the value of € 103 now going back into
here we can take this and actually if we wanted to we could put the salary year euro into there that column it's going to aggregate it appropriately right now it's doing a sum so if I wanted to I could get a average of this of these values or we could actually go to that measure that we created That explicit measure throw it in here and we get the explicit value of the median salary Euro all right so let's shift our focus on this analysis let's say we wanted to analyze more around the date specifically the
day of the weeks for when job postings are occurring well let's go back into Power pivot and manage to open up the power pivot window right now investigating our the diagram view of our data model we only have one table in here data jobs salary well if we go under the design tab talked about in the last lesson we can actually create a date table I could also potentially mark this table of do job salaries a it's not a date table so we actually need to create one and you'll see what it looks like after
that and with that I did click new on this anyway it created this new table called calendar and expecting all of the different values in here well let's actually just get out of this view let's actually go to the data view one which is pretty cool with it with it what it created it created it based on the dates it knew what was in our original table so from the first of 2023 all the way to the last day of 2023 and with this it has a year column month day of week and day of
week number so a lot of great values from it now we need to actually connect these two there's no relationship between the two if we go to that data jobs salary so selecting it here here we only have this job posted date column which is a date and a time so we need only a date so because this column is named inappropriately I'm going to change it to J job posted date time so now let's create that new column with that job posted date time this time though instead of clicking add column we're going to
go to insert function and this is pretty neat because it allows us to actually look under different things in this case we wanted sort of a text function and we can look and explore different one specifically I know we want this one a format converts a value and text to the specified number format so I'm going to click okay and it automatically fills it in with this colon and equal sign of format equal to from there I'll select the job posted date time column that's the value and then what do we want for the format
well I know we want in the format of basically the year first then two months or two M's and then two D's for month and date in order to match close that double quote because that's the actual format we're using that's all we need so we'll close the parentheses and press enter and then I'm going to take this calculated column one drag it over here and then I can see that it did convert it correctly so I'm also going to go now and rename this appropriately to job posted date press enter so now let's create
a relationship between the two remember we can go to that diagram view or I can use this of create relationship go to calendar to match on the date itself let's see what it looks like in that actual diagram view we always want to inspect it to make sure we have this right one to many or one to one anytime we have many to many you need to start questioning it depending on what the data is anyway we now have a relationship established with this so let's actually get into analyzing this with our calendar based on
this day of the week and seeing what is the prop portion that they're turning out during the week for job postings so closing out the power pivot window I'm going to go in and create a new sheet from there I'm going to go go insert pivot table from data model we're going to do it in the existing worksheet underneath calendar Underneath more Fields I'm going to drag in day of week into the rows so it has Sunday all the way to Saturday then from there remember we created that job count already so I'm going to
take that and drag that into the values so looking at this I can see that I think our relationship is not set up properly cuz we have basically the blanks at 32,000 I think I know what's going on with this let's go back into the power pivot window in calendar when we select the date it's of the time data type date it also has this format of date and time I don't that really matters too much but if we go into Data job salary and we go to that job post to date because we use
that format function right now the data type is auto of text we need it to be of date and this now looks a lot more similar to what does on the calendar now when I close out of this bam all the values pop up here so don't forget about your data types and making sure they're match within the data model so let's actually visualize this by inserting a pivot chart and Bam we get this bad boy which we'll rename to to when are most jobs posted during the week and it looks like we have well
on Saturday Sunday or the lowest obviously during the week it's the highest with a basically a higher amount on Wednesday so pretty cool analysis that we were able to do based on the day of the week we didn't have to create any additional things and additionally we can evaluate based on this calendar table created we can do other analysis such as by the year month day of the week and whatnot all right so that's a brief intro into measures and also calculated columns don't worry too much if you're not feeling too confident with them just
yet as one you have some practice problems to go through to get more familiar with it but the next lesson will be and the next two lessons will be on Dax and Dax advance in order to explore different formulas that you can also use inside of your measures and also calculated columns all right with that I'll see you in the next one where we're getting into deck see you there welcome to this lesson on Dax or data analytical Expressions we' used it a few times before in the previous lesson but now we're going to go
much more in depth and actually understanding the basics of it now as we've learned Dax can be used within measures or even calculated columns for the purpose of what we we going through in the project we're not going to create any calculated columns but we will be using it for measures for this we're going to be focusing on three major types of functions in this lesson specifically around aggregation statistics and also filter these functions you're going to notice are very similar to your Excel functions that we did back in Chapter 2 so a lot of
those similarities and concept we've learned already are going to be able to be applied to this so we'll be able to move pretty quick now we're going to be answering two major questions regarding our final project the first one involves calculating the number of skills required per job title we're going to use Dax in order to calculate this and then we're even going to go on to actually graph this to show how it correlates with median salary spoil alert the more skills you have the higher median salary you can expect from there we're going to
go into a deeper analysis of salar specifically looking at the median salary and specifically being able to compare it from your home country to the US and also non us countries so we're going to use filter function in order to be able to view these things within a pivot table now jumping right into Excel for this you can continue working in the Excel file that you have from that first lesson on power pivot intro where we created this visualization right here which analyzes top skills of data nerds and has some filters for job title and
Country if you don't happen to have that file anymore or you got lost along the way you can just use the power pivot intro part two file and you can start from there now if you're loading it via the power pivot intro part two file you're going to have two sheets in there one skill job analysis and then also the skill analysis we're not actually going to be using the skill analysis so you can feel free to delete this or conversely if you're working from the files that you've been building up during this and didn't
necessarily load from the power pivot intro part two file you may have multiple tabs in there once again I only care about this skill jobs analysis where we have this this is what we're going to keep for the final project the job analysis and and also this other one that we created back in the power query lesson we're actually going to be recreating it with power pivot so both of these I can just delete or anything else you have in there you can f it free to delete after holding control and selecting both of those
I'm just going to delete them all right so we're going to be looking at aggregation functions first conveniently Microsoft has some documentation around the Dax functions and also statements that they have so I'm going to dive right into the link that's provided on the screen underneath Dax functions specifically I'm going to go into the aggregation functions they have this page here on aggregation functions overview and it shows a lot of the different functions they have for this average count Max Min sum let's look at count real quick count is pretty simple all we're going to
do is use the following syntax count and inside of it you provide a column and for this it says Hey the column that contains the values to be counted so pretty simple function to use similarly we have distinct count which has the similar syntax of you provide distinct count and the column and the column that contains the values counted and it will return the number of distinct values in columns we're going to use this so what we're going to be calculating with those functions that we just went over is trying to find out how many
skills per job we're going to first go through based on a job title and find not only the skill count but also the job count and then we're going to take both these values and divide them to get the skills per job so I'm going to create a new sheet for this and inside of here I'm going to insert in a pivot table from our data model we're going to do in the existing worksheet for the rows we're going to go through the do data job salary table and we're going to put that job title
short into the rows and then now we need the skill count remember we could go in and do something and create an implicit measure by throwing job skills and the values we want an explicit measure because we're actually going to be using the skill count in a later calculation to find that skill for job anyway how do we do this well we can also not only create a measure by going to power pivot and underneath here going to new measure you can also just select in here which table you want to use in this case
I'm doing a skill count so I want to contain it in the data jobs skills table doesn't really matter which table I'll put it in but I just go by my memory of which one I'm going to know to go look at for which in there it auto selects that table of data jobs skills the measure name is going to be skill count and then for the formula itself we want to do a count of the job skills column from the job skills table make sure it's not from the job salary table okay I'm going
to put a closing parenthesis on this and then for this we do want to format it to use a th separator and zero click okay and now in the data job skills table we have this explicit measure can drag it right next to it same values are getting created as the implicit measure so I'm going to take out that implicit measure next thing you want to calculate is that job count we're going to be counting it based on the distinct values of the job ID so I'm going go to add measure we're going to call
this one job count and we'll do a distinct count of we want to do it of the job ID column and for this one we want to make sure that we're actually doing it from the salary or data jobs salary table because this has all the job IDs in it once again we're going to format as a number with 1,000 separator and click okay and then I'm going to drag at the bottom the measure is going to appear I'm going to drag it into here so now we want to get how many skills per job
so we want to take the skill count column and divide it by the job count column this one doesn't really matter too much because it contains both of them but I'm going to put this in the data jobs skills table I'm going to call this skills per job now what's great about these explicit measures that we just created is I can go hey I want to do this skill count and I want to divide divided by the job count and it's right there so you don't have to necessarily write out every single time okay I
want to do a count of the job skills column and then divided by a count of the job ID column which actually needs to be a distinct count anyway this is where we run into errors that's why the explicit meas are so measures are so great all right so I have skill count divided by job count I'm going to create it as a number and I want one decimal place for this go ahead and click okay and then we're going to add this skills per job two here now I'm actually going to recommend although we
just use the division sign I'm going to actually recommend this divide function with it which is a ma math function and what would you do in this case is you would provide divide and you list a numerator and a denominator and the reason why I like this is because it fixes any type or catches any error specifically it performs Division and returns alternate results or or blank on division by zero so we're not going to necessar error out if we have a division by Z zero issue and you can actually provide as shown down here
in the alternate result the value return When division by zero results in an error so you could actually catch that any so I'm go going to go back into that skills per job and I'm going to go to edit measure I'm going to change this to divide specify the first and second parameter with a comma and then click okay okay overall no real change here but just a best practice to know about so now with this skills per job I want to actually get in and comparing this to median salary this is what we're going
to be building right here we're going to be comparing it to median salary and then graphing it in a scatter chart in order to see how these different job titles correlate to each other so first so to know what the final analysis is going to be of this I'm going to rename this sheet appropriately specifically calling it salary vers skills and this pivot table here we don't need necessarily the skill count or the job count we just need the skills per job okay we're going to calculate now the median salary and median is a statistical
function which is encountered underneath here but there's a lot of different options underneath here such as Med median finding the different percentiles like we did back in the formulas looking at things like standard deviation and whatnot so a lot of good statistical functions that you have access to Via Dax so for this measure I'm just going to come up here to power pivot go under measures and select new measure I do want this in the data job salary table and we're going to call this median salary for this we're going to be using the median
function and we need to provide it a column specifically that salary year average value for formatting we're going to format it as a currency with zero decimal places since it's a salary so now we have median salary here I actually want it to appear on the Y AIS so I'm going to throw it over here on the First Column so now we have the median salary and skills per job I'm just going to rate these or sort these from highest to lowest to see if I can see visually if there's anything going on with a
correlation right now I am seeing some higher skills than uh with a higher salary but let's actually visualize this so I'm going to insert pivot chart and select PIV pivot chart for this we want to enter a scatter plot and if you remember back from our charts lecture we're going to have issues with this you can't create this chart with the data inside the pivot table doesn't natively support creating Scatter Plots kind of annoying if you ask me anyway let's X out of this and for this what we're going to do is we're just going
to set this area starting up here we're going to set it equal to this entire table right here I'm not going to capture the grand total at the bottom because we're not going to be plotting that now with these values I'm going to select the contents in that this column f and g and then from there go insert a scatter plot specifically this one right here I can see it already looks pretty good you can't actually add the data labels in whenever you create this chart we actually have to go about doing that somewhat manually
specifically we have to select on the data points and then rightclick it and we have to select add data labels okay now it's giving us points which bar which actually correlate to the skills per job point it's not what we want we want to include the job title we're going to add that so we're going to do is select one of those values and just rightclick it and then from there select format data labels then the pane's going to open up on the right hand side and it should pop you up underneath label options label
options then this label options and right now we have this y value selected that's not what we want we want value from cell and it says Hey select the data label range what we want is right here all the way going down it's hidden behind here I'm going to sort of guess but I know it goes down to E11 click okay and scrolling it over bam we got all those data labels on there now all right so now we need to clean this bad boy up because well it's a hot mess that is all up
in the upper right hand quadrant labels are overlapping we're going to fix all of this first thing is I'm going to correct the axises so I'm going to click on the y or click on the x axis and it should go immediately to this minimum axis underneath access options and I can see the first value stops around or begins around 880,000 so I'm going to change this to that and press enter okay similarly I'm going to select the Y AIS and if doesn't go to it should be under access options inside that format access Pane
and I'm going to select this first value that I want to go to is three I'll leave the default of nine there next thing is we need some axis labels for the y axis we'll call this average skills requested for the x-axis we'll call this median salary and we'll specify the units of USD speaking of which this is not formatted correctly for how we want the numbers so under that format access pane under access options and under access opt options again under number we can go to the custom option specifically you should have this type
hopefully appearing up if not you can just enter it into this format code below and then press enter all right the last two things to do is rename the title naming it do more skills equal more money for data nerds which from this chart it looks like it does and we can actually confirm this if we want by adding a trend line now there's different options here for trend lines we've going over linear exponential IAL linear forecast I feel linear best meets this need here also like the coloring aspect of it so we're going to
go with that all right the last thing to do is just fix some of these names on here so right now we have the data La labels appearing to the right of the data point and in cases where it's close so data senior data scientist it's too close to the edge and so it's just sort of over the top of it anyway what you can do is actually select it twice so click it twice then you can drag and drop it and it should have these arrows or these connectors that connect the name to where
it goes to all right so now we have our final visualization and I'd say it's not too bad some things I'm noticing about this some correlation if you notice yes we do see the average skills requested are going up with the salary but those jobs I mean if you you can pretty much see it they div iding line those jobs that end an engineer Vice analyst or scientist are commanding or requesting more skills but yet have sort of a similar pay to their data analyst or scientist counterparts so I don't know I guess it kind
of pays to be a data analyst and not a data engineer don't tell my data engineer friends I said that all right last analysis we're going to get into is using filters to actually aggregate so in this case right here we're showing what we're going to get to the final thing of based on a job title short value what is the median salary in this first column for the us then what is the median salary for non us and then finally that final column of median salary what is the median salary of in this case
the selected column is uh Argentina it's filter down basically I call this filter function we're going to go over but we're going to be calculating or figure out how to prevent filters from affecting a visualization so we can get core values what we may want so we're going to create a new sheet and I'm going to call this salary analysis like before we're going to insert a pivot table from our data model insert it into this new sheet and we're going to be putting that job title short into the rows now we're obviously with this
going to be calculating median salary so I'm going to go ahead and just drag that into the values to start getting those median salaries additionally we're going to want to include a Slicer in here so based on the job country so I'm going to insert slicer on job country click okay and then with this we can actually see if we select something like Argentina it's going to filter down to what it is or what the salary median salary is in Argentina but remember we're trying to add two columns to this so we can compare these
values of something like Argentina to us salaries and maybe non us salaries so basically countries outside the US anyway we're going to be using filter functions for this and for warning on this it says it here the filter and value functions in Dax are some of the most complex powerful and differ greatly from Excel functions so there's going to be a little bit of complexity here in understanding this and for this filter function we're going to be using this one on calculate and what it does is it evaluates an expression in a modified filter context
calculate is pretty simple in my opinion first you provide an expression so such as hey perform a count of this column or a median of this column from there you provide a filter or filters and as it states below here filters can be Boolean filter Expressions table filter or filter modification functions main thing is here we're going to use things like logical operators in order to compare this to maybe a certain value we're going to expect so let's jump into creating our first one with median salary evaluating for median salary in the United States so
I want to create this measure inside of our data job salary column sorry data job salary table and for this we're going to call it median salary us we're going to be using the calculate function for this and inside of here we're going to insert the ex an expression so in our case the expression is the median of the salary year average column and what we're going to do actually I'm just going to leave this is cuz filter is optional we can tell filter is optional based on the square brackets around it I'm going to
just close out this calculate function change this to a format of currency with zero decimal places and then from there take that median salary us and actually drag it onto here so right now calculate is working by calculating the median salary and there's no filters applied to it so pretty simple so let's go in and actually edit this measure now now remember we have an explicit measure of median salary so I actually don't even need to Define it like I did here I can actually just call out median salary in this case kicking okay still
the same value going back in and actually editing it we now want to apply a filter specifically for this filter we want to make sure that the job country column is equal to United States so I'm going to type in job country and we can use logical operators so I'm going to use an equal sign right next to this and I'm going to specify United States need make sure it's spelled exactly right I know it's that via the column okay so now we're going to leave everything out El as is click okay and Bam now
it has the median salary filtered by the US and I can confirm this by scrolling down to the United States clicking United States and seeing that these values are the same but no matter what I actually click the United States median salary is going to stay the same additionally if you noticed here when I click on something like the US virsion Islands would am I moveing there they only have four job titles available so because of that they just filter this table down to only show those four that are applicable it along with their applicable
salaries in median salary in the US so now let's calculate the median salary for non us countries and actually see how they differ so come into D job salary select add measure for this we're going to be using non us values once again we want to use that calculate fun function on the median salary measure that we created and for this one we're still evaluating the job country but we want it not equal to so we're going to use basically a less than and greater than sign right next to each other say not equal to
and we'll say United States we're going to format this as a currency with zero decimal places click okay and then add this bad boy to the values and I want to actually see a country with more job postings in it so we'll go to something like Australia and now something like Australia we can see one comparing us to non Us in general the US well except for data Engineers yeah it looks like only data Engineers are the lowest one in another country everything else is higher in the US but now we can with this one
compare hey what does it look like something like Australia compared to us and non us countries so super useful in actually filtering down providing the right context for what we want to look at so as a data analyst median salary is around 100,000 Which is higher than us and also any other non us median salary so may have to move to Australia one last clean up right quick slicer itself I don't like it to say job country we're going to name this to country all right now wrap up the analysis for this all right so
you now have some practice problems to go through and test out these different Dax functions that we just went through along with some others now in this lesson we just did some basic dacks in the next one we're going to be moving into some more advanced Stacks features that I do find myself using from time to time but overall most of the stuff we apply in this lesson I use dayto day all right with that see you in the next lesson we'll be wrapping up basically our final question in our project and be done with
our project see you there all right welcome to the last lesson in this course where we're going to be going over more advanced decks specifically we're going to be focusing more in depth on fil fter and also relation or relationship type functions these are going to be needed by our data model in order to calculate what is the salary or median salary for an Associated skill if you remember back to a few lessons ago we had relationship issues I know I feel that with having them being able to filter tables in certain directions and we're
going to be able to see that and fix that in this lesson so in this lesson you can start with some the workbook from the last lesson or if you got lost dur in the way you can go into the Dax intro workbook now let's do a quick overview of where we're at with which analysis we've done for this project we've identified what are the top skills of data nerds along with different filters to filter for whatever our interest is in my case I'm looking for data analyst in the United States and I can see
that se SQL Excel and Tableau are some of the highest additionally we've zoomed out a little bit and been able to identify based on job titles where our job title of Interest Falls compared to others and how many skills it requires for data analysts it's right above did business data analysts and based on the number of skills it looks like it's appropriately rewarded for the median salary and then final thing we did was be able to analyze additionally Based on data analyst we can look at different countries and compare it not only in that country
but to within the US and outside the US so a lot of good stuff related to well data analyst that position and analyzing the salary but what about skills well we haven't done that yet we're going to get into actually analyzing in this first portion analyzing what is the expected median salary based on one of the top 10 skills we did this back in the power query lesson but now we have this new data model we need to recalculate it anyway we're going to run into some issues with the data model as we're going to
find out additionally we're going to be calculating the skill likelihood instead of skill count basically finding the percentage of a skill in a job posting this is somewhat complex so this portion here will be optional and you'll be able to use job count instead if you don't want to follow along with this skill likelihood anyway back in your workbook whether you started from that uh Dax intro or you're continuing on with from the last lesson we're going to create this new sheet for this and for this we're going to name this skill salary analysis as
usual we're going to go in and insert in a pivot table from our data model so we can get into analyzing the skills going click okay insert it in and so for this I want to analyze what is the median salary for a skill so if I drag the job skills from the data jobs skills table into the rows we have all the different skills pop up underneath here and then if we went up here and then tried to drag or we will be dragging in the median salary into here all these values are going
to be the same addition we get this popup right here that relationships between tables may be needed basically we're running into an issue with our data model even if I click autod detect it's going to tell me no new new relationships are found so what's going on here well let's actually analyze our data model by going to manage and then inside of here go into diagram view so the air resides with their filtering dire remember this Arrow right here signifies which way we can actually filter our data so in our case we have job skills
which is over here in the data job skills table and we're trying to find the median salary the problem is is we're basing that off of that salary or average value that's in the data jobs salary table and based on the direction of this Arrow we cannot flow in the opposite direction this is what we're call oneway way or single filtering now unfortunately Excel doesn't support bir directional filtering however in things like powerbi you can actually go in and change it from single filtering to both or bir directional filtering kind of makes me wish I
was in powerbi right now so back in Excel we can't actually control this via here and actually click it to change this to bir directional filters we can only control the relationship itself but we we can use Dax to fix this now in order to fix this relationship we actually have relationship functions inside of Dax specifically we're going to use this cross filter function with this function you put inside of cross filter the column names so in our case we can specify basically the job ID from job salary and the job ID from data job
skills and then from there we specify the direction which the parameters under here we can go into what we can provide to directions we can either provide none basically don't create a relationship both which is what we want filters on either side or one way which is what we have already we're not going to use this you also control filters left or filters right the one way we're also not messing with that we want both now this cross filter remember is a filter function so we need to use this in an appropriate for formula that
we already know calculate in order to filter so I'm going to x out of this box right here cuz that's not applicable what we're going to do is I'm going to calculate median salary or a new median salary if you will inside of the data jobs skills table and because it's uh going to use the same name but we're going to keep it in a different table it'll be perfectly fine and then for this remember we want to use still calculate we want to have an expression in here in our case we want to calculate
what is the median salary and we'll just use the explicit measure that we already defined then from there we'll get into the filter one of what we want to actually filter we want to provide for this cross filter and for this we're going to specify the job ID of one table along with the job ID of the other table then for the filter type we're going to use both okay I'm going to go ahead and close this now we're calculating median salary so I want this formatted as a currency with zero decimal places I'm going
go ahead and click okay and have an error in my formula should have known that by the X I need to actually put a closing parentheses on here and I'll lied to you a measure a column with the name median already exists okay I thought we could do that it's Sil me so we'll name it median salary skills go ahead and click okay okay now I'm going to drag this into the values and we can actually see with this one now that the associated median salaries are actually there and it's not all that 115,000 which
is basically the median of the entire data set so I'm going to go ahead andove move this other median salary out of here and from there we're going to also drag skill count into here I just want to look at the top 10 most common skills in this case so I'm going to go up here into our filter and go to our value filters for top one dot dot dot we want the top 10 items by in this case skill count and then from there based on these top 10 skills I'm going to sort it
from largest to smallest but like usual this is no good unless we don't actually analyze for the country and also for the title or job title so if I actually go back into that skill jobs analysis I can just select these two slices right there pressing control then copy it and paste them into here now you may notice whenever I'm clicking this this is not affecting this pivot table right here so we can actually inspect this by going to the slicer and going to report connections right now this slicer is only affect ing the skill
job analysis tab so this one right here in our case for this job title we actually want to affect it on this page here of skill salary analysis which is right down here click okay looks like the salary is updated also we want to do the same thing for Country adjusting the report connections for this as well and selecting this one right here for underneath the sheet of skill salary analysis clicking okay bam it updated as well so now looking at the top skill of data analyst in the United States which I'm pretty familiar with
I can see things like python Oracle and Tableau are top three Excel does make the list and it's the second to last at 84,000 now with this I do want a visualization with it specifically I want a combo chart showing this so I'm going go into insert pivot chart pivot chart and for this go down to combo for this I want the median salary to be the main focus and then for the skill count we're going to put that on a secondary axis because right now it's just way too low if we keep it on
the same axis and this has the format that I want right here go ahead and click okay I'm going to hide all the field buttons on the chart I'm going to add a primary vertical and also a secondary vertical axis along with a chart title and then for the legend itself I'm going to click it and then rightclick it and go to format Legend and for this it should go under Legend options Legend options Legend options I'm going to unclick this of show The Legend without overlapping the chart and I'm just going to move it
up here so not bad I don't necessarily want this orange line right here for the skill kind I don't really feel like a line is best to signify the count instead what I'm going to do is select the line and if it doesn't appear the format data series you can also just right click it go to format data series and then underneath fill and line they have line but also marker for the line we're going to go no line and then for the marker we're actually going to change the marker options to builtin we'll change
it to this square is going to be fine or we can change it to a diamond we'll make it slightly bigger and I don't really like the color so I'm going to go into design and change the color to this monochromatic pallette 8 nope never mind not that one I meant monochromatic palette one I want the bar charts to be more visually popping than the actual markers themselves I change the title two what's the pay of the top 10 skills and then change the primary access to median salary USD and the other one one to
job count closing this out and then making some room over here for the actual visualization itself so now we have our visualization that we want that looks at this and be able to show us what are the top 10 skills for data analyst and their Associated pay now one last thing for this regarding slicers I want to actually make it to where they're connected between the charts so right now I have it to where this basically this one for skill salary analysis tab if I go over to the skill job analysis tab select business analyst
it will change then go go back to skill salary analysis it updated to business analyst anyway I wanted to if we change a slicer to make sure that it changes on the appropriate sheets so the job title slicer is only on these two sheets actually that one's perfectly fine but the one we actually have concerns with now is the country specifically on this one I'm selected on the United States the skill job analysis one it's also on the United States and updates appropriately but then if we look in the salary analysis that one's on Australia
it's not updating appropriately so we need to go to slicer report connections and we're going to be putting the country one on all the different sheets so I'm going to go ahead and select all the sheets for this I'm going to do the same for skill salary analysis country slicer which it looks like it updated along for the skill job analysis so what I'm going to do is actually copy this now and put this into the salary verse skills because we're controlling it on this page as well and so now whatever I select select something
like maybe United Kingdom it will update appropriately and update on other sheets as well anyway quick one quick note because we move those titles around that one time sometimes it's not going to match up exactly how we had it before if you recall I'm going to go ahead and select all we set up these text box in order to view them whenever basically all countries were selected so that is one of the issues about dragging and dropping those titles and making them stick to a certain location it messes it up your filters whenever you want
to filter down for something like the United States so this wraps up basically our four major analysis that we did now I'm going to take it a step further this portion will be completely optional and that's this right now we're using skill count in order to look at what is you know the skill count of in this case for data analyst in we'll do United States we see that SQL is around 400 4,000 and that Excel is around 3500 but what does that actually mean well if we go to the Future file of what we're
going to get to we're actually going to be calculating a skill likelihood instead which in this case is looking at what is the proportion of a skill compared to all the different jobs that are available for data analysts in the United States and so that 4500 and almost 3500 is equal to well greater than 50% for SQL and about 40% for Excel so that makes in my mind a lot clearer how important that skill is over account in that you probably should be learning SQL and Excel as a data analyst so back in our sheet
where we're actually calculating with the job count how do we calculate this well let's actually get to moving this over to here go back into our pivot table self and if we throw up the job count you may get this relationship between toils maybe needed don't worry about it too much now these values are all stagnant based on some issues with the filter Direction but that actually comes to our advantage because for our filter right here specifically data analyst in the United States the amount of jobs that actually are are 8339 if I actually remove
both of these filters we would expect it to be the total rows of the column which is 32672 so coincidentally this is actually doing what we need we just need to get a percentage of these two values and that can be done pretty easy so let's open the show field list and actually get into creating this measure we're going to create in the data job skill table we'll call this skill likelihood and what this will do is take skill count and divide it by job count but remember we probably want to use the divide function
for for this so putting in skill count and then job count now there's no option to format this as a percentage unfortunately so I'm going to go ahead and click okay from there I'm going to drag the skill likelihood into the values and go through and format this appropriately selecting that it's a percentage and then with this I'm going to select something that a value that I know what it should be of data analyst in the United States and with those values selected I can see that Excel is at 41% which I know that's what
it is and SE is at 53% for these values so bam we have this skill likelihood now we can now go in and remove these other two columns of skill count and job count and then from here actually move this graph back over and unfortunately with the adjusting to it we actually have to fix this and turn this back into a combo chart so we're going to design change chart type into combo select for the skill likelihood we want this to be on the secondary axis click okay go back to format data series remove the
line and then change the marker option to be built in and to be that diamond at 6 point and then finally update that secondary access to basically say it's skill likelihood and Bam now we have this final visualization now there's one more that we actually do need to clean up and that's this one right here what are the top skills of data nerds right now we're doing a count of the job ID an implicit measure which you know how I feel about that we should use an explicit measure specifically we're using skill likelihood instead of
that and remove that count of job postings once again I need to actually format this as a percentage so going to home change it to a percentage and then from there clicking in it and sorting from smallest to largest and Bam for this one data analyst in the United States once again we can actually see visually what are the top skills for this so now we just updated both of these charts to have a more represen istic understanding of what's going on with the data all right so you should be super proud of what we
just accomplished in this project going through both power query and power pivot and actually diving deep to understand some key statistics about top paying skills and also top skills you should be targeting depending on what job you're pursuing and what country you're in now do have some practice problems go through and test out some of these more advanced functions specifically this cross filter function that we went over then after that in the next lesson we're going to be getting into how we can actually go about sharing this project for those that purchase the course practice
ice problems and also certificate you can now go through and complete that end of course survey and you'll be rewarded this course certificate now if you didn't do this it's not too late for you to go in and purchase the course so way you get this course certificate all you got to do is go in and take that Endor survey and you'll get it all right congratulations on your work so far see you in the next one all right congratulations again for finishing that last project in this video and the next video which are the
last two videos of this entire course they're going to be focused on how to actually go through and share your projects in my recommended way specifically we're going to be sharing this on GitHub so that way others can see it here I am on GitHub and also if you didn't notice there where you actually downloaded all those Excel files at the beginning of this course anyway inside of here is where I'm hosting my different projects and you've gone through and probably seen this but you may not have clicked on something like the project One dashboard
and in this case yeah I have the Excel file but that read me in there displays below this and this is what we're actually going to be doing in the next two videos to set this up and then create this read me and this allows you to detail all the different skills that you used along with detailing all the different analysis that you did while going through this now that was Project one project two is going to follow a similar method and that it has the Excel file and the readme and then in the readme
itself it details all the different work that we did in it so you may be like Luke why the heck am I going to be using GitHub in order to share this project I'm not familiar with it I don't know how to use GitHub at all why am I going to waste my time with it well I think it's useful not only in Excel but also other Technologies specifically programming here I have my SQL project for my SQL course and this this is where I host my SQL code and all the different analysis that I
did for it and similarly for my python course and the project we creating that I also hosted on GitHub and detailed all the different the steps that we did along with all the different uh python files associated with it so more the story is I think github's a great tool to use in order to share your work not only in Excel but also other tools now if you recall from Project one we walk through the steps to quickly share your project on one drive if you had it accessible via like a paid Microsoft subscription and
this provided a method to go through and share if you go up here and actually copy the link a usable link for others whether they have Excel or not to actually go in and then manipulate your dashboards that you have so you may be wondering why the heck are we not doing this with this second Excel file that we created with all of our analysis and then sharing it via this method well if you're called back to this handy Dan table of the different Microsoft versions and the different skills or basically Technologies within Excel that
it uses Microsoft online which where we hosted that first project at doesn't have the capabilities of power query or power pivot because of that I could go through the process of adding the second project to this which it's this file right here I'll open it up then actually investigating it well it does if you investigate all the different sheets does go through and actually show the analysis that we did but if you actually get into manipulating it like in this case let's say I wanted to see what are the top skills of data analyst you're
going to get this popup right here that says this workbook contains external data connections or bi features that are not supported basically power pivot and power query aren't supported it can't actually query the data it's just showing the basic last snapshot of the data right here and you can't manipulate it so in this case Microsoft online becomes pretty useless so that's why I'm recommending sharing it via GitHub as you can share all the associated files with this if somebody want to they could come in here and download it along with going through and actually detailing
what you actually did so basically controlling the story line and sharing what the different analysis or insights that you actually found now this what you're reading right now is a read me and it requires understanding markdown and how to write and markdown so we're going to be covering that more in depth in the next video when we get into markdown and creating the read me this video is going to be primarily focused on just getting this project into GitHub so what are we going to be doing for this well we have five major steps we
need to get through the first thing is installing git which is the core technology used behind GitHub we'll explain more in a bit second and third we'll be going through actually setting up our GitHub account and then installing GitHub desktop to then manage with Git our different folders and projects and then fourth and fifth we'll be basically initializing the repository which is a fancy term for a folder and from they are getting that folder repository onto GitHub to then share so before we install it what the heck is git well similar to how they have
track changes and stuff like word and PowerPoint git does this it's a Version Control System it tracks changes in not only files but also code and because of all this it allows you also to collaborate with others when working on a project git is the core technology behind maap managing all these different things going on on your own local computer and then whenever you make any of these changes get Hub is where it keeps track of these final changes if you will and then displays it for the world to see and also pull those changes
so here's my Excel di analytics course right here on GitHub and I have the same folders or repository on my own local computer now there's actually hidden folders or git folders in here managing this and I can do a shortcut on Mac of command shift period to show that but anyway I wanted to mainly show this of this dogit folder in here and this thing I don't necessarily touch this at all or work inside of it this.get folder contains all the different revisions and tracks all the different changes within my project so in order to
get this git folder inside your project and then also get it into GitHub we need to actually install git so navigate over to the git website into their downloads select your operating system Choice whether Mac OS or Windows I want a Windows machine right here and from there I'm going to select the 64-bit version for Windows and click here to download Once download I'm going to open the file as do I want to allow this to make changes in my device yes I do and then it's going to walk you through the setup process for
git all of these things are going to be left as default so feel free to just go through and select it all after I've left all the default settings as is and selected that it then gets into the actual install itself looks like it installed properly we'll go ahead and click finish we can confirm it's installed by opening something like terminal and you should have a terminal app installed this is just confirming it you don't necessarily have to do this anyway mine opens in a Powershell and you can just type something like get and it
shouldn't give you an error message it should instead give you how you could go about using git via the command line in terminal don't worry don't be AF of this we're not going to be using git via the command line although I may need to make a separate course on that instead we're going to be using GitHub desktop to manage git so in order to use GitHub you need to have an account if you already have an account you can feel free to just sign right on in but if you don't go through the whole
process of entering your email providing your different credentials and then getting logged in once logged in it should direct you to your homepage if it doesn't you can come up here to this icon at the top and from there just select your profile I would go through at this point and actually customize your profile specifically adding a picture your name a little description and any social media links over here on the right hand side of on my homepage I have some different pinned repositories because you just set it up you probably have none but this
is where we're going to be putting your Excel project when you're complete so that way if people navigate to your profile they can see it now that we have this account we need to actually get our project or our repository onto GitHub but unfortunately there's not really an easy method I've found with actually using the UI from the website to do this and that's mainly because there's a lot of technical things going behind the scenes and managing git instead I'm going to recommend downloading github's application to install on your computer they have it for both
Mac and windows navigate to this link here and for this we I'm going to go ahead and just download the 64-bit version of this application this one's a lot easier to install than get from here once we have it downloaded I'm going to open the file the installer should open this window for you to next sign into GitHub once you've enter your credentials for GitHub you'll use this to configure git and for this you're going to basically say hey I want to use GitHub account and name and email address to manage all this and click
finish now it should navigate you to the let's go started screen anyway it has methods for you to go through and create a tutorial repository if you want to we're going to be doing that and it has some different options for this that you can also select via the file menu such as a new repository add local repository or clone repository we're going to be creating a new repository and as a reminder repository it's basically a fancy name for a folder but it's a way for us to maintain and collect all of our different files
and not for what we're using in our project so for this we need to give it a name so I'm going to give it some descriptive like Excel project data analytics and for description I'll just give the simple one of my project Dem maturing my Excel skills for the local path we need to actually point it to the folder that has this so mine is inside my documents folder and real quick inside that folder itself right now I would expect you to have the project one and project two I also going to be putting all
the different files that I have for the different Excel workbooks that we work through in the lesson if you don't have them don't feel like you need it the main important thing is that you have both project one and project 2 in there and I have them conveniently located in different folders inside of here never getting out of that so I can select this Excel project. analytics folder I'm going to select this folder it's going to ask if I want to initialize this repository with a read me I do as far as the get ignore
I'll put none and license none as well and we'll create the repository so now you're going to be navigated to this screen here here which is basically the default screen of GitHub desktop it allows you to select different repositories right now I have only the Excel project analytics one it allows you to select different branches we're going to say on one shifting to another Branch beyond the scope of this course then up here at the top it has something like publish repository which we want to do but one quick thing real quick I can actually
investigate what files are going to be pushed up to GitHub by going here into history and right now it's just one I selected that box for read me so the readme is in there and the other one's just do get attributes the other ones aren't in there and I'm doing this on a Windows machine well if I navigate back to the folder that contains my project so here I have Excel project. analytics which I selected two from the GitHub desktop whenever I go into it it actually created another folder inside of it and that has
theget attributes and read me that it's talking about about now I've done this on both Windows and Mac and Mac doesn't cause this issue of putting another folder inside your other folder so for Mac users you may not have this problem so completely ignore this but for Windows user this is a problem because this right here is the project or the folder was going to get uploaded to GitHub so what we need to do is take all the contents of this by selecting it all and just pressing control to select it all and then dragging
it into that folder so a little confusing but if we go back to the documents we have our Excel project. analytics folder then inside of that we have our GitHub repo and then now navigating back into GitHub desktop I go over here and I see changes we have 85 of 8 five different files and folders within there it's actually picking up on all those different files that I have in there once again if you're on a Mac you may not see this because it's already in there in history and you can see it's actually within
the this portion of the guy anyway the thing now is if we go ahead and publish this repository to GitHub it's only going to have what's inside of our history right now under this what we're calling a commit and a commit is a snapshot of your repository at the time that you're basically committing it so we need to do a commit in order to get all these different changes into a repository cuz technically right now they're in an area called a staging area or the working area anyway we need to provide a summary that's required
and I'm going to add something simple like add all Excel files doesn't need to be super descriptive and from there I'm going to click commit to main now if I go into history I have this initial commit that it did but then that add all Excel files it's going to then have in all those different Excel files that I added into it so now that our local repository on your machine is is up to date we need to then publish this repository to GitHub and we can either click this button or this button here for
this we're going to keep the same name and description that we have before we don't want to keep this code private so we're going to uncheck that box and then from there we're going to click publish repository so my repository has quite a bit of Excel files and the memory size of it is pretty large so it is taking a little bit of time to do this so now we've completed pushing our local repository to our remote repository on GitHub so inside of GitHub I can navigate up here to the right hand side and I
go to your repositories and here it is the Excel project data analytics that we made public and it's all in here so now somebody can come in here and see our different work in this case our project One dashboard is inside of here we have our Excel file in there and Bam we've set up git and also GitHub and that was a push so now we need to demonstrate what is a pull and so in order to do that a pull request we need to actually make changes on our remote repository so that on GitHub
and then pull it into our local repository so here's what we can do for that I'm going to just go in and we created this read me. markdown file upon creation because we selected that checkbox you can actually come in here and edit this read me by clicking the edit file button and and I'm just going to come in here and I'm just going to say hey I added this on github.com adding it in the bottom now we're going to go into markdown formats and stuff as you can see we have this hashtag here we're
going to go all that in the next lesson but anyway I made this changes to here so we need to like we did on our local repository and making a change we need to commit those changes here and conveniently it just gives us a commit message of update read me confirm the correct email and it conects directly to the main branch we're just staying on that Branch we're not shifting for this course at all from there I'm going to commit changes so now if I go back into the project itself scroll on down to see
the read me I can see that I have I added this on GitHub whereas on my local machine if I go into look at the readme markdown it doesn't have that addition that I added to the readme file so we need to pull those changes going back to the GitHub desktop app I'm going to come up here and you notice that it says fetch or this isn't going to do anything this is just going to fetch origin basically the main branch and Pull It in this isn't going to make any changes to your file it's
just going to update it of what's on GitHub and we can see based on this that we have basically one change here by this one and this down Mark and so in order to get these changes we need to pull the origin pull it and so I'm just going to click it to pull and now when we go into the history we now have this new one of update read me we can see that this readme has this addition because it's in green of I edit this on github.com and then inspecting this in the readme
itself it now updated to say hey I added this on github.com so bam we just demonstrated how to push and also pull from our local repository and machine to our remote repository so now that we have GitHub and git all set up we now need to get in to actually building out those readms and explaining what we did in our project and demonstrating those skills that we gained in this course so that's what we'll be doing in the next lesson if you're getting stuck at any point during the way I highly recommend that you take
use of something like chat gbt or even gemini or whatnot and actually paste in your error code and it will help you with troubleshooting it it's a lot quicker than posting a comment in here saying that you had an issue all right with that see you in the next one we're getting into the Remy see you there welcome to the last video in this course and in this we're going to be going over how we're going to actually document all the different work that you did for project one and for project two we're going to
putting this into our markdown file or our read me and then from there getting it onto GitHub and then finally going through how to share it on LinkedIn so right now navigating to our GitHub repo with our project in it you should have at least two folders in there one for your project One dashboard and one for your project 2 if you have your other folders for all the work that you did for all the other lessons in this course that's awesome too but not required mainly just have your project work in there anyway we
have this read me for the entire project itself and right now it's pretty Bare Bones and if we navigate into that project One dashboard right now you should have only have a file in there specifically that Excel file but we need also a readme in here as well so we can description add a description of what we did in that dashboard similarly project 2 doesn't have a read me as well now we have demonstrated in that last lesson how we can actually go into something like the readme and then from there edit it inside of
your web browser by just clicking this edit this file icon it shows not only the edits for you to actually go through and maybe type something but also the preview itself itself of what the file is going to look like don't worry we're going to be going over markdown syntax in a little bit but anyway that's how we're going to be doing all these different changes to the files for this I'm not going to do these changes I'm actually going to cancel these changes now an alternate option to making edits to something like a readme
is using a text editer or IDE integrated development environment such as something as Visual Studio code which is completely free and is I have it launched here in my app um is an app that I use in order to edit and manage my different files I can also go through if I'm editing the read me itself I can type inside of here and edit it but also during that I can actually go in and view what's going on with the actual read me itself off to the side while I'm typing here in this other window
anyway I just want to make you aware of this that is an option for you to go through but it does take some experience with knowing how to use vs code setting this all up so based on the complexity we've already built up already we're going to stick to just editing our readms inside of github.com so before we get into building our project readms we need to understand some syntax here specifically if you notice this Excel project analytics is capitalized and everything else is lowercased and if we actually go in and edit the file we
can see that we have this hashtag at the front which translates this into a heading so they have special characters that you can actually use in front or around text to manipulate text and the team that created markdown conveniently created this cheat sheet which I'll link here and it shows all the different methods that you can use to actually manipulate and make different things happen inside your markdown file so let's actually look at a few here I have a heading one heading two and heading three denoted by how many hashtags and a space and then
if I preview this heading one heading two and heading three next we can either bold or italicize text by surrounding it either double asteris or single asteris and the final results right here is bold and italicized notice how the Bold text and italicize are on the same line it's important that after you go to a new line you actually put two spaces in there now that I have that in there it will actually shift it to the next line we can also do things like an ordered list or an unordered list which would be like
bullet points and it conveniently indents that and makes it look a lot nicer we can o surround something by a back tick which is located up at the top of your keyboard or you could do triple back ticks at the top and bottom for if you have multiple lines of code and if we actually go to preview this we can see that the single line of code was just surrounded whereas a multiline creates this entire coding block the final two worth mentioning are links and also images for the link for the text that you wanted
to appear for the link you'll put in square brackets and then for the hyperlink itself you're going to put that inside a parentheses right next to it and then actually changing this to a real world example of something like google.com if I go to preview and then I click this link it's going to ask me if I want to leave site and go to Google I'm not going to do it because it's going to mess up all my changes but you get the point for images is very similar but the text you provide in the
square brackets is just your alternate text so whenever you scroll over it what the text is displays and then from there is the actual image location however this isn't an actual image location so I have this eror message that goes on with this alt text hence this broken file you're going to notice that if any of your files for your images are broken anyway github.com actually makes it pretty easy to get images in in this case I have a gif of the dashboard you could also use an image file but all I have to do
is take it and drag it into here and if you notice it automatically formatted it with alt text and then the actual link location itself so saving the file itself and it puts that exclamation point at the front signifying that it's an image or in this case GIF if I go to preview scrolling down we can see that we have our image once again you need to put spaces after that other one to make sure that you're not having it all in the same line but you get the point anyway let's actually get into creating
this read me that's on the homepage if you will of our actual project and the main point of this one is I want people to be navigated to the appropriate project depending on what they're looking for so I went ahead and put in some text already for how I want to break this down I'll break uh I'll shift over to preview and I'm going have a title such as my excel. analytics projects from there we're going to have the salary dashboard project and the salary analysis right now the image that I have for the dashboard
is in the wrong location actually shift that up now I went ahead and added the images also for our salary analysis while cleaning up where the salary dashboard is which I included only just two graphs here but I just want to give a sneak peek of what's going to be inside of those other readms that were about to build out now you may be wondering how the heck do I get screenshots of graphs in my different dashboard well depending on if you're using Mac or Windows they have software installed already and so these shortcuts should
work for you in order to perform your appropriate screen capture I primarily use on a Mac command Shift 4 to select a certain area and it allows me to basically just hover over something and snapshot it this same thing can be done on a window Windows machine you're just going to press Windows shift plus s so I went through also and just added a quick description to each section I'm go into preview because it's a little bit easier to read there anyway underneath this I just detail hey this contains all my Excel files to follow
along in my case my free course of Excel for data analytics I would word it differently for you of that you're actually providing all your different Project work in this repository additionally I provide a short description for the first project and then also a short description for the second project make sure in this case you actually are putting spaces after those lines so you don't have those images overlay on top of it now the last thing I would do as you see here I link to my course but I think more importantly what you need
to do is actually link to the appropriate files within this repository so people can quickly get to the salary dashboard or the salary analysis and so I'm going to add this link of connecting to that appropriate project by first adding this text of check out my work here and then inside parentheses I'm going to list the folder of project One dashboard you have to make sure you spell it exactly like the folder that is inside of your repository or the Link's not going to work I'm going to do the same with the project two dashboard
as well and going to preview it I can see it's all there I probably want some spaces in between this and so just put an extra enter in there okay that's good enough I'm going to get into committing the changes this is update my readme that sounds good I'm going to commit them so now on our home folder of our repository of excel project. analytics scrolling down I have my read me here it tells me about it and then for the salary dashboard it says hey check out my work here when I click on it
it navigates me into this folder for the salary dashboard which you need to now create a readme 4 also it's just good practice to make sure that you check to make sure that other link works as well and in this case it didn't it's a good thing we checked it I had project 2 dashboard and instead it was actually project 2 analysis I'm going to commit changes and then now when I actually try it out bam navigates me to the right location so now you have now the basics to go through you understand markdown enough
to edit it I'm going to walk through how I built out the project one read me and also the project 2 read me so that way you have some understanding of what you should do going forward with the project one I recommend including a picture of the dashboard to start and then a brief intro detailing why you wanted to do this project underneath this make sure you include a link to the file itself which is conveniently right here and then inside of here detailing the different skills that you use with building this is really important
for job Seekers that way if a recruiter comes and looks at this they see what the skills are you used in this and then from there I talk about the data set itself talking about what we were trying to get or extract out of the data so basically all the foundation they need in the introduction portion this portion I recommend keep being the similar format the next portion you can feel free to go about however you want specifically I go into the dashboard build breaking it down into three main areas of focus on first is
the charts itself I highlight the different median salaries all of the different job titles themselves I go into some insights from that I also talk about the country map and the insights from this as well next after charts I move into functions and formulas detailing one of the key functions that we used using median and then an if statement in order to build out an array formula so not only breaking it down but also explaining what insights we're able to get with this formula and then the third skill I talk about is data validation talking
about why it's used a gif of How It's actually applicable or how it's actually visually seen in Excel and then finally I just wrap it up with a conclusion so to recap for the first project you need an intro statement describing what we're doing and why you did it and what skills you used then then from there on the build itself explaining what you actually built how you use those skills and what insights you got out of it and then finally wrap it up with a conclusion for the second project mine is very similar formatted
in that I have an introduction Excel skills used the data set and then since this one was primarily focused on analysis I included the four questions that we went through and actually answered for our analysis so then with the template of these four questions I broke each one of those down with those questions primarily focusing on one what skill did I use to help answer that question and then two what is the analysis insights I got out of answering that question I repeat the same thing for the second question specifying the skills that we use
for this and then the analysis or what insights we got out of it after going through questions three and four we then get to our final thing of a conclusion of what you actually learn and extracted from insights for this so it's really good to put all this stuff in it I wouldn't be overwhelmed and think you need to include everything in it think about a job recruiter themselves they don't have a lot of time so keeping it as short and to the point as possible is going to be best for you once you're done
actually gone through and built out your repo with all its Associated read me it's time to get into actually sharing this on social media via LinkedIn I recommend the same approach that we used back in Project one of listing this down in your project section by going through and actually clicking the add icon and adding the projects if you did go through and actually add that salary dashboard already I would just focus this one on the salary analysis so I'd put in something like a name of the data science job analysis a description add any
appropriate skills there's a ton of different skills you actually select for what you use I would focus on primarily these of Microsoft Excel power query data modeling ETL and pivot tables for the media in this case I would include a link to your repo and paste it on into here and click add it will then provide this snapshot thumbnail of what's going on here and a title I like it all I'll click apply now if you recall back from that first project we tried to provide the link of that one drive link for Excel and
it didn't work so if you have that project on LinkedIn I would go through and also attach this link as well to that so that way they know how to navigate to it finally select your start and stop date if you have any contributors are associated with I don't have in this case and then from there save it the last thing I recommend doing is making a post telling others about your project so they can come in and see it in it I would definitely include something like a link and feel free to tag Kelly
or myself in it I love checking out your projects and seeing the different work that you've done for it so once again congratulations for finishing this course been nothing short of your hard work Excel was the first skill or main skill that I learned in helping me land my first data analytics opportunity so I feel the same can go for you as well now after you taking a short break and you're ready to get back into learning more skills I do have a squel course that I recommend you taking as you've learned from analyzing this
data Excel and SQL are two of the most top skills of data analyst so it pays to know it and you can basically learn it in a weekend all right with that I'll see you then either in the next video or in the next course see you there
Related Videos
The Ultimate Excel Tutorial - Beginner to Advanced - 5 Hours!
5:43:05
The Ultimate Excel Tutorial - Beginner to ...
Simon Sez IT
1,630,015 views
Power BI Full Course Tutorial (8+ Hours)
8:20:12
Power BI Full Course Tutorial (8+ Hours)
Learnit Training
1,687,945 views
Statistics - A Full Lecture to learn Data Science
4:15:27
Statistics - A Full Lecture to learn Data ...
DATAtab
740,449 views
IBM Data Analyst Complete Course | Data Analyst Tutorial For Beginners,
15:18:23
IBM Data Analyst Complete Course | Data An...
My Lesson
2,356,945 views
Excel Full Course Tutorial (4+ Hours)
4:29:00
Excel Full Course Tutorial (4+ Hours)
Learnit Training
455,557 views
Music for Work — Deep Focus Mix for Programming, Coding
3:24:55
Music for Work — Deep Focus Mix for Progra...
Chill Flow
615,678 views
Positive mood jazz☕Relaxing Piano Jazz Music for Study, Work & Chill Out
3:16:17
Positive mood jazz☕Relaxing Piano Jazz Mus...
Jazz & Coffe Shop
7,084,412 views
SQL for Data Analytics - Learn SQL in 4 Hours
4:08:41
SQL for Data Analytics - Learn SQL in 4 Hours
Luke Barousse
692,116 views
ChatGPT for Data Analytics: Full Course
3:35:30
ChatGPT for Data Analytics: Full Course
Luke Barousse
699,080 views
Data Analysis with Python Course - Numpy, Pandas, Data Visualization
9:56:23
Data Analysis with Python Course - Numpy, ...
freeCodeCamp.org
2,875,837 views
Destroy Satanic Altars with these powerful prayers Apostle Joshua Selman #prayer #jesus #koinonia
Destroy Satanic Altars with these powerful...
Apostle Joshua Selman Prayers
Excel 365 Beginner to Advanced - 12 Hours
11:43:05
Excel 365 Beginner to Advanced - 12 Hours
Simon Sez IT
24,908 views
Intuitive SQL For Data Analytics - Tutorial
11:00:16
Intuitive SQL For Data Analytics - Tutorial
freeCodeCamp.org
293,100 views
Forest Cafe Jazz Music | Morning Tranquill Jazz With Nature Therapy For Stress Relief, Study & Wo...
3:22:50
Forest Cafe Jazz Music | Morning Tranquill...
Tranquill Jazz Melody
5,164,472 views
How I'd Learn to be a Data Analyst in 2024
13:17
How I'd Learn to be a Data Analyst in 2024
Luke Barousse
301,959 views
Data Analyst Bootcamp for Beginners (SQL, Tableau, Power BI, Python, Excel, Pandas, Projects, more)
19:23:46
Data Analyst Bootcamp for Beginners (SQL, ...
freeCodeCamp.org
1,532,655 views
9 Huge LIES About Becoming a Data Analyst Nobody Talks About
14:49
9 Huge LIES About Becoming a Data Analyst ...
Avery Smith | Data Analyst
6,588 views
Excel for Finance and Accounting Full Course Tutorial (3+ Hours)
3:58:57
Excel for Finance and Accounting Full Cour...
Learnit Training
954,740 views
How I Would Learn to be a Data Analyst
10:22
How I Would Learn to be a Data Analyst
Luke Barousse
73,194 views
Excel Tutorial Beginner to Advanced - 12-Hour Excel Course
11:55:34
Excel Tutorial Beginner to Advanced - 12-H...
Simon Sez IT
873,686 views
How to Use Excel - A 3-Hour Path to Confidence and Skills
3:08:17
How to Use Excel - A 3-Hour Path to Confid...
Teacher's Tech
279,118 views
Copyright © 2025. Made with ♥ in London by YTScribe.com