in this course you will learn how to use python to automate everyday tasks such as creating an excel report sending text messages extracting tables from websites interacting with websites and more you will learn how to use a few python automation libraries over a series of projects frank andrade created this course frank is a data scientist and an experienced teacher make sure to check the description for a link to the code and an automation cheat sheet frank created for this course okay now let's see how to extract tables from a website in python we can use
the pandas library to extract tables from websites like wikipedia as you can see here so here i have a list of the simpsons episodes in wikipedia and we can extract all the tables that you can see here so we only need to use the pandas library and i'm going to show you how to do this right now so we go here and i'm using jupyter notebooks right now and to install pandas first you have to write pip install pandas so in jupyter notebooks you have to use this exclamation mark but if you're using pycharm or
the terminal just write pip install pendants all right once you install pandas we have to import it so we write import pandas and we write spd so pd represents pandas so first we import this and then to extract tables from websites we only have to write pd dot read underscore html so inside these parentheses we have to write the link of the website so here i open quotes and paste the link of this website which i copied before and now this is going to return a list so this list is going to have multiple tables
because here in the website there are not only one but like 20 tables so we're gonna get like 20 tables in a list so i'm going to name this list simpsons so this is my simpsons list and now i run this and we get these tables inside this list so let's see first how many tables are in this website so here i write land simpsons and here i run this and we have 23 tables so let's get only the first table or actually the second because i think the second is the one that corresponds to
the first season and as you can see here we have the second table which corresponds to the first season of the simpsons so we have here episode one two three and so on and if we go here we can verify that this is the same data so i have here season one and you can see that is the same data so we successfully extracted all the information in this table now let's perform some basic web scraping with pandas web scraping consists in extracting data from websites so instead of doing it manually we can automate it
with some web scraping techniques and in this video we're going to extract csv files from a url using only pandas so here is the target website we're going to scrape and it's this one so this website contains data about football matches of different links so here you can see a lot of links and now i'm going to choose the first one that says england football results and here we're gonna see some data about premier league and other leagues that england has and if i want to download one of these files i will have to click
on any of those and as you can see here i downloaded that csv file of the first listed here so this one corresponds to the season 21 22 and is from the premier league so instead of manually downloading each file we can use a specific pandas method to read these files from the internet and also by using the for loop we can automate this and download all the files that you can see here so instead of clicking one by one we can download all the files listed here so there are a lot of them and
we can download it just with pandas and a for loop in python so let's do it here and now i'm going to show you how to extract data from a single csv file from this website so to do that we have to use the read underscore csb method so we write pd dot read underscore csv open parenthesis and we've used this method before but when we use it we read some data that was in the folder where we were working so in the folder where this jupyter notebook file was located but in this case we're
not going to read anything inside our computer but we're going to read data that is in a website so instead of writing the path of the file in your computer in this case we're going to write that link of the file so here i'm going to show you this file has a link so if we want to download we have to make a request to that link to get that file so i'm going to show you here i'm going to right click and now i'm going to copy the link address so i copy and now
i'm going to paste it here and now i press enter and let's see what's going to happen so i press enter and as you can see here instead of going to the website it downloaded the file so this means that this link contains the data we want to struct so we're gonna use this link so i'm going to copy again here copy i'm gonna make sure this is the address so copy link address and now go back here open quotes paste that link and this is the link we want to struct because it contains here
that that csv so this means that this is a csv file and as you might remember we read here csp because we're using the read underscore csv method so that's everything you have to do to read this csv file that is stored in this website so now i'm going to run this and let's see the results so here as you can see all the data was read here with a read underscore csv and it was successfully loaded here so now i'm going to set here a new variable and it's going to be called df underscore
premiere 21 and as you can see here this belongs to that 2122 season because here in the date it says 2021 and also this is premier league because the teams belong to the premier league you may know if you're familiar with this competition but if you're not it doesn't matter so now let's continue so here i'm going to set this data frame to this variable so i press ctrl enter and now i'm going to show here this data frame that we saw already and now i'm going to rename some columns because some column names aren't
so obvious so maybe you will struggle to understand what this column means for example and i'm going to rename some of them so let's do it here fast and we also practice that rename method so here i'm going to copy here now i write the name of the data frame dot rename open parenthesis and now we want to change the columns so i write columns equal to then open the dictionary and here the key is going to be let's say we want to change only the these two columns so i'm going to tell you what
they mean so here i can write the name of the column we want to change then column then the value and now comma and now the second key or the second key value pair so here is the second element and now i'm going to paste this one so here it's the second you know this first fthg stands for final time home goals so it means all the goals are scored by the home team so i'm going to replace this name with the home underscore goals name and here is final time away goals so i'm going
to write only a way underscore goals and that's it those are my new names now to update the column names i'm going to write in place equal to true now i run this and now i'm gonna show this data frame updated so now this column is named home underscore goals and this one is away underscore goals and that's it in this video we extracted a single csv file for my url with pandas all right now i'm going to show you how to extract tables from pdfs so here i have a pdf and we're going to
track this table that you can see here so we can get this in another format for example i'm going to export this table to a csv format so i'm going to leave this pdf in the description so you can also extract the table from this pdf okay to extract tables from pdfs we have to install a library called game load so we open up the terminal and here we write pip install game load hyphen pi so you have to install this but before you install this library you have to install two dependencies which are tk
so you write pip install tk and the other is ghost script so you have to write pip install go script and install these two libraries these are the two requirements before you install camelot so once you have these libraries we import game load writing import came load so we write this and now we can start using this library so first to read a pdf we have to use camelot dot read underscore pdf and here i'm going to write the name of this pdf which is full dot pdf and then i'm going to specify which page
i want in this case i want the first page and here you can add another parameter for example here there is the flavor parameter and this is the parsing method to use this is set to lattice by default but in case you failed to extract the table from your pdf you can set it to a stream so maybe with the stream it works much better so i'm going to leave it with a default value and now i'm going to set this to a variable name table so equal to tables and now let's see what's inside
these tables variables so i print these tables and now let's see the content so here we see that we have a table list and we have n equal to one and this represents that there is only one table in this page number one and that's exactly what we have there is only one table in our pdf so great now let's export this table to a csv file so to do that we have to write tables that export and then write the name of the csv file we want to export to so right foo that's csv
and this is just a name i'm writing right now and now i set the format to csv and then set compress equal to true okay now in case you have many tables you want to pick one table in particular in my case i only have one table so i'm going to pick the first table and i'm going to export the first table using tables with square brackets 0 which represents the first table so now to csv and now inside foo that's csv and with this we export the first table to a csv file so now
we only have to run this and now in my working directory i have a new file named full.csv so now i'm going to check it out and as you can see here i have the data and here is in csv format and we can verify that this is the same data i have before in the pdf so it's exactly the same but now in csv format and that's how you struck a table from a pdf in this video we'll learn some html basics that will help us when scraping a website html stands for hypertext markup
language and is one of the most basic tools used to build websites it defines the meaning and structure of web content although we're not going to build a website in this course it's important for us to get familiar with the basics of html this will help us understand the code behind a website as a result we will be able to find the best approach to scrape a website and get the data we want now let's review the html markup syntax the first element in purple we see on screen is a tag they are hidden keywords
within a web page that define how a web browser must format and display the content most tags must have two parts an opening and a closing part for example the h1 tag is the opening tag and h1 with the slash is the closing tag note that the closing tag has the same text as the opening tag but has an additional forward slash character a few tags like the img tag that stands for image are an exception to this rule because they don't need a closing tag there are plenty of tag names and will review the
essential tags for web scraping a bit later the second element we see is the tag attribute attributes allow us to customize attack and are defined within the opening tag for example the h1 tag contains an attribute named class attributes are often assigned a value using the equal sign the attribute value in this example is title finally we have the affected content which is what we usually see in a website it can be a text for example like a title of an article it's cool like that because it's affected by the tag and attributes defined so
all of this is known as an html element or node great so far we've learned some of the basics of html now it's time to see some of the most important tag names for web scraping so let's start with some of the most common tag names the first is the head tag this represents the head section and it's used mostly for metadata then we have the body tag this establishes the body of an html document and the third html tag you see on the screen is the header tag not to be confused with the head
tag the header tag typically contains introductory content and layout that goes above the body let's see these three tags in action in a website that we're going to scrape later in this course first let's inspect this website so right click and select the inspect button after this you should get the html code of the website so scroll app and go to the head tag as you can see on the screen so this is the head which doesn't represent anything at all but if you go to the body you'll see that all the website will be
highlighted in blue and then if you select the header you'll see a section of the website highlighted in blue which is where the logo goes next we have the article tag this tag is new in html5 and this tag can be used to contain block entries posts etc then we have the p tag which includes paragraph in an article and then we have the h1 h2 or h3 tag which are level headings so h1 is level one heading which is the headline or title of a page and h2 is the level two heading so the
subtitle of a page and so on in this website we can see the article tag the h1 tag and the p tag so the first contains the whole article and h1 contains the title of that movie and finally the p tag contains the plot of the movie the following tags you're going to see really often when scraping websites and you should at least recognize them to find the best way to scrape a website first there is a div tag which is a divider or a kind of generic container then there is the nav tag which
is used for specifying a navigational region within a document like the pagination bar then we have the li attack which represents a list item with an order list or an order list and then we have the a tag a for anchor also known as hyperlink or simply a link to make an actual link using the a tag we use the href attribute for example in this website we see many movies listed in a table we see that this is inside a ul tag which contains a a tag which is the anchor and inside it is
the li tag which is the list item tag as we can see the a tag has the href attribute that contains the link that redirects to another page so now you can see how they relate to each other in this html document let's see the last six tags i listed here so first we have the button tag so this one specifies a button that can be clicked it's commonly used with forms then we have the table tag that is used for making tables in an html page then is that td tag which stands for table
data this represents a data cell within a table then we have the tr tag which stands for table row element this tag defines a row of cells in a table and then we have the ul tag which is an ordered list this one is used with the li tag to make an order list to see this last text in action let's check this website that contains data inside rows in a table so as you can see here we have the tr tag which represents row and then we have the td tab which represents each element
of data then i scroll up and i got the table tag which is the whole table that contains a group of data the last tag i want to show you is the iframe tag this one makes it possible to embed another page within a page in html5 this is known as nested browsing this iframe thing makes sometimes web scraping a little tricky so sometimes you have to switch between one frame to another frame but that's not in all the websites so you have to recognize whether the website you are scraping has multiple iframes or everything
is contained in just one iframe before we start writing some code let's review some html basics first here's the website we're going to analyze i chose this website for its simplicity so you can successfully scrape your first website the website has three main sections the title the plot description and transcript now let's have a look at its html code here i wrote a short html code version of the website that contains transcripts of thousands of movies now if you've never seen html code before you will still recognize some text elements like the title description and
transcript in this case we're looking at the transcript of the movie titanic so that's why the title is there besides that element something you will always see in html code are tags tags are those words surrounded by brackets there are opening and closing tags for example in this html code we have four texts article h1 p and div each of them represent a node and we'll see them in detail in the following tree structure the two structure will help us come up with effective ways to scrape a website and also will be the foundation for
xpath that we'll use later in this course now let's start building this tree the first element of the tree is the root in this case the root is the article element as we've seen in the html code before this article tag has an attribute called main article this root element also contains that h1 tag that contains at the same time the text titanic which is the title of the movie another element inside the root is the p element in this case this has an attribute and a text the attribute is named plot and the text
is the plot description finally we have the div element this element has also an attribute called full script and text with the transcript of the movie all the rectangles you see on this tree are called nodes every node has exactly one parent except the root the h1 node's parent is the article node siblings are nodes with the same parent an element node can have zero one or several children but attribute and text nodes have no children for example p and if have two child nodes but not type elements now something very important in this tree
is the attribute element because they will determine the approach to take to scrape a website in this example you see only classes like main article plot and full script however you can also find ids and odors in this video we'll learn how xpath works xpath is key to easily learn web scraping with selenium and scraping so let's start xpath stands for xml path language xpath is a query language for selecting nodes from an xml document but we can also use it to select elements from html pages another way to select elements from websites is using
css selectors css stands for cascadian style so it's a language that was mainly created to style html web pages css and xpath have some similarities in their syntax however in this course we'll mainly see how to select elements with xpath because of its simplicity now let's review the expat syntax with xpath we can select an element by using the double slash and then we write the element name this is the most basic way to locate an element or node for example if you want to select all the h1 elements in an html page we do
double slot h1 this double slash is a special character in xpath that means pick a matching note that is located at any level within the xml document we're going to see the meaning of this in other special characters in detail in the next video now we can select an element based on its position by using the square brackets let's imagine that there are two h1 tags in the page you will pick the first writing the expat expression double slash h1 plus one inside square brackets and you will pick the second h1 tag with double slash
h1 and the number two inside square brackets now we can also specify the attributes adding square brackets and writing inside the attribute name this is the standard version of an it starts with double slash followed by the tag name square brackets the add sign and the attribute name like class id and so on then we write that equal sign then quotation marks and inside the quotation marks that attribute value this is how we build an average expat expression we can also use some functions to find specific values for example the startswith function will search for
text at the beginning while the contain function will search for text included inside an element in this example i included that contains function inside the xpath expression so as you can see we have to include parentheses to wrap the attribute name and attribute value we also have to use the comma to separate the attribute name and attribute value instead of the equal sign you can also use the end or logical operators with expat expressions they should be placed inside the square brackets and every time you use operators include a parenthesis for each xpath expression we'll
see how functions and operators work in action using a small html code in the next video now it's time to test some expat expressions with this small html code we've been using so far i'm gonna leave the html code i'm using with the link of this website in the description so you can play with it and understand how xpath works so as you can see this is a small html code of the titanic transcript we've seen before so this html code is going to help us understand what we've learned of xpath so far so as
you can see there is the title the plot description and the small transcript i wrote so you can test the expat expression in that bar for example i'm going to write double slash and then write the tag name h1 so as you can see here the result it includes the h1 that is the title titanic 1997 and if i want the text i can just write slash and then text with parentheses this is one way to get the text but there are many ways to get text with selenium or with scraping i'm going to show
you how to do it later so now let's test some other expat expression so let's try the p which is the paragraph tag so now i write p and i got two results because i include two plot elements now let's try double slash div and now let's get the text inside it so we get the transcript of this movie which is just a small phrase of the movie and now let's try some other expression so i write double slash and p so i get two results because i have two plots so if i want to
specify one element in particular i have to open square brackets and inside i have to write the number so for example if i write one it means the first element or the first p element which is the first plot that contains 84 years later and if i write two it's the second plot that contains the text in the end and that's it so so now let's write div to test that average expat expression so now i write that add sign and then class and then open quotes and write full screen so this is going to
be the expat expression of the transcript element so this represents the full script element and we just use the expat expression with the div tag the class attribute name and the full script attribute value as we've seen before so now if i want the text i just write slash and text with parenthesis and we just get the text now let's delete all the text values and just leave the the sign elements of the expat expression like the double slash the quotes square brackets and the add sign so this is the bones of the expat expression
and now i'm going to write the p tag class and now the value plot so as you can see i get one element because i just have one expat with these conditions and now let's test some logical operators so to do so i'm going to include parentheses and then copy this write or and then paste this expression so i'm saying locate an element that contains the class plot or the class plot2 so either one or the other so as you can see in the result i get two results that tell us that both elements satisfy
these conditions now if i write and instead of or i will get no result because there is no element that satisfies both conditions so let's now try the contains function so i delete one expat expression and now just one remains so i'm going to write inside the square brackets the word contains and instead of writing the equal sign i'm going to replace it with the comma as you can see it says that at least it should has two arguments so i'm going to replace that equal sign with the comma so as you can see i
get two results because both elements contain the word plot inside the attribute name so that's why i get to results this contains function will still work if i change the expat expression so let's change the p tag for div and also let's change the plot attribute value for script so now i get that transcript element which is the full script and that's how you use the contains function with expat expressions and that's it now it's your turn to test any xpath you want with this html code now it's time to see some special characters that
will come in handy when writing expat this lesson will be extremely useful to build robust expat expressions that will efficiently locate any element we want first we have the single slash character this one selects the children from the node set on the left side of this character in contrast the double slash specifies that the matching note set should be located at any level within the document for example if we write double slash article we'll get any descendant node in the html tree in this case the root is the only that matches now if i add
slash h1 i get the immediate h1 element in this case the movie title if i add slash p i get the 2p elements and if i add slash div i get the transcript on the other hand if we want to get the text we should write text and parenthesis however if we add only a single slash we won't get the text we want because it's not the child node of article so what we have to do is to write first the element that contains the text for example if we write slash h1 before slash text
parenthesis we should get the title also you can get the text using double slash text and parenthesis because double slash locates an element located at any level within the document then we have the dot character this specifies that the current context should be considered as a reference so it refers to the present node then we have the double dots which refers to apparent nodes sometimes it's difficult to find an element in a website but if you find its children you can use this special character to find the parent now i have a double slash h1
that matches the title let's see what happens if i add single slash and period as you can see i still get the same element because period returns the current node however if i write now period twice the parent node will be returned in this case the part note of h1 is article unfortunately in this xpath playground page you can see it clearly but if we go to the actual page and inspect it then press ctrl f to text path and write the same x path so first double slash h1 and then add that twice you
will see that only the article element that is the parent node will be highlighted in green now we have the asterisk character which is a wild card character that selects all elements or attributes regardless of names as an exercise try to guess what these characters mean together you can pause the video to figure it out if you want so these characters together select all the children notes considering the current context this means that all the children elements will be matched so this is the opposite of that double dot character that gives you the parry node
now let's test the asterisk character if i write double slash article and then slash asterisk i get all the notes inside articles so as you can see there is the h1 node the p node and the div node that contains that transcript now as you can see we obtain all the elements inside now if i add slash text parenthesis i get all the text inside these elements finally we have some characters we've already seen so first there is the add sign which selects an attribute then we have the parenthesis that it's used for grouping and
expat expression and the last one is the square brackets with the number inside which indicates that the node with an index n should be selected and that's it you just learn many special characters use them wisely to build better expat expressions to scrape websites with selenium first you have to download chrome driver and install selenium to download chrome driver just go to chromedriver.chromium.org and in the download section you will find this page here you will find the current releases of chrome driver and you have to choose the one that corresponds to the version of the
google chrome you're using to know which version is the right for you just go to this three dots button on the upper right corner and click on it and inside help click on about google chrome here you will see the version of google chrome that you have installed so in my case i have installed version 92 so i'm going back to the other tab and now i'm going to choose chromedriver92 but in your case you have to choose the one that corresponds to your chrome driver so i click on chrome driver 92 then you see
options for different operating systems so you'll see linux mac and windows in my case i have a mac so i'm going to download mat64 so i click on it and then it's going to start downloading the file once the file is downloaded unzip the file and remember the path where you're leaving this file because we're going to use this path later now it's time to install selenium so you can install inside pycharm so just open python and then go to the terminal located on the bottom so click on it and after that just write pip
install selenium and after this just press enter and after this you'll have selenium install another way to install selenium is just opening the terminal or the command prompt in my case i'm gonna open the terminal and once you have the terminal just activate your virtual environment and then write pip install selenium and press enter and then selenium will be installed on your machine and that's it in this video i'm going to show you how to automate this website so we can extract the titles and subtitles that you can see here in the news so we're
going to struck these titles and subtitles of each card so we don't have to visit this website every time we want to read the news but we only check our txt file and see the titles and subtitles and we can see which is the most interesting news for us so let's go to pycharm and the first thing we're gonna do is to create a driver a driver allows us to interact with this website through selenium so let's do this so first we have to import what driver so we're right from selenium import web driver so
that's the first thing we do and after that we have to make another import so we have to write from selenium.webdriver.chrome that service import service so in selenium 4 we have to make this extra import which is something we didn't need in selenium 3 but for selenium 4 which we're using right now we need this extra import so let's continue this and now let's define the website and the path that we're using so first the website that we're going to automate in this case is this website i chose which is all about football and i
just copy and paste it so here's the website and now the path so the path is where you downloaded that chrome driver so in the previous video you downloaded a chrome driver and you have to copy and paste the path of that file so just copy and paste it so i'm just gonna paste the path of my chrome driver here and now let's define the driver so i write driver equal to webdriver dot chrome and open parenthesis in previous versions of selenium that was enough to create a driver but on selenium 4 we have to
do an extra step which is creating a service so here i'm gonna write service equal to and here use this service that i imported here so i just copy and paste and now i have to write executable path equal to path so this executable underscore path argument is equal to the path that we defined here and after this we have to define the service parameter inside this chrome uh this scroll method and equal to service so basically it's first we define a service uh we say executable path equal to path and then we uh define
here in the service parameter we set it this equal to surface which is this one that we defined before and now we can open a driver by writing driver.get and inside parentheses we write website so driver.get website and if we run this we're gonna open uh chrome driver or just the chrome browser as you can see here here it says that chrome is being controlled by automated test software and this is not the browser i had before the one i opened before that i showed you before is here so this is the one i open
manually uh with chrome but this one is one that the one that selenium opened by itself so automatically so we're gonna automate this uh this website and we're gonna do that on the next video all right in this video i'm gonna show you how to extract the title and subtitle of this news so first to extract that title and the subtitle and all the data in any website that you visit you have to right click on any blog space and select that inspect option so i select on inspect and now we have these developer tools
open so this tab that we just opened is developer tools and here it contains all the html elements that is behind this website so you can see here that we have many elements and they represent uh the elements in the website so now i'm gonna click on this option or this button on the left the one that is here and i'm gonna select a specific element that i want so here let's select this element that is here uh on the right that i'm i'm hovering on right now so i'm gonna right click i'm gonna click
on this element that is here on the left and then i'm gonna drag to this element that is here so this is a card of add news and there are many so i'm going to choose this one in particular just to work with this so i click and now we have this element in blue so it's selected this in the developer tools and this indicates that this is the html element that is behind this card and as you can see every card contains a title and a subtitle and this one here too contains a title
and a subtitle and this one the big one here also a title and a subtitle and now let's check out other elements here in the developer tools so we have now this element in blue but we can go up and see this div element so we have before this a element but here we have now the div element and this one represents i think the folder of this news and if we go up we can see let's say a small container that only contains the title and subtitle and if we go up we see the
container that only contains that image of the news so basically that's how it works and also here this is the element let's say the parent element and this one contains the whole thing so i'm gonna use this one in particular to locate this card and not only this one but all these cards because they represent or they have the same pattern so i'm gonna build an expat that locates this element and to do that we have to press ctrl f here in developer tools and once we press this we have this option that allows us
to find an element by its xpath so here we have to build the xpath which basically consists of a tag attribute and the attribute value and now let's create this x path so i'm going to copy the attribute volume because it's a bit long and i might forget and now to create this xpath i only have to write double slash then the name of the tag which is div then open square brackets and here i uh write this add sign and then the name of the attribute which is class then equal to double quotes or
single quotes and now i paste that element that i had before and as you can see this element is now highlighted in green now this is great we have the whole card selected but on second thought i think it's much much better if we only select the text because here now we have not only the text but we have the image of the news but we only want the text because that's what we're going to extract so to do that i'm going down here and this one represents the the image as you can see here
and i'm going down one more time and this element represents only the text so i'm going to replace the value of this attribute and i'm going to paste it here so we only obtain that element and not the whole thing so here i paste it and now i have this element which is well first we have this one because this is the first element on the on the website and then we have this one so this is the one we were working with and well we also have this one and this one and by the
way here in this option we can navigate through all the elements that were found so as you can see not only this element represents this x path but also all the news that you can see in this website so let's go back to this one and as you can see here we only have the title and the subtitle so only the text great now let's copy this x path and let's go back to pirate charm and what i'm going to do is paste this xpath and now i'm going to write driver dot and here i'm
going to use a method called find element so i choose this one and this has two parameters by and also volume which i just wrote here so buy and value the first one is the method that we're using to locate this element in this case it's an expat because we just build the xbox as you can see here and in the value is the value of the xpath so i'm just going to copy and paste and well it's here but as you can see there is a conflict because we use uh double quotes here for
this expat and also here for the value so it's like conflict so i'm gonna press ctrl z and i'm gonna use single quotes to avoid that conflict so now i paste and we have this so this is great now we have uh this element and with this we are supposed to get the element that is here so the text title and subtitle here there are different methods that you can use you can also use the id or the class or any other attribute but we're going to use mostly xpath in this course great now before
we move on here we have to modify this method so we shouldn't use find element because here if we just find element we're gonna get only the first element in this website so as you can see here on the bottom there are 50 elements that were found with this expat so we can navigate here with this uh these arrows we can navigate from the first element to the last element and if we use only find element in singular we're gonna get only the first element in this list so this one the one that you see
right now so this one here that's the only element that we're gonna get and we don't want that we want all the elements in this uh website that all the elements that were found so to do that we have to use uh the find elements method so in plural with the s so now we're gonna get all the elements that are found with this expat so great now we have this and i'm going to set this equal to containers so this is my variable and now let's move on here and now that we have this
expat i'm going to show you something else that we can do so let's go back to the element that we had before which is this one and great now we have this element which is in green and we have the whole let's say the whole container but we only want the title or we want the title and subtitle separately so we don't want to extract them together but in different variables so what we have to do is just to open this uh element here we can click here and open the elements that are inside this
element so to see the children of this element so we open this one and we see that there is this a element and we still don't get the title and the subtitles separately so we click one more time here and now we have this h2 and this p and we can see that this h2 represents the title and the p represents the subtitle with this we can get the subtitle and the title separately so we only have to build the expat and to get to these elements and to get this xpath we only have to
add this slash which says hey give me that immediate note and we only have to write that a because uh the a is that immediate node that follows this div so we have to write a and as you can see here we have this uh a followed by this div so it's like a sequence we have first the div and we have now the a now we write this slash again and now we have to get to this h2 because this one represents that title so we write slash and now one more time we write
h2 so this is basically the sequence first is div then is a and then is h2 and we use this single slash because we want to get only the the immediate node so everything is perfect so div a and h2 and if we navigate here with these arrows we can see that we get all the titles in this website so everything is perfect so let's just copy this xpath and now let's paste it here on pirate charm so i'm gonna paste it and i'm gonna comment this out okay now to get to this uh title
element i'm gonna use driver and i'm gonna use find element so now again we're right by and we write value and now we write again expat and now we paste that export that we just built so it's here i paste it so now i write single quotes to avoid any conflict and we have this so as you can see this xpath and the previous one we built has something in common which is this one that i'm selecting right now so this is the same as this and there is nothing wrong if we leave this as
it is because it's correct but we can improve the syntax of this code so what we can do here is to reduce this containers variable so what i'm going to do is i'm going to loop through this list by the way i'm not sure if i mentioned this before but the difference between find element and find elements is that find element returns a single element but find elements returns a list and a list is an iterable so we can iterate through a list so we're using find elements and this returns a list so containers is
a list that we can iterate through so we write for container in containers colon and now we can put this inside that for loop so now we can use this container variable and we can replace this instead of driver so we can replace this with driver so we paste this one and now we have container that find element and basically this new let's say this new syntax is reducing this container variable so now instead of writing all of this xpath we can only use the dot sign and just replace all that export that we had
before so this that represents this thing and we can use this dot because we're using here also the container reference so this container is used as a reference for this new xpath so this is like a new syntax that you can use to avoid writing a long expat as we had before great now let's find the expat for the subtitle so here we have the expat for the title which was here and now let's find the expat for the subtitle so as you can see here the expat for or the element for the subtitle is
the one here that is inside the p tag so this p tag represents the subtitle and to get to this p element we only have to delete this h2 and only write p so as you can see now we have all the subtitles so the subtitle of this news now the next one and the next one and this one and this one too so it's great now we have all the subtitles of all the news listed in this website so now we only have to copy and paste it here comment this out and i'm gonna
uh duplicate this line of code with ctrl d and now we have this so instead of writing the whole thing we can only write just delete this and p and let's compare this so first we have this and this is represented by the dot which is by the way this one which is exactly the same so then we have a which is here slash a and then slash b so this is the equivalent of this so great now we can delete this and now we have the subtitle element so far so good now let's go
back to the website and i want to show you something here we have right now the subtitle element as you can see here but what happens if we only want the text because right now we have all these elements selected in green as you can see it's all this element with the tag name with that attribute and also this thing but we only want the text which should be in in white like this one but it's in green right now because it's selected so what if we only want the text and not the whole element
so if we only want to extract the text which is in white here and is in black in the website what we have to do is to use that text attribute so let's go back to python and to get to the text we only have to write that text and this returns only the text of this element so we have this whole element that is in green and with that text we only get the text inside this element so we copy this one and we use it also for the subtitle great now let's give a
name to this expression so i'm gonna write title equal to this and now subtitle equal to this so with this we have the variable names and we can move on all right now the next element that we're gonna get is the links of each news so here we have each card and let's go here again to the card we solve we have this card for example and what we want to do is to get the link so we go directly to the website that contains this new and as you can see the link is here
it's inside the href this is the link so i'm gonna just click on this link and now let's wait and as you can see here we have the news that corresponds to that card so is the all the details about this this news and now i go back here and well that's how we get to the link and now we verify that that's the right link that we want so now we have to build the xpath that gets that link so i'm gonna delete this slash a and slash p and now we have only this
so to get to that this href or to this link we have to add the slash a that we had before and this is the element that represents this whole uh let's say this whole description and if we want to get to that link we only have to use the href attribute that that's a method that we can use in python so let's just copy this and let's go back to patreon so i'm going to use again that container as a reference and again find element and here i'm going to use buy and again we're
going to use the xpath as a method and we're going to write value and here single quotes and the whole xpath so we don't need to use this part of the xpath because we already have the container as a context and okay now we can get only the href and to do that instead of writing that text for example as we did before we only have to use the get attribute method so we write get attribute and inside we write that name of the attribute which is href as you can see here the name of
the attribute that contains the link is href so we write href and with this we get the link so we don't get that whole element but we get only the link of this element so now i'm going to define this as link equal to and that's it great now one little detail that i want to tell you here is that usually the links are inside href attribute this is like something that you're gonna see quite often and also the href is inside an atac so you will usually see that the link are inside href attribute
and also inside and attack so that's a typical element that you will see often so the next time we want to get to a link just look for the href or look for the eight tag and you will get to the link fast so now back to pycharm and with this we have all the elements so another detail that i want to mention is that here we're using find element and not find elements because here in this for loop we're iterating through each element of the list so containers is a list but container is an
element so here is only a single element per iteration so we only have to use find element because it's a single element great in this video we learned how to locate all the elements that we wanted in the website in the next video we're gonna see how to export all these elements in a csv file and we're gonna use pandas for that all right in this video we're gonna see how to export all the elements that we extracted to a csv file so first what we have to do is to create empty lists so i'm
going to create three empty lists the first one is going to be titles and this one represents this title element and now i'm going to duplicate this and the second one is going to be subtitles in plural and the third one is going to be links so i write links and these are my three lists now what we have to do is append each element to a list in this for loop so in each iteration we're going to append an element to this list so we only have to write here titles and use the append
method so this append method allows us to append each element to the list and we only have to write that variable that we want to append so is like this titles that append and inside parentheses that element now i duplicate this and i'm gonna use only subtitles and i'm gonna do the same logic so subtitles that append and subtitle so this one is the list and this one is the element and one more time now links the list append the element which is link so this is what we have and this is how we append
each element uh to the list so in every iteration great at this point all the lists are filled with the elements in this iteration so what we have to do now is build a data frame with these lists and to create a data frame we have to use a library named pandas so i'm gonna write here import pandas as pd and probably you have to install this library so go ahead and click on terminal and here just write peep install pandas then press enter and you just have to wait some seconds to install this library
i already have this library installed so i don't have to wait anything so once you have pandas installed you can write this import pandas spd and now we can use the library pandas with this pd so i write pd and now i can use the data frame method so now i can use the uh the data frame method and i can open these curly braces and inside i can use a dictionary so to show you this much better i'm gonna create a dictionary so you don't get confused so first i'm gonna write dictionary underscore or
just my dictionary so my underscore dict and open here a dictionary so as you might know a dictionary consists of a key and a value so the key is the one that is here on the left and the value is the one that is on the right so the value i'm gonna set it here to my variable which is uh titles so our dictionary is going to have three keys and three values so the first value is titles and i'm going to name this as titles too there is nothing wrong if you set the name
of the key as the name of the variable but if you want you can change it for example i can write only title you don't have to write it the same way you can change it a little bit or you can change it completely you can choose so now i'm gonna write the second element which is the subtitles list so here are subtitles and i'm gonna paste these subtitles and i'm gonna write just in singular subtitle that's the name of my key and now the third one is links so here links this is the name
of the value by the way the name of the value obviously has to be the same as the the one we want here that is which is the list and here i'm gonna write the name of the key which is link which i can choose right now great now we can use this dictionary to create a data frame so here i copy the name of that dictionary and i put it inside parentheses and with this we can create a data frame using this dictionary so just a recap right now first we have these lists here
empty lists we fill these empty lists with all the elements here in the in the loop and then now that we have all the elements inside these lists what we have to do is to create a dictionary so this dictionary has a key and a value and the name of the key we can choose it but the name of the value is the name of the list so with this we have the dictionary and we created this dictionary to create a data frame so we're going to write pd that data frame and inside we write
my underscore ticked and with this we have this data frame so i'm going to name this data frame and i'm going to set it equal to df underscore headline so we have this data frame and now we can easily export this uh to a csv file using df underscore headlines that to underscore csv so we're exporting to a csv file but there are other options like json and also excel and html and different options but i'm going to choose csv as you can see here and now i'm going to give a name to this file
and it's going to be headline dot csv and with this the file is going to be exported with this name so great now to end all of this we have to use driver that quit so after we we struck all this information and we export all of these to this csv file the driver is going to be closed with this that quit great now let's test this out so we right click and choose run so i run this script and let's see what happens so first python is going to open this driver uh through selenium
and now it extracted the information and it closed the driver so the script was successfully executed and now let's check if we have these headlines.csv here on our working directory so here i open this and we can see that there is a file named headline.csv which is this one so we were working in this folder name tutorial and this one is the file we just generated so we double click on it and we can see that we have the title the subtitle and the links so now let's verify if this is correct the first title
is best of the best and the subtitle says who makes our man cd versus real madrid combined 11 and all of this so let's check if that's the same here and yeah it says best of the best and who makes our man city versus real madrid and all of that so everything was successfully extracted and we also have the links so we can visit any uh any news that is interested for us we can go directly to the website and we can check all the news here without being distracted by all these images and all
the ads on the website great that's it for this video on the next video i'm gonna show you how to do all of this on the background so you don't have to see that driver opened every time you run the script but all will be run in headless mode so far we automated this website and extracted the titles and subtitles that you see here and we did this with selenium so every time we run this code we open this browser automatically and extracted all the data that's great but in this video i'm going to show
you how to do all of this without opening this browser with selenium but doing this in headless mode so we don't have to watch selenium open this browser and do all the automation but we can just do it in the background and we can let selenium do that in headless mode so let's go back to python and here we're going to add some lines of code to activate headless mode so to do that first we write here from selenium.webdriver.com that options import options this options is going to help us modify the default behavior of selenium
so now we add here some lines of code and first i'm going to write here headless mode so we know that this is the part that is going to change so here first we initiate uh an instance of the options uh class here so i write options parentheses and here i write use equal to options so this is just a rule of thumb you have to initiate an instance and this is my options object as you can see i gave the same name to the name of this class this is just like a convention you
use the same name but here in lowercase and this is my options object so now i write options and to turn on headless mode we have to use the headless parameter so i write headless and this by default is set to false so we don't use headthis mode when we run this this script because it's in false but if we want to turn on head this mode we have to set it equal to true so it's equal to true now and now we have to change this web webdriver.chrome so in addition to this service parameter
we're going to add another parameter and this time is going to be options so this options has all the default parameters that selenium works with so selenium has some default behavior and we can change it here with the options parameter so i'm going to set options equal to options and there it is so as you can see i use these uh name variables just to make my life easier you can set any other name to this options variable you can set it like options one two three and you can write here options one two three
it doesn't matter i just name my variables like this to make my life easier here too service equal to service you can use a different name for service here but i just name it service in lower case all right now something you need to know is that we can make different changes to that default behavior of selenium but one of the most popular is headless mode later in the course we'll see another things that we can add to this options object so we can modify the behavior of selenium even further so now let's continue with
this uh with these changes everything is ready to run this script in headless mode but i'm gonna make a little change here in the end so instead of naming this csv file as headline i'm going to name it headline dot headless so we know that this comes from this script which is the one we're using with headless mode so headline uh this dash headless and now it's ready let's run this script and see the results so let's wait a couple of seconds and the difference with the previous script is that here we're not going to
open a browser automatically with selenium as we did in the previous video but all is going to be done in headless mode so in the background here we got a message that everything was successfully and now i'm gonna close this one and see if we have this file headline headless so i go here and i check here it says headline hyphen headless and we have all this data so it's the title the subtitle and the link again but everything was done with headers mode so let's verify the first the first title subtitle and here it
says pep idol watch word diola star reaction to ben simmons penalty so let's go here and this is the first section that we are working with and yeah we got the same pep idol virtuolas reaction 2 so great you can verify if all that news was properly extracted i did it myself and everything is working fine so that's it for this video in this video we'll learn how to automate this website and extract all this data in headless mode all right in this video i'm going to show you how to schedule this a script to
run every day or every time of the day you want so for example this news we can schedule to run every morning so every time we turn on the computer you'll have the csv file with all the data automatically extract it so you don't have to do it yourself but you can schedule when that's going to be executed so to do that first we have to convert our pi file this is a pi file because this is python so the extension is that pi and we have to turn this pi file or convert this pi
file to an executable so we have to do that first but before we do that we have to make some modifications to our script so we have to prepare our script to work properly before we turn our pi file into executable so to do that we're gonna do some importation so we write from date time import date time this is the first the second is import os this help has like interact with the operating system for example we can create folders and other things and the last one is sis i didn't explain you what date
time is this helps us manipulate that date and time so for example we can extract the hour or the day when the script is executed so that's it that's all the modules or libraries we have to import and now let's do some changes so first we have to use the os module and get access to the path attribute and then write your name and use sys that executable what we're doing here is get the path of the executable that we're going to create so far we have a pi file but we're doing this for the
executable that we're going to create in the next video so with this we're saying get the path of that executable that we're going to create and here we only get access to the path this is the attribute of the os module and well this is the directory name and let's give this a name i'm going to name this application underscore path so my application is my executable file that we're going to create and this is the path so we know what's the path of this executable we do all of this because when we work with
executable files it's a bit messy to work with paths sometimes the path you don't know were goes the file that you have here so right now we export all this data in this csv file and this csv file is supported in our working directory this is fine because we're working with a pi file but when we work with an executable file things get a bit messy and we don't know sometimes where the file is actually exported and with this we're going to be able to export a file in the same folder where our executable file
is going to be located so it's going to be in the same folder and we're not going to have any problem with the path so that's why we do this next we use the date time um this date time to customize the name of our file so right now is just headless or headline headless dot csv but if we run this every morning we're not gonna know which file corresponds to which day so for example if it's a saturday in the weekend and we didn't check the news from monday to friday we wouldn't know if
this file is from monday or tuesday or wednesday and so on so just to customize the name of the file that is going to be exported we have to create a a variable that indicates which day is the day when this script or this executable is going to be executed so now to do that we write datetime that now parenthesis and this is equal to well i'm just going to name it now and with this we have the date right now according to my machine or my computer so now we have to use a method
called strf time this stands for a string from time and basically what we're going to do is to get the time in a string format so here this null variable is in time format and we're going to convert this time data into a string so we can customize and format this this time so here inside you have to write the format that is used in this uh by this strf time method i didn't explain the syntax of this or the format of this but it's quite simple i'm going to this website which is right here
which is named as the method str f time you can go visit this website and you can check the format that i was talking about so for example what i want to create here is a format that is let's say day or month day and year so i'm gonna extract use this uh percentage m i just copy and paste it then i said i want sorry this was month with the m then d so i'm going to struct this where is the d it's here so i just copy and paste percentage d and finally the
year which is going to be percentage y so this is the year so with this we have month day and year as i showed you here so this is the the format that we have to use you only can copy here and paste it as i did it now i'm going to give this a name and i'm going to name it month well just as simple as this month day underscore year and i'm going to put this here on the right so with this we have this variable and this variable is going to help us
customize our file name so now let's go here let's go uh to this section and i'm going to use a f string to customize this string so i only write f in front of this string and now to add a variable or to concatenate a variable i have to use the curly braces so with curly braces we can concatenate a variable so i'm going to copy that name of the variable and i'm going to paste it here so i paste it and actually i'm going to change this name i want it to be just headline
so headline high fan and this so with this we're going to get something like like that something like headline high fan uh tuesday or well in this case numbers so 1 or 0 1 0 4 2022 something like that just following this format month day and year so with this we get this csv file with a date that we want and now what we have to do is to use this uh application path because i told you that the path is a bit messy when we work with executable so we're going to include this path
here and an easy way to do this is just adding these curly braces for a variable and just pasting this and adding the slash and as you can see this is the typical format we use for a path right we use this slash and that's how we uh put two paths together so we put this path and this and we get the whole path but this is not a good uh a good practice because this slash is also a bit messy between operating systems sometimes uh different operating systems use like different slash so for example
mac os i think uses this forward slash but windows i believe uses this backwards slash so it's like a bit messy you can find some problems if you use this slash so what we do is work with the os module to avoid this slash so we use um os and we use the join method to avoid writing this slash ourself so what we do is write os that path that join and here we have to add the path we want to concatenate so before i do this i'm going to create a name for this um
first i'm going to delete this because i'm not going to do this you can do it it's going to work fine but you have to be careful with the slash so i'm not gonna do this just to follow a best practice and what i'm gonna do is i copy this and now i create a variable file underscore name just to be more organized you don't have to do this i just like to be organized so file name and now i copy file name and i put it here so this is the file name and now
to concatenate this i'm gonna write this and concatenate it with application underscore path so basically this is the same we are concatenating this application path with this file name so we're not using this slash anymore or this backward slash anymore but we're letting this os path join taking care of it so we don't run into any issue so now this is ready and i'm gonna name this as final underscore path so this is my final path and this is what we're going to work with so i delete this by mistake but everything is fine now
so now i add this final underscore path so i copy this and paste it so this csv is going to be sent to this path this path is where the executable is going to be located so we avoid any issue with the executable and paths and that's it with this everything is ready to convert our pi file to executable file and then run this executable file at any time we want and we're gonna do that in the next video okay in this video i'm going to show you how to schedule this script so you can
run it every morning or every day you want at any time you want so the first thing we have to do is to open up the terminal i have the terminal here on python on the bottom so i just click here terminal and here what we're going to do is first install a library called pi installer so pi installer is going to help us convert a pi file supply file our python files to executable so the first step to to schedule this script is to convert it to an executable file and once we have the
executable we can schedule this executable to be run anytime we want so to install this library we only write pip install by installer so you just have to write this press enter and wait a little bit so here i got a message that i have this library already installed but it's going to take you a couple of seconds to install this library okay once this is ready we just clean this app and now we write the following command so we write pi installer double uh dash and then we write one file so this is the
command and then we have to write the name of this this pi file so this script in my case i name it news dash headlines so i just have to write this name and i'm gonna press tab and as you can see here i have this uh this script and i'm just gonna write news dash headline in one little detail i forget to mention is that you have to be located on this folder so on the folder where the the python file that this pi file is located so you have to do that on the
terminal here we're right now on the terminal and to do that first i'm going to copy this so i don't lose this command and to do that you have to know how to navigate on the terminal and on the terminal we navigate with this command cd which stands for change directory so if you are not on the folder where is your pi file this is not going to work so you have to use cd and then for example i'm going to use cd to go to the previous folder and right now i'm on another folder
you can see here this is the parent folder of tutorial which is the folder where is my pi file and i can verify it because if i do cd i can see the folders that are inside this one so i write cd and then press tab and you can see all the folders and one of them is the tutorial folder as you can see here so to go into this folder i only have to write cd and then write tutorial so i have this i press enter and as you can see i'm not anymore on
the football headlines folder but i'm right now on the tutorial folder in this tutorial folder happen to have this uh this um script that i need so this is the the script which i need which i named news dash headlines so perfect so now i'm gonna clean this and now i'm gonna delete cd and i'm gonna use the previous uh the previous command i had which i copied yeah is this one and well it says pi installer double double dash one file and the name of my pi file so this is ready i'm gonna press
enter to convert this pi file to executable so i press enter and now we have to wait some seconds great i get the message that building the executable was completed successfully so now to verify this was successful we have to go to the folder where this is located so i'm going here to the left and here this is my tutorial folder and in your folder you will see these two folders one is called build and the other is this and your executable is in the dist file so this is my test file and this is
my python script but now it's an executable so on a mac what you can do to test this executable which i highly recommend you is to double click on it and just after you double click you'll see that the executable will be run but sometimes when is the first time you open this sometimes you won't see this option to open with the terminal because right now my operating system recognizes that this should be open with the terminal but sometimes it doesn't know so what you have to do is to help it so you have to
right click and click on open with and sometimes you will see the option terminal but if you don't see it you just click on order and then you have to locate the terminal so you just click here all applications then scroll all the way down go to utilities and here it should be the terminal so here i'm going to open with the terminal i didn't have to do it but i just did it to show you how this works so now i have this uh executable i just run this executable by double clicking on it
and here and well the executable apparently is running and we can verify if this is successful by going here to python or actually we can see right here so we can see here in the folder and we see that we have a new file and this file is named headline dash and here is today's date so 04 28 2022 and this is the format that we used here for the csv file month day and year so we verify that this file was successfully created so we have here the file and if we check the content
we'll see that is the title and also the subtitle and also the link well we verify that everything is working fine and i highly recommend you to verify that the executable was successfully created because once we schedule this you'll have to wait until the moment that you schedule to verify if this is working fine so it's much better if you test it out right now as i did some seconds ago now we're going to schedule when this is going to be run so now i open up a new terminal so i click on new window
and here to schedule this executable we have to open chrome tab so we write chrome tab then dash and then e we press enter and we're going to get this window and here we have to write a command to schedule this executable and the command has the following format to show you the format i'm going to this website which is called chrome tab guru and here you can build a part of the command so here you can know the syntax used by this chrome tab so here i previously tested that syntax so i'm just gonna
leave it as the default parameter so here you see five asterisk and the first represents the minute the second the hour the third day the fourth the month and the fifth the day on the week and here let's say we want to schedule this executable to be run at nine in the morning every day so if we want that we only uh delete this and write here zero because it's zero uh in the morning i mean the minute it's zero and the hour is nine so at nine in the morning so this hour is uh
from 0 to 24 or 23 i guess so we have to write 9 and this indicates is 9 in the morning so it's saying add 9 and here we can also see when is the next time when this is supposed to run so it's 28th which is the date today then 29 then 30 uh one the next month and the second of the next month so we have here we can verify which days this is supposed to to run so this is really good because you have to be very careful when you create uh this
command because by mistake you can make it like to run every minute or run every hour and you might damage the performance of your computer so be very careful and check out that expression that you're creating here so i'm gonna copy this and now i'm gonna paste it here okay this is the first part of my command and the second is the path of that executable file so here we have to go here to my folder and we have to drag this file to the terminal to get that path so here first i'm gonna press
the i key to go to insert mode so here i'm in short mode and i can type anything as you can see here so we're in insert mode and here i'm gonna paste this or i'm gonna drag this file so i drag just to get the path so that's how mac os works so here we have this path and this is the second part of this command so first we have to write the hour that we want or the time we want this to run and then we have to paste that path of this executable
so everything is ready here and what i'm gonna do is i'm gonna save this command and to do that we have to press the escape key so i press escape and now we have to press this column that you see on the bottom and now w and q so w stands for right and q stands for quit so we want to write this and then quit this chrome tab now i press enter and we're going to get this window and we have to press ok to give permission so i press ok and now this was
successfully created and to verify this we have to write the command chrome tab dash l so we're going to list all the commands that were created so i press enter and now you see that one of the commands created is the one we just created a second ago so this is the command we created and with this we're telling uh our operating system that we want to run this executable at nine in the morning every day so i'm not gonna wait until nine in the morning because it's gonna take too much and we already tested
out this executable file manually so we know that this is working just fine so you can schedule this to be run earlier in your computer and verify it yourself or you can test this out manually as i showed you before and that's it for this video in this video we'll learn how to schedule an executable file to be run at any time you want okay in this video i'm going to show you how to create pivot tables using python so here we have the typical sales data that we have to work with in microsoft excel
so we usually have to create pivot tables using excel but now we're going to create a pivot table using python in this case we're going to create a pivot table that tells us how much people spend in each product line so we're going to divide by gender male and female and we're going to see how much each gender expand in product line so we're gonna do this with python and first we have to read this excel file so i'm going here and we have to use pandas as usual so i import pandas as pd and
then to read this excel file we have to use read underscore excel so this is an excel file because here we have the xlsx extension so that's why i'm using read excel and not read csv so now i just have to write super market underscore sales that xlx x and well this is the name of the file you i'm going to set a name to this so df equal to and with this we have the data frame so we read this excel file and we put this inside a data frame so with this we're reading
this excel file so now if you want we can print this so i print this data frame and now we'll see the result and well here i didn't write the name correctly so i didn't write supermarket so i write i write supermarket now and we have here the data frame and as you can see we got here the same data that is on the left but now is here printed in python so now to make a pivot table what we have to do is use that pivot underscore table method so what we have to do
is write df that pivot underscore table and then we have to indicate the index the columns and the values for this pivot table great and now i'm going to show you the columns that we're going to work with so these are the columns in yellow so the gender column the product line column and the total column so i'm going to select these columns so we only see these three columns in our data frame so to do that first we have to write df which is the name of our data frame and then write double square
brackets to select multiple columns so to select one column we write just a pair of square brackets and if we want to select multiple columns we have to write double square brackets so now we indicate the columns that we want to select so in this case is the gender column then is the i think product line column so i just copy this one and finally is the total column so total so with this we have the three columns that we wanted so now i can set df equal to this and now my data frame will
have only these three columns so if i print this you'll see that we only have three columns and these are the columns that we selected so this is a good practice when you only want to focus on some columns and you don't want to see all the columns that is in the file great now let's continue with our pivot table so i have here the pivot table and we have to define which are the index the columns and the values so first what we have to do is remember the goal of this pivot table so
the goal is to see how much each gender is spent in each product line so if that's our goal probably we want the gender in the index and we want the product line in the columns and also we want to sum the amount of money spent on each product so now let's define the parameters first the index so index is going to be the gender then the column is going to be the product line so i said here product line then we have the values and as i said before this is going to be the
amount of money so this is going to be total and finally we have the aggregates function that we want to apply so in this case we have to write agg func and this is the operation that we want to apply so in this case i want to sum okay with this our pivot table is ready and now i only have to set this equal to and i'm going to name this pivot underscore table so that's the name of my pivot table and well now i'm just going to put this here and well this is ready
and now i'm going to print this pivot table so you can see its content so print pivot underscore table so now i run this and let's see the result so here we have our pivot table and as we can see we successfully built this pivot table so in the index we have the gender male and female then in the column we have the product line so there are different product line electronic accessories exports and travel and so on and finally in the values we have the numeric data that was in the total column so here
we have this numeric data and well we use the sum as our aggregate function so we sum the values in each category and now to see the content of this pivot table much better i'm going to export this pivot table in an excel file so what i'm going to do is delete this and here i'm gonna export this pivot table using to underscore excel and then i'm gonna set a name for this excel file so i'm gonna set it equal to pivot underscore table dot x l s x and we can also set the name
of the sheet in this case i'm gonna name the sheet as report so this is the name of my sheet and then we can also set in which row this data is supposed to be exported so we only have to add the start row parameter and in this case i'm going to export this in row number four and with this this pivot table is going to be exported in an xlx file and the first sheet is going to be named report and this is going to export it in row number four and by the way
you can also round the numbers inside this pivot table so you only have to write here that round and zero so with this you round the numbers inside this pivot table so now i run this so now i should have this file in my working directory and since i don't have microsoft excel in my computer i'm going to open this in google sheets okay i just opened this file and well is named pivot underscore table it's an excel file and i just opened this in google sheets and well this pivot table was exported in the
row that we specified and we see that the sheet is named report here on the bottom and that's it we successfully created this pivot table with python and we exported this into an excel file in the previous video we used this sales data to create this pivot table with python and now we're going to use this pivot table to create a bar chart and to do that we're going to use a library called open pi excel so first we go here and we open the terminal and we install open pi excel so we only have
to write pip install open pi xl so now we press enter and with this we're going to install this library and this library is going to help us do things we will do in microsoft excel like creating charts or summing values in columns and more things so now let's import this library so i write from open pi excel import and the first thing we're going to import is load underscore workbook and we're going to use this load underscore workbook to read our excel file so the name of my excel file is pivot underscore table so
now i use this load underscore workbook and here i write the name of the file so i write pivot underscore table dot x lsx and now i set this equal to wb which stands for workbook and now i'm going to select the sheet i'm going to work with so to do that i have to write wb and then open square brackets and then we have to write the name of the sheet in this case i'm going to use the sheet report which is here so the name is report and now i only have to write
report and this is the name of my sheet so i set this equal to sheet all right with this we have the workbook and we also have the sheet and now we can use these two variables to manipulate our file okay now what we're going to do is select the active rows and columns that are in our sheet so here we have some active rows and columns and this is determined by the cells where our pivot table is located so here we see that our pill table is located between a5 and g7 so we have
to locate the minimum row the minimum column the maximum row and the maximum column so let's do this so i go here and now we use wb and then we have to use the active attribute so i write active and first let's locate the minimum column so i write that mean underscore column so now i duplicate this and now i do the same but now with the maximum column so now maximum column then minimum row and then maximum row so now i set this equal to and i'm gonna set just names that are equal to
this so first minimum column then maximum column and so on so now with the minimum row and finally with the maximum row so now we have these four variables and now i'm gonna print these four variables so you can understand much better what they mean so first the minimum column and well then maximum column then minimum row and maximum row these four variables are going to be useful when we make our bar chart so we get 1 seven five and seven so for minimum column we get one and for maximum column seven so let's go
and check here so minimum column one so that's correct because the minimum column is a which represents one and the maximum column where our pivot table is located is g so it's two four five six seven so yeah maximum column seven so minimum one and maximum seven so that's correct and then we have minimum row and maximum row so five and seven so now let's check here and the minimum row uh is five and the maximum is seven so this is based where our pivot table is located right now because this is the only element
that is in our sheet so that's why this is our reference all right now that we have these minimum and maximum values we're going to use these values to create our bar chart so first i delete this and now i'm going to import a bar chart from open pi excel so i write from open pi excel that chart import bar chart and now i'm going to instantiate a bar chart object so i write bar chart and open parenthesis and this is equal to bar chart so now i have this bar chart object and now we
have to do this first we have to import a reference and this reference is going to have the minimum and maximum value so i write here comma reference so i'm going to use this reference and then i'm going to put inside parentheses the minimum and maximum values okay so first let's indicate the sheet we're working in so first sheet and then we have to indicate the minimum and maximum values so here i write min column or mean underscore call and then i write mean column and i do the same for the order so i only
have to duplicate and add comma so here i add cool mass but we have to change here to maximum column and then to minimum row and finally to maximum row and with this we have the four parameters and now i just have to change the values so minimum column then maximum column then minimum row and finally maximum row so okay now these are the reference but now we have to split the reference in two so first we have the data this is the data and then we have the categories this is the categories so we
have to create two references so let's start with the data so first we have the data and i'm going to set this equal to data so i write equal to data and well now i'm just going to format this properly and then i'm going to explain you how this works so first i'm going to put this in one line so you can see it much better and okay now we have to make a little change to this reference and i'm going to explain you why so here we have minimum column maximum column minimum row and
maximum row but these are the minimum and maximum values of the whole pivot table so here we have the whole pivot table and the minimum value as i told you before is a so one but for the data the minimum value is v so number two and the maximum value is still g so seven so the minimum column is two and the maximum is seven so here what we have to do is write minimum column plus one so here i have to add plus one and as i told you before the maximum column is going
to be the same because it shares the same maximum column with the pivot table so it's the same maximum column and then in the rows the minimum and maximum row you can see that the data in yellow has the same minimum and maximum row compared with the pivot table so it's the same so right now we're analyzing the area in jello and it has the same minimum and maximum row compared to the whole pivot table okay so the only thing that changed was the minimum column because it starts in b so here we added minimum
column plus one okay now let's do the same but now for the for the categories so now i write categories and here i'm going to delete this and let's analyze so the categories are in green so the minimum column is still a that's correct but the maximum column is not g anymore but is still a so here what we have to do is write in maximum column the value of minimum column so the minimum and the maximum column are going to be mean underscore column okay so now let's see that minimum and maximum row so
the minimum row is not five anymore but it's six and the maximum row is seven so here what we have to do is add minimum row plus one because minimum row is the reference of the whole pivot table so five plus one six so here i go and add plus one and the maximum row is going to be still seven so it's going to be the same it shares the maximum row with the whole pivot table so it remains the same and here i will leave it as max underscore row and with this we have
the references for our data and the category just one little detail you need to know here i highlighted these two areas in yellow and in green so you can know which are the data and which are the categories but this is the same concept that you will follow in any pivot table so the categories are gonna be always on the left here and this doesn't include the header so not this one and the data does include the header so in the data we include the numeric data which is right here and also we include the
headers so you have to follow the same concept all right now let's create a bar chart so now that i have the references i'm gonna use this bar chart object so i just copy bar chart and now i'm gonna use a method called add underscore data and we already have the data so the data is here in my data variable so i only have to add data and then we have to indicate where we want to create this bar chart so i'm going to create this in the cell b12 and well with this our bar
chart is going to be created in b12 all right now let's add a title to this bar chart and then let's add a style so first let's add the title so i write bar chart that title and we set this equal to and we can write anything we want for our title in this case i'm gonna write sales by product line because this is what our bar chart is about so sales by product line and then i'm gonna change the style so i'm gonna write that style and this is basically the style that our bar
chart is going to have so when we create charts in microsoft excel we have different styles so different colors and different shapes and we can select here only with numbers so for example we can use style number one style number three or style number five so for now we can only guess the number and then we can see the results when we export this bar chart all right now let's save the result so i write wb which stands for workbook and then save so that's save and i'm going to sport this as bar chart dot
x lsx so with this we're gonna save all the results in this excel file so now what i'm gonna do is run all of this and see the results so i run this well we got a message of success but before i open the file here i made a little mistake here i forgot to add the categories so we created categories but we didn't add it to the bar chart so here i'm gonna write here bar chart that set underscore categories and inside i'm gonna put these categories that we created so i put this inside
and here we also this shouldn't be here this b12 shouldn't be here sorry this should be in another method so first we add data and we add categories to our bar chart and then once this is done what we have to do is write sheet that add underscore chart and then we put our bar chart here so i'm gonna write bar chart and then i'm gonna set the cell where we want this bar chart to be added so i set this equal to b12 and this bar chart is going to be added in b12 so
just a recap first we add the data and the categories to our bar chart and then we use the add underscore chart method to indicate where we want to add this bar chart okay now i'm going to add just one more parameter here so i'm going to add title from data and here equal to true so we have the title from data in this uh in this bar chart so with this our bar chart is ready so now i can run this and see the results so i now run this and well we got this
message of success and now i'm gonna open this excel file so we can see the results all right here i open the bar chart file that we created before and here we have the pivot table and also we have the bar chart so this bar chart corresponds to the data in our pivot table so we created this using the data here and also the categories here on the left and as you can see we got here the title that we set here so sales by product line we also got this style number five so these
are the colors and the format we have and well we did all of this using open pi excel and one of the coolest things we did when we wrote this code is that no matter how many rows this pivot table has our code is gonna work because here we use minimum and maximum values here so if this pivot table has more rows and more columns this is still going to work because we use references and that's it for this video in this video we'll learn how to create bar charts using open pi excel okay in
the previous video we learned how to create a bar chart so we use this pivot table to create this bar chart that you see here and now we're gonna see how to create formulas like this one so for example we're gonna see how to sum these two values but now we're gonna do this with python so instead of writing this formula manually we're gonna do this here with pythons all right first i'm going to show you the easiest way to create a formula in this spreadsheet using python so here i'm going to first write this
formula so i write sum and well we have this formula and now i'm gonna copy this formula so the easiest way to create this formula is just writing this in python so first i'm gonna paste it here and i'm gonna put this in quotes and well i'm gonna here set this cell which is b8 equal to this formula so here i write sheet so first of course you have to read this excel file so we have this bar chart that x lsx which is the name of my file that we created before in the previous
video and then we have to select the sheet so we have to select this report sheet this is the report sheet that we have here and well we named this as sheet so now we select sheet and we select the cell we want to work with so in this case b 8 and then we write this so sheet b8 and this equal to this expression so this is basically the same as doing this so we have b8 here and we write this equal to this formula so it's basically the same so we have this but
now encode and now to complete this i'm going to set a style for this cell so i only have to write the name of this cell and then that style i'm going to set this equal to currency so the format is going to be currency and well we're going to have here uh a dollar sign i think and well we're going to have the sum and the sum is going to be in currency format so okay now let's run this and let's see the results so first i'm going to delete this because i don't want
to do this manually so i delete this and now let's see the result we should have the same but now with python so i run this and well we see the result but actually we didn't save this file so i'm gonna write here wb just to make sure everything was saved and that's saved and i'm going to export this in another file so this is going to be called report that xlsx so now i run this and let's see the results so now we have this file and well let's open this file all right i
have the file here opened and as you can see we have here the the sum that we did before so here is the formula and here's the value and we have here the dollar sign because we set this cell with the currency style so great so with this we could create this uh we could make this sum using python but we did this for only one cell and what if now we want to do this uh sum for all the cells that are here so probably we want to calculate the sum for all the cells
that are here so now let's do that so the simplest thing that you can do is just copy and well you can duplicate this and for example for b9 or actually for c8 you can just uh change here c and c and then you can write c six and then c seven so that's what you can do but now i'm gonna show you a better way to do this with a for loop so i'm gonna delete this and i'm going to comment this out so instead of doing this manually like for every column you have
we can use a for loop so first we're going to use these references that we had before so we're gonna use the minimum and maximum columns and rows so here first i'm gonna look through the rows that we have here so let's see which rows or actually let's see which columns we're going to use because we're going to sum these columns between b and g so i'm going to use the columns that are in this range so from b to g and to do that we have to use the minimum column plus one so minimum
column is a plus one b and the maximum column so okay let's do this so first minimum column plus one so basically this is the the range where our data is so as you might remember from previous videos this was in jello and this belong to our data so b to g okay so minimum column plus one and then the maximum column so these are our two references and we need to loop through this uh range so to do that i'm going to use the range method so i write here range open parenthesis and then
we have to write the minimum value and the minimum value is min underscore column plus one and then we have to write the maximum value but when we use range we have to write always the maximum value plus one because range gets this value minus one so it's kind of tricky i'm gonna show you here but first i'm gonna use this for loop so for i in range and then let's print this so first i'm gonna print let's say from 1 to 10 so 1 to 10 so you can see what i'm talking about so
then i print i and let's see here i'm not going to save this so i just want to show you how this works so when we print a range from one to ten you see that we got one two three until nine so we got the last element minus one so that's why i said before that we have to use here plus one so if we write here plus one we actually get the number 10 so that's why here i'm gonna delete this i'm using max underscore column plus one because i want to get the
maximum column here so now i'm gonna print i and we can see the results so here we got from two until seven and that's correct because here we want from b until g so b is number two and g is number seven so that's great we got the columns here so we got from two until seven great now i'm gonna import something called get underscore column underscore ladder and to do that we have to write from open pi xl that utilities or actually it's just utils and then we have to write import get underscore column
underscore letter and this helps me get the letter from a column so i'm gonna show you here i'm gonna just right here get underscore column underscore letter and inside parentheses i write the i so here if i print i'm going to print i and then i'm going to print this so we can compare the values so here i run this and as you can see we get 2 for the column b we get 3 for column c and we get 7 for column g so as you can see this gets the letter based on the
number so if we give 7 this returns g if we give 2 this returns b so that's what get underscore column underscore letter does and we're going to use these letters to create this formula so here i'm going to copy and paste it so we can see much better what we're gonna do so here instead of writing b8 for example i'm gonna open an f string and i'm gonna open here curly braces to make this a variable so instead of writing b i'm going to use a letter so here i'm going to delete this and
i'm going to set this equal to letter so this is my variable letter equal to and well this expression so here we got the letter for example letter b and then instead of writing here b i'm gonna write letter so we got here the letter and the number so now i'm gonna do the same here in the formula i'm going to open here this f string and then instead of writing b here i open square brackets twice and then i write letter [Music] here and here so we have letter and well with this we have
the letter in our formula and actually we can do the same here so instead of b again one more time later and well this is done and finally we have to change the numbers so instead of writing 8 we have to use reference in this case i'm going to use the maximum and minimum rows so for example here we had b8 and this represents this cell b8 and this is the maximum row plus one so this sum formula is always going to be located one cell below our pivot table so we only have to sum
the last row of our pivot table or the maximum row of our pivot table plus one and this guarantees that we're gonna get this uh cell that we have here v8 so for example row seven plus one is eight so we got this row eight so now let's write this here so instead of writing eight we have to write the maximum row plus one and well the same goes here maximum row plus one and as you might expect the formula is always going to be located in maximum row plus one so that always happens and
well then we have to edit this number six and number seven so to do this we have to take a different approach so here let's see so we sum always the range where the data is so in this case b6 and b7 so here we can say that the minimum value is always the minimum row plus one because in the minimum row we always have the headers and after the headers is the data so that always happens and the maximum row is going to be where the wrench ends so for example this one ends in
b7 and this is exactly where our pivot table ends so we have minimum row plus one which is this and that we have maximum row so this is our range so now let's do this here so instead of writing six i write minimum row plus one and instead of writing seven i write maximum row and this is going to be true as long as you have a pivot table with this format which is actually the standard format for a pivot table yeah you always have this header and you always have these categories and always the
data is here so basically this is going to work usually so now that we have this i'm gonna first comment this out because i'm not i don't wanna print this eye and what i'm gonna do here is print the values so i'm gonna comment this out and let's see if we did all of this correctly so i'm gonna i'm gonna print this and let's see the results so now i run this and well we get here some b6 b7 which is what we had here before but now we also have c6 c7 d6 d7 until
g6 g7 so this is basically what's going to be here so we get all of these formulas and that's great because we did this with a for loop and we didn't do this with uh manually one by one so great now that we verify this is correct i'm gonna comment this out and i'm gonna uncomment this and actually i'm gonna delete this because i'm not gonna use it all right now that everything is ready i'm gonna uncomment this wb that save and i'm gonna create this report so i'm gonna run this and let's see what
happens so i run this and now i have this file in my working directory so i'm gonna open this to see the results all right i just opened this file and as you can see we have all the formulas that we created before with the for loop in python and well as i mentioned before this is going to work the code that we wrote here is going to work as long as this pivot table has this format but anyway you can add more columns so you can add here more columns or you can add here
more rows but well the format has to be something like this and that's it for this video in this video we learn how to create formulas in our spreadsheet using python alright so far we learned how to create our pivot table how to create formulas for multiple columns and also how to create this bar chart using python and now what we're going to do is complete this report by adding a title and subtitle and also changing the font so let's do this so i'm going here to the right and well first we have to import
load underscore workbook from open pi excel so we read this excel file which i named report.xlsx and then well we have to select the first sheet which is the report sheet so once we have this we can have access to the cells so i write sheet and then we open square brackets so i want to put the title here in a1 so what i have to do is write here a one and then we can write the title so i'm gonna put a title which is a sales report so this is my title and then
i'm going to write a subtitle in a2 and this is going to be named as the month so i'm going to write january so we have the title sales report and then we have the month in a2 okay now let's edit the font so i'm going to edit the font of the title so i write that font but first we have to import font from open pi excel so i write from open pi excel that styles import font so now i can use this uh font so i only have to write font and then we
can choose the the font we want so in this case i'm going to choose arial so i write this and then we can select uh bold for example so i'm gonna set this equal to true so this title is going to be in bold and then we can set the size so i write size equal to and you can choose any size you want so i'm going to choose 20 for this and then we can do the same for the subtitle so i write sheet then a2 that font so this is the font attribute and
then we can change this using this font that we imported and well i'm going to set this equal to arial and here you can choose any font that you know from microsoft excel in my case i don't remember so many fonts so i'm just choosing arial so then i write bold equal to true and well size this one a bit smaller in this case 10. and well after this as usual we have to save the results so we write wb that save and i'm going to set the name as report underscore january so this is
gonna be my file and now i only have to run this and well i'm gonna open this to see the results all right i just opened this file report underscore january and here we have the title sales report and this is in bold and well the size is 20 as you can see here and we also have the subtitle january and again it's in bold but now the size is 10. so we successfully created this title and subtitle and we could edit the font and that's it now feel free to go to our code and
hover on the font class that we have here so you can see all the parameters that you can use so for example strike color scheme size bold italic and so on all right in this video we're going to put all the pieces together and we're going to convert the pivot table into an excel file so we're gonna use the three scripts that we created in the previous videos so this one added a chart and this one here added formulas and this one here added a title and a subtitle so i put these three scripts together
in this script that i named pivot to report and well this is basically the same i just made a little change here i added a variable that i named month and well here we can set the month of the year and well this is going to be also the month of this report and since i added this here i changed the name of our file that is going to export it so here instead of writing report underscore january as we had before here we have report underscore january here what i did is uh introduce these
uh current braces to put this month variable so we can change here for example march december and our excel file is going to have the same name report underscore march or report underscore december and also here in the subtitle i set that month so we have here in the report we have the name of the month so with that our script is done so this is basically the same that we have here in number two number three and number 4 but now with this month variable and well that input is going to be our pivot
table that we have here and the output is going to be a report with the formulas and with the bar chart and well with the title and subtitle so now let's run this and well we got a message of success and now let's see the result and here we have the file so the file's name report underscore february and well we have the title we have the subtitle here we have the formulas for each column and well we also have the bar chart so we successfully convert this pivot table into this report and we could
do all of this by putting together the scripts that we wrote in the previous videos and that's it for this video in the following video i'm going to show you how to convert this script into an executable file so you can automate all of this even more all right in this video i'm going to show you how to convert this pi file into an executable so right now we have our script which is a pi file and we're gonna convert this into an executable so we can automate all of this so first what we have
to do before converting this pi file into an executable is importing two libraries so we're gonna import os and we're going to import also syst so i write os and then cis so now we have to create a path for our executable and to do that we're going to do this so we write os that path that dear name and then we write cis dot executable and this represents the path where our executable is going to be located so now i set this equal to application underscore path and we need to do this because when
we convert a script into an executable the path becomes a bit messy so we need to make sure that we specify exactly where is the path where our executable is located so this is why we do this all right now instead of writing here uh the name of the file as we did before we have to join this name of the file with the path that we have here so here instead of only writing this i'm gonna do this so i'm gonna create an input path so i write input and score path and we set
this equal to os that path that join and we're going to join the application path with the name of our file so the result is going to be the input path and this input path is what we're going to read here in load underscore workbook so here we need to make sure that this pivot table that x lsx is going to be located in the same folder where our executable is going to be located so i'm going to do this later when i create the executable file i'm going to put this excel file inside the
same folder so you have to do the same okay now that we have the input path i'm going to create also an output path so i scroll all the way down and here instead of writing the name of the file i'm going to create an output path so i write output underscore path and we set this equal to os that path that join so we have to join path again and well again application path and we join this with the name of the file so here we write and with this we have the output path
so i write output path and we put this inside that save and well with this we're going to export this report file inside the same folder where our executable file is going to be located so the input and the output are going to be in the same folder where the executable is going to be located and that's something that we should keep in mind all right now to customize our automation even more what i'm going to do is ask the name of the month so instead of just setting the name of the month what i'm
going to do here is write month equal to and here use the input function so instead of just writing the name of the month every time we want to run this executable file we're gonna ask the name of the mod so i'm gonna write here introduce month so this is very cool because every time you run the executable file you can change the name of the month so it's not always going to be february but it can be march it can be december it can be july any month you want and all the changes are
going to be seen here in the code so here when you set that month here it's going to be the same month that you right here so that's very cool and well i'm gonna delete this and with this this is done so now we can convert this pi file into executable because we already set the input and the output path so now we can open the terminal and well here we can run the command that converts the pi file into executable so to do that first we have to go to the folder where our script
is located so in my case my script is named six dot py2exe and this script is located in the tutorial folder so you can see this here and well in the terminal i also have to be in the tutorial folder which is this one so once you are in the folder where your script is located you have to run the following command uh pi installer then hyphen hyphen one file and then you have to write the name of your script so in my case six dot pi hyphen two hyphen e x e that pi so
this is the name of my script and well in case you don't have pi installer installed you can write pip install pi installer well in case you don't have it but probably you have this library installed because we already did this step many times in this course so okay once this is done you only have to press enter and after this an executable is going to be created into a folder named dist so we're going to create two folders a build folder and at this folder and the executable is going to be located inside that
this folder so we got a message of success and now let's check this folder and now if we go here on the left panel we see that we have two folders the build folder and this folder and inside this folder we have this executable and now i'm going to open this folder so here i'm in the same folder but now in a different view but now you can see that well we have the executable so to run this first we need to put the pivot underscore table inside this folder so we put this file inside
this folder because as i mentioned before we need this file in the same folder where the executable is located so now that we have this file we can run this executable and to do this on a mac i'm going to open with terminal so i open with the terminal and now i have the terminal and as you might remember i added an input in my code so here we have this input so as you can see is asking me for the month so it says introduce month so what i have to do here is just
write the month so i'm going to write let's say march so this is the name of the month and well i press enter and as you can see this was executed so we have here a new file named report underscore merge dot x lsx and this has the report with the title the subtitle the formulas and the bar chart and well i can open this one here but i don't have microsoft excel so it's going to open with the numbers up in my mac and what i'm going to do is open this one here in
google sheets so i have this report underscore merge and well you can see all this data was successfully created so we have the formulas we have the title we have the subtitle with the name of the month i set before which is march and well also we have this uh bar chart and well that's it we successfully automated our excel report in this video i'm going to show you how to send messages in whatsapp using python so you can send a message to anyone you want at any time you want so let's get started okay
the first method to send messages in whatsapp using python is using a library named pi what kit this library allows us to easily send messages uh in whatsapp so first what i'm gonna do is open a terminal i'm here on pycharm and i'm gonna write pip install pi what k so this is the name of the library and keep in mind that i'm installing this in a virtual environment and i highly recommend you to install it in a virtual environment because this library has a lot of dependencies so you want to avoid some conflict so
install it in a virtual environment so i'm gonna hit enter and now it's gonna install so in my case i have this library already installed so i have the message requirement already satisfied but probably it's gonna take some seconds or even a minute for you to install this library all right now that we have the library installed i'm going to close up the terminal and i'm going to import this library so i'm going to write import pi what kit and to send a message i write pi what kit that send what msg and inside i'm
going to use this parameter so the first one is the phone number as you can see here the second one is the message the third one is the time in hours and the fourth one is the time in minutes so for the phone number i'm gonna write the phone number with the country code so we write plus the counter code for example in america i think it's plus one and then any number you want the second one is the message you want to send and i'm going to write just test then the time in hours
so here i write the time uh i think here is 7 17 in the morning so i'm gonna write here seven in hours and then in minutes i'm gonna write 721 now in the phone number i'm not gonna write the phone number i'm gonna hide it because i don't want to get any message so here i'm gonna define a variable just for the sake of this video which is named phone number now i write input and here i write enter phone number so every time i run this script it's gonna ask me for the phone
number and i'm gonna write it so here i'm gonna delete this and i'm gonna just copy and paste it you don't need to define this variable you can write the phone number here all right before running this script keep in mind that in the browser that you use you need to log into whatsapp manually so go ahead and go to your browser google chrome or safari and log into whatsapp using the qr code before we run this code so i already did this and i'm just going to run this code and try it out so
i'm going to check 7 19s now so i want to get the results faster so i'm going to put 720 so now i run this and i wait a couple of seconds but first i need to introduce that phone number and i paste the number and it says in four seconds whatsapp will open and after 15 seconds it will deliver your message so now whatsapp's opening now uh it wrote my message as you can see below on the bottom of the screen so now let's wait 15 seconds and it's gonna send the message so as
you can see here it just sent the message which is test and it did it just without any problem now i'm going back to pi charm and i close this one and i'm going to show you different things you can do with this library okay you can add more parameters to this method and you can even close the window that was open so before we open this whatsapp window but as you can see the window is still there so we can add a parameter in the method to indicate that we want the window to close
after we send the message so to do that we only i'm gonna copy and paste this one and we need to add two or three more parameters first is that wait time as you can see here the wait time is by default 15 so i'm just gonna leave it 15 so this is our this is the seconds that you have to wait before the the message is delivered so i'm going to leave it as 15 seconds this is not so important but now we see this one tap underscore closed so this indicates whether we want
to close the windows or note so this is by default set to false and we're going to set it equal to true so i write true and finally we have close time so these are the number of seconds that we're going to wait until the top is closed so i'm gonna set this to two seconds and basically what we're saying here is that we want the tab to close after the message is delivered and we're gonna wait only two seconds so let's run this code first i'm gonna just change the time because it's not 721
anymore but i'm going to set it to 725 and now run so it pastes the number and it says in 18 seconds whatsapp will open and after 15 seconds the message will be delivered so 15 seconds here is the same what we said here this is the default but we could change it but that's not important for what we're doing right now so here another whatsapp window was opened and it says here again test as you can see here and now let's wait 15 seconds for the message to be delivered so just a couple seconds
more and here it's test and that window or the tab was closed as you can see great so what i showed you so far is how to send messages to contacts so we can also send messages to groups so now let me show you how to do it so to send messages to a group that you're part of we only need to get the id so every group in whatsapp has an id and you can get the id by going to any group you want and choosing the group info section then you have to tap
on the invite via link option and copy that link i already have the link of the group i'm gonna use for this video and if you didn't get it i'm gonna leave a guide that you can use to get the id of the group you want to test so here i'm gonna define first um a variable like we did before in phone number but in this case i'm gonna set this group underscore id so enter group id and this is not necessary it's just for the sake of this video but now we're gonna use pi
what kit and here to send a messages to a group we have to use send i think it's this one send what msg underscore 2 underscore group so now the parameters are similar but instead of phone number we have to use the id of the group so here i'm going to set right id then the second is going to be the message so test group the third one is the time so in my case 7 because it's 7 in the morning and time in minutes i'm gonna set it to 31. so here it's basically the
same just uh the only thing that changes is the id so now i copy group id and i paste it here so now we can test this out so i right click and run this one and now it's gonna ask my well my phone number because i didn't comment this out so just give me a second i comment this out and now i'm gonna run this again and it's asking my group id i just paste the group id i press enter and now it says in 49 seconds that message whatsapp will open and then the
message will be delivered now i'm gonna cut the video and i'm gonna come back when the automation starts so now it's working and it found the name of the group and now let's wait for the message to be typed below on the on the bottom and as you can see it says test group and the test group message was delivered to this group great we just learned how to send messages in whatsapp to contacts and to groups using pi what kit