over the next 40 minutes I'm going to present to you everything you need to know about Microsoft fabric Microsoft's new endtoend data and analytics platform the key learning outcomes for this course are that I want you to walk away with a solid understanding of what fabric is the problem it solves and how it solves them but more than just the boring what why and how of fabric really what I want to share with you is some of my passion and excitement for Microsoft fabric so you leave with an appreciation for the huge opportunity that fabric
represents for your organization and for you personally in your own career we'll Begin by telling the story of Houston Electric a fictional electrical Goods e-commerce company that is struggling with its existing data infrastructure and workflows through the case study we will explore the problems that fabric was built to solve we will then dive into the big Vision behind fabric I'll be showing you exactly how Microsoft fabric solves these typical indust industry problems and you begin to see why I believe fabric is such transformational technology now if you want to truly understand fabric then there are
some fundamental concepts that you will need to learn so in part two of the course we will cover what I believe are the most important Concepts that you need to understand those being one Lake the seven experiences of fabric and the four compute engines together we will then look in more detail at the seven experiences of fabric so that you gain a proper appreciation for what exists and how you can use it to round off this course I'll be showing you how you can get started using fabric for yourself and signpost some of my favorite
resources for learning fabric that I think can really help you in your own Journey so if you're wondering is this course really for me should I invest the time to sit through this course well the great thing about fabric is that it's an endtoend unified solution which means that it caters for every data Persona in your organization so if you're working in any of these roles currently maybe outside of fabric then this presentation will definitely be highly relevant and interesting for you or if you want to be working in one of these roles within Fabric
in the future then this course will also be a great foundation for you either way no knowledge of fabric is assumed there's not going to be any code examples or too technical in the weed stuff this is quite a high level course if you want a more detailed explanation and code examples feel free to look at my other videos on YouTube where there's lots of that stuff on there and I'll be using plenty of practical examples throughout to solidify your understanding of the platform I've spent over 200 hours preparing this course so I believe it
would definitely be worth your while to stick around on top of that I've been using fabric nearly every day since it was released so I've got a good understanding of how it works and I've been a data professional for the last eight plus years working across pobi data engineering data science data warehousing and realtime analytics which you're about to see covers nearly every aspect of Fabric's functionality so all that effort knowledge and experience has been condensed into this introduction to fabric course which I'm releasing for free on this YouTube channel to give back to the
community that has helped me so much in my own Journey as well as on YouTube this course is hosted for free in my online community built especially for people like you to learn Microsoft fabric even faster I recommend you follow the link in the description because as well as the video lessons you also get access to the course notes and links to further resources that I'll be mentioning throughout this video throughout the course and you can use the community feature to ask me and the community any questions you might have about the course and Microsoft
fabric generally and it's completely free so there's nothing to lose so without further Ado let's begin so to understand why fabric is such a transformative technology it helps to properly understand the problems that fabric has been built to solve and to illustrate this I want to tell you about the story of Houston Electric now Houston electrics is a fictional electrical goods online retailer based in the US but it ships electrical parts around the world the company's grown to over 400 employees to service a growing demand and the company Prides itself and being Innovative and uses
data to make better decisions in 2015 Houston electrics hired their first Chief data officer who launched a digital transformation program to make better use of technology to drive business growth Central to this strategy was the establishment of a central data department now with more than 50 employees and the company has invested in a number of cloud Technologies in Azure and Amazon web services and they've started using powerbi looking at what the company is using today we can see that the compan is split into the following main departments at least from a technical point of view
each department has a number of data technologies that they use to store and manage and analyze their data and if you're not familiar here with all of the logos on the screen don't worry the point here is that the data landscape in the company has grown organically over the last 8 years with little strategic Vision this is a common situation that a lot of companies find themselves in each department uses the tools they prefer and each department relies on data from other departments to do their job for example the customer success team manages the order
reviews database which keeps track of all the reviews that customers leave about their products now let's look at a particular workow that uses this you orders review database to see how it works and see how the company functions so when a customer purchases a product they leave a review on the website and the review data is stored in an Azure SQL database managed by the customer success team and the data engineering team and the company have built a data pipeline that copies this data every morning into a data Lake for data scientists to analyze and
the data scientists then take this review data they perform sentiment analysis on the review basically to gauge whether the customers are liking or hating certain products based on the text they write and these sentiments are then written into a data Lake container finally the bi team have created a powerbi report that communicates these findings back to the product teams to help them improve the products going forward now they've got this workflow to a point where it's it's working but it was a pain to set up and it's a pain to maintain and data just for
this one workflow is scattered across four different locations in four different data formats some of which are proprietary meaning you can't really access it very easily from other tools and this is just one workflow the company operates many data processing pipelines like this copying data between different departments and between different storage locations and products the complexity of the data landscape has grown so much that the maintenance of these existing systems and processes becomes an almost full-time job so to analyze what might be going wrong here let's map out some of the tools that the company
is currently using under these four broad buckets of data ingestion data storage data engineering and data science and business intelligence now let's analyze the current situation against these nine considerations listed down the left hand side and to analyze them we're going to speak to a number of key employees in the company so if you speak to the chief data officer they're going to tell you that the data architecture has grown organically and now there are data silos all over the place I'm the chief data officer and I don't want to be the chief integration officer
so instead of focusing his time on how the company can generate more value from data here's role is focused on system integration which is quite a massive headache for him and for his team and for the company so in our visual map here we can plot different blocks that represent all the different systems that create different data silos then then if we speak to the data engineering lead he tells us that we're maintaining hundreds of pipelines copying data between lots of different data stores for different departments and it's messy and to get around all these
silos of data within and between departments and data products the data engineering team have been flat out building data pipelines that copy data from here to there and now they're in to deep the pipelines are failing regularly as there's just too much to manage and maintain within a small team so on our diagram now we can see all of the different copies between each system and product and how this is causing confusion and maintenance and problems with data quality as well it's not just the data engineering team this causes issues for imagine being a powerbi
developer how on Earth would you know which data sets to connect to with an organization and how can you know if they're up to dat and whether you can trust them the data science team faces similar issues they mention that we have data scattered in so many formats in different places and it takes me days to get clean data sets just to begin an analysis and even then I don't know if I can trust the data and on top of that just the fact that so many systems exist and they're all different creates a cognitive
overhead for this team they added that I had to learn the intricacies of many different data Technologies and each one is different we had a new starter and it took three months to upskill them in all the different data platforms we use so let's add those problems onto our visual here the data scientists mentioned that there's no uniformity in data formats across the organization which makes it difficult to get everything into a common format needed to begin analysis on top of that they have to work every day across five six maybe seven data products each
with their own learning curve user interface user experience and this is inefficient especially for new joiners so the different color blocks here show how the user experience is different in each of the tools the company uses next the IT director is not happy too he's saying we're using too many systems all have different security profiles and requirements to keep data at rest and data in transit secure and it's a nightmare and it's true it's not just the people on the front end of these tools that find it difficult spare thought for the IT director and
his team responsible for managing access to each of these tools independently securing them and the data stored within them governing the data and then monitoring and maintaining the systems generally and as if that wasn't bad enough enough he also gets the finance director on the phone every month to complain about the Azure Bill he added I dread getting our Azure bill every month it's so unpredictable and sometimes scary each data product has their own pricing structure so it's difficult to predict how much we will be charged month to month so now we add billing and
Licensing to our visual here and we can see that each product has a different licensing and a different billing structure which makes it really difficult for companies to manage and finally it's the poor old dashboard user they don't really know much about all the palava that's going on behind the scenes here but they can sense that something isn't right they say I'm not sure I can trust the data that's being presented to me here doesn't always reflect reality and they're right to be nervous all of this complexity means that the data teams have failed to
really get a grip on data governance and data quality and the dashboard users at the end of the process are losing trust and when they lose trust in the visualizations and the data being presented to them then what's the point in all of this anyway so let's just pause there for a second and it was at this point when the head of Houston electrics the smart guy that he is was watching the learn Microsoft fabric YouTube channel wow what a coincidence he learned about Microsoft Fabric and he knew that Microsoft fabric could help solve the
problems that they were experiencing as a business and he was right you see within Microsoft fabric we have a family of data products available to us which Loosely fall under the four buckets that we were looking at before data ingestion storage data engineering data science and business intelligence now these tools when looked at in isolation they serve a similar function to the tools that the company was using previously and it's a common myth that you sometimes hear people say about Microsoft fabric is that oh it's just a remarketing exercise they've just put a new badge
on existing technology but this Viewpoint fundamentally misunderstands that fabric has been completely built from the ground up to address all of the problems that we saw in the previous slides it's not just a marketing exercise but a complete rethink and re architecture of how data is managed in your organization and let's look at what we mean by that in more detail so in fabric all of your company data resides in one place and it's called one Lake and it's a fundamental concept that we'll explore in more detail a little bit later on how one Lake
eradicates data silos so data across all these different products is actually stored in one place and because of this it also eradicates the need to create multiple copies of a data set a fundamental principle in fabric is that a data set should only ever exist in one place and then be referenced throughout fabric using a clever feature called shortcuts so all your organization's data is stored in one Lake and in one Lake all data is stored in the same format called Delta paret and this is an Open Standards format and this solves so many problems
that we highlighted previously most importantly it solves the data integration problem so data scientists data engineers and data analysts might all be using different tools maybe different languages that they're familiar with but under the hood they're all working on the same data which is in the same format and they're not wasting hours and days trying to get that data into suitable format so they can begin their analysis next fabric provides a unified user experience we heard previously that people were sick of having to log into multiple different platforms each with a different look and feel
when in fabric you log in Via a web portal and the user experience is designed to feel similar to that of Microsoft 365 so you log in once and you can navigate to any fabric experience and each experience has a similar look and feel this means you can spend more time thinking about how you're going to get value from your data and not worrying about how to navigate through the application and for people administering fabric there's so many benefits too for starters access control and security is drastically simplified in fabric there's one access control method
and one security model applied across all tooling and experiences and access to these resources is principally managed through workspaces which is just a collection of fabric items that makes sense for a particular security boundary that you want to set up and because all of our data lives in one Lake data governance and discoverability becomes a much easier task and in fact fabric has many features built in for governing your data set next up is a really important one fabric comes with a single built-in monitoring Hub which monitors all fabric activity across your experiences so you
only have one place to look if you want to monitor all of your different data pipelines your notebook runs any kind of processes that you're running within Fabric and last but not least billing and Licensing in fabric is Unified meaning that when you purchase some fabric capacity for your organization you immediately get access to all of the features and items there's no longer a separate license or billing structure for all of the different products listed above now there's different levels of capacity you can buy depending on the intensity of your usage and something to bear
in mind here is that if your company is currently paying for power bi premium capacity you get an f64 fabric capacity which is a lot of capacity in fabric for free with your powerbi premium capacity this means you can create and use any fabric items today without paying any extra and all of these things I've mentioned here are kind of a topic within themselves that I could go into a lot more detail on but what I've done is if you go through to the Community Link here and go through to the classroom tab I'll leave
a link in the description there's links to a lot more resources where you can learn about each of these specific features in Fabric in more detail okay so now we're starting to understand the power of fabric and you've heard about one Lake and the experiences and the compute engines and how powerful these things are when they work together let's look at how it works in more detail so in fabric there are seven experiences and they cover the whole endtoend data analy workflow that a company might face now an experience is just a logical grouping of
tools that make sense for specific personas for example in the data engineering experience you'll find easy access to tools that a data engineer would use frequently and in fabric you'll find three main data stores places where you can create and manage data the data warehouse the lakeh house and the kql database the important point is that any tabular data that you create in any of these stores is under the hood automatically stored in one Lake and crucially all the data stored in one lake is is stored in this format the Delta paret format which as
I said previously Is An Open Standards format so to interact with data in any of these stores we can write tsql scripts in the data warehouse we can write python r or Scala in the notebooks in data engineering and data science experiences or we can write kql in the real-time analytics experience and there's also a lot of low code and no code options for beginners in each of these experiences and Microsoft co-pilot is now tightly integrated into fabric so you don't need to worry if you don't know much coding fabric is really built to cater
for people with no coding experience all the way through to professional developers and these are features that they're constantly adding more of into fabric to make it a bit less intimidating for you know beginner level people entry-level people to get up and running quickly in fabric now the ability to run script in many different languages against the underlying Delta paret format of one lake is made possible by the four compute engines in fabric now these compute engines act like the integrator the user writes some code in a language they're familiar with and then the relevant
engine converts that into a query of the underlying Delta tables in one Lake and Returns the data to the user that they're expecting in the experience that they're currently using now once the data is stored in one Lake it's directly accessible by all the other engines without needing any import or export and all the compute engines have been fully optimized to work with Delta paret as their native format as well as data stores and scripting tools there are more items available a lot more items available to us in each fabric experience so now we have
a good understanding of how the overall architecture Works let's review each experience one by one starting with data Factory so the data Factory experience in fabric is primarily focused on moving and transforming your data and a core use case is using data Factory to get new data into fabric perhaps from an external API or by connecting to one of your organizational systems you could describe the data Factory experience as a set of tools to help you with extract transform and load functions now it's built for Enterprise scale too so if you have a lot of
data or a high frequency of refresh then this shouldn't be a problem now the fabric items that you can create within this experience are the data flow and the data flow basically allows citizen developers to connect to more than 300 data sources to bring data into fabric transform it using a kind of familiar low code no code power query interface so if you're a powerbi developer or you've used Excel then this will be very familiar to you and the data can then be written into one of the fabric data storage solutions like a Lakehouse or
a data wouse we also have the data pipeline now this is an orchestration tool and it's used to trigger different data processing workflows normally on a schedule for example triggering the Run of a fabric Notebook on a particular schedule or maybe triggering a stored procedure in your data warehouse and pipelines can be used also to bring data into fabric this set of tools provides similar functionality to existing tools that you might be familiar with like azid data fact Factory synapse pipelines and powerbi data flow gen one and the main personas who might be using the
datafactory tools would be data Engineers analytics engineers and also powerbi developers now the data warehouse experience consists of the data warehouse unsurprisingly which provides a familiar transactional data warehouse solution with tables schemas views stored procedures and all that good stuff and it's obviously queriable using tsql and it provides features to make it more accessible for citizen analysts so there's visual scripting low and no code Solutions built into this experience as well and it's Lake Centric meaning that under the hood it's highly scalable architecture so it's not a traditional SQL Server although you can use tsql
to query it under the hood it's actually built on top of a completely different engine called the pois engine now the fabric items you can create within this experience are as we mentioned a trans action or data warehouse built on top of the Polaris engine for scalability so this is where users can create tables schemas views store procedures functions all this good stuff now it provides a similar set of functionality although not exactly the same to some of these existing tools so it's similar in a way to SQL server or an azra SQL database because
it uses tsql it allows the user to interact with it using tql but it's more similar to synapse SQL serus or dedicated pools because it us is the same underlying engine the Polaris engine and if you're using tools outside of the Microsoft ecosystem similar to snowflake I mean obviously there's lots of other tools I could mentioned here that similar to but these are just a selection and the main personas who might be using the data warehouse experience could be database administrators data Engineers data analysts those kind of people next up we have the data engineering
experience and data Engineering in Microsoft fabric enables users to design build build and maintain infrastructures and systems that enable their organization to collect store process and analyze large volumes of data so the fabric items that you can create in this experience are the Lakehouse which allows you to store and manage both unstructured data so files and convert them into structured data so Lakehouse tables and we do that using notebooks and other tools as well but the notebook is where a user can write write and run scripts in a variety of languages to perform data engineering
tasks like cleaning data or validating your data or whatever you like really and the languages available to you are python or r or Scala and the notebook is built on top of Apache spark which is a big data processing framework commonly used in the data industry now we also have the spark job definition and this is a set of instructions typically written in Python in P spark that Define how to execute a job on the spark cluster and this is more for advanced users who want a bit more control over how their data is going
to be processed by The Spark engine now this set of tools provides similarish functionality to some of these tools that you might be familiar with like ADLs Azure data Lake Services data bricks or Snowflake and obviously the main personas who might be using the data engineering experience would be data Engineers or analytics Engineers next up we have the data science experience so the data science experience provides a complete set of tools to support the entire data science workflow within an organization right the way through from data exploration preparation cleansing and experimentation modeling model scoring and
the serving of your insights and your predictive insights within a powerbi report now the fabric items you can create in this experience are are the notebook so the notebook is a key tool for data scientists and they use them to explore the data through code typically through python or R and the notebooks are used for data exploration running experiments training ml models machine learning models and other stuff that data Sciences do as well another tool in this experience are experiments so when a data scientist is training a machine learning model typically they will run a
lot of experiments to optimize the model and the experiments item in fabric provides functionality to track each of these different iterations and logs things like the parameters that are being used the code versioning and evaluation metrics for that particular model run now experiments use mlflow which is basically the industry standard for this kind of logging of machine learning model training and experimentation finally we have machine learning models so during the experiment ation phase specific versions of machine learning models can be registered using ml flow and fabric provides functionality for managing and reviewing these models using
the machine learning model item in Fabric and you'll find similar functionality within here that you might find in other tools like Azure machine learning synapse notebooks and datab Bricks notebooks as well and finally the main personas that going to be using the data science experience is not surprisingly data scientists the real time analytics experience in fabric provides a set of tools to ingest manage and analyze real time invent data and event data is a completely different Paradigm really in data analytics and requires a different mindset and also a different set of tooling the fabric items
you can create within this experience are the kql database which is a data store for your streaming data sets it's built on top of the kql the custo query language engine and and if you're familiar with Azure data Explorer for things like logging then this is the same engine that being used in the kql database within fabric we also have event streams and this is a no code tool to register streaming data sets process them and then rout them to various destinations in fabric we also have the kql query set now this basically allows you
to query data in a kql database using the custa query language and this set of tools provides a similar functionality to the existing tools as I mentioned it's very similar to Azure data Explorer now there are some new things like the event stream that's completely new but it's built on top of the same engine and the personas well it could be a DAT engineer or an analytics engineer or if you have real time or iot Internet of thing Engineers or perhaps security Engineers because this is used quite a lot with security event logging and tracking
that sort of thing as well so powerbi is Microsoft's business intelligence solution that allows you to create reports to present visual insights to business users within fabric there's quite a lot of items that you can create under this experience I'm going to be talking about two of the main ones first one being the report so the report is basically a business facing business intelligence report and you use it to visualize business data and insights and show them back to to your user your business and when you publish a report you also get an Associated semantic
model and these are what used to be called a data set in powerbi until not too long ago and the semantic model is a collective term that refers to all the things that make up kind of like the back end of a powerbi report so things like your tables your relationships your Dax measures now things like calculation groups as well all these different things make up the semantic model this experience provides a set set of tools that you know similar to maybe you've used Tableau or looker obviously it's not exactly the same there's lots of
differences but it's these kind of business intelligence tools if you're coming from a different platform and the main users and personas who might be using this experience will be well the business users you know not necessarily data professionals but people who are consumers of your reports then you have powerbi developers and maybe bi analysts data analysts all going to be using powerbi now the final experience I want to touch on here is data activator and data activator is currently in preview still and it's a very new experience but it's a no code experience in fabric
for automatically taking actions for example running a power automate script when patterns or conditions are detected in changing data and this could be data in a powerbi report or it could be event streams as we've just seen in the real time experience and in data activator you can create what's called a reflex now a reflex lets you define a specific data point that you're tracking maybe it's the temperature of a fridge in an iot system for example and then it also allows you to define a condition so whenever the temperature of your fridge gets greater
than 4° C or whatever that is in Fahrenheit it's going to trigger an action and your action can be a variety of different kind of actions it could be a power automate script that runs or it could be an email notification or it could be a HTTP request to some other Azure function for example or all these different options but basically it monitors your data in real time and then when the data reaches some Condition it's going to perform an action data activator i' would say is quite unique in the data analytics Market I don't
think there's too many companies doing a similar thing at least at this scale so you might have used tools like power automate or Azure functions which has a similar kind of If This Then That Type Paradigm but obviously in Azure functions you have to create that logic yourself most of the time and power automate is not really built for Enterprise scale data processing but these are some of the tools that you could describe it as similar to although it is unique in its own way and the people who going to be using this tool are
again business users because this is an entirely no code experience so it's built for people in the business to set up these things without you having to do it so if you've got someone in marketing who really wants to track you know social media Impressions and when they get above a certain level they want to be notified you know if one of your posts is going viral you don't want to be sitting reading the powerbi report day after day hour after hour really you you want to set a threshold and once it goes above a
th000 Impressions then you want to be alerted to that fact and then you can do something about it so that's why data activator is so powerful because people in the business can set these up for themselves in a no code way so we've covered a lot of ground there's probably information overload going on inside your heads so let's just pause here for today but before we finish I just want to let you know how you can get started using Fabric and then show you lots of links to more resources that I find really helpful for
learning Fabric and I think would help you with your journey as well okay so here I am inside of the school Community learn Microsoft Fabric and if you haven't been here before I recommend you go here you know we've got 15 members currently but it's likely to go up a lot more I haven't even advertised it yet so follow the link in the description here I basically answer any questions that you've got and you can ask me any questions or other people in the community as well and and generally we have lots of discussions about
fabric so if you want to learn fabric I recommend you go here and in here I've got this classroom content and it's called introduction to Microsoft Fabric and within this course you'll find all of the notes about what I've just been describing in this course and at the bottom there's this getting started link and here I describ two methods for how you can get started with Microsoft fabric the first one is to follow through to the documentation that's the Microsoft documentation to set up a free Tri now this is good if you have admin privileges
within your existing company or within your Microsoft tenant now if you're just someone at home who wants to set up their own fabric instance especially with its own admin privileges then there is a way to do this via the M365 developer account I've linked to a blog post here on data witches. comom and this is a really good blog post because it shows you how to set up an m 365 developer account and it is a little bit long- winded but basically allows you to get a fabric free trial with admin privileges so really you
can use it to do whatever you want for 60 days and in this blog post it tells you exactly how to do that so I recommend following these instructions and then getting your sandbox environment set up for fabric on top of that I've also got this links to more resources and some of these links are provided by Microsoft so in their introduction to fabric presentation they list these resources so these are all Microsoft resources at the top so links to the documentation links to the ebook links to Microsoft learn fabric modules some endtoend scenarios and
fabric notes if you're more of a visual learner these are some revision cards for Microsoft fabric that you might find useful now on top of that I've listed these personal recommendations of what I think are good resources for you to learn from so one of the best ones that I use quite regular is the Azure synapse YouTube channel so they've kind of rebranded let's say and they're doing quite a lot of content on Microsoft fabric particularly there's a series called fabric espresso and what's good about the is that they interview the people working on uh
the fabric product teams about how their particular product works or how a new feature works so I definitely recommend you going there next up is advancing analytics YouTube channel and this is really good particularly if you're interested in learning more about lakeh House Lake House principle Simon there has got a lot of experience in building and designing lake houses and The Spark engine in particular as well so if you looking to learn that kind of stuff I would recommend advancing analytics we also got katos bi run by Chris Vagner which is an awesome place to
learn more about Fabric and they have every Friday they have like a live stream every well for me it's in the mornings I think if you're in the US it would be around lunchtime and yeah it's just really good resource to learn about fabric next up we have Tales from the field and tales from the field again is like a regular discussion and Roundup about what's happening in the fabric data community so if you don't have much time to get your head around all the different news and events and ways that people are working with
fabric and I recommend Towers from the field it's a great bunch of guys all working for Microsoft all working with clients on kind of figuring this stuff out as well so there's some good insights in that one as well I'd also recommend Fab bri. Guru which is a Blog run by Sandy Po and this is a really good resource for helping you understand a wide range of topics um especially Sandy loves going into the some of the more technical stuff figuring out how these systems can integrate working out cool new workflows that are available to
us in Fabric and he documents them really well in a lot of detail so I definitely recommend you checking that out another one is data Mozart focusing mainly on powerbi but but now is producing a lot more content on fabric particularly recently around the dp600 which is one of the certification exams it's the only fabric certification exam currently that allows you to become a certified fabric analytics engineer so if that you're interested in that I definitely recommend checking out data moar and the final one is a Blog by Sam de bruy and this is basically
a collection of articles about fabric he focuses I think previously he was focusing on azure and kind of the Microsoft data stack in general but now Sam's been producing some of the best like blog posts that I've read come from Sam so I definitely recommend if you like reading blog posts about fabric to check this one out as well if you've made it this far I want to say thank you very much for watching and I would love it if you leave a fire Emoji in the comments just to let me know that you've made
it all the way to the end if you have any questions then I recommend that you join our learn Microsoft fabric community and ask away the link is in the description or down here