In this episode, Brick and Caleb orient us to Microsoft's new data platform, Fabric, which was launched into public preview earlier this year. They discuss its key components, including its relationship with Data Engineering, Data Factory, Data Science, and Power BI. Additionally, they explore the future implications of integrating AI into data management.
Click here to watch this episode on our YouTube channel.
Blue Margin increases enterprise value for PE-backed, mid-market companies by building and managing their data platforms. Our strategy, proven with over 250 companies to-date, expands multiples through data transformation, as presented in our book, The Dashboard Effect.
Subscribe here to get more episodes of The Dashboard Effect podcast on your favorite podcast app.
Visit Blue Margin's library of additional BI resources.
Brick Thompson: Welcome to The Dashboard Effect. I'm Brick
00:00:06
Thompson.
00:00:06
Caleb Ochs: I'm Caleb Ochs.
00:00:08
Brick Thompson: Caleb Two episodes ago, we talked about
00:00:10
answering the question, what is Power BI? And I thought we might
00:00:14
do the same thing today answering the question about
00:00:17
what is fabric, fabric being the new platform that Microsoft
00:00:21
launched back in May, into public beta, basically, the new
00:00:26
data platform out on Azure.
00:00:26
Yeah, yeah, it'll be good. I mean, there are a lot of
00:00:30
components to Power BI and kind of similar to Power BI is just
00:00:34
one of the components of fabric.
00:00:37
So yeah, there's a lot to cover.
00:00:37
But I think we can cover some of the high level to give people a
00:00:41
broad understanding of, of what they can expect there. And I
00:00:44
know you've been doing some training classes, I'm excited to
00:00:47
hear what you've been learning.
00:00:52
Let's see what happened. Well, I've been doing the fabric for,
00:00:55
for Dummies courses out on Microsoft, it's good stuff. So
00:00:59
at a very high level fabric is the branding and packaging,
00:01:04
around Microsoft data platform.
00:01:04
Previous to that, I think, probably synapse would have been
00:01:08
what it was. And so it's becoming fabric, and really
00:01:13
includes all of the pieces of things that you need to have
00:01:17
data environment, a BI environment, things like data,
00:01:22
warehouse, data, Lake, house, Power BI data models, all of
00:01:25
that stuff, including machine learning, and so on. So within
00:01:30
fabric, they've done a really good job of integrating all of
00:01:33
these pieces. And it's, it's pretty fascinating. So, so maybe
00:01:37
we'll start with data engineering. So what is data
00:01:42
engineering and fabric?
00:01:45
Caleb Ochs: So I think the way they do it, you might have to
00:01:48
correct me here, since it's a little more fresh for you. So
00:01:52
the data engineering piece, is the it's going to be your your
00:01:57
data warehouse.
00:01:58
Brick Thompson: Right, right.
00:01:58
Yeah, it's the data warehouse or data lake house.
00:02:03
Caleb Ochs: Gotcha. Yeah. So so it's going to be where you're
00:02:05
going to do like, you're gonna pull data, you're gonna land it
00:02:08
somewhere, you might do some cleansing, you might do some
00:02:11
transformations of it. But you're not necessarily doing
00:02:15
like machine learning at this point.
00:02:17
Brick Thompson: Right? Yeah. So so the engineering is really
00:02:20
about setting up a place where your data is going to live. So
00:02:23
it's something called one lake.
00:02:23
In fact, I don't want to get too in the weeds here. But Microsoft
00:02:26
is moved away from a true traditional SQL Server, even if
00:02:30
you're working in SQL, you're going to have all of your data
00:02:34
stored in a data lake data lake house. So it's really creating a
00:02:37
data lake house, and then all of the tables and the data models
00:02:42
and various things that would exist within that. So another
00:02:46
section in fabric, or another part of it is called Data
00:02:53
Factory. So it has been around for quite quite a long time on
00:02:56
on Azure, what is data factory?
00:03:00
And how are you going to use that in in fabric?
00:03:02
Caleb Ochs: Sure. So Data Factory is used to move data,
00:03:05
you can do a lot of stuff. But typically, if you connect to
00:03:08
your source, and then you dump the data somewhere, in this
00:03:12
case, it will be in a one like.
00:03:12
So data factory, the way that we're seeing it is that you'll
00:03:19
use some of that just data and data engineering pieces, like
00:03:23
the spark notebooks, and things like that. And then you'll use
00:03:27
Data Factory is kind of your job orchestrator and scheduling, and
00:03:30
that type of thing. And like making sure things don't fail
00:03:34
and all that fun stuff. That's where data factory will fit in.
00:03:38
Brick Thompson: Got it. All right. Another piece is data
00:03:41
science, I think of this primarily as machine learning
00:03:44
stuff. So building tools that can help with predictive
00:03:48
analysis and so on. Predicting which customers might churn,
00:03:52
that type of thing is really well integrated into the to the
00:03:55
ecosystem. And in fact, it's gotten so easy to do that stuff.
00:03:59
It's amazing. Easy, you still gotta start to know what you're
00:04:02
doing. But you know, five years ago, six years ago, that machine
00:04:05
learning stuff was really you needed a data science degree to
00:04:09
even approach it. And now regular people can actually
00:04:12
start using it.
00:04:13
Caleb Ochs: Yeah, I mean, there's that's kind of the theme
00:04:15
of fabric, right? Just make it easy. And, and machine learning
00:04:19
is one of those those pieces.
00:04:19
And I think what's exciting about it for down the road is
00:04:22
that this AI you know, AI studio is another really cool thing
00:04:28
that Microsoft's doing, I could see that being integrated right
00:04:31
into fabric. And so you're just, you're answering questions about
00:04:34
your data, as you're processing it. That would be pretty sweet.
00:04:39
Brick Thompson: Yeah, I would imagine we're gonna see some
00:04:42
really cool stuff over the next few months. And then obviously,
00:04:45
another huge piece of this is Power BI. So Power BI is now
00:04:49
included as part of fabric, it's part of the fabric ecosystem
00:04:53
makes it really easy to create, what used to be data analysis
00:04:58
cubes, analysis or versus cubes.
00:04:58
Now you're creating data sources within fabric using Power BI to
00:05:02
help you do the modeling of the tables and, and creation of DAX,
00:05:08
and all of that stuff. Yeah,
00:05:12
Caleb Ochs: I mean, it's directly because what they're
00:05:15
calling it from Power BI directly to the, to your one
00:05:19
lake, you go through some of the things like the data warehouse,
00:05:23
what they're calling the warehouse and the lake house.
00:05:26
But you can model it right inside of, of the web UI inside
00:05:30
of fabric. You can even write DAX in there. And then the
00:05:34
coolest part is that it doesn't actually pull the data into
00:05:37
Power BI just stays there, right? In your warehouse, or
00:05:41
lake house. And Power BI just transacts against it. And it's
00:05:45
lightning fast.
00:05:46
Brick Thompson: It's amazing how fast I don't know quite how
00:05:48
they're getting that performance. It's amazing. It's
00:05:50
awesome. It really is. Alright, so you're already mentioned
00:05:53
another piece data warehouse, how to, I have a little bit of
00:05:56
confusion around what they're calling data warehouse. Now in
00:05:58
the old days, that would have been sort of a Kimball model SQL
00:06:01
Server base something, how are you seeing that in fabric?
00:06:04
Caleb Ochs: Yeah, so the way that they've laid it out is
00:06:08
warehouse is going to be your sequel engine. So this is really
00:06:12
the only difference. So warehouse, you get to write SQL
00:06:15
against it. And you can do some transformations, or creating new
00:06:18
tables using SQL, T-SQL. The Lakehouse is going to be spark.
00:06:24
So that's going to use PI Spark, or, you know, Python as you as
00:06:28
you write things there. And that's really the only
00:06:31
difference inside of fabric as your as you're dealing with
00:06:34
them, they look pretty much the same, right? Right. It's like,
00:06:37
oh, there's your tables, you can create your relationships, you
00:06:40
can do all the things that you can, like between the two of
00:06:42
them. The difference is the engine that's processing the
00:06:45
data.
00:06:45
Brick Thompson: And all of the data is sitting natively and
00:06:48
delta parquet files in one lake, but it's just basically how you
00:06:52
interact with it. How you model and that
00:06:54
Caleb Ochs: type of thing. Yeah, yeah. And those Delta parquet
00:06:56
files are really cool.
00:06:57
Brick Thompson: Yeah, those are really cool. There's a thing, it
00:07:00
was called Time Machine, something like that. Where Yeah,
00:07:04
we're when changes happen to the tables, that delta parquet
00:07:08
files, you can actually now go in and say I want to, I want to
00:07:11
see exactly the state of this based on some date and some time
00:07:15
in the past. And, you know, we've we've built data
00:07:18
warehouses for years that allowed you to do that. But it
00:07:21
took a lot of careful modeling and writing of DAX to be able to
00:07:25
do that. Well.
00:07:26
Caleb Ochs: Yeah, right. The typical scenario there is, I
00:07:29
need to I need to keep data as of like month end or something,
00:07:33
I don't want to change that, even though in the SIS
00:07:36
transactional system, it might you want to be able to say no,
00:07:39
this is what we reported on January 31. That's going to be
00:07:42
what we want to report on forever now. Right?
00:07:44
Brick Thompson: So you know, so if you're billing historical
00:07:47
report, you might say, Alright, I want every month and and now
00:07:50
it's very simple to go back and get that without having to have
00:07:54
set it up beforehand to write those out to a table somewhere,
00:07:57
right? And the built in, there's so many great features coming
00:08:01
with this. And it's still in preview. Do you know, I can't
00:08:04
remember the date that they may be going GA on this?
00:08:06
Caleb Ochs: I don't think they've announced it yet. You
00:08:09
know, it's it's interesting. I was reading this post a couple
00:08:14
of weeks ago about someone saying like, you know, when is
00:08:16
this going to be generally available? And I'm having a hard
00:08:19
time getting people to buy into it, since it's public preview.
00:08:23
And, you know, someone replied, like, well, like, what
00:08:26
functionally, does it not do well for it, and they didn't
00:08:29
really have anything to say. So that the really where the
00:08:33
conversation went was? Well, you know, just because there's a
00:08:36
label on it means you, you're not going to use it, even though
00:08:39
functionally it meets all your specs. But if it didn't have a
00:08:42
label, and functionally it didn't meet all your specs,
00:08:44
you'd be able to write Yeah, so it's kind of an interesting
00:08:48
thing. It's really just the public preview right now is it
00:08:51
is just labeled, there are some some nuances kind of on the
00:08:53
edges. I think those are going to be there, even when it's
00:08:56
generally available, there's still going to be some bugs and
00:08:59
kinks getting worked out. Right, just like there was with Power
00:09:02
BI. But public preview right now seems usable.
00:09:06
Brick Thompson: Yeah, I agree.
00:09:06
So there may be some of this that I got wrong, or that we got
00:09:09
wrong. But that's sort of a high level overview of the pieces of
00:09:13
fabric as we see them. There are so many details and cool
00:09:17
features and things you can do and things that we'll all be
00:09:20
learning about how to optimize and sort of do as best practices
00:09:23
in this. I'm really excited about it's cool.
00:09:27
Caleb Ochs: Yeah, I think it's gonna be great. I mean, it just
00:09:29
fits in really well with where Microsoft's headed with
00:09:31
everything. And the the AI stuff is gonna be there and fabric is
00:09:38
gonna enable a lot of it right.
00:09:38
And it's gonna be pretty cool.
00:09:40
Brick Thompson: Yeah, yeah. As, as they said during the keynote
00:09:44
back in May. Data is the fuel that powers AI. And clearly the
00:09:50
company Microsoft is really setting yourself up to provide
00:09:52
that sort of that conduit that place to have the data and
00:09:56
manage the data so that you can enable your AI really well.
00:09:59
Caleb Ochs: Alright man, it's gonna be sweet
00:10:01
Brick Thompson: Alright. Thanks, Caleb.
00:10:02
Caleb Ochs: Thank you