SQL Server 2017: Demos, architecture and everything else you need to know

Coming up we take a look at SQL Server 2017 which brings the mission-critical performance security
and availability of SQL Server to Linux and containers. We’re going to show
you how the SQL Server that you know now runs on Linux and the new platform
abstraction layer and how well that performs. We’ll also show you Active
Directory authentication on Linux and the high availability options as well as
failover between Windows and Linux platforms. And stay tuned too, as we’ll
show you support for graph analysis to identify new relationships in your
data and more. Microsoft Mechanics I’m joined by Travis Wright from the
SQL engineering team, welcome. Yeah, thanks for having me back. So back on the Mechanics last November you launched SQL Server 2017 in preview. Now that the product is complete, what’s new? The main focus for SQL Server of 2017
is to bring SQL Server to the Linux platform and containers. At the same time we continue our cloud first development model. So all the new features we’ve been
adding to Azure SQL database recently, those will now show up in SQL Server. Things like SQL Graph. And since the preview in November we’ve
been adding a bunch of enterprise capabilities to SQL Server on Linux. That includes high availability, security features including Active Directory
authentication, and we’ve achieved performance parity between SQL Server on Windows and Linux. So, I bet if you’re watching this you’re probably
wondering did you have to actually rewrite SQL Server in order to bring
it to the Linux platform? Yeah fortunately no,
we’ve been actually trying to work on optimizing SQL Server and more
directly talk to the hardware for a while now anyway. So if you think back to prior to 2005, SQL Server like other apps leverage
Windows for hardware resource management. Then in SQL Server 2005 because we
wanted to have better control over the hardware resources to improve performance and scalability. we brought the hardware resource management into
SQL Server user mode process so that we could better control that. And that’s what we call the SQL OS layer. And that’s responsible for
processor scheduling, memory management, Network and disk i/o. And at that point SQL Server
really no longer had a dependency on Windows for Hardware resource management. A few years ago Microsoft research identified about 50 application binary
interfaces or ABI’s that all of the Windows API surface area sits on top of. And it just narrows down to those 50 ABI’s. So to bring SQL Server to linux, we took a small number of Windows Binaries that SQL Server actually interacts with. And we then brought that together with
the SQL OS layer and packaged that together and brought it to Linux. And that what we call the SQL platform
abstraction layer. And then as part of the SQL platform abstraction layer, we intercept those ABI’s that are in Windows and we remap them onto the ABI’s of the host operating system we are running on. And that’s what we call a host extension. And today we have one for Linux and
we have one for Windows. And the result is that regardless of which
operating system SQL Server is actually running on, it thinks that it’s running on top of Windows. Right, another benefit here is that you
didn’t have to rewrite SQL and there’s not really any emulation running . Yeah exactly it’s the same code that we’ve
always had and same high quality. And the other benefit is that as
we introduce new features and as we fix bugs, that’ll show up on Windows
and Linux at the same time. An example of that is SQL graph
which we’ll see later. That was actually developed by developers running SQL Server on Windows but it just works on SQL Server on Linux. And then we optimize the
code pass through the SQL PAL so that we get the same great performance we’ve
always had with SQL server whether it’s running on Windows or Linux. So can we see then how the two
platforms compare in performance? Yeah definitely, let’s take a look. So here I’ve got a couple of virtual
machines running Azure. One is running Red Hat Enterprise Linux and
the other one is Windows. They’re ds-11 v2 VMs with two cores on
14 gigs of memory. Over here in my terminal
window I have a bench marking application that we use to put some load on the
server and measure the performance. So let’s kick this off in both of these windows here. The one on the left is targeted this
VM running Windows. And the other is the VM running RHEL. So you can see that the connections
start spinning up here. And once all the connections are
created then it’ll start measuring some transaction performance of both doing some inserts as well as some selects. And here you can see the milliseconds
of how long it takes to do a particular transaction. And the rate of transactions per second. You can see that across
SQL Server on Windows and Linux here the performance is
approximately the same. Right, so let’s just give it a couple more seconds here And now we can see that the result here is 8,914
transactions within that time frame. And over here with SQL server on RHEL it’s 9,229. So within a couple of percentage points. And you can run this test all day long. It’s kind of fun actually to do
a comparison between the two. And sometimes the SQL Server on Windows
will win or sometimes SQL Server on Linux will win. But the performance is
always within a couple percentage points. So pretty fair race. Then this is a basic performance demo. Any other benchmarks that we have? Yeah of course we want to do more sophisticated and and official benchmarking testing. And so to do that
we run TPC benchmarks. That’s sort of the industry standard. So in fact we actually just posted recently a new world record for the one
terabyte data warehousing benchmark with SQL Server running on Red Hat
Enterprise Linux. And as you can see here we actually beat the
previous world record holder which was SQL Server 2016 on Windows . And so what you can kind of infer from this is that SQL server 2017 is actually
faster than SQL server 2016. And that’s through some improvements that
we’ve made in the query processing engine. So not a platform to platform thing, more of a year of progress on SQL Server? Right, so the demo shows you how SQL Server on Windows is comparable to SQL Server on Linux in terms of performance. And then this benchmark shows you how SQL Server
2017 is actually faster than SQL Server 2016. So the SQL platform distraction layer
really gets you a long way in terms of where you need to be. But I know there are some things that
we like to take for granted or sometimes take for granted windows such as Active Directory based authentication, or clustering services. And those aren’t the same in Linux, so how do you solve for those? Yeah for Active Directory authentication, once your SQL Server
is on a Linux host that is domain joined you can connect to it just like you can
on any other SQL Server. So let’s take a look at that. Here I am on a Windows PC and
I’ve got SQL Server management studio open. And I will just connect to a
database engine here. And here we’ll just connect to RHEL1 which is a RHEL based instance of SQL Server. And you can see the here that I’m connecting with my
Active Directory authentication credentials. Just hit connect and I
connect to it just like I would any other SQL Server that’s on SQL Server on Windows. How about clustering? For clustering we leverage pacemaker which is a cluster manager in Linux world. And that’s basically the same as Windows
Server cluster services. So let’s take a look at what that looks like here in
SQL Server management studio. We have support for always-on availability
groups as well as normal failover clustering with shared disk. So here we’ll open up the always-on
availability group dashboard. And we can see that we have an always-on
availability group for this server and there’s three RHEL based
nodes in this always-on availability group. And then over here I have a simple
application that just connects to the cluster name for that cluster. And just outputs the name of the cluster node that it’s currently connected to. So now to simulate SQL Server going
down for some reason. Let’s just go over here and
stop the SQL server service. And we’ll give it a few seconds here. That will eventually kill off the
connections that are coming in here. And now we can see
that it’s trying to reconnect. Okay, so now we can see that the connections are
starting to come back online. You can see these are now connected to Rhel2 which was now taken over as the primary node within that availability group. So out of personal curiosity, this probably doesn’t work, but can I have an always-on availability group between Windows and Linux host? Yeah actually you can, we’ve
had that question quite a few times. We don’t recommend it for an HA scenario
because there’s no cluster manager that spans Windows and Linux yet. But it’s great for a couple different scenarios. One is where you have a need for like an
OS level redundancy. So imagine a scenario where you have a always-on
availability group that spans Windows and Linux. And there are some zero day
vulnerability you have to patch on Windows And so you would maybe failover
your entire work load to your Linux side and patch your Windows and then bring it back. Another scenario is migration. Where you have a workload running on SQL
Server on Windows today in an always-on availability group. You add a few Linux nodes to that cluster, replicate the databases over. All this happens online right? All your applications continue to work. And then when you’re ready you just failover. And that’s how you can migrate with basically just a few seconds of downtime. And that would work if I’m going to Linux
or back to Windows? Yeah, either way. We just seen a lot of great capabilities coming to Linux, but are there any new capabilities that are coming to both Windows and Linux? Yeah, there’s really two main things. The first is graph support which we’ve added. And then also support for Python for our machine learning services. Today people need to typically have a dedicated database for running a graph database. Something like Neo4j. What we’ve done with SQL Graph is we’ve actually
brought graph database capabilities into SQL Server in SQL Server 2017. And that helps you kind of get those inferences out of the many relationships that you have in a graph type of data model. So I’m going to jump back to my Mac here and show you what that looks like. So here I’m going to fire up a container running on my macbook. And here I’ve got Visual Studio code open and I’m going to connect to it using
the new SQL Server extension. We can see that we’re now connected to the master database in my container. So let’s just create a quick database here that we’re going to store some graph data in. To create a node table in SQL Server it’s just like creating any other table
in SQL Server except that we have this as node syntax here. So I’ve created some tables here to store some information about my Factory. Now I want to create some edge tables. And you can see the creating edge tables
is also the same as creating any other type of table in SQL Server except we have this as edge syntax here. So now let’s insert some data into these tables. And you’ll notice just by looking at these queries it looks just like inserting data
into any other SQL Server table. Except that with these edge tables we’re
inserting the node ID for each end of the edge into the table. We’re just using T-SQL, here? Yeah, just T-SQL. The syntax is essentially the same except for a few new keywords that we’ve added to the language. Now that we’ve got some data in there we can actually start to write some queries which go and look at this graph data. And you can see here that the graph query
syntax makes it much easier to write a query as opposed to having to write a
big long set of join statements. And you don’t have to worry about what the
difference is between an inner and an outer join. The syntax here is really just much easier to use. So one of the benefits of this is that
you get your graph data right next to your traditional relational data. And you can join between the two different data sets as opposed to you know having your graph
data off in some separate database and trying to figure out how to bring that
together with your relational DB. The other advantage is that you get all the
power of SQL Server with your graph data. So if you want to have some of the
advanced features for security that SQL Server has you can do that. Or, as we’ll see here I can create a
clustered columnstore index on the machine table here. And that improves the performance of
your queries by 30-100 times by using our new column store feature. That is an in-memory technology that also compresses the data on your disk to save you a lot of space. That’s a huge performance boost. But you also mentioned Python support
is also available now on SQL Server 2017 for building intelligent data models. Right, so in SQL Server 2016 we
introduced support for R. And now in SQL Server 2017 we’ve added Python to our machine learning services. And those are the two main languages the data scientists use to analyze data. And the interesting thing that we’ve done here is we’ve actually put R and Python in database so you can actually
execute these R and Python scripts where the data is at instead of having to pull the data out
and process it on a different server. So in SQL Server 2017 we also added the ability to offload these scripts to
GPUs for even faster performance. And that’s available with SQL Server on
Windows today and we’ll be adding it to SQL Server
on Linux in the future. A really great, fast tour of SQL Server 2017 including the new Linux capabilities. But, if you’re watching at home where
can folks go to learn more? There’s so much more we didn’t even
have a chance to talk about today like running SQL Server on Windows
or Linux VMs and Azure. So go check out more at the link below. Thanks Travis, and of course keep
watching Microsoft Mechanics for the latest tech updates across Microsoft. We’ll see you next time. Microsoft Mechanics www.microsoft.com/mechanics


Add a Comment

Your email address will not be published. Required fields are marked *