20 Jun 2024

Azure SQL Database vs Azure SQL Data Warehouse

 Case

As we slowly move from on-premises data warehouses with Microsoft SQL Server to cloud data warehouses in Microsoft Azure, we need to know more about the various options in Azure. You probably already used an Azure SQL Database, but Microsoft also introduced Azure SQL Data Warehouse. What are the differences between these two databases?
Azure SQL DB vs Azure SQL DW














Solution
Back in 2013, Microsoft introduced Azure SQL Database which has its origin in the on-premises Microsoft SQL Server. In 2015 (however public availability was in July 2016) Microsoft added SQL Data Warehouse to the Azure cloud portfolio which has its origin in the on-premises Microsoft Analytics Platform System (APS). This was a Parallel Data Warehouse (PDW) combined with Massively Parallel Processing (MPP) technology and included standard hardware. It is the 'big brother' of SQL Server, but with a slightly different purpose.

In this post we will briefly describe the differences between these two Microsoft Azure Services, but first Microsofts own definitions:
  • Azure SQL Database is a relational database-as-a service using the Microsoft SQL Server Engine (more);
  • Azure SQL Data Warehouse is a massively parallel processing (MPP) cloud-based, scale-out, relational database capable of processing massive volumes of data (more);

Differences
1) Purpose: OLAP vs OLTP
Although both Azure SQL DB and Azure SQL DW are cloud based systems for hosting data, their purpose is different. The biggest difference is that SQL DB is specifically for Online Transaction Processing (OLTP). This means operational data with a lot of short transactions like INSERT, UPDATE and DELETE by multiple people and/or processes. The data is most often highly normalized stored in many tables.

On the other hand SQL DW is specifically for Online Analytical Processing (OLAP) for data warehouses. This means consolidation data with a lower volume, but more complex queries. The data is most often stored de-normalized with fewer tables using a star or snowflake schema.

2) Achitecture
In order to make the differences more clear a quick preview of the architecture of Azure SQL Data Warehouse, where you see a whole collection of Azure SQL Databases and separated storage. The maximum number of compute notes at the moment is 60.
    Architecture SQL DW: Decouples (Blob) storage from compute (SQL DB)





















    3) Storage size
    The current size limit of an Azure SQL Database is 4TB, but it has been getting bigger over the past few years and will probably end up around 10TB in the near future. On the other hand we have the Azure SQL Data Warehouse which has no storage limit at all (only the limit of your wallet), because the storage is separated from the compute.

    3) Pricing
    The pricing is also quite different. Where Azure SQL DB starts with €4,20 a month, Azure SQL DW starts around €900,- a month excluding the cost of storage which is included in SQL DB. The storage costs for Azure SQL DW are around €125,- per TB per month. And the maximum costs of a single SQL DB is around €13500,- where SQL DW ends around a massive €57000,- (excl. storage). But when you take a look at the architecture above, it should be no surprise that SQL DW is more expensive than SQL DB, because it consists of multiple SQL DBs.

    However, SQL DW has one big trick up its sleeve that SQL DB hasn't: you can pause it completely and then you only pay for storage. If you start your SQL DW with your ETL job and pause it right after you processed your Azure Analysis Services then you only need it a small percentage of the month.

    Note: prices are from July 2017

    4) DTU vs DWU
    SQL DB has 15 different pricing tiers which specify the number of Database Transaction Units  (DTU) and the storage size/type:
    - Basic
    - Standard (S0, S1, S2, S3)
    - Premium (P1, P2, P3, P4, P6, P11, P15)
    - Premium RS (PRS1, PRS2, PRS4, PRS6)
    Basic has only 5 DTUs and the highest number of DTUs is, at the time of writing, 4000.
    The term DTU is a bit vague. It is a mysterious combination of RAM, CPU and read-write rates, but basically if you want to double the performance of your current database you just need to double the number of DTU's for your database.

    SQL DW has 12 different pricing tiers and uses Data Warehouse Units (DWU) to specify the performance level.
    - DWU100, 200, 300, 400, 500, 600, 1000, 1200, 1500, 2000, 3000, 6000
    The term DWU is a little less vague, because if you divide that number by 100 you have the number of compute nodes available for that pricing tier. On the other hand the exact combination of CPU, memory and IOPS per compute note is unknown.

    Because both services have a different purpose it is a bit strange to compare the hardware, but according to this MSDN blog post 1 DWU is approximately 7,5 DTU.

    But there is also some similarity: for both services you can use the same script to change the pricing tier on the fly to either give the performance a real boost when needed or the save money in the quiet hours.

    5) Concurrent Connection
    Although SQL DW is a collection of SQL Databases the maximum number of concurrent connections is much lower than with SQL DB. SQL DW has a maximum of 1024 active connections where SQL DB can handle 6400 concurrent logins and 30000 concurrent sessions. This means that in the exceptional case where you have over a thousand active users for your dashboard you probably should consider SQL DB to host the data instead of SQL DW.
    For more details see the SQL DB Recource Limitations and SQL DW Recource Limitations.

    6) Concurrent Queries
    Besides the maximum connections, the number of concurrent queries is also much lower. SQL DW can execute a maximum of 32 queries at one moment where SQL DB can have 6400 concurrent workers (requests). This is where you see the differences between OLTP and OLAP.
    For more details see the SQL DB Recource Limitations and SQL DW Recource Limitations.

    7) PolyBase
    Azure SQL Data Warehouse supports PolyBase. This technology allows you to access data outside the database with regular Transact SQL. It can for example use a file in an Azure Blob Storage container as a (external) table. Other options are importing and exporting data from Hadoop or Azure Data Lake Store. Although SQL Server 2016 also supports PolyBase, Azure SQL Database does not (yet?) support it.

    8) Query language differences
    Although SQL DW uses SQL DB in the background there are a few minor differences when quering or creating tables:
    - SQL DW cannot use cross databases queries. So all your data should be in the same database.
    - SQL DW can use IDENTITY, but only for INT or BIGINT. Moreover the IDENTITY column cannot be used as part of the distribution key.
    - Also see this SQL DW list of unsupported table features.

    9) Replication
    SQL DB supports active geo-replication. This enables you to configure up to four readable secondary databases in the same or different location. SQL DW does not support active geo-replication, only Azure Storage replication. However this is not a live, readable, synchronized copy of your database! It's more like a backup.

    10) In Memory OLTP tables
    SQL DB supports in-memory OLTP. SQL DW is OLAP and does not support it.

    11) Always encrypted
    SQL DB supports Always Encrypted to protect sensitive data. SQL DW does not support it.

    Conclusion
    Although Azure SQL DB looks much cheaper on a monthly basis, this doesn't mean you should always choose SQL DB by default. One big advantage is that you can pause SQL DW. For example a stage database is only used during the ETL process. Why should you always have this database up an running? Or why should your datamart stay online 24*7 if your end users only use Analysis Services to browse the data.

    On the other hand the original purpose of both services is different (OLTP vs OLAP), but this doesn't mean you should always use Azure SQL DW for your data warehouses. Depending on several factors like data size, complexity, required up-time and budget, Azure SQL DB could also host your data warehouse. You could even mix them in your project. For example Stage and Historical/Persistent stage in Azure SQL DW and your Datamart in Azure SQL DB.



    Please leave a comment if you know more significant differences that are worth mentioning.

    How to Make Link Server in SQL Server

     Creating a linked server in SQL Server allows you to connect to and query external data sources directly from your SQL Server instance. This can include other SQL Server instances, databases like Oracle, MySQL, or even non-relational data sources. Below is a guide on how to create a linked server in SQL Server.

    Step-by-Step Guide

    1. Open SQL Server Management Studio (SSMS):

      • Launch SSMS and connect to your SQL Server instance.
    2. Open the Object Explorer:

      • In the Object Explorer, expand the node for the SQL Server instance where you want to create the linked server.
    3. Navigate to Linked Servers:

      • Expand the Server Objects node.
      • Right-click on Linked Servers and select New Linked Server....
    4. Configure the Linked Server:

      • In the New Linked Server dialog, you need to provide the following details:

      • General Page:

        • Linked server: Provide a name for the linked server.
        • Server type: Choose the appropriate server type.
          • If connecting to another SQL Server, select SQL Server.
          • For other data sources, select Other data source.
        • Provider: Select the appropriate OLE DB provider for the data source.
        • Product name: This can be a descriptive name of the data source (optional for SQL Server).
        • Data source: The name or network address of the server to which you are linking.
        • Provider string: Connection string information (if needed).
        • Location: For some providers, you may need to specify the location.
        • Catalog: The default database/catalog to use.
      • Security Page:

        • Configure the security settings to control how the local server will connect to the linked server.
        • You can choose to:
          • Not be made: Local logins cannot access the linked server.
          • Be made without using a security context: Connect using a self-mapping mechanism.
          • Be made using the login's current security context: Use the current login's security context.
          • Be made using this security context: Specify a remote login and password to use for the connection.
      • Server Options Page:

        • Configure additional server options if needed (e.g., collation compatibility, data access settings, etc.).
    5. Create the Linked Server:

      • Click OK to create the linked server. If everything is configured correctly, the linked server should appear under the Linked Servers node in Object Explorer.

    Example Using T-SQL

    Alternatively, you can create a linked server using T-SQL. Below is an example for linking to another SQL Server instance:

    sql
    EXEC sp_addlinkedserver @server=N'MyLinkedServer', @srvproduct=N'SQL Server'; EXEC sp_addlinkedsrvlogin @rmtsrvname = N'MyLinkedServer', @useself = N'False', @locallogin = NULL, @rmtuser = 'RemoteUser', @rmtpassword = 'RemotePassword';
    • Replace 'MyLinkedServer' with the name you want for your linked server.
    • Replace 'RemoteUser' and 'RemotePassword' with the appropriate credentials for the remote server.

    For non-SQL Server data sources, the T-SQL might look like this:

    sql

    EXEC sp_addlinkedserver @server = N'MyOracleServer', @provider = N'OraOLEDB.Oracle', @datasrc = N'OracleServerName'; EXEC sp_addlinkedsrvlogin @rmtsrvname = N'MyOracleServer', @useself = N'False', @locallogin = NULL, @rmtuser = 'oracle_user', @rmtpassword = 'oracle_password';
    • Replace 'MyOracleServer' with the name for your linked server.
    • Replace 'OraOLEDB.Oracle' with the appropriate OLE DB provider for Oracle.
    • Replace 'OracleServerName' with the network name of your Oracle server.
    • Replace 'oracle_user' and 'oracle_password' with the appropriate Oracle credentials.

    Testing the Linked Server

    To test the linked server, you can run a simple

    SELECT * FROM MyLinkedServer.DatabaseName.SchemaName.TableName;

    Replace MyLinkedServer, DatabaseName, SchemaName, and TableName with the appropriate names for your setup.

    Creating a linked server in SQL Server provides a powerful way to access and query data from diverse sources, enabling greater flexibility and integration in your data management workflows.

    Union Budget 2024-25 – All You Need to Know

      Union Budget 2024-25 – All You Need to Know The Union Budget is a yearly financial plan presented by the Finance Minister of India for the...