Polybase: Azure Sql DB cannot import blob storage - azure-sql-database

Is it true that Azure SQL cannot import blob storage? (SQLDW can and also stand alone instance)
as given in this document, it cannot. But the document is from 2018. Has things changed after that?

Azure SQL Database does not have Polybase but it does have BULK INSERT, eg
BULK INSERT Product
FROM 'data/product.dat'
WITH ( DATA_SOURCE = 'MyAzureBlobStorageAccount');
See this question for more details and an example:
Create a table in Azure SQL Database from Blob Storage
Main page:
https://learn.microsoft.com/en-us/archive/blogs/sqlserverstorageengine/loading-files-from-azure-blob-storage-into-azure-sql-database

It is true. PolyBase is not part of Azure SQL DB . And the document in your question is the latest.

Related

Can we import SQL bacpac file in existing DB

How to import SQL bacpac file in existing DB.
Because i can import with new DB.
But not able to import in the existing one.
As far as I know, Azure SQL Database doesn't support import a BACPAC file into a existing database.
No matter in Azure SQL Database or SQL Server, they all mentioned new database.
You can reference this document:
Azure SQL Database: Import BACPAC into a new database.
SQL Server: Import a BACPAC File to Create a New User Database.
But there are many methods you can copy all data from your resource database to the existing Azure SQL database.
One of them is export all your database table view to Azure blob and import these files to your existing Azure SQL database with SSMS. I did this successfully.
You can follow my step.
Export Data to Blob Stroage:
Import Data from Blob Stroage :
Using Import Data. It's operation just the opposite of Export Data.
It has the advantage that I can import these data to my existing Azure SQL database no matter it already has data or not.
Hope this helps.
Also conveniant way is to use DataImport Wizard, but use import from local Sever to Azure servet diractly. Without blob storage as mediator.
Just select needed tables and set spec settings, if need. (insert identity and so on)

Azure SQL External table of Azure Table storage data

Is it possible to create an external table in Azure SQL of the data residing in Azure Table storage?
Answer is no.
I am currently facing similiar issue and this is my research so far:
Azure SQL Database doesn't allow Azure Table Storage as a external data source.
Sources:
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql?view=sql-server-2017
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-2017
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql?view=sql-server-2017
Reason:
The possible data source scenarios are to copy from Hadoop (DataLake/Hive,..), Blob (Text files,csv) or RDBMS (another sql server). The Azure Table Storage is not listed.
The possible external data formats are only variations of text files/hadoop: Delimited Text, Hive RCFile, Hive ORC,Parquet.
Note - even copying from blob in JSON format requires implementing custom data format.
Workaround:
Create a copy pipeline with Azure Data Factory.
Create a copy
function/script with Azure Functions using C# and manually transfer
the data
Yes, there are a couple options. Please see the following:
CREATE EXTERNAL TABLE (Transact-SQL)
APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse
Creates an external table for PolyBase, or Elastic Database queries. Depending on the scenario, the syntax differs significantly. An external table created for PolyBase cannot be used for Elastic Database queries. Similarly, an external table created for Elastic Database queries cannot be used for PolyBase, etc.
CREATE EXTERNAL DATA SOURCE (Transact-SQL)
APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse
Creates an external data source for PolyBase, or Elastic Database queries. Depending on the scenario, the syntax differs significantly. An external data source created for PolyBase cannot be used for Elastic Database queries. Similarly, an external data source created for Elastic Database queries cannot be used for PolyBase, etc.
What is your use case?

External table in Blob Storage in Azure SQL(Not Azure SQL DW)

Here is my script which I am trying to run in Azure SQL Database:
CREATE DATABASE SCOPED CREDENTIAL some_cred WITH IDENTITY = user1,
SECRET = '<Key of Blob Storage container>';
CREATE EXTERNAL DATA SOURCE TEST
WITH
(
TYPE=BLOB_STORAGE,
LOCATION='wasbs://<containername>#accountname.blob.core.windows.net',
CREDENTIAL= <somecred>`enter code here`
);
CREATE EXTERNAL TABLE dbo.test
(
val VARCHAR(255)
)
WITH
(
DATA_SOURCE = TEST
)
I am getting the following error:
External tables are not supported with the provided data source type.
My goal is to create external table in blob storage so that Hive query in HDInsight references to the same blob. The table needs to be managed through Azure SQL. What's wrong with this script?
Azure SQL Database does have the feature to load files stored in Blob Storage but it only via the BULK INSERT and OPENROWSET language features. See here for more information.
BULK INSERT dbo.test
FROM 'data/yourFile.txt'
WITH ( DATA_SOURCE = 'YourAzureBlobStorageAccount');
The way you have scripted it is more like an external table using Polybase which is only available in SQL Server 2016 and Azure SQL Data Warehouse at this time.
I'm thinking External tables can be used for Cross Database Querying (Elastic queries). So it couldn't able to use the External Data Source which is BLOB_STORAGE

How to ensure faster response time using transact-SQL in Azure SQL DW when I combine data from SQL and non-relational data in Azure blob storage?

What should I do to ensure optimal query performance using transact-SQL in Azure SQL Data Warehouse while combining data sets from SQL and non-relational data in Azure Blob storage? Any inputs would be greatly appreciated.
The best practice is to load data from Azure Blob Storage into SQL Data Warehouse instead of attempting interactive queries over that data.
The reason is that when you run a query against your data residing in Azure Blob Storage (via an external table), SQL Data Warehouse (under-the-covers) imports all the data from Azure Blob Storage into SQL Data Warehouse temporary tables to process the query. So even if you a run SELECT TOP 1 query on your external table, the entire dataset for that table will be imported temporarily to process the query.
As a result, if you know that you will querying the external data frequently, it is recommended that you explicitly load the data into SQL Data Warehouse permanently using a CREATE TABLE AS SELECT command as shown in the document: https://azure.microsoft.com/en-us/documentation/articles/sql-data-warehouse-load-with-polybase/.
As a best practice, break your Azure Storage data into no more than 1GB files when possible for parallel processing with SQL Data Warehouse. More information about how to configure Polybase in SQL Data Warehouse to load data from Azure Storage Blob is here: https://azure.microsoft.com/en-us/documentation/articles/sql-data-warehouse-load-with-polybase/
Let me know if that helps!

Upload Google Cloud SQL backup to Bigquery

I have had troubles trying to move a Google Cloud SQL database to BigQuery. I have exported the database backup from Cloud SQL to Cloud Storage, but when trying to import this into BigQuery, I get the error: 'Not found: URI' for gs://bucket-name/file-name
Is what I'm trying to do even possible? I'm hoping to somehow directly upload the Cloud SQL data to BigQuery. It's a large table (>27GB) and I have been having a lot of connection issues with Cloud SQL, so exporting as CSV or JSON isn't the best option.
BigQuery doesn't support the mysql backup format, so the best route forward is to generate csv or json from the cloud sql database and persist those files into cloud storage.
More information on importing data can be found in the BigQuery documentation.
You can use BigQuery Cloud SQL federated query to copy Cloud SQL table into BigQuery. You can do it with one BigQuery SQL statement. For example, following SQL copy MySQL table sales_20191002 to BigQuery table demo.sales_20191002.
INSERT
demo.sales_20191002 (column1, column2 etc..)
SELECT
*
FROM
EXTERNAL_QUERY(
"project.us.connection",
"SELECT * FROM sales_20191002;");
EXTERNAL_QUERY("connection", "foreign SQL") would execute the "foreign SQL" in the Cloud SQL database specified in "connection" and return result back to BigQuery. "foreign SQL" is the source database SQL dialect (MySQL or PostgreSQL).
Before running above SQL query, you need to create a BigQuery connection which point to your Cloud SQL database.
To copy the whole Cloud SQL database, you may want to write a script to iterate all tables and copy them in a loop.