How can I create a dynamic blob linked service? - sql

I want to access multiple storage accounts. I want one sql table to be uploaded to one blob and other sql table to the other. for this, i have to use just one pipeline. How can I create a dynamic blob?

Each storage account requires its own connection string.
One way to do this would be to create a service (for example an azure function) that you pipeline calls. In the function there is logic to place the the correct table in the correct storage account.

Related

Oracle Cloud to Azure Cloud storage

We have a requirement to move data from oracle Cloud storage to Azure Cloud storage.
The requirement is basically to move data from an Oracle ADW database (hosted on Oracle cloud) to Snowflake database (hosted on Azure).
Since the data volume in tables is huge (some with 60mil+ records) we do not wish to use any ETL tool and instead want to setup a pipeline as below.
Oracle ADW database -> Store data in Oracle storage --> Move data to Azure Cloud storage -> Load into Snowflake using snowpipe or similar snowflake utilities.
How should I go about this implementation?
Also share your views on whether we can use Oracle fastconnect and Azure ExpressRoute to directly pull data from Oracle Cloud onto snowflake (or into Azure storage)
I am looking for the same thing with the simplest method from Oracle (on prem but could be cloud), into Snowflake. Looks like data must be exporeted or dropped to external tables, shifted to Azure Blob storage (like AWS S3), then pushed into Snowflake using COPY INTO - basically copying on disk external tables. This is what Snowpipe does:
"Snowpipe copies the files into a queue, from which they are loaded into the target table in a continuous, serverless fashion based on parameters defined in a specified pipe object. The following table indicates the cloud storage service support for automated Snowpipe from Snowflake accounts hosted on each cloud platform:"
It's been a while since I have worked with this. The other option is GoldenGate, which was not expensive the last time I looked into it:
https://www.snowflake.com/blog/continuous-data-replication-into-snowflake-with-oracle-goldengate/
Easy, simple, fast. Anyone have any better ideas would be appreciated.

How to use output of Azure Data Factory Web Activity in next copy activity?

I have a ADF Web activity from which I'm getting metadata as an output. I want to copy this metadata into Azure Postgres DB. How to use the Web activity output as an source to the next copy activity?
Accoding to this answer. I think we can use two Web activities to store
the output of your first Web activity.
Use #activity('Web1').output.Response expression at second web activity to save the output as a blob to the container. Then we can use Copy activity to copy this blob into Azure Postgres DB.
Since I do not have permission to set role permissions, I did not test this. I think this solution is feasible.

Calculate Hashes in Azure Data Factory

We have a requirement where we want to copy the files and folders from on premise to the Azure Blob Storage. Before copying the files I want to calculate the hashes and put that in a file at the source location.
We want this to be done using Azure Data Factory. I am not finding any option in Azure Data Factory to calculate the hashes for a file system type of objects. I am able to find the hashes for a blob once its landed at destination.
Can some one guide me how this can be achieved.
You need to use data flows in data factory to transform the data.
In a mapping data flow you can just add a column using derived column with an expression using for example the md5() or sha2() function to produce a hash.

How do I create a BigQuery dataset out of another BigQuery dataset?

I need to understand the below:
1.) How does one BigQuery connect to another BigQuery and apply some logic and create another BigQuery. For e.g if i have a ETL tool like Data Stage and we have some data been uploaded for us to consume in form of a BigQuery. So in DataStage or using any other technology how do i design the job so that the source is one BQ and the Target is another BQ.
2.) I want to achieve like my input will be a VIEW (BigQuery) and then need to run some logic on the BigQuery View and then load into another BigQuery view.
3.) What is the technology used to connected one BigQuery to another BigQuery is it https or any other technology.
Thanks
If you have a large amount of data to process (many GB), you should do the transformation of the data directly in the Big Query database. It would be very slow to extract all the data, run it through something locally, and send it back. You don't need any outside technology to make one view depend on another view, besides access to the relevant data.
The ideal job design will be an SQL query that Big Query can process. If you are trying to link tables/views across different projects then the source BQ table must be listed in fully-specified form projectName.datasetName.tableName in the FROM clauses of the SQL query. Project names are globally unique in Google Cloud.
Permissions to access the data must be set up correctly. BQ provides fine-grained control over who can access, and it is in the BQ documentation. You can also enable public access to all BQ users if that is appropriate.
Once you have that SQL query, you can create a new view by sending your SQL to Google BigQuery either through the command line (the bq tool), the web console, or an API.
1) You can use BigQuery Connector in DataStage to read and write to bigquery.
2) Bigquery use namespaces in the format project.dataset.table to access tables across projects. This allows you to manipulate your data in GCP as it were in the same database.
To manipulate your data you can use DML or standard SQL.
To execute your queries you can use the GCP Web console or client libraries such as python or java.
3) BigQuery is a RESTful web service and use HTTPS

My azure storage unique namespace

I'm trying to move some tables from SQL to Azure Table Storage.
I created an MVC Website with the default authentication. I successfully connected it to my Azure SQL database. Now I want to use the table storage for authentication too, instead of the SQL database.
The problem is, I cannot find my storage account's unique namespace. What, where is that namespace?
Thanks!
Looking at a table URL, for example 'http://myaccountname.blob.core.windows.net/mytable', the 'myaccountname' will be the name of your account. Storage account names must be between 3 and 24 characters in length and may contain numbers and lowercase letters only. The storage account name must be unique on the Azure service. A list of storage accounts your own and more information about them can be found in the Azure Portal.
More information on authentication for tables can be found here and here. Manipulating and authenticating access to your tables are features built into the storage client libraries which are available in a variety of languages. Since you mention MVC, you might want to check out the .Net storage library.