Azure Data Lake - Creating U-SQL External Data Source from connection string - azure-data-lake

How to create U-SQL Data Source with connection string ?
This is my attempt, it has issue in setting the CREDENTIAL parameter.
CREATE DATA SOURCE MyAzureSQLDBDataSource
FROM AZURESQLDB
WITH
(
PROVIDER_STRING = "Database=mySampleDB;",
CREDENTIAL = "Server=mySampleDB.database.windows.net;User ID=myUser;Password=myPasswd",
REMOTABLE_TYPES = (bool, byte, sbyte, short, ushort, int, uint, long, ulong, decimal, float, double, string, DateTime)
);
Error message:
error External0: E_CSC_USER_INVALIDDATASOURCEOPTIONVALUE: Invalid value '"Server=mySampleDB.database.windows.net;User ID=myUser;Password=myPasswd"' for data source option 'CREDENTIAL'.
Description:
The only valid values for 'CREDENTIAL' are identifiers or two-part identifiers.
Resolution:
Use a valid value.

Have you reviewed CREATE DATA SOURCE (U-SQL)?

To expand on David's answer.
Since U-SQL scripts are stored at least temporarily in the cluster, we cannot allow the inclusion of secrets in the script. Instead, you need to create the credential in the metadata via an Azure PowerShell command (or SDK) and then refer to that credential in the CREATE DATA SOURCE statement. The documentation link provided by David contains some examples.

Related

Connect a Bigquery function with Cloud functions

I am trying to create a function in Bigquery which pulls a Cloud Functions:
CREATE OR REPLACE FUNCTION `DATASET.XXXXX`(user_id int64, corp_id STRING) RETURNS STRING
REMOTE WITH CONNECTION `myPROJECTID.REGION.MY_CONNECTION`
OPTIONS (
endpoint = 'https://XXXX.cloudfunctions.net/XXXXX'
)
previously create a connection in the Bigquery shell, but I get the following error, does anyone know?
Keyword REMOTE is not supported at [2:1]
or
Not found: Connection my-connection
Your project must be allowlisted. It's a private preview (I asked 2 month ago, still nothing....)

How to read ABAP code using a java client

I have a requirement were I need to read ABAP code written by SAP developers. I want to write my own client using Java/Python which can integrate with SAP system and get me the ABAP code.
What I understand that ABAP code is stored in SAP database like HANA, mysql etc. So is there a way which SAP provides where we can read the code like we can do in Git/SVN etc.
I've used RFC calls RPY_FUNCTIONMODULE_READ and RPY_FUNCTIONMODULE_READ_NEW through the perl NWRFC wrapper/library to retreive ABAP code.
You can access tables with below techniques:
Using SAP Connectors via RFC (RFC_READ_TABLE)
Using SOAP Web Service with same function (RFC_READ_TABLE)
Using custom web services with existing functions which are reading report, functions, etc.
You can use both Java or Pyhton for RFC, there is already exits github repo for python.
If you will select reading directly in db table, you need to know structure of saved data. It has own mechanism for OOP objects. Daniel Berlin try to implement binary parser in C++ in sap-reposrc-decompressor project. Never forget this source depended with SAP version.
I think using ADT (ABAP Development Tools) plugin is good for updated systems. There is already Eclipse plugin exists for ADT. ADT not exists in old systems.
If you are planning to use your solution in old system (after 7.01), you can build your own solution with abapGit and custom web services.
NOTE: Keep in mind, report and data elements (variables, tables, types) saved in separate tables. Dynpro objects (screens etc), reports (Smartforms) hard things to decompile.
Before you re-invent a wheel, Take a look at:
ABAPgit. https://docs.abapgit.org/
or the old SAPLink https://wiki.scn.sap.com/wiki/display/ABAP/SAPlink.
If you want JUST the source code, You could expose a very simple rest service/ Endpoint in SAP.
This service would just read the raw code and return it as plain text.
Every abaper could create this for you.
BUT is the raw source only. There is much more to a complete development
and why tools like ABAPGIT exist.
In SICF, create a new endpoint / service.
EG ZCODE_MONKEY with the class below as an example.
Now activate the service.
Call the endpoint
http://server:PORT/zcode_monkey?name=ZCODE_MONKEY
Sample implementation
CLASS zcode_monkey DEFINITION
PUBLIC
CREATE PUBLIC .
PUBLIC SECTION.
INTERFACES: if_http_extension.
ENDCLASS.
CLASS zcode_monkey IMPLEMENTATION.
METHOD if_http_extension~handle_request.
DATA: lo_src type ref to CL_OO_SOURCE,
l_name TYPE string,
l_repname type c length 30,
l_clskey type seoclskey ,
l_source type rswsourcet,
resultcode TYPE string.
FIELD-SYMBOLS: <line> TYPE LINE OF rswsourcet.
l_name = server->request->get_form_field( name = 'NAME' ).
l_clskey = l_name.
l_repname = l_name.
create OBJECT lo_src
EXPORTING
clskey = l_clskey
EXCEPTIONS
class_not_existing = 1
others = 2 .
IF sy-subrc <> 0.
read REPORT l_repname into l_source.
else.
lo_src->read( ).
lo_src->if_oo_clif_source~get_source( IMPORTING source = l_source ).
ENDIF.
LOOP AT l_source ASSIGNING <line>.
CONCATENATE resultCode
cl_abap_char_utilities=>cr_lf
<line>
INTO resultCode RESPECTING BLANKS. " always show respect ;)
ENDLOOP.
SErver->response->set_content_type( content_type = 'text/plain' ).
server->response->set_cdata( EXPORTING data = resultcode ).
server->response->set_status(
EXPORTING
code = 200
reason = 'this is a 3.50 piece of code. Dont ask...its a demo ' ).
ENDMETHOD.
ENDCLASS.

How to upload images to a SQL database in Azure?

I've seen some queries to upload the images from a file, but I get this error message:
Cannot bulk load because the file could not be opened
I went to the properties>security option of the file to give access to SQL, but I couldn't find the option to give the permission. Considering this is Azure from Microsoft, how do I give the access to my files so I can execute the query? I'm using OPENROWSET and this is my code.
INSERT INTO FOTOS_EMPLEADOS
values (1,'HOLA', (SELECT * FROM OPENROWSET(BULK 'C:\Users.jpg', SINGLE_BLOB) as T1))
If there is a mistake with the code or other way to do it, please let me know.
TIA
Azure SQL Database doesn't support load file from on-premise computer.
Please reference OPENROWSET (Transact-SQL):
If you want to do this, you need upload the images to Blob Storage:
Please see Importing into a table from a file stored on Azure Blob storage:
--> Optional - a MASTER KEY is not required if a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1';
GO
--> Optional - a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************';
-- NOTE: Make sure that you don't have a leading ? in SAS token, and
-- that you have at least read permission on the object that should be loaded srt=o&sp=r, and
-- that expiration period is valid (all dates are in UTC time)
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://****************.blob.core.windows.net/curriculum'
, CREDENTIAL= MyAzureBlobStorageCredential --> CREDENTIAL is not required if a blob is configured for public (anonymous) access!
);
INSERT INTO achievements with (TABLOCK) (id, description)
SELECT * FROM OPENROWSET(
BULK 'csv/achievements.csv',
DATA_SOURCE = 'MyAzureBlobStorage',
FORMAT ='CSV',
FORMATFILE='csv/achievements-c.xml',
FORMATFILE_DATA_SOURCE = 'MyAzureBlobStorage'
) AS DataFile;
Hope this helps.

U-SQL error in Azure Data Lake Analytics

I'm trying to execute a simple pipeline in azure data lake analytics, but I'm having some trouble with U-SQL. I was wondering if someone can give a helping hand.
My Query:
DECLARE #log_file string = "/datalake/valores.tsv";
DECLARE #summary_file string = "/datalake/output.tsv";
#log = EXTRACT valor string from #log_file USING Extractors.Tsv();
#summary = select sum(int.valor) as somavalor from #log;OUTPUT #summary
TO #summary_file USING Outputters.Tsv();
Error:
Erro
Other general doubts:
1. When I deploy a new pipeline to ADF sometimes it doesn't appear in the activity window and sometime it does. I didn't get the logic. (I'm using the OneTime pipeline mode)
2. There is a better way to create new pipeline (other than manipulate raw Json files?)
3.There is any U-SQL parser? What is the easiest way to teste my query?
Thanks a lot.
U-SQL is case-sensitive so your U-SQL should look more like this:
DECLARE #log_file string = "/datalake/valores.tsv";
DECLARE #summary_file string = "/datalake/output.tsv";
#log =
EXTRACT valor int
FROM #log_file
USING Extractors.Tsv();
#summary =
SELECT SUM(valor) AS somavalor
FROM #log;
OUTPUT #summary
TO #summary_file USING Outputters.Tsv();
I have assumed your input file has only a single column of type int.
Use Visual Studio U-SQL projects, VS Code U-SQL add-in to ensure you write valid U-SQL. You can also submit U-SQL jobs via the portal.

Processing Event Hub Capture AVRO files with Azure Data Lake Analytics

I'm attempting to extract data from AVRO files produced by Event Hub Capture. In most cases this works flawlessly. But certain files are causing me problems. When I run the following U-SQL job, I get the error:
USE DATABASE Metrics;
USE SCHEMA dbo;
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
REFERENCE ASSEMBLY [Avro];
REFERENCE ASSEMBLY [log4net];
USING Microsoft.Analytics.Samples.Formats.ApacheAvro;
USING Microsoft.Analytics.Samples.Formats.Json;
USING System.Text;
//DECLARE #input string = "adl://mydatalakestore.azuredatalakestore.net/event-hub-capture/v3/{date:yyyy}/{date:MM}/{date:dd}/{date:HH}/{filename}";
DECLARE #input string = "adl://mydatalakestore.azuredatalakestore.net/event-hub-capture/v3/2018/01/16/19/rcpt-metrics-us-es-eh-metrics-v3-us-0-35-36.avro";
#eventHubArchiveRecords =
EXTRACT Body byte[],
date DateTime,
filename System.String
FROM #input
USING new AvroExtractor(#"
{
""type"":""record"",
""name"":""EventData"",
""namespace"":""Microsoft.ServiceBus.Messaging"",
""fields"":[
{""name"":""SequenceNumber"",""type"":""long""},
{""name"":""Offset"",""type"":""string""},
{""name"":""EnqueuedTimeUtc"",""type"":""string""},
{""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Body"",""type"":[""null"",""bytes""]}
]
}
");
#json =
SELECT Encoding.UTF8.GetString(Body) AS json
FROM #eventHubArchiveRecords;
OUTPUT #json
TO "/outputs/Avro/testjson.csv"
USING Outputters.Csv(outputHeader : true, quoting : true);
I get the following error:
Unhandled exception from user code: "The given key was not present in the dictionary."
An unhandled exception from user code has been reported when invoking the method 'Extract' on the user type 'Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor'
Am I correct in assuming the problem is within the AVRO file produced by Event Hub Capture, or is there something wrong with my code?
The Key Not Present error is referring to the fields in your extract statement. It's not finding the data and filename fields. I removed those fields and your script runs correctly in my ADLA instance.
The current implementation only supports primitive types, not complex types of the Avro specification at the moment.
You have to build and use an extractor based on apache avro and not use the sample extractor provided by MS.
We went the same path