I've seen some queries to upload the images from a file, but I get this error message:
Cannot bulk load because the file could not be opened
I went to the properties>security option of the file to give access to SQL, but I couldn't find the option to give the permission. Considering this is Azure from Microsoft, how do I give the access to my files so I can execute the query? I'm using OPENROWSET and this is my code.
INSERT INTO FOTOS_EMPLEADOS
values (1,'HOLA', (SELECT * FROM OPENROWSET(BULK 'C:\Users.jpg', SINGLE_BLOB) as T1))
If there is a mistake with the code or other way to do it, please let me know.
TIA
Azure SQL Database doesn't support load file from on-premise computer.
Please reference OPENROWSET (Transact-SQL):
If you want to do this, you need upload the images to Blob Storage:
Please see Importing into a table from a file stored on Azure Blob storage:
--> Optional - a MASTER KEY is not required if a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1';
GO
--> Optional - a DATABASE SCOPED CREDENTIAL is not required because the blob is configured for public (anonymous) access!
CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************';
-- NOTE: Make sure that you don't have a leading ? in SAS token, and
-- that you have at least read permission on the object that should be loaded srt=o&sp=r, and
-- that expiration period is valid (all dates are in UTC time)
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://****************.blob.core.windows.net/curriculum'
, CREDENTIAL= MyAzureBlobStorageCredential --> CREDENTIAL is not required if a blob is configured for public (anonymous) access!
);
INSERT INTO achievements with (TABLOCK) (id, description)
SELECT * FROM OPENROWSET(
BULK 'csv/achievements.csv',
DATA_SOURCE = 'MyAzureBlobStorage',
FORMAT ='CSV',
FORMATFILE='csv/achievements-c.xml',
FORMATFILE_DATA_SOURCE = 'MyAzureBlobStorage'
) AS DataFile;
Hope this helps.
Related
I have a new error using Azure ML maybe due to the Ubuntu upgrade to 22.04 which I did yesterday.
I have a workspace azureml created through the portal and I can access it whitout any issue with python SDK
from azureml.core import Workspace
ws = Workspace.from_config("config/config.json")
ws.get_details()
output
{'id': '/subscriptions/XXXXX/resourceGroups/gr_louis/providers/Microsoft.MachineLearningServices/workspaces/azml_lk',
'name': 'azml_lk',
'identity': {'principal_id': 'XXXXX',
'tenant_id': 'XXXXX',
'type': 'SystemAssigned'},
'location': 'westeurope',
'type': 'Microsoft.MachineLearningServices/workspaces',
'tags': {},
'sku': 'Basic',
'workspaceid': 'XXXXX',
'sdkTelemetryAppInsightsKey': 'XXXXX',
'description': '',
'friendlyName': 'azml_lk',
'keyVault': '/subscriptions/XXXXX/resourceGroups/gr_louis/providers/Microsoft.Keyvault/vaults/azmllkXXXXX',
'applicationInsights': '/subscriptions/XXXXX/resourceGroups/gr_louis/providers/Microsoft.insights/components/azmllkXXXXX',
'storageAccount': '/subscriptions/XXXXX/resourceGroups/gr_louis/providers/Microsoft.Storage/storageAccounts/azmllkXXXXX',
'hbiWorkspace': False,
'provisioningState': 'Succeeded',
'discoveryUrl': 'https://westeurope.api.azureml.ms/discovery',
'notebookInfo': {'fqdn': 'ml-azmllk-westeurope-XXXXX.westeurope.notebooks.azure.net',
'resource_id': 'XXXXX'},
'v1LegacyMode': False}
I then use this workspace ws to upload a file (or a directory) to Azure Blob Storage like so
from azureml.core import Dataset
ds = ws.get_default_datastore()
Dataset.File.upload_directory(
src_dir="./data",
target=ds,
pattern="*dataset1.csv",
overwrite=True,
show_progress=True
)
which again works fine and outputs
Validating arguments.
Arguments validated.
Uploading file to /
Filtering files with pattern matching *dataset1.csv
Uploading an estimated of 1 files
Uploading ./data/dataset1.csv
Uploaded ./data/dataset1.csv, 1 files out of an estimated total of 1
Uploaded 1 files
Creating new dataset
{
"source": [
"('workspaceblobstore', '//')"
],
"definition": [
"GetDatastoreFiles"
]
}
My file is indeed uploaded to Blob Storage and I can see it either on azure portal or on azure ml studio (ml.azure.com).
The error comes up when I try to create a Tabular dataset from the uploaded file. The following code doesn't work :
from azureml.core import Dataset
data1 = Dataset.Tabular.from_delimited_files(
path=[(ds, "dataset1.csv")]
)
and it gives me the error :
ExecutionError:
Error Code: ScriptExecution.DatastoreResolution.Unexpected
Failed Step: XXXXXX
Error Message: ScriptExecutionException was caused by DatastoreResolutionException.
DatastoreResolutionException was caused by UnexpectedException.
Unexpected failure making request to fetching info for Datastore 'workspaceblobstore' in subscription: 'XXXXXX', resource group: 'gr_louis', workspace: 'azml_lk'. Using base service url: https://westeurope.experiments.azureml.net. HResult: 0x80131501.
The SSL connection could not be established, see inner exception.
| session_id=XXXXXX
After some research, I assumed it might be due to openssl version (which now is 1.1.1) but I am not sure and I surely don't know how to fix it...any ideas ?
According to the document there is no direct procedure to convert the file dataset into tabular dataset. Instead, we can create a workspace and that creates two storage methods (blobstorage which is the default storage, file storage). The SSL will be taken care by workspace.
We can create a datastore in the workspace and connect that to the blob storage.
Follow the procedure to do the same.
Create a workspace
If we want, we can create a dataset.
We can create from local files of datastore.
To choose a datastore, first we need to have a file in the datastore
Goto Datastores and click on create dataset. Observe that the name is workspaceblobstorage(default).
Fill the details and see that the dataset type is Tabular.
In the path, we will be having the local files path and can check there, under the select or create a datastore, it is showing default storage as blob.
After uploading, we can wee the name in this section which is a datastore and tabular dataset.
In your workspace created, check whether the public access is Disabled or Enabled. If disabled, it will not allow to access due to lack of SSL. Checkout the image below. After enabling, use the same procedure which was implemented till now.
We are trying to copy the data frame to Teradata specific database and the script is not accepting the schema_name parameter. Data Copy to User Database, which used in logon command is happening. But I tried to override the default and specifying Database Name in the copy_to_sql it is failing.
from teradataml import *
from teradataml.dataframe.copy_to import copy_to_sql
create_context(host = "ipaddrr", username='uname', password = "pwd")
df = DataFrame.from_query("select top 10* from dbc.tables;")
copy_to_sql(df = df ,table_name = 'Tab', schema_name='DB_Name',if_exists = 'replace')
Error: TeradataMlException: [Teradata][teradataml](TDML_2007) Invalid value(s) 'DB_Name' passed to argument 'schema_name', should be: A valid database/schema name..
Do you have a database / user named DB_Name? If not, try creating the database first and then running your copy script:
CREATE DATABASE DB_NAME FROM <parent_DB> AS PERMANENT = 1000000000;
I don't think the utilities / packages will typically create a database for you on the fly, since it can be a more involved operation (locks, space allocation, etc.) than creating a table.
I'm trying to execute a simple pipeline in azure data lake analytics, but I'm having some trouble with U-SQL. I was wondering if someone can give a helping hand.
My Query:
DECLARE #log_file string = "/datalake/valores.tsv";
DECLARE #summary_file string = "/datalake/output.tsv";
#log = EXTRACT valor string from #log_file USING Extractors.Tsv();
#summary = select sum(int.valor) as somavalor from #log;OUTPUT #summary
TO #summary_file USING Outputters.Tsv();
Error:
Erro
Other general doubts:
1. When I deploy a new pipeline to ADF sometimes it doesn't appear in the activity window and sometime it does. I didn't get the logic. (I'm using the OneTime pipeline mode)
2. There is a better way to create new pipeline (other than manipulate raw Json files?)
3.There is any U-SQL parser? What is the easiest way to teste my query?
Thanks a lot.
U-SQL is case-sensitive so your U-SQL should look more like this:
DECLARE #log_file string = "/datalake/valores.tsv";
DECLARE #summary_file string = "/datalake/output.tsv";
#log =
EXTRACT valor int
FROM #log_file
USING Extractors.Tsv();
#summary =
SELECT SUM(valor) AS somavalor
FROM #log;
OUTPUT #summary
TO #summary_file USING Outputters.Tsv();
I have assumed your input file has only a single column of type int.
Use Visual Studio U-SQL projects, VS Code U-SQL add-in to ensure you write valid U-SQL. You can also submit U-SQL jobs via the portal.
How to create U-SQL Data Source with connection string ?
This is my attempt, it has issue in setting the CREDENTIAL parameter.
CREATE DATA SOURCE MyAzureSQLDBDataSource
FROM AZURESQLDB
WITH
(
PROVIDER_STRING = "Database=mySampleDB;",
CREDENTIAL = "Server=mySampleDB.database.windows.net;User ID=myUser;Password=myPasswd",
REMOTABLE_TYPES = (bool, byte, sbyte, short, ushort, int, uint, long, ulong, decimal, float, double, string, DateTime)
);
Error message:
error External0: E_CSC_USER_INVALIDDATASOURCEOPTIONVALUE: Invalid value '"Server=mySampleDB.database.windows.net;User ID=myUser;Password=myPasswd"' for data source option 'CREDENTIAL'.
Description:
The only valid values for 'CREDENTIAL' are identifiers or two-part identifiers.
Resolution:
Use a valid value.
Have you reviewed CREATE DATA SOURCE (U-SQL)?
To expand on David's answer.
Since U-SQL scripts are stored at least temporarily in the cluster, we cannot allow the inclusion of secrets in the script. Instead, you need to create the credential in the metadata via an Azure PowerShell command (or SDK) and then refer to that credential in the CREATE DATA SOURCE statement. The documentation link provided by David contains some examples.
I'm trying to upload an image to SQL server in Linux (raspbian) environment using python language.So far i was able connect to Sql server and also i created a table and i'm using pyodbc.
#! /user/bin/env python
import pyodbc
from PIL import Image
dsn = 'nicedcn'
user = myid
password = mypass
database = myDB
con_string = 'DSN=%s;UID=%s;PWD=%s;DATABASE=%s;' % (dsn, user, password, database)
cnxn = pyodbc.connect(con_string)
cursor = cnxn.cursor()
string = "CREATE TABLE Database2([image name] varchar(20), [image] image)"
cursor.execute(string)
cnxn.commit()
This part complied without any error.That means i have successfully created a table right? Or is there any issue?
I try to upload image as this way.
image12= Image.open('new1.jpg')
cursor.execute("insert into Database1([image name], [image]) values (?,?)",
'new1', image12)
cnxn.commit()
I get the error on this part. and it pyodbc.ProgrammingError:
('Invalid Parameter type. param-index=1 param-type=instance', 'HY105')
Please tell me another way or proper way to upload a image via pyodbc to a database