Insert records into Spark SQL table - sql

I have created spark SQL table like below through Azure Databricks:
create table sample1(price double)
Actual file has data like 'abc' instead of double value.
While inserting 'abc' string value into double column it accepts as NULL without any failure. My concern is why are we not getting any error? I want to failure message in this case.
Please let me know if I'm missing something. I want to disable the implicit conversion of datatypes.

Related

SQL Query in Azure Dataflow does not work when using parameter value in where clause

I use a Azure Datafactory Pipeline.
Within that pipeline i use 2 activities:
Lookup to get a date value
This is the output:
"firstRow": {
"Date": "2022-10-26T00:00:00Z"
A dataflow which is getting the date from the lookup in 1 which is used in the source options SQL query in the where clause:
This is the query:
"SELECT ProductID ,ProductName ,SupplierID,CategoryID ,QuantityPerUnit ,UnitPrice ,UnitsInStock,UnitsOnOrder,ReorderLevel,Discontinued,LastModifiedDate FROM Noordwind.Products where LastModifiedDate >= '{$DS_LastPipeLineRunDate}'"
When i fill the parameter by hand with for example '2022-10-26' then it works great, but when i let the parameter get's its value from the Lookup in step 1 the dataflow fails
Error message:
{"message":"Job failed due to reason: Converting to a date or time failed due to an invalid character. Details:null","failureType":"UserError","target":"Products","errorCode":"DF-Executor-Conversion"}
This is the parameter in the pipeline view, but clicked on the dataflow:
I have tried casting the date al kind of things but not the right thing.
Can you help me.
UPDATE:
After a question from Rakesh:
This is the activity parameter
#activity('LookupLastPipelineRunDate').output.firstRow
I have reproduced the above and got the below results.
My source sample data from SQL database.
For demo, I have used set variable for the date and given a sample date like below.
Created a string parameter and given this variable value to it.
In your case pass the lookup firstrow output date.
I have used below dataflow expression in the query of dataflow source and got the desired result.
concat('select * from dbo.table1 where d1 >=','\'',$date_value,'\'')
Result in a target SQL table.
I have created an activity set variable:
The first pipeline still returns the right date.
I even converted it just to be sure to datetime.
I can create a variable with type string.
Code:
#activity('LookupLastPipelineRunDate').output.firstRow
Regardless of the activity set variable that fails, it looks like the date enters nicely as an input in the Set variable activity
And still a get an error:
When i read this error message, it says that you can't put a date in a string variable. But i can only choose string, boolean and array, so there is no better option for this.
I also reviewd this website.
enter link description here
There for i have altered the table which contains the source data which i use in the dataflow.
I Deleted the column LastModifiedDate because it has datatype datetime.
Now i created the same column with datatype datetime2
I did this because i read that datetime2 has less problems with conversions.

Column Datatypes Issue in sql server 2019 when Import Flatfile using SSIS

I have column in flatfile contain value like. 2021-12-15T02:40:39+01:00
When I tried to Insert to table whose column datatype is datetime2.
It throwing Error as :
The data conversion for column "Mycol" returned status value 2 and status text
"The value could not be converted because of a potential loss of data.".
What could be best datatype for such values.
It seems the problem is two-fold here. One, the destination column for your value should be a datetimeoffset(0) and two that SSIS doesn't support the format yyyy-MM-ddThh:mm:ss for a DT_DBTIMESTAMPOFFSET; the T causes it problems.
Therefore I suggest that you define the column, MyCol, in your Flat File Connection as a DT_STR. Them, in your data flow task, use a derived column transformation which replaces MyCol and uses the following expression to remove the T and with a space ( ):
(DT_DBTIMESTAMPOFFSET,0) (REPLACE(Mycol,"T"," "))
This will then cause the correct data type and value to be inserted into the database.

Data Type Conversion causing error in SSIS Package

I have an SSIS package that has a step that creates a temp table using a Execute SQL Task.
Inside of this query I have a case statement that is something like:
Cast(Case
When billing_address is Like '%DONOTUSE%' Then 1
When billing_address is Like '%DONTUSE%' Then 1
Else 0
End as nvarchar)DoNotUseAccounts
I have an update statement in a different Execute SQL Task that is like this:
Update #StatementAccounts
Set Issues = Issues + ' - Do Not Use Account'
Where Product In ('prod1','prod2','prod3','prod4')
And DoNotUseCustomer= 1
When executing the package I am receiving an error: "Error: String or binary data would be truncated."
Am I using the wrong data type?
Does the Update statement need to be cast/converted as well?
Any guidance would be helpful.
I have tried using datatype int, numeric, and Casting the update statement as an int as well.
You have one of two possible issues here.
1) You have an explicit CREATE TABLE #StatementAccounts statement where you're defining Issues as NVARCHAR with no length specified, in which case it's one character, or with a length that's too small to accommodate the additional characters that you're trying to append with your UPDATE statement.
FIX: Make the declaration at least len( ' - Do Not Use Account') characters longer.
2) Much more likely from the sound of things, you're using a SELECT...INTO #StatementAccounts statement and letting SQL Server define your data types for you. In this case, it's setting Issues to be just big enough to accommodate the largest value in that initial statement.
FIX: Issue an explicit CREATE TABLE #StatementAccounts statement and declare appropriately sized data types, then change the SELECT...INTO to an INSERT INTO.

Replacing Null Values Using Derived Column in SSIS

I have a data flow task which picks up data from a non-unicode flat file to a SQL Server table destination.
I'm using a Derived Column task to replace NULL values in a date column with the string "1900-01-01". The destination table column is a varchar data type.
I'm using this SSIS Expression (DT_STR,10,1252)REPLACENULL(dateColumn,"1900-01-01") and the task executes successfully but I still see NULLs instead of the "1900-01-01" string at the destination.
Why is this? I've tried replacing the column, adding a new column but whatever I do I still see NULLs and not the replacement string. I can see my new derived column in the Advanced Editor so can see no reason why this isn't working. Any help would be most welcome.
If your your source is non-unicode, why are you using DT_STR? varchar is a already a non-unicode data-type. You should just be able to do it with
REPLACENULL(dateColumn,"1900-01-01")
Also, did you put in a lookup transformation to update the column? Have you made sure the right keys are being looked up and updated?

Integration Services double to string

I'm using Integration Services to load data from an Excel file to SQL Server table. When I try to send a number stored as double (DT_R8) into a database column where data are stored as varchar(50) I find a queer rounding.
For example consider data in first row first column of above image. Original value is 31.35 but as a string it's stored as shown below
I already tried to use a Delivered Column transformation to cast to string before exporting to SQL, I also added a Round(x, 5) but I get the same result.
How can I solve this problem given that I can't change SQL column data type?
The only working solution was changing the input type from double (DT_R8) to currency [DT_CY]. It seems that the rounding performed on double (DT_R8) make its use difficult when parsing is somehow involved in the export process.