Partial Loading Due to " in data - Snowflake Issue - sql

I haven't been able to find anything that describes this issue I am having, although, I am sure many have had this problem. It may be as simple as forcing pre-processing in Python before loading the data in.
I am trying to load data from S3 into Snowflake tables. I am seeing errors such as:
Numeric value '' is not recognized
Timestamp '' is not recognized
In the table definitions, these columns are set to DEFAULT NULL, so if there are NULL values here it should be able to handle them. I opened the files in Python to check on these columns and sure enough some of the rows (the exact number throwing an error in Snowflake) are NaN's.
Is there a way to correct for this in Snowflake?

Good chance you need to add something to your COPY INTO statement to get this to execute correctly. Try this parameter in your format options:
NULL_IF = ('NaN')
If you have more than just NaN values (like actual strings of 'NULL'), then you can add those to the list in the () above.

If you are having issues loading data into tables (from any source) and are experiencing a similar issue to the one described above, such that the error is telling you *datatype* '' is not recognized then you will need to follow these instructions:
Go into the FILE_FORMAT you are using through the DATABASES tab
Select the FILE_FORMAT and click EDIT in the tool bar
Click on Show SQL on the bottom left window that appears, copy the statement
Paste the statement into a worksheet and alter the NULL_IF statement as follows
NULL_IF = ('\\N','');
Snowflake doesn't seem to recognize a completely empty value by default, so you need to add it as an option!

Related

How do I edit the BigQuery Connector for Excel .iqy file to have the SQL statement already in it instead of relying on input from an Excel cell?

I'm having an issue similar to bigquery excel connector - query larger than 256 char
However, I AM referencing a cell range and get the result:
"WARNING Request failed: Error. Unable to execute query. 400 { code : 400, errors : [ { domain : global, location : query, locationType : other, message : 1.593 - 1.593: No query found., reason : invalidQuery } ], message : 1.593 - 1.593: No query found. }"
Perhaps I'm "splitting" the query incorrectly? I assumed each cell only needed to be less than 256 characters, and it would just concatenate subsequent cells in the range specified to the end of the string in the preceding cells.
Every help document I've found show simple SQL statements, and I can run simple ones, but the query I really need to work has a select statement in the where clause for a field. I've tried joining the table referenced in the where clause to see if that makes the statement simpler, more easily recognizable as a query, but no luck.
I've tried opening the .iqy file in NotePad that (BigQuery originally had me download) to see if I could just input the query there, but I cannot find any documentation for syntax on these types of files so when I load it into Excel it still shows a prompt for the query to be inputted.
The final result doesn't need to have the query read from a cell reference, in fact, if it could all just be in the .iqy file, that would be most preferable: less chance of users mucking up the data.
Make sure to URL encode the parameters (query, project and key) in .iqy file. Use an online tool like https://meyerweb.com/eric/tools/dencoder
if you had the .iqy loaded already in Excel before making above changes, you must delete the query definition. Go in Properties and uncheck Save query definition, then connect to the .iqy again
Not sure what the max size for q(query) is for https://bigquery-connector.appspot.com but I recommend using a BigQuery VIEW instead.
it hides the SQL plumbing and hence reduces the size of the SQL passed to the API. URL encoding the query can then be as easy as replacing spaces with +
you can tune/change the view definition in BigQuery without having to rollout a new .iqy to your users
implement some sort of row-level security using CURRENT_USER()...
But that's another topic !
Finally, coming back to the .iqy, you can combine and embed parameters in the query like so:
q=select+*+from+mydataset.myview+where+FiscalYear=["Year", "Enter a year:"]&p=myproject&k=myURLencodedKey

Can't convert String to Numeric/Decimal in SSIS

I have five or six OLE DB Sources with a String[DT_STR], with a length of 500 and 1252 (Latin) as Code Page.
The format of the column is like 0,08 or 0,10 etc etc. As you can see, it is separated with a comma.
All of them are equal except one of them. In this one source, I have a POINT as separation. On this it is working when I set the Data Type in the advanced editor of the OLE DB Source. On another (with comma separated) it is also working, if I set the Data Type in the advanced editor of the OLE DB Source. BUT the weird thing is, that it isn't working with the other sources although they are the same (sperated with comma).
I tested Numeric(18,2) and decimal(2).
Another try to solve the problem with the conversion task and/or the derived column task, failed.
I'm using SQL Server 2008 R2
Slowly, I think SSIS is fooling me :)
Has anyone an idea?
/// EDIT
Here a two screens:
Is working:
click
Isn't working:
click
I would not set the Data Type in the Advanced Editor of the OLE DB Source. I would convert the data in the SQL Code of the OLE DB Source, or in a Script Transformation e.g. using Decimal.TryParse , which would update a new column.
SSIS is unbeleivably fussy over datatypes and trying to mess with its internals is not productive.
Check that there are any spaces in between the commas, so that the SSIS is throwing an error trying to convert the blank space to a number. A blank space does not equal nothing in between spaces.
Redirect error rows and output the data to a file. Then you can examine the data that is being rejected by the SSIS and determine why it's causing error.
Reason for the error
1) Null’s are not properly handled either in the destination database or during SSIS package creation. It is quite possible that the source contains a null database but the destination is not accepting the NULL data leading to build generate above error.
2) Data types between source and destination does not match. For example, source column has varchar data and destination column have an int data type. This can easily generate above error. There are certain kind of datatypes which will automatically convert to another data type and will not generate the error but there are for incompatible datatypes which will generate The value could not be converted because of a potential loss of data. error.
The Issue arises when there is unhandled space or null. I have worked around using the Conditional (Ternary) Operator which checks the length:
LEN(TRIM([Column Name])) >= 1 ? (DT_NUMERIC,38,8)[Column Name] : 0

DB2 LOAD Modifier - GeneratedOverride or IdentityOverride

I am performing a DB2 load, and I am struggling to understand the impact of using GeneratedOverride over IdentityOverride. When I run the following command:
db2 load from tab123.ixf of ixf replace into application.table_abc
All rows are rejected, with the following error being the culprit:
SQL3550W The field value in row row-number and column column-number is not NULL, but the target column has been defined as GENERATED ALWAYS.
So to try and step around this, I executed
:
db2 load from tab123.ixf of ixf modified by identityoverride replace into application.table_abc
But this immediately returned this error:
SQL3526N The modifier clause "IDENTITY OVERRIDE" is inconsistent with the current load command. Reason code: "3".
From checking the reason code I see that the issue is "Generated or identity related file type modifiers have been specified but the target table contains no such columns." .. but the SQL3550W error seems to infer that the columns are generated always!
The only way I can get these rows to commit to the table is to run..
db2 load from tab123.ixf of ixf modified by generatedoverride replace into application.table_abc
Can anyone enlighten me to why I am recieving the SQL3526N error, or what the implications of running generatedoverride are?
Thanks for sticking with me..
Generated columns are not necessarily identity columns, apparently that's the case in your situation. Check the CREATE TABLE syntax to see what are other ways to generate column values.
By using the GENERATEDOVERRIDE option during the load you are obviously replacing (overriding) the generated values with those from the input file.

Informix: Update Statement Error In WebSphere

I am trying to run this update statement but informix doesn't allow me to.
I have a table, named ITEMS and below I have selected some records from it.
SELECT SHORT_SKU, ITEMS."STYLE" FROM ITEMS;
SHORT_SKU STYLE
--------- -----
01846173 null
01811752 null
01811748 null
Trying to run the below UPDATE statement, informix says syntax error.
UPDATE ITEMS SET ITEMS."STYLE" = 'M' WHERE SHORT_SKU = '01846173';
^ syntax error here
Then I changed (as below) and got "Column (style) not found in any table in the query (or SLV is undefined)."
UPDATE ITEMS SET STYLE = 'M' WHERE SHORT_SKU = '01846173';
How do I update the "STYLE" field?
UPDATE 1
I did a change to one of WAS data source's custom properties, ifxDELIMIDENT. Originally it was blank. So, I changed it to true. Restarted WAS. And I couldn't login to our application. SQLExceptions were thrown by WAS but was not able to see the stack trace because WAS has truncated the last few lines. After changing the property back to blank, I was able to login to our application.
I tried another approach, which was to write a Java class that updates the ITEMMST.STYLE column. I executed this from a shell script. In the shell script, I defined and exported the variable DELIMIDENT with the value 'Y'. But I am still getting 'Syntax error'.
UPDATE 2
I managed to update the column. This is done by adding the 'DELIMIDENT=Y' property at the end of the connection string which will be passed to the DriverManager object when opening the database connection.
But, this won't work for our web application because it uses the WebSphere data source to create the db connection. It would be super if there's a way to set this property in the Informix environment itself.
Try:
UPDATE ITEMS SET "STYLE" = 'M' WHERE SHORT_SKU = '01846173';
It must be that STYLE is a reserved word so you must double-quote it to refer to the column. But standard UPDATE syntax doesn't allow you to prefix column names with the table name in the SET clause (since you can only be updating the columns of one table: the table mentioned in the UPDATE).
The right Syntax would be
UPDATE ITEMS SET STYLE = 'M' WHERE SHORT_SKU = '01846173';
As stated on IBM Documentation but as STYLE is a reserved word i guess your getting problems, Read IBM Recommendation on this.
Anyway you may find a work arround oat this link, otherwise you may consider changing the column name.
I am not aware of STYLE being a keyword in Informix (but I haven't gone to look for it). However, you can usually use keywords as column names etc without much problem.
If you must quote it, you need to set the DELIMIDENT environment variable - the value doesn't matter, but use DELIMIDENT=1 for concreteness. This enables SQL standard 'delimited identifiers', where double quotes surround identifiers (column names, table names, etc) and single quotes surround strings. (Normally, you can use either single quotes or double quotes around strings.)
One other point: if you use delimited identifiers, they also become case-sensitive (whereas normally, identifiers are case-insensitive). So you need to know how the STYLE column is stored in the system catalog. In most databases, they will be in lower-case. There's an outside chance that in a MODE ANSI database, they are stored in upper-case (but it is a while since I looked to make sure).
Use this query:
UPDATE ITEMS SET ITEMS.STYLE = 'M' WHERE SHORT_SKU = '01846173';
I think double quotes not required for column name.
Updated Answer 1:
Error Description-
-217 Column column-name not found in any table in the query
(or SLV is undefined).
The name appears in the select list or WHERE clause of this query but is
not defined in a table and does not appear as a statement local variable
(SLV) definition. Check that the column name or SLV name and the names of
the selected tables are spelled as you intended.
If all names are spelled correctly, you are not using the right tables,
the database has been changed, or you have not defined the SLV. If the
name not found is a reference to a column, that column might have been
renamed or dropped. If the name not found represents an SLV and you
defined the SLV in the statement, make sure that the SLV definition
appears before all other references to that SLV name.
This error message can also appear during the execution of an ALTER TABLE
statement when the engine tries to update views that depend on the table.
More info link
Updated Answer 2:
If not possible to change column name then get more information about SLV.
You can refer following links for description and use of SLV:
link1
link2
link3
There are 2 solutions for this.
Set the Informix JDBC data source 'ifxDELIMIDENT' property to 'true'
Rename the affected table columns
For the 1st option, we had a problem after setting the data source to 'true'. Suddenly all our queries did not work. After much troubleshooting, we found out that by setting the 'ifxDELIMIDENT' property to 'true', it also changed Informix to be case sensitive. In our Java code, we have all the column names in uppercase, and in Informix (Example: resultSet.getString("STYLE")), but the table column names are lowercase (Example: 'style'). This was why after changing this property, we were not able to login to our application Unfortunately this behavior was not documented anywhere in IBM's Info Centre nor in the internet.
We opted for the 2nd option which involved changing the affected column names to another column name (Example: Changed 'STYLE' to 'ITEM_STYLE').

Import Package Error - Cannot Convert between Unicode and Non Unicode String Data Type

I have made a dtsx package on my computer using SQL Server 2008. It imports data from a semicolon delimited csv file into a table where all of the field types are NVARCHAR MAX.
It works on my computer, but it needs to run on the clients server. Whenever they create the same package with the same csv file and destination table, they receive the error above.
We have gone through the creation of the package step by step, and everything seems OK. The mappings are all correct, but when they run the package in the last step, they receive this error. They are using SQL Server 2005.
Can anyone advise where to begin looking for this problem?
The problem of converting from any non-unicode source to a unicode SQL Server table can be solved by:
add a Data Conversion transformation step to your Data Flow
open the Data Conversion and select Unicode for each data type that applies
take note of the Output Alias of each applicable column (they are named Copy Of [original column name] by default)
now, in the Destination step, click on Mappings
change all of your input mappings to come from the aliased columns in the previous step (this is the step that is easily overlooked and will leave you wondering why you are still getting the same errors)
At some point, you're trying to convert an nvarchar column to a varchar column (or vice-versa).
Moreover, why is everything (supposedly) nvarchar(max)? That's a code smell if I ever saw one. Are you aware of how SQL Server stores those columns? They use pointers to where the column is stored from the actual rows, since they don't fit within the 8k pages.
Non-Unicode string data types:
Use STR for text file and VARCHAR for SQL Server columns.
Unicode string data types:
Use W_STR for text file and NVARCHAR for SQL Server columns.
The problem is that your data types do not match, so there could be a loss of data during the conversion.
Two solutions:
1- if the type of the target column is [nvarchar] it should be change to [varchar]
2- Add a "Derived Column" component to the SSIS package and add a new column with the following expression:
(DT_WSTR, «length») [ColumnName]
Length is the length of the column in the target table and ColumnName is the name of the column in the target table.
finally at the mapping part you should use this new added column instead of the original column.
Not sure if this is a best practice with SSIS but sometimes I find their tools are a bit clunky when you want to do this type of activity.
Instead of using their components you can convert the data within your query
Instead of doing
SELECT myField = myNvarchar20Field
FROM myTable
You could do
SELECT myField = CONVERT(VARCHAR(20),myNvarchar20Field)
FROM myTable
This a solution that uses the IDE to fix:
Add a Data Conversion item to your dataflow as shown below;
Double click on the Data Conversion item, and set it as shown:
Now double click on the DB Destination item, Click on Mapping, and ensure that your input Column is actually the same as coming from the Copy of [your column name], which is in fact the Data Conversion output NOT the DB Source Output (be careful here). Here is a screenshot:
And thats it .. save and run ..
Mike, I had the same problem with SSIS in SQL Server 2005...
Apparently, the DataFlowDestination object will always attempt to validate the data coming in,
into Unicode. Go to that object, Advanced Editor, Component Properties pane, change the "ValidateExternalMetaData" property to False. Now, go to the Input and Output Properties pane, Destination Input, External Columns - set each column Data type and Length to match the database table it's going to. Now, when you close that editor, those column changes will be saved and not validated over, and it will work.
Follow the below steps to avoid (cannot convert between unicode and non-unicode string data types) this error
i) Add the Data conversion Transformation tool to your DataFlow.
ii) To open the DataFlow Conversion and select [string DT_STR] datatype.
iii) Then go to Destination flow, select Mapping.
iv) change your i/p name to copy of the name.
Get to the registry to configuration of the client and change the LANG.
For Oracle, go to HLM\SOFTWARE\ORACLE\KEY_ORACLIENT...HOME\NLS_LANG and change to appropriate language.
The dts data Conversion task is time taking if there are 50 plus columns!Found a fix for this at the below link
http://rdc.codeplex.com/releases/view/48420
However, it does not seem to work for versions above 2008. So this is how i had to work around the problem
*Open the .DTSX file on Notepad++. Choose language as XML
*Goto the <DTS:FlatFileColumns> tag. Select all items within this tag
*Find the string **DTS:DataType="129"** replace with **DTS:DataType="130"**
*Save the .DTSX file.
*Open the project again on Visual Studio BIDS
*Double Click on the Source Task . You would get the message
the metadata of the following output columns does not match the metadata of the external columns with which the output columns are associated:
...
Do you want to replace the metadata of the output columns with the metadata of the external columns?
*Now Click Yes. We are done !
Resolved - to the original ask:
I've seen this before. Easiest way to fix (don't need all those data conversion steps as ALL of the meta data is available from the source connection):
Delete the OLE DB Source & OLE DB Destinations
Make sure Delayed Validation is FALSE (you can set it to True later)
Recreate the OLE DB Source with your query, etc.
Verify in the Advanced Editor that all of the output data column types are correct
Recreate your OLE DB Destination, map, create new table (or remap to existing) and you'll see that SSIS got all the data types correct (same as source).
So much easier that the stuff above.
Not sure if this is still a problem but I found this simple solution:
Right-Click Ole DB Source
Select 'Edit'
Select Input and Output Properties Tab
Under "Inputs and Outputs", Expand "Ole DB Source Output" External Columns and Output Columns
In Output columns, select offending field, on the right-hand panel ensure Data Type Property matches that of the field in External Columns properties
Hope this was clear and easy to follow
Sometime we get this error when we select static character as a field in source query/view/procedure and the destination field data type in Unicode.
Below is the issue i faced:
I used the script below at source
and got the error message Column "CATEGORY" cannot convert between Unicode and non-Unicode string data types. as below:
error message
Resolution:
I tried multiple options but none worked for me. Then I prefixed the static value with N to make in Unicode as below:
SELECT N'STUDENT DETAIL' CATEGORY, NAME, DATEOFBIRTH FROM STUDENTS
UNION
SELECT N'FACULTY DETAIL' CATEGORY, NAME, DATEOFBIRTH FROM FACULTY
If anyone is still experiencing this issue, I found that it related to a difference in Oracle Client versions.
I have posted my full experience and solution here: https://stackoverflow.com/a/43806765/923177
1.add a Data Conversion tool from toolbox
2.Open it,It shows all coloumns from excel ,convert it to desire output. take note of the Output Alias of
each applicable column (they are named Copy Of [original column name] by default)
3.now, in the Destination step, click on Mappings
I changed ValidateExternalMetadata=False for each transformation task. It worked for me.