SQL Server 2012 -SSIS- Problems querying string from unicode only culture - sql

I am new to dealing with languages in SQL Server and this forum...
The following query:
SELECT [CultureCode], [Target]
FROM [Str].[dbo].[LatestReversal]
where [Target] = N''
CultureCode | Target
am-ET | ማዕከላዊ የብራዚል የቀን ብርሃን ጊዜ
am-ET | ፓስፊክ የቀን ብርሃን ጊዜ
...
Expected:
The query in the code snippet is to a table that forms the first step of an ETL process. Using this query I would expect to see only rows returned have empty strings.
Result:
However I am returned 900+ rows of that have a value for the field Target. All these strings are from Unicode only cultures i.e. windows does not have a specific code page for them. Can someone explain why this is happening?
The CSV file was UTF8 and I have tried Unicode format.
The SSIS job uses NText for the Source and Destination connection data types (no conversion)
The Target field in the DB is nvarchar(Max)
Later in the process I try to process a cube allowing for failures. The exact same strings fail the load process.
Any help appreciated. Even with pointers on other special handling I would need to apply for these cultures.
Cheers,
Seamus

you can simply use IS NULL function to retrieve only empty strings for target field...
SELECT [CultureCode], [Target]
FROM [Str].[dbo].[LatestReversal]
where [Target] IS NULL;

Related

Search and Replace a a partial string / substring in mssql tables

I was tasked with moving an installation of Orchard CMS to a different server and domain. All the content (page content, menu structure, links, etc.) is stored in an MSSQL database. The good part: When moving the physical files of the Orchard installation to the new server, the database will stay the same, no need to migrate it. The bad thing: There are lots and lots of absolute URLs scattered all over the pages and menus.
I have isolated / pinned down the tables and fields in which the URLs occur, but I lack the (MS)SQL experience/knowledge to do a "search - replace". So I come here for help (I have tried exporting the tables to .sql files, doing a search-replace in a text editor, and then re-importing the .sql files to the database, but ran into several syntax errors... so i need to do this the "SQL way").
To give an example:
The table Common_BodyPartRecord has the field Text of type ntext that contains HTML content. I need to find every occurance of the partial string /oldserver.com/foo/ and replace it with /newserver.org/bar/. There can be multiple occurances of the pattern within the same table entry.
(In total I have 5 patterns that will need replacing, all partial string / substrings of urls, domains/paths, etc.)
I usually do frontend stuff and came to this assignment by chance. I have used MySQL back in the day I was playing around with PHP related stuff, but never got past eh basics of SQL - it would be helpful if you could keep your explainations more or less newbie-friendly.
The SQL server version is SQL Server 9.0.4053, I have access to the database via the Microsoft SQL Server Management Studio 12
Any help is highly appreciated!
You can't manipulate the NTEXT datatype directly, but you can CAST it to VARCHAR(MAX), then use the REPLACE function to perform the string replacement, then CAST it back to NTEXT. This can all be done in a single UPDATE statement.
update MyTable
set MyColmun = cast(replace(cast(MyColumn as nvarchar(max)), N'/oldserver.com/foo/', N'/newserver.org/bar/') as ntext)
where cast(MyColumn as nvarchar(max)) LIKE N'%/oldserver.com/foo/%'
The WHERE clause in the UPDATE statement below is used to prevent SQL Server from making non-changes, i.e. if the value does not need to be changed then there is no need to update it to itself.
The CAST function is used to change the data type of a value. NTEXT is a legacy data type used for storing large character values, NVARCHAR(MAX) is a new and more versatile data type for storing large character values. The REPLACE function can not operate on NTEXT values, hence the need to CAST it to NVARCHAR(MAX) first, do the replace, then CAST it back to NTEXT afterwards.

Can't convert String to Numeric/Decimal in SSIS

I have five or six OLE DB Sources with a String[DT_STR], with a length of 500 and 1252 (Latin) as Code Page.
The format of the column is like 0,08 or 0,10 etc etc. As you can see, it is separated with a comma.
All of them are equal except one of them. In this one source, I have a POINT as separation. On this it is working when I set the Data Type in the advanced editor of the OLE DB Source. On another (with comma separated) it is also working, if I set the Data Type in the advanced editor of the OLE DB Source. BUT the weird thing is, that it isn't working with the other sources although they are the same (sperated with comma).
I tested Numeric(18,2) and decimal(2).
Another try to solve the problem with the conversion task and/or the derived column task, failed.
I'm using SQL Server 2008 R2
Slowly, I think SSIS is fooling me :)
Has anyone an idea?
/// EDIT
Here a two screens:
Is working:
click
Isn't working:
click
I would not set the Data Type in the Advanced Editor of the OLE DB Source. I would convert the data in the SQL Code of the OLE DB Source, or in a Script Transformation e.g. using Decimal.TryParse , which would update a new column.
SSIS is unbeleivably fussy over datatypes and trying to mess with its internals is not productive.
Check that there are any spaces in between the commas, so that the SSIS is throwing an error trying to convert the blank space to a number. A blank space does not equal nothing in between spaces.
Redirect error rows and output the data to a file. Then you can examine the data that is being rejected by the SSIS and determine why it's causing error.
Reason for the error
1) Null’s are not properly handled either in the destination database or during SSIS package creation. It is quite possible that the source contains a null database but the destination is not accepting the NULL data leading to build generate above error.
2) Data types between source and destination does not match. For example, source column has varchar data and destination column have an int data type. This can easily generate above error. There are certain kind of datatypes which will automatically convert to another data type and will not generate the error but there are for incompatible datatypes which will generate The value could not be converted because of a potential loss of data. error.
The Issue arises when there is unhandled space or null. I have worked around using the Conditional (Ternary) Operator which checks the length:
LEN(TRIM([Column Name])) >= 1 ? (DT_NUMERIC,38,8)[Column Name] : 0

Using dynamic SQL in an OLE DB source in SSIS 2012

I have a stored proc as the SQL command text, which is getting passed a parameter that contains a table name. The proc then returns data from that table. I cannot call the table directly as the OLE DB source because some business logic needs to happen to the result set in the proc. In SQL 2008 this worked fine. In an upgraded 2012 package I get "The metadata could not be determined because ... contains dynamic SQL. Consider using the WITH RESULT SETS clause to explicitly describe the result set."
The problem is I cannot define the field names in the proc because the table name that gets passed as a parameter can be a different value and the resulting fields can be different every time. Anybody encounter this problem or have any ideas? I've tried all sorts of things with dynamic SQL using "dm_exec_describe_first_result_set", temp tables and CTEs that contains WITH RESULT SETS, but it doesn't work in SSIS 2012, same error. Context is a problem with a lot of the dynamic SQL approaches.
This is latest thing I tried, with no luck:
DECLARE #sql VARCHAR(MAX)
SET #sql = 'SELECT * FROM ' + #dataTableName
DECLARE #listStr VARCHAR(MAX)
SELECT #listStr = COALESCE(#listStr +',','') + [name] + ' ' + system_type_name FROM sys.dm_exec_describe_first_result_set(#sql, NULL, 1)
exec('exec(''SELECT * FROM myDataTable'') WITH RESULT SETS ((' + #listStr + '))')
So I ask out of kindness, by why on God's green earth are you using an SSIS Data Flow task to handle dynamic source data like this?
The reason you're running into trouble is because you're perverting every purpose of an SSIS Data flow task:
to extract a known source with known metadata that can be statically typed and cached in design-time
to run through a known process with straightforward (and ideally asynchronous) transformations
to take that transformed data and load it into a known destination also with known metadata
It's fine to have parameterized data sources that bring back different data. But to have them bring back entirely different metadata each time with no congruity between the different sets is, frankly, ridiculous, and I'm not entirely sure I want to know how you handled all your column metadata in the working 2008 package.
This is why it wants you add a WITH RESULTS SET to the SSIS query - so it can generate some metadata. It doesn't do this at runtime - it can't! It has to have a known set of columns (because it aliases them all into compiled variables anyway) to work with. It expects the same columns every time it runs that Data Flow Task - the exact same columns, down to the names, the types, and the constraints.
Which leads to one (terrible, terrible) solution - just stick all the data into a temporary table with Column1, Column2 ... ColumnN and then use the same variable you're using as the table name parameter to conditionally branch your code and do whatever you want with the columns.
Another more sane solution would be to create a data flow task for each of your source tables, and use your parameter in a precedence constraint to just pick which data flow task should run.
For a solution this poorly tailored for an out-of-the-box ETL, you should also highly consider just rolling your own in C# or a script task instead of the Data Flow Task provided by SSIS.
In short, please don't do this. Think of the children (packages)!
I've used CozyRoc Dynamic DataFlow Plus to achieve this.
Using configuration tables to build the SQL Select statements, I have a single SSIS package that loads data from Oracle and Sybase (or any OLEDB source) to MS SQL. Some of the result sets are in the millions of rows and performance is excellent.
Instead of writing a new package every time a new table is needed, this can be configured in minutes and run on a the pre-tested and robust existing package.
Without it I would have been up for writing hundreds of packages.

Import Package Error - Cannot Convert between Unicode and Non Unicode String Data Type

I have made a dtsx package on my computer using SQL Server 2008. It imports data from a semicolon delimited csv file into a table where all of the field types are NVARCHAR MAX.
It works on my computer, but it needs to run on the clients server. Whenever they create the same package with the same csv file and destination table, they receive the error above.
We have gone through the creation of the package step by step, and everything seems OK. The mappings are all correct, but when they run the package in the last step, they receive this error. They are using SQL Server 2005.
Can anyone advise where to begin looking for this problem?
The problem of converting from any non-unicode source to a unicode SQL Server table can be solved by:
add a Data Conversion transformation step to your Data Flow
open the Data Conversion and select Unicode for each data type that applies
take note of the Output Alias of each applicable column (they are named Copy Of [original column name] by default)
now, in the Destination step, click on Mappings
change all of your input mappings to come from the aliased columns in the previous step (this is the step that is easily overlooked and will leave you wondering why you are still getting the same errors)
At some point, you're trying to convert an nvarchar column to a varchar column (or vice-versa).
Moreover, why is everything (supposedly) nvarchar(max)? That's a code smell if I ever saw one. Are you aware of how SQL Server stores those columns? They use pointers to where the column is stored from the actual rows, since they don't fit within the 8k pages.
Non-Unicode string data types:
Use STR for text file and VARCHAR for SQL Server columns.
Unicode string data types:
Use W_STR for text file and NVARCHAR for SQL Server columns.
The problem is that your data types do not match, so there could be a loss of data during the conversion.
Two solutions:
1- if the type of the target column is [nvarchar] it should be change to [varchar]
2- Add a "Derived Column" component to the SSIS package and add a new column with the following expression:
(DT_WSTR, «length») [ColumnName]
Length is the length of the column in the target table and ColumnName is the name of the column in the target table.
finally at the mapping part you should use this new added column instead of the original column.
Not sure if this is a best practice with SSIS but sometimes I find their tools are a bit clunky when you want to do this type of activity.
Instead of using their components you can convert the data within your query
Instead of doing
SELECT myField = myNvarchar20Field
FROM myTable
You could do
SELECT myField = CONVERT(VARCHAR(20),myNvarchar20Field)
FROM myTable
This a solution that uses the IDE to fix:
Add a Data Conversion item to your dataflow as shown below;
Double click on the Data Conversion item, and set it as shown:
Now double click on the DB Destination item, Click on Mapping, and ensure that your input Column is actually the same as coming from the Copy of [your column name], which is in fact the Data Conversion output NOT the DB Source Output (be careful here). Here is a screenshot:
And thats it .. save and run ..
Mike, I had the same problem with SSIS in SQL Server 2005...
Apparently, the DataFlowDestination object will always attempt to validate the data coming in,
into Unicode. Go to that object, Advanced Editor, Component Properties pane, change the "ValidateExternalMetaData" property to False. Now, go to the Input and Output Properties pane, Destination Input, External Columns - set each column Data type and Length to match the database table it's going to. Now, when you close that editor, those column changes will be saved and not validated over, and it will work.
Follow the below steps to avoid (cannot convert between unicode and non-unicode string data types) this error
i) Add the Data conversion Transformation tool to your DataFlow.
ii) To open the DataFlow Conversion and select [string DT_STR] datatype.
iii) Then go to Destination flow, select Mapping.
iv) change your i/p name to copy of the name.
Get to the registry to configuration of the client and change the LANG.
For Oracle, go to HLM\SOFTWARE\ORACLE\KEY_ORACLIENT...HOME\NLS_LANG and change to appropriate language.
The dts data Conversion task is time taking if there are 50 plus columns!Found a fix for this at the below link
http://rdc.codeplex.com/releases/view/48420
However, it does not seem to work for versions above 2008. So this is how i had to work around the problem
*Open the .DTSX file on Notepad++. Choose language as XML
*Goto the <DTS:FlatFileColumns> tag. Select all items within this tag
*Find the string **DTS:DataType="129"** replace with **DTS:DataType="130"**
*Save the .DTSX file.
*Open the project again on Visual Studio BIDS
*Double Click on the Source Task . You would get the message
the metadata of the following output columns does not match the metadata of the external columns with which the output columns are associated:
...
Do you want to replace the metadata of the output columns with the metadata of the external columns?
*Now Click Yes. We are done !
Resolved - to the original ask:
I've seen this before. Easiest way to fix (don't need all those data conversion steps as ALL of the meta data is available from the source connection):
Delete the OLE DB Source & OLE DB Destinations
Make sure Delayed Validation is FALSE (you can set it to True later)
Recreate the OLE DB Source with your query, etc.
Verify in the Advanced Editor that all of the output data column types are correct
Recreate your OLE DB Destination, map, create new table (or remap to existing) and you'll see that SSIS got all the data types correct (same as source).
So much easier that the stuff above.
Not sure if this is still a problem but I found this simple solution:
Right-Click Ole DB Source
Select 'Edit'
Select Input and Output Properties Tab
Under "Inputs and Outputs", Expand "Ole DB Source Output" External Columns and Output Columns
In Output columns, select offending field, on the right-hand panel ensure Data Type Property matches that of the field in External Columns properties
Hope this was clear and easy to follow
Sometime we get this error when we select static character as a field in source query/view/procedure and the destination field data type in Unicode.
Below is the issue i faced:
I used the script below at source
and got the error message Column "CATEGORY" cannot convert between Unicode and non-Unicode string data types. as below:
error message
Resolution:
I tried multiple options but none worked for me. Then I prefixed the static value with N to make in Unicode as below:
SELECT N'STUDENT DETAIL' CATEGORY, NAME, DATEOFBIRTH FROM STUDENTS
UNION
SELECT N'FACULTY DETAIL' CATEGORY, NAME, DATEOFBIRTH FROM FACULTY
If anyone is still experiencing this issue, I found that it related to a difference in Oracle Client versions.
I have posted my full experience and solution here: https://stackoverflow.com/a/43806765/923177
1.add a Data Conversion tool from toolbox
2.Open it,It shows all coloumns from excel ,convert it to desire output. take note of the Output Alias of
each applicable column (they are named Copy Of [original column name] by default)
3.now, in the Destination step, click on Mappings
I changed ValidateExternalMetadata=False for each transformation task. It worked for me.

Manually inserting varbinary data into SQL Server

We have a SQL Server table for user settings. Originally the settings were domain objects which had been serialized as XML into the table but we recently begun serializing them as binary.
However, as part of our deployment process we statically pre-populate the table with predefined settings for our users. Originally, this was as simple as copying the XML from a customized database and pasting it into an INSERT statement that was ran after the database was built. However, since we've moved to storing the settings as binary data we can't get this to work.
How can we extract binary data from a varbinary column in SQL Server and paste it into a static INSERT script? We only want to use SQL for this, we don't want to use any utilities.
Thanks in advance,
Jeremy
You may find it easier to store a template value in a config table somewhere, then read it into a variable and use that variable to fill your inserts:
DECLARE #v varbinary(1000)
SELECT #v = templatesettings from configtable
INSERT INTO usertable VALUES(name, #v, ....)
From SQL Server 2008 onwards you can use Tasks > Generate Scripts and choose to include data. That gives you INSERT statements for all rows in a table which you can modify as needed.
Here's the steps for SQL 2008. Note that the "Script Data" option in SQL 2008 R2 is called "Types of data to script" instead of "Script Data".
I presume you're OK with utilities like Query Analyzer/Mangement Studio?
You can just copy and paste the binary value returned by your select statement (make sure that you are returning sufficient data), and prefix it with "0x" in your script.
If I understand you correctly, you want to generate a static script from your data. If so, consider performing a query on the old data that concatenates strings to form the SQL statements you'll want in the script.
First, figure out what you want the scripted result to look like. Note that you'll need to think of the values you're inserting as constants. For example:
INSERT INTO NewTable VALUES 'value1', 'value2'
Now, create a query for the old data that just gets the values you'll want to move, like this:
SELECT value1, value2
FROM OldTable
Finally, update your query's SELECT statement to produce a single concatenated string in the form of the output you previous defined:
SELECT 'INSERT INTO NewTable VALUES ''' + value1 + ''', ''' + value2 + ''''
FROM OldTable
It's a convoluted way to do business, but it gets the job done. You'll need a close attention to detail. It will allow a small (but confusing) query to quickly output very large numbers of static DML statements.
David M's suggestion of using the 0x prefixing works but i had to add an extra 0 at the end of varbinary data that i was trying to insert.
See the stackoverflow entry below to see the issue with additional 0 that gets added when converting to varbinary or saving to varbinary column
Insert hex string value to sql server image field is appending extra 0