XML data type in SQL Server? - sql

I have some unstructured data that describe setting for each devices. For e.g. for Device 1:
<ChartSetting xmlns="ChartSetting1">
<ChartSizeX>132.6</ChartSizeX>
<ChartSizeY>132.6</ChartSizeY>
<ChartType>Native</ChartType>
<BarSizeX>90</BarSizeX>
<BarSizeY>6</BarSizeY>
<DistToBar>34.8</DistToBar>
<DistToFirstLineY>17.5</DistToFirstLineY>
<MarkerDistance>120</MarkerDistance>
<DistToFirstField>18.5</DistToFirstField>
<PatchSizeX>7.5</PatchSizeX>
<PatchSizeY>9</PatchSizeY>
</ChartSetting>
However for Device 2 setting is different
<ChartSetting xmlns="ChartSetting2">
<PatchGap>1</PatchGap>
<PatchSize>5</PatchSize>
</ChartSetting>
xml data is not used for query purpose, it will be passed to the .net application that eventually send it to the devicedriver (through C++ code)
We only have four types device. So possibility of different settings is limited.
Can this kind of data be stored as typed XML that adhere to some schema? Storing as varchar will be nightmare if there are invalid settings stored.
I know xml schema collection can have multiple schema but can it confirms to only one schema.
Is there an easy way to create a schema?
Should I be using untyped XML instead?

You could create a DTD for the four different device types and validate your XML fragments with those DTDs, but I'd suggest you do that processing OUTSIDE of SQL Server.
If you are not going to query the XML, there isn't any reason to store it as the XML data type.
Answers to your questions:
Typed XML: you'll have to use a specific type per column (see http://msdn.microsoft.com/en-us/library/ms184277.aspx)
Create a
schema - check out (http://msdn.microsoft.com/en-us/library/ms176009.aspx) and try
something like XMLSpy if it gets too complex. Your snippets looked
small so I should think you can manage it with Notepad/++
I'd use untyped XML and validate it when it gets stored and/or retrieved

1.I think it's possible, if your schema collection contains 4 different schemas, the related column can contain xml that satisfies one of the schemas in collection.
2.I believe the easiest way is to let sql server create a schema for you (first create tables ChartSetting1,ChartSetting2 with desired columns - it's not necessarily, you can create SELECT that returns all columns, but using tables is easier), then
DECLARE #mySchema NVARCHAR(MAX);
SET #mySchema = N'';
SET #mySchema =#mySchema +
(
SELECT * FROM ChartSetting1
FOR XML AUTO, ELEMENTS, XMLSCHEMA('ChartSetting1')
);
SET #mySchema =#mySchema +
(
SELECT * FROM ChartSetting2
FOR XML AUTO, ELEMENTS, XMLSCHEMA('ChartSetting2')
);
CREATE XML SCHEMA COLLECTION [name] AS #mySchema;
-- we don't need these tables , drop them
DROP TABLE ChartSetting1,ChartSetting2
3.It depends. I'd say if you need constrain the XML data beyond schema validation, then it makes sense to use untyped XML. If schema validation covers all aspects of data validation, why not use it?

I think it is impossible create schema that conforms both XML.
Many XML editors can create XSD from XML, e.g.: Visual Studio
Typed XML vs. Untyped XML

XML shouldent really be stored in a databse as XML. XML is more of a transport medium than data. As you say its big and hard to query. You should store your data as data (varchar int ect) then use PATH MODE to return it back to you in the format you want

Related

tSqlt - Unit Testing Procedures which output XML using Schema Collections

We are currently developing SQL stored procedures for an integration project which requires XML payloads, which we have implemented using Schema Collections to ensure strict adherence to the defined schema.
We want to write unit tests for these stored procedures but given the output is XML, I am not sure how ( or if ) I can do this with the assert stored procedures in tsqlt (without having to convert everything into a table / table variable.)
Can someone please point me in the right direction - whether it can/can't be done, and if its not possible, how should I approach trying to create a new asset stored procedure for XML payloads ?
Or would we save a lot of time / heartache by simply converting the expected and actual results from XML into tables in each test ?
Thanks
If you are dealing with the situation that the order of nodes might change, then converting the XML to tables might be your best bet.
If the order of nodes and attributes is fixed in your output, you could convert the XML to an NVARCHAR(MAX) and then use tSQLt.AssertEqualsString.

What is the simplest way to convert table Rows into XML and viceversa

I need to develop a sql server procedure which will convert table rows into xml and will transfer it to another stored procedure(Different server) through link server. On that server I have to take this xml data as input and again convert it into table rows. What will be the simplest and time efficient way to do this.
It is not possible to use the XML datatype as a parameter in stored procedure on a linked server. You have to use nvarchar(max) and convert to XML in the stored procedure.
To create the XML you should use FOR XML (SQL Server). RAW and AUTO is easy and if you need more control you could use PATH. I would stay away from EXPLICIT.
To shred the XML on the receiving side you should use nodes() Method (xml Data Type) and value() Method (xml Data Type).
Given that you are using SQL Server, you need only append FOR XML to your query.

Using dynamic SQL in an OLE DB source in SSIS 2012

I have a stored proc as the SQL command text, which is getting passed a parameter that contains a table name. The proc then returns data from that table. I cannot call the table directly as the OLE DB source because some business logic needs to happen to the result set in the proc. In SQL 2008 this worked fine. In an upgraded 2012 package I get "The metadata could not be determined because ... contains dynamic SQL. Consider using the WITH RESULT SETS clause to explicitly describe the result set."
The problem is I cannot define the field names in the proc because the table name that gets passed as a parameter can be a different value and the resulting fields can be different every time. Anybody encounter this problem or have any ideas? I've tried all sorts of things with dynamic SQL using "dm_exec_describe_first_result_set", temp tables and CTEs that contains WITH RESULT SETS, but it doesn't work in SSIS 2012, same error. Context is a problem with a lot of the dynamic SQL approaches.
This is latest thing I tried, with no luck:
DECLARE #sql VARCHAR(MAX)
SET #sql = 'SELECT * FROM ' + #dataTableName
DECLARE #listStr VARCHAR(MAX)
SELECT #listStr = COALESCE(#listStr +',','') + [name] + ' ' + system_type_name FROM sys.dm_exec_describe_first_result_set(#sql, NULL, 1)
exec('exec(''SELECT * FROM myDataTable'') WITH RESULT SETS ((' + #listStr + '))')
So I ask out of kindness, by why on God's green earth are you using an SSIS Data Flow task to handle dynamic source data like this?
The reason you're running into trouble is because you're perverting every purpose of an SSIS Data flow task:
to extract a known source with known metadata that can be statically typed and cached in design-time
to run through a known process with straightforward (and ideally asynchronous) transformations
to take that transformed data and load it into a known destination also with known metadata
It's fine to have parameterized data sources that bring back different data. But to have them bring back entirely different metadata each time with no congruity between the different sets is, frankly, ridiculous, and I'm not entirely sure I want to know how you handled all your column metadata in the working 2008 package.
This is why it wants you add a WITH RESULTS SET to the SSIS query - so it can generate some metadata. It doesn't do this at runtime - it can't! It has to have a known set of columns (because it aliases them all into compiled variables anyway) to work with. It expects the same columns every time it runs that Data Flow Task - the exact same columns, down to the names, the types, and the constraints.
Which leads to one (terrible, terrible) solution - just stick all the data into a temporary table with Column1, Column2 ... ColumnN and then use the same variable you're using as the table name parameter to conditionally branch your code and do whatever you want with the columns.
Another more sane solution would be to create a data flow task for each of your source tables, and use your parameter in a precedence constraint to just pick which data flow task should run.
For a solution this poorly tailored for an out-of-the-box ETL, you should also highly consider just rolling your own in C# or a script task instead of the Data Flow Task provided by SSIS.
In short, please don't do this. Think of the children (packages)!
I've used CozyRoc Dynamic DataFlow Plus to achieve this.
Using configuration tables to build the SQL Select statements, I have a single SSIS package that loads data from Oracle and Sybase (or any OLEDB source) to MS SQL. Some of the result sets are in the millions of rows and performance is excellent.
Instead of writing a new package every time a new table is needed, this can be configured in minutes and run on a the pre-tested and robust existing package.
Without it I would have been up for writing hundreds of packages.

XML data in ntext to xml data type

I'm a rookie when it comes to XML data.
I currently have a database with xml data being stored in a ntext column and this database appears to be taking up too much space. I'm trying to improve the way the data is stored, to reduce the overall size of the DB.
The way I see it, I have two options:
nvarchar(max)
xml
I will need to test the two options above by importing some data into those columns.
The problem I am having is, the XML data in the ntext column is currently stored as utf-8. In order for me to import it into the XML data type column, I will need to CAST/CONVERT the data to UTF-16?
Is this right?
Storing you data as datatype XML has two major benefits:
since it's stored as a native XML, you can do things like XQuery on top of it
since it's stored as XML it's stored in an optimized (tokenized) way, taking up less space than what the equivalent nvarchar(max) column would use up.
To convert your existing NTEXT column: just do a CAST on it.
What results do you get frmo this:
SELECT
id, ntextColumn, CAST(ntextColumn AS XML)
FROM
dbo.YourTable
I am almost sure this will work - just like that. SQL Server doesn't support UTF-8 - so your data even in the ntext column is most likely not really stored as UTF-8 (it's already been converted to SQL Server's Unicode - UCS-2/UTF-16) so I don't see any issue with converting this to datatype XML, really.

How to parse a text attribute into a xml atribute in SQL server 2005?

I have a table which contains an attribute called custom_fields that stores a well-formed xml:
<Root>
...
<TotalMontoSoles></TotalMontoSoles>
...
</Root>
But this attribute is not stored as a xml data type, instead it's stored as a text. What I need to do is set the TotalMontoSoles value and I was trying to accomplish that by using the modify method from XML-DML but I keep getting a
Error SQL: Explicit conversion from data type xml to text is not allowed.
error when I try to cast the column into a xml type:
DECLARE #custom_fields xml
SET #custom_fields = (SELECT CAST(custom_fields as XML) FROM UPLOAD_HEADER_TEMPORAL
#custom_fields.modify('...')
What am I doing wrong? Is there any other way I could accomplish this?
UPDATE:
Maybe it's important to point out that what I'm trying to do here is to create a procedure and I'm getting this error during compilation time.
Text column data types cannot be converted (cast) to XML. You can (should) use one of the varchar types though. Microsoft will be removing the text (and image) data type at some point in the future.
http://msdn.microsoft.com/en-us/library/ms187993%28v=sql.90%29.aspx
ntext, text, and image data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.