Build XML Off of Common Table Expression - sql

I am using a CTE to recurse data I have stored in a recursive table. The trouble is I am trying to figure out how I can use "FOR XML" to build the desired xml output. I have a Table of Contents table I am recursing and I want to be able to use that data to generate the XML.
Here is an example of what the data is simliar to:
ID|TOC_ID|TOC_SECTION|TOC_DESCRIPTON|PARENT_ID
1|I|Chapter|My Test Chapter|-1
2|A|Section|My Test Section|1
3|1|SubSection|My SubSection|2
I want to be able to spit out the data like so:
XML Attributes:
ID = Appended values from the TOC_ID field
value = value from TOC_Section field
<FilterData>
<Filter id="I" value="Chapter">
<Description>My Test Chapter</Description>
<Filter id="I_A" value="Section">
<Description>My Test Section</Description>
<Filter id="I_A_1" value="SubSection">
<Description>My Test SubSection</Description>
</Filter>
</Filter>
</Filter>
</FilterData>
Not sure how I can take the CTE data and produce a similar format to the above. When the data is in separate tables it isn't too difficult to build this type of output.
As always appreciate the input.
Thanks,
S

You may get some mileage from Recursive Hierarchies to XML in Christian Wade's blog - it all looks mighty painful to me!

Check this out Will (Not sure you are still following)....this does have a 32 level max, but that should still work fine for my stuff...can't see going deeper than that. Found this on another forum:
CREATE TABLE tree ( id INT, name VARCHAR(5), parent_id INT )
GO
INSERT INTO tree VALUES ( 1, 'N1', NULL )
INSERT INTO tree VALUES ( 3, 'N4', 1 )
INSERT INTO tree VALUES ( 4, 'N10', 3 )
INSERT INTO tree VALUES ( 5, 'N7', 3 )
GO
CREATE FUNCTION dbo.treeList(#parent_id int)
RETURNS XML
WITH RETURNS NULL ON NULL INPUT
BEGIN RETURN
(SELECT id as "#id", name as "#name",
CASE WHEN parent_id=#parent_id
THEN dbo.treeList(id)
END
FROM dbo.tree WHERE parent_id=#parent_id
FOR XML PATH('tree'), TYPE)
END
GO
SELECT id AS "#id", name AS "#name",
CASE WHEN id=1
THEN dbo.treeList(id)
END
FROM tree
WHERE id=1
FOR XML PATH('tree'), TYPE
Now isn't that nice and simple?
Customised from the great example over on http://msdn.microsoft.com/en-us/library/ms345137.aspx

Related

how can i create sql table for complex structure

I need to create a table in the database and query this table if specifically on the frontend we receive something like:
self-delivery reception: 10:00-13:00
deadline for admission: 19:00
deadline: 12:00
order acceptance:
{morning: 12:00-13:00}{evening: 16:00-17:00}
We have a structure:
Title string
Value string
Intervals []*Intervals
Which refers in one of the fields to another structure.
Structure:
Title string
Value string
I have created 3 tables:
table of usual values
INSERT INTO delivery_conditions(title, val)
VALUES ('self-delivery reception', '10:00-13:00'),
('deadline for admission', '19:00'),
('deadline', '12:00');
table links to another structure
INSERT INTO delivery_intervals(title, val)
VALUES ('order acceptance', 1);
table for different structure
INSERT INTO customers.intervals_settings(title, val)
VALUES ('morning', '12:00-13:00'),
('evening', '16:00-17:00');
I also wrote a query in which I combined the fields of the third table with the json object
SELECT
sfc.title,
sfc.val,
json_build_object(
'title', sfi.title,
'val', sfi.val
) AS interval
FROM delivery_conditions sfc
JOIN delivery_intervals sfi ON sfi.conditions_id = sfc.id;
as far as I understand, my method is not at all correct and I would be very grateful to people who will explain how to act in such situations.

SQL Server - matching attributes query

SQL Server Gurus ...
Currently using MS SQL Server 2016
I know Joe Celko and all SQL purists are squirming at the thought of using bitmasks, but I have a use case in which I need to query for all widgets that contain a set of given attributes.
Each widget may contain several hundred attributes.
The attributes of a widget are either present or not (1 = present, 0 = not
present)
One way I thought to do this is via bitmasks – the attributes to be found (a bitmask) could be ANDed with the attributes of each widget to find matches in a single operation. For example, the widgets table might be:
widets table:
widget_uid Uniqueidentifier
attributes BigInt
SELECT widget_uid
FROM widgets
WHERE ( attributes & bitmask ) = bitmask;
Problem is, using a BigInt for the attributes limits the number of attributes to 64 (a widget can have several hundred attributes), I could group the attributes in chunks of 64 bits, ie:
widets table:
widget_uid Uniqueidentifier
attributes0 BigInt -- Attributes 0-63
attributes1 BigInt -- Attributes 64-127
attributes2 BigInt -- Attributes 128-191
SELECT widget_uid
FROM widgets
WHERE ( attributes0 & bitmask0 ) = bitmask0
AND ( attributes1 & bitmask1 ) = bitmask1
AND ( attributes2 & bitmask2 ) = bitmask2
... but was wondering if anyone has come up with a solution for bit operations using bitmasks with greater than 64 bits – or if other (more efficient?) solutions would exist?
In the use case, the widgets table does contain other columns, but I am only concerned with the attributes matching portion of the query at the moment.
Any and all ideas are welcome - would be interested in knowing how others tackle this particular problem.
Thanks in advance.
We had a similar use case, on a significantly large data set. This was for an e-commerce site with products and attributes. Our case was a bit more complex than here, where we had any possible number of attributes and then values assigned to those attributes. e.g. Color - Red/Green/Blue, Size - S/M/L etc.
We found that associated tables with good indexing was the key in our case. While this may not be an option for you we found this to be the optimal solution for a dynamic data set.
I can code you up an example if you feel it will be helpful.
Edited to add example:
DROP TABLE IF EXISTS #Widgets
DROP TABLE IF EXISTS #Attributes
DROP TABLE IF EXISTS #WidgetAttributes
CREATE TABLE #Widgets (widget_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #Attributes (Attribute_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #WidgetAttributes (widget_UID UNIQUEIDENTIFIER,Attribute_UID UNIQUEIDENTIFIER)
CREATE NONCLUSTERED INDEX ix_WidgetAttribute ON #WidgetAttributes (Attribute_UID) INCLUDE (widget_UID)
INSERT INTO #Widgets (widget_UID, Name) values
( '{c63bea73-2331-4698-82c9-f71845ab8601}', N'Widget 1' ),
( '{a0865b8f-606b-4273-9207-39a8a26016c4}', N'Widget 2' ),
( '{211fe27e-ab98-4b61-83a3-3d006d66db5a}', N'Widget 3' )
INSERT INTO #Attributes (Attribute_UID, Name)
VALUES
( '{99354dc0-d0b2-4919-a887-edf115eeb1bd}', N'Height' ),
( '{136bbe4c-497d-472f-a905-670e4a7805d0}', N'Width' ),
( '{f006f950-30d1-453e-8e09-4f7d140fa3cb}', N'Depth' ),
( '{0d190639-677f-4b75-8d36-1bdac00de132}', N'Colour' )
-- Set links
-- Widget 1 All attributes
-- Widget 2 Height Width
-- Widget 3 Colour
INSERT INTO #WidgetAttributes (widget_UID, Attribute_UID)
SELECT '{c63bea73-2331-4698-82c9-f71845ab8601}',Attribute_UID FROM #Attributes
UNION ALL
SELECT TOP (2) '{a0865b8f-606b-4273-9207-39a8a26016c4}',Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
UNION ALL
SELECT '{211fe27e-ab98-4b61-83a3-3d006d66db5a}',Attribute_UID FROM #Attributes WHERE Name = 'Colour'
-- #SearchAttributes to hold list of attributes you are trying to find
DECLARE #SearchAttributes TABLE (Attribute_UID UNIQUEIDENTIFIER)
INSERT INTO #SearchAttributes
SELECT Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
;WITH cte AS (
SELECT WA.widget_UID, COUNT(1) AttributesPresent FROM #WidgetAttributes WA
JOIN #SearchAttributes SA ON SA.Attribute_UID = WA.Attribute_UID
GROUP BY WA.widget_UID
)
SELECT cte.AttributesPresent
, W.widget_UID
, W.Name
FROM cte
JOIN #Widgets W ON W.widget_UID = cte.widget_UID
ORDER BY cte.AttributesPresent DESC
Gives an output of:
AttributesPresent widget_UID Name
----------------- ------------------------------------ ----------
3 C63BEA73-2331-4698-82C9-F71845AB8601 Widget 1
2 A0865B8F-606B-4273-9207-39A8A26016C4 Widget 2
We used an approach of counting how many attributes were present for each so we not only had the option of "exact match" but also "closest fit".
Using bitmask in databases is wrong approach. Even if you somewhow manage it to work, you will not be able to use indexes to speed up execution.
Use standard solution, this is standard situation. There is standard M:N relationship between Widgets and Attributes (both should be tables, of course). You will add another table that will assign Attributes to Widgets - you can call it WidgetAttributes.
It will have 3 columns: Id, WidgetId, AttributeId
Then you can simply for example get list of Widgets that have Attribute:
select w.*
from Widgets w
inner join WidgetAttributes wa on wa.WidgetId = w.Id
inner join Attributes a on a.Id = wa.AttributeId
where a.AttributeName='xxx'

SQL: Insert a new unique row based off previous row

I'm working with Deltek Vision ERP software trying to create a custom solution to provide a list of deliverables for projects. I've created a custom grid in the Project area of Vision that has a table in SQL structured like this:
SELECT TOP (1000) [WBS1]
,[WBS2]
,[WBS3]
,[Seq]
,[CreateUser]
,[CreateDate]
,[ModUser]
,[ModDate]
,[CustDeliverables]
,[CustDueDate]
,[CustCompletionDate]
FROM [Vision_Prod].[dbo].[Projects_Deliverables]
The table has three user entered fields, which are the last three columns listed above.
What I'm trying to accomplish is have deliverables set at WBS2 level also roll up to WBS1, so basically what needs to happen is that any time a record is created with a value in WBS2 the record is duplicated but the duplicate has no value in WBS2.
I've setup a workflow in Vision so that when someone enters a deliverable into the grid on a phase it kicks off a stored procedure to accomplish this. The problem is the Seq field. This is a unique identifier the system is assigning when a record is created. When my stored procedure fires I'm getting an error that the sequence has to be included in the record.
This is the stored procedure I'm using:
INSERT INTO [Vision_Prod].[dbo].[Projects_Deliverables] (WBS1, WBS2, WBS3, Seq, CreateUser, ModUser, ModDate, CreateDate, CustDueDate, CustCompletionDate)
SELECT WBS1, '', '', (NEXT VALUE FOR Seq), CreateUser, ModUser, ModDate, CreateDate, CustDueDate, CustCompletionDate
FROM Projects_Deliverables
WHERE Projects_Deliverables.WBS2 IS NOT Null and Projects_Deliverables.WBS1 = #WBS1
If anyone can help me figure out how to get the system to assign a new sequence when a record is created in this way that would be most appreciated.
You could create a dummy table using VALUES to create the an extra row and then fiter out when not needed. Something (but not tested) like this:
SELECT TOP (1000)
[WBS1]
,CASE c WHEN 2 THEN NULL ELSE [WBS2] END AS [WBS2]
,[WBS3]
,[Seq]
,[CreateUser]
,[CreateDate]
,[ModUser]
,[ModDate]
,[CustDeliverables]
,[CustDueDate]
,[CustCompletionDate]
FROM [Vision_Prod].[dbo].[Projects_Deliverables]
CROSS JOIN (VALUES (1), (2)) t(c)
WHERE (WBS2 IS NULL AND c < 2)
OR WBS2 IS NOT NULL
This was the solution:
REPLACE(CAST(NEWID() AS VARCHAR(36)), '-', '')

Using TSQL to generate XML

I am trying to generate some XML using TSQL and having some trouble, I was able to get to traverse 2 levels but getting stuck at 3. How it should work in the example I have posted you have to Purchase Orders, the first purchase order only has one part with one release for that part. The second Purchase order as 2 parts each part with its own release. At the end I should have 2 XML documents that traverse 3 levels, release/part/purchase order. I think I am getting confused when linking to the next stage up as well as the group by.The example only shows 2 parts but there can be many parts to one purchase order but only ever one release per part, any help would be much appreciated.
DECLARE #sdv AS TABLE
(
PID INT IDENTITY(1,1) PRIMARY KEY,
PurchaseOrder varchar(20),
PartNo int,
Release int
)
INSERT INTO #sdv values ('PURC123', 111, 1)
INSERT INTO #sdv values ('PURC333', 222, 1)
INSERT INTO #sdv values ('PURC333', 333, 1)
select
PurchaseOrder,
(
select
PartNo,
(
select
Release
from
#sdv a3
where
a3.Release = a2.Release and
a3.PartNo = a2.PartNo and
a3.PurchaseOrder = a1.PurchaseOrder
for xml path('Release'), type
)
from
#sdv a2
where
a2.PartNo = a1.PartNo
group by a2.PartNo
for xml path('part'), type
) from #sdv a1
group by PurchaseOrder
for xml path('Purchase')
I am going to shoot you an answer so you can play around with it. I am around for a little bit longer today, but will check back tomorrow.
For this example I just used attributes as I can read it a bit easier. I did my grouping using a CTE and assumed the relationship was 1 to many between a part and a release. (could be wrong there, but need info from you). Was also a bit hazy about the levels needing to go a third deep. If we are talking about iterations of parts (versions) then you could simple add a #Release to the second level. Note: If you don't want attributes just remove the #before each of the tags.
I would highly recommend being very diligent about your code format whenever doing hierarchical xml work. Its easy to get lost.
CODE
DECLARE #sdv AS TABLE
(
PID INT IDENTITY(1,1) PRIMARY KEY,
PurchaseOrder varchar(20),
PartNo int,
Release int
)
INSERT INTO #sdv values ('PURC123', 111, 1)
INSERT INTO #sdv values ('PURC333', 222, 1)
INSERT INTO #sdv values ('PURC333', 333, 1)
;WITH PO AS (
SELECT DISTINCT PurchaseOrder
FROM #sdv
)
SELECT L1.PurchaseOrder AS "#PurchseOrder",
( SELECT PartNo AS "#PartNo",
( SELECT DISTINCT L3.Release AS "#Release"
FROM #sdv AS L3
WHERE L2.PartNo = L3.PartNo
ORDER BY L3.Release ASC
FOR XML PATH('RELEASE'),TYPE
)
FROM #sdv AS L2
WHERE L2.PurchaseOrder = L1.PurchaseOrder
FOR XML PATH('PART'), TYPE
)
FROM PO AS L1
FOR XML PATH('PurchaseOrder'), ROOT('PurchaseOrders')
XML OUTPUT
<PurchaseOrders>
<PurchaseOrder PurchseOrder="PURC123">
<PART PartNo="111">
<RELEASE Release="1" />
</PART>
</PurchaseOrder>
<PurchaseOrder PurchseOrder="PURC333">
<PART PartNo="222">
<RELEASE Release="1" />
</PART>
<PART PartNo="333">
<RELEASE Release="1" />
</PART>
</PurchaseOrder>
</PurchaseOrders>

Importing multiple xml files, filter, then output to different xml format in SSIS

I have a task that requires me to pull in a set of xml files, which are all related, then pick out a subset of records from these files, transform some columns, and export to a a single xml file using a different format.
I'm new to SSIS, and so far I've managed to first import two of the xml files (for simplicity, starting with just two files).
The first file we can call "Item", containing some basic metadata, amongst those an ID, which is used to identify related records in the second file "Milestones". I filter my "valid records" using a lookup transformation in my dataflow - now I have the valid Item ID's to fetch the records I need. I funnel these valid ID's (along with the rest of the columns from Item.xml through a Sort, then into a merge join.
The second file is structured with 2 outputs, one containing two columns (ItemID and RowID). The second containing all of the Milestone related data plus a RowID. I put these through a inner merge join, based on RowID, so I have the ItemID in with the milestone data. Then I do a full outer join merge join on both files, using ItemID.
This gives me data sort of like this:
ItemID[1] - MilestoneData[2]
ItemID[1] - MilestoneData[3]
ItemID[1] - MilestoneData[4]
ItemID[1] - MilestoneData[5]
ItemID[2] - MilestoneData[6]
ItemID[2] - MilestoneData[7]
ItemID[2] - MilestoneData[8]
I can put this data through derived column transformations to create the columns of data I actually need, but I can't see how to structure this in a relational way/normalize it into a different xml format.
The idea is to output something like:
<item id="1">
<Milestone id="2">
<Milestone />
<Milestone id="3">
<Milestone />
</item>
Can anyone point me in the right direction?
UPDATE:
A bit more detailed picture of what I have, and what I'd like to achieve:
Item.xml:
<Items>
<Item ItemID="1">
<Title>
Data
</Title>
</Item>
<Item ItemID="2">
...
</Item>
...
</Items>
Milestone.xml:
<Milestones>
<Item ItemID="2">
<MS id="3">
<MS_DATA>
Data
</MS_DATA>
</MS>
<MS id="4">
<MS_DATA>
Data
</MS_DATA>
</MS>
</Item>
<Item ItemID="3">
<MS id="5">
<MS_DATA>
Data
</MS_DATA>
</MS>
</item>
</Milestones>
The way it's presented in SSIS when I use XML source, is not entirely intuitive, meaning the Item rows and the MS rows are two seperate outputs. I had to run these through a join in order to get the Milestones that corresponds to specific Items. No problem here, then run it through a full outer join with the items, so I get a flattened table with multiple rows containing obviously the same data for an Item and with different data for the MS. Basically I get what I tried to show in my table, lots of redundant Item data, for each unique MilestoneData.
In the end it has to look similar to:
<NewItems>
<myNewItem ItemID="2">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones>
<MS_DATA_TRANSFORMED MSID="3">
data
</MS_DATA_TRANSFORMED>
<MS_DATA_TRANSFORMED MSID="4">
data
</MS_DATA_TRANSFORMED>
</MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
<myNewItem ItemID="3">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones>
<MS_DATA_TRANSFORMED MSID="5">
data
</MS_DATA_TRANSFORMED>
</MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
</myNewItem>
<myNewItem ItemID="4">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones></MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
</myNewItem>
</NewItems>
I would try to handle this using a script component with the component type transformation. As you are new to ssis, i assume you haven't used this before. So basically you
define input columns, your component will expect (i.e. column input_xml containing ItemID[1] - MilestoneData[2];...
use c# to create a logic which cuts and sticks together
define output columns your component will use to deliver the transformed row
You will face the problem that one row will probably be used two times in the end, like i.e.
ItemID[1] - MilestoneData[2]
will result in
<item id="1">
<Milestone id="2">
I have done something pretty similar using Pentaho kettle, even without using something like a script component in which you define own logic. But i guess ssis has a lack of tasks here.
How about importing the XML into relational tables ( eg in tempdb ) then using FOR XML PATH to reconstruct the XML? FOR XML PATH offers a high degree of control over how you want the XML to look. A very simple example below:
CREATE TABLE #items ( itemId INT PRIMARY KEY, title VARCHAR(50) NULL )
CREATE TABLE #milestones ( itemId INT, msId INT, msData VARCHAR(50) NOT NULL, PRIMARY KEY ( itemId, msId ) )
GO
DECLARE #itemsXML XML
SELECT #itemsXML = x.y
FROM OPENROWSET( BULK 'c:\temp\items.xml', SINGLE_CLOB ) x(y)
INSERT INTO #items ( itemId, title )
SELECT
i.c.value('#ItemID', 'INT' ),
i.c.value('(Title/text())[1]', 'VARCHAR(50)' )
FROM #itemsXML.nodes('Items/Item') i(c)
GO
DECLARE #milestoneXML XML
SELECT #milestoneXML = x.y
FROM OPENROWSET( BULK 'c:\temp\milestone.xml', SINGLE_CLOB ) x(y)
INSERT INTO #milestones ( itemId, msId, msData )
SELECT
i.c.value('#ItemID', 'INT' ),
i.c.value('(MS/#id)[1]', 'VARCHAR(50)' ) msId,
i.c.value('(MS/MS_DATA/text())[1]', 'VARCHAR(50)' ) msData
FROM #milestoneXML.nodes('Milestones/Item') i(c)
GO
SELECT
i.itemId AS "#ItemID"
FROM #items i
INNER JOIN #milestones ms ON i.itemId = ms.itemId
FOR XML PATH('myNewItem'), ROOT('NewItems'), TYPE
DROP TABLE #items
DROP TABLE #milestones