Updating SQL table from XML - sql

I am using InfoPath 2007 to send out a survey (it is not connected to SharePoint or a DataBase). The file I will get back is an XML file. Every place there is an answer block, it has its own unique id (aka field name).
Now, I have a SQL Server Database (2007?) with a table "Responses". Its columns are: AnswerID(unique PK), QuestionID (FK) (which is the unique id (field name), and Answer. The QuestionID is already populated with the unique id (field name). There are more than 300 records for QuestionID.
What I need to be able to do is reach into the XML file, find the QuestionID (field name), grab the data for that field name, and then put the data into the DB column "Answer" that matches the field name in the QuestionID column.
Is there an easy/medium way to do this mapping/updating with the least amount of chance of error?
NOTE: I tried to use the DB import XML data wizard, the information breaks out into an unmanageable number of tables.

You can shred the XML into rows and columns and then use that to update your table.
Here is a little example of what you can do.
create table Responses(QuestionID varchar(10), Answer varchar(10))
insert into Responses values('Q1', null)
insert into Responses values('Q2', null)
insert into Responses values('Q3', null)
declare #xml xml
set #xml =
'<root>
<question ID="Q1">Answer1</question>
<question ID="Q2">Answer2</question>
<question ID="Q3">Answer3</question>
</root>'
;with cte as
(
select
r.n.value('#ID', 'varchar(10)') as QuestionID,
r.n.value('.', 'varchar(10)') as Answer
from #xml.nodes('/root/*') as r(n)
)
update R
set Answer = C.Answer
from Responses as R
inner join cte as C
on R.QuestionID = C.QuestionID
select *
from Responses
Result:
QuestionID Answer
---------- ----------
Q1 Answer1
Q2 Answer2
Q3 Answer3
The XML I used most certainly does not look anything like what you have but it should give you a hint of what you can do. If you post a sample of your XML file, table structure and expected result/output you can probably get a more precise answer.
Edit
declare #xml xml =
'<?xml version="1.0" encoding="UTF-8"?>
<my:myFields xmlns:my="xx.com" xml:lang="en-us">
<my:group1>
<my:group2>
<my:field1>Im an analyst.</my:field1>
<my:group3>
<my:group4>
<my:field2>1</my:field2>
<my:field3>I click the mouse.</my:field3>
</my:group4>
<my:group4>
<my:field2>2</my:field2>
<my:field3>I type on the keyboard.</my:field3>
</my:group4>
</my:group3>
</my:group2>
<my:group2>
<my:field1>Im a stay at home mom.</my:field1>
<my:group3>
<my:group4>
<my:field2>1</my:field2>
<my:field3>I Cook.</my:field3>
</my:group4>
<my:group4>
<my:field2>2</my:field2>
<my:field3>I clean.</my:field3>
</my:group4>
</my:group3>
</my:group2>
</my:group1>
</my:myFields>'
;with xmlnamespaces('xx.com' as my)
select
T.N.value('../../my:field1[1]', 'varchar(50)') as Field1,
T.N.value('my:field2[1]', 'varchar(50)') as Field2,
T.N.value('my:field3[1]', 'varchar(50)') as Field3
from #xml.nodes('my:myFields/my:group1/my:group2/my:group3/my:group4') as T(N)
Result:
Field1 Field2 Field3
Im an analyst. 1 I click the mouse.
Im an analyst. 2 I type on the keyboard.
Im a stay at home mom. 1 I Cook.
Im a stay at home mom. 2 I clean.

Related

Search using LIKE operator with multiple dynamic values accepting both full and partial text match

Is there any way to do multiple term search in a column using like operator dynamically in SQL Server? Like below
SELECT ID
FROM table
WHERE
Company LIKE '%goog%' OR
Company LIKE '%micros%' OR
Company LIKE '%amazon%'
For example: input values "goog; micro; amazon;" (input value should auto split by delimiter ';' and check the text exist in the table) means that Search term 'goog' or 'micros' or 'amazon' from company column, if exists return.
Table - sample data:
ID Company
------------------------------------------
1 Google; Microsoft;
2 oracle; microsoft; apple; walmart; tesla
3 amazon; apple;
4 google;
5 tesla;
6 amazon;
Basically, The above query should return the results as like below,
Desired results:
ID
-----
1
2
4
6
Is it possible to achieve in SQL Server by splitting, then search in query? I look forward to an experts answer.
If you pass in a table valued parameter, you can join on that.
So for example
CREATE TYPE StringList AS TABLE (str varchar(100));
DECLARE #tmp StringList;
INSERT #tmp (str)
VALUES
('%goog%'),
('%micros%'),
('%amazon%');
SELECT t.ID
FROM table t
WHERE EXISTS (SELECT 1
FROM #tmp tmp
WHERE t.Company LIKE tmp.str);
The one issue with this is that someone could write le; Mic and still get a result.
Strictly speaking, your table design is flawed, because you are storing multiple different items in the same column. You really should have this normalized into rows, so every Company is a separate row. Then your code would look like this:
SELECT t.ID
FROM table t
JOIN #tmp tmp ON t.Company LIKE tmp.str
GROUP BY t.ID
You can simulate it by splitting your string
SELECT t.ID
FROM table t
WHERE EXISTS (SELECT 1
FROM STRING_SPLIT(t.Company) s
JOIN #tmp tmp ON s.value LIKE tmp.str);

Extract data matching a condition inside an XML array in SQL server

Considering this simple table id (int), name (varchar), customFields (xml)
customFields contains an XML representation of a C# Dictionary. E.g :
<dictionary>
<item>
<key><int>1</int></key>
<value><string>Audi</string></value>
</item>
<item>
<key><int>2</int></key>
<value><string>Red</string></value>
</item>
</dictionary>
How do I select all rows of my table with 3 columns: id, name and 1 column that is the content of the string tag where the value of the int tag is equal 1.
The closest to the result I managed to get is:
SELECT id, name, C.value('(value/string/text())[1]', 'varchar(max)')
FROM myTable
OUTER APPLY customFields.nodes('/dictionary/item') N(C)
WHERE (customFields IS NULL OR C.value('(key/int/text())[1]', 'int') = 1)
Problem is, if xml doesn't have a int tag = 1 the row is not returned at all (I would still like to get a row with id, name and NULL in the 3rd column)
I've created a table the same as yours and this query worked fine:
select id, name,
(select C.value('(value/string/text())[1]','varchar(MAX)')
from xmlTable inr outer apply
customField.nodes('/dictionary/item') N(C)
where
C.value('(key/int/text())[1]','int') = 1
and inr.id = ou.id) as xmlVal
from xmlTable ou
Here is my result:
The reason why your query didn't worked is because it first selects values of "value/string" for all rows from "myTable" and then filters them. Therefore, the null values appear only on empty fields, and on the other fields (which contains any xml value), the "value/string/text()" is displayed. This is what your query without where clause returns:
id,name,xml
1 lol NULL
2 rotfl Audi
2 rotfl Red
3 troll Red

how to extract data from xml having multiple nodes which are same under a parent node into a table in sql

I have a xml like
<list of roleids>
<roleid> 1 </roleid>
<roleid> 2 </roleid>
<roleid> 3 </roleid>
<roleid> 4 </roleid>
</list of Roleids>
I want the roleid values in a table with column RoleID
I have tried with the below query but i am not getting the expected output
declare #ListOfRoleID XML
set #ListOfRoleID =
'<list>
<roleid>1</roleid>
<roleid>2</roleid>
</list>'
DECLARE #RoleIDList TABLE (RoleID int)
INSERT INTO #RoleIDList(RoleID)
SELECT X1.value('(roleid)[1]','INT') as roleid
from #ListOfRoleID.nodes('/list') as ListOfRoleID(x)
CROSS APPLY #RoleIDList.nodes('/list') AS ListOfRoleID1 (X1)
You want to project .nodes('/list/roleid'), which will create one row per roleid. Then extract the value. No need for cross apply. See sqlfiddle:
declare #ListOfRoleID XML = '<list>
<roleid>1</roleid>
<roleid>2</roleid>
<roleid>3</roleid>
</list>';
SELECT x.value('.','INT') as roleid
from #ListOfRoleID.nodes('/list/roleid') as ListOfRoleID(x);

Importing multiple xml files, filter, then output to different xml format in SSIS

I have a task that requires me to pull in a set of xml files, which are all related, then pick out a subset of records from these files, transform some columns, and export to a a single xml file using a different format.
I'm new to SSIS, and so far I've managed to first import two of the xml files (for simplicity, starting with just two files).
The first file we can call "Item", containing some basic metadata, amongst those an ID, which is used to identify related records in the second file "Milestones". I filter my "valid records" using a lookup transformation in my dataflow - now I have the valid Item ID's to fetch the records I need. I funnel these valid ID's (along with the rest of the columns from Item.xml through a Sort, then into a merge join.
The second file is structured with 2 outputs, one containing two columns (ItemID and RowID). The second containing all of the Milestone related data plus a RowID. I put these through a inner merge join, based on RowID, so I have the ItemID in with the milestone data. Then I do a full outer join merge join on both files, using ItemID.
This gives me data sort of like this:
ItemID[1] - MilestoneData[2]
ItemID[1] - MilestoneData[3]
ItemID[1] - MilestoneData[4]
ItemID[1] - MilestoneData[5]
ItemID[2] - MilestoneData[6]
ItemID[2] - MilestoneData[7]
ItemID[2] - MilestoneData[8]
I can put this data through derived column transformations to create the columns of data I actually need, but I can't see how to structure this in a relational way/normalize it into a different xml format.
The idea is to output something like:
<item id="1">
<Milestone id="2">
<Milestone />
<Milestone id="3">
<Milestone />
</item>
Can anyone point me in the right direction?
UPDATE:
A bit more detailed picture of what I have, and what I'd like to achieve:
Item.xml:
<Items>
<Item ItemID="1">
<Title>
Data
</Title>
</Item>
<Item ItemID="2">
...
</Item>
...
</Items>
Milestone.xml:
<Milestones>
<Item ItemID="2">
<MS id="3">
<MS_DATA>
Data
</MS_DATA>
</MS>
<MS id="4">
<MS_DATA>
Data
</MS_DATA>
</MS>
</Item>
<Item ItemID="3">
<MS id="5">
<MS_DATA>
Data
</MS_DATA>
</MS>
</item>
</Milestones>
The way it's presented in SSIS when I use XML source, is not entirely intuitive, meaning the Item rows and the MS rows are two seperate outputs. I had to run these through a join in order to get the Milestones that corresponds to specific Items. No problem here, then run it through a full outer join with the items, so I get a flattened table with multiple rows containing obviously the same data for an Item and with different data for the MS. Basically I get what I tried to show in my table, lots of redundant Item data, for each unique MilestoneData.
In the end it has to look similar to:
<NewItems>
<myNewItem ItemID="2">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones>
<MS_DATA_TRANSFORMED MSID="3">
data
</MS_DATA_TRANSFORMED>
<MS_DATA_TRANSFORMED MSID="4">
data
</MS_DATA_TRANSFORMED>
</MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
<myNewItem ItemID="3">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones>
<MS_DATA_TRANSFORMED MSID="5">
data
</MS_DATA_TRANSFORMED>
</MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
</myNewItem>
<myNewItem ItemID="4">
<SomeDataDirectlyFromItem>
e.g. Title
</SomeDataDirectlyFromItem>
<DataConstructedFromMultipleColumnsInItem>
<MyMilestones></MyMilestones>
</DataConstructedFromMultipleColumnsInItem>
</myNewItem>
</NewItems>
I would try to handle this using a script component with the component type transformation. As you are new to ssis, i assume you haven't used this before. So basically you
define input columns, your component will expect (i.e. column input_xml containing ItemID[1] - MilestoneData[2];...
use c# to create a logic which cuts and sticks together
define output columns your component will use to deliver the transformed row
You will face the problem that one row will probably be used two times in the end, like i.e.
ItemID[1] - MilestoneData[2]
will result in
<item id="1">
<Milestone id="2">
I have done something pretty similar using Pentaho kettle, even without using something like a script component in which you define own logic. But i guess ssis has a lack of tasks here.
How about importing the XML into relational tables ( eg in tempdb ) then using FOR XML PATH to reconstruct the XML? FOR XML PATH offers a high degree of control over how you want the XML to look. A very simple example below:
CREATE TABLE #items ( itemId INT PRIMARY KEY, title VARCHAR(50) NULL )
CREATE TABLE #milestones ( itemId INT, msId INT, msData VARCHAR(50) NOT NULL, PRIMARY KEY ( itemId, msId ) )
GO
DECLARE #itemsXML XML
SELECT #itemsXML = x.y
FROM OPENROWSET( BULK 'c:\temp\items.xml', SINGLE_CLOB ) x(y)
INSERT INTO #items ( itemId, title )
SELECT
i.c.value('#ItemID', 'INT' ),
i.c.value('(Title/text())[1]', 'VARCHAR(50)' )
FROM #itemsXML.nodes('Items/Item') i(c)
GO
DECLARE #milestoneXML XML
SELECT #milestoneXML = x.y
FROM OPENROWSET( BULK 'c:\temp\milestone.xml', SINGLE_CLOB ) x(y)
INSERT INTO #milestones ( itemId, msId, msData )
SELECT
i.c.value('#ItemID', 'INT' ),
i.c.value('(MS/#id)[1]', 'VARCHAR(50)' ) msId,
i.c.value('(MS/MS_DATA/text())[1]', 'VARCHAR(50)' ) msData
FROM #milestoneXML.nodes('Milestones/Item') i(c)
GO
SELECT
i.itemId AS "#ItemID"
FROM #items i
INNER JOIN #milestones ms ON i.itemId = ms.itemId
FOR XML PATH('myNewItem'), ROOT('NewItems'), TYPE
DROP TABLE #items
DROP TABLE #milestones

Return multiple values in one column within a main query

I am trying to find Relative information from a table and return those results (along with other unrelated results) in one row as part of a larger query.
I already tried using this example, modified for my data.
How to return multiple values in one column (T-SQL)?
But I cannot get it to work. It will not pull any data (I am sure it is is user[me] error).
If I query the table directly using a TempTable, I can get the results correctly.
DECLARE #res NVARCHAR(100)
SET #res = ''
CREATE TABLE #tempResult ( item nvarchar(100) )
INSERT INTO #tempResult
SELECT Relation AS item
FROM tblNextOfKin
WHERE ID ='xxx' AND Address ='yyy'
ORDER BY Relation
SELECT #res = #res + item + ', ' from #tempResult
SELECT substring(#res,1,len(#res)-1) as Result
DROP TABLE #tempResult
Note the WHERE line above, xxx and yyy would vary based on the input criteria for the function. but since you cannot use TempTables in a function... I am stuck.
The relevant fields in the table I am trying to query are as follows.
tblNextOfKin
ID - varchar(12)
Name - varchar(60)
Relation - varchar(30)
Address - varchar(100)
I hope this makes enough sense... I saw on another post an expression that fits.
My SQL-fu is not so good.
Once I get a working function, I will place it into the main query for the SSIS package I am working on which is pulling data from many other tables.
I can provide more details if needed, but the site said to keep it simple, and I tried to do so.
Thanks !!!
Follow-up (because when I added a comment to the reponse below, I could not edit formatting)
I need to be able to get results from different columns.
ID Name Relation Address
1, Mike, SON, 100 Main St.
1, Sara, DAU, 100 Main St.
2, Tim , SON, 123 South St.
Both the first two people live at the same address, so if I query for ID='1' and Address='100 Main St.' I need the results to look something like...
"DAU, SON"
Mysql has GROUP_CONCAT
SELECT GROUP_CONCAT(Relation ORDER BY Relation SEPARATOR ', ') AS item
FROM tblNextOfKin
WHERE ID ='xxx' AND Address ='yyy'
You can do it for the whole table with
SELECT ID, Address, GROUP_CONCAT(Relation ORDER BY Relation SEPARATOR ', ') AS item
FROM tblNextOfKin
GROUP BY ID, Address
(assuming ID is not unique)
note: this is usually bad practice as an intermediate step, this is acceptable only as final formatting for presentation (otherwise you will end up ungrouping it which will be pain)
I think you need something like this (SQL Server):
SELECT stuff((select ',' +Relation
FROM tblNextOfKin a
WHERE ID ='xxx' AND Address ='yyy'
ORDER BY Relation
FOR XML path('')),1,1,'') AS res;