parse openxml and get a name value collection for given node - sql

I have a xml document which contains some custom fields which i wont know the names of. i want to generate a select statement which will list the contents in a name value style.
All examples I have found sofar require me to know the names of the nodes.
i.e.
declare #idoc int
declare #doc nvarchar(max); set
#doc = '<user>
<additionalfields>
<Account__Manager>Fred Dibner</Account__Manager>
<First__Aider>St Johns Ambulance</First__Aider>
</additionalfields>
</user>'
EXEC sp_xml_preparedocument #idoc OUTPUT, #doc;
SELECT * FROM OPENXML (#idoc, 'user/additionalfields/',1)
is it possible to achieve this?

well i found the answer after a fair amount more experimenting.(incidentally the double underscore replace is due to the output format of some of the database field names.)
SELECT replace(name,'__',' ') as name, value
FROM OPENXML (#idoc, '/user/additionalfields/*',1)
WITH (
Name nvarchar(4000) '#mp:localname',
value nvarchar(4000) './text()'
)

Related

Can't expand JSON file in SQL Server using OPENJSON beyond 1st level

I am working with JSON that has been inserted into a SQL table and I have been trying to expand the dataset. So far I have been unable to expand beyond a single.
The data looks like this in the database. A single record with JSON.
I have been able to expand the data with the following query:
DECLARE #json NVARCHAR(MAX);
SET #json = (Select [JSON] FROM TableLocation)
SELECT *
FROM OPENJSON (#json)
I have confirmed that all the records are there, however, I haven't been able to expand it beyond this level. Most of the documentation I have found online doesn't reference if the hierarchy is blank. Any assistance would be great.
I have tried reference the ID column (or any other columns), however if I do I get a column of nulls.
DECLARE #json NVARCHAR(MAX);
SET #json = (Select [JSON] FROM TableLocation)
SELECT *
FROM OPENJSON (#json)
WITH (
ID nvarchar(4000) '$.id')

Extracting a single value from a json array in sql server

I am using MS SQL server to get a search result in json format there is only ever 1 row returned in my use case but they designed this as a search tool so you can return more than one value hence the array. The issue I am having is extracting the id value from the array that is returned.
json #response (Array):
{"hits":[{"id":1320172,"email":"xyz#domain.eu","first_name":"IMA","last_name":"TESTERTOO","created":"2018-12-12T11:52:58+00:00","roles":["Learner"],"status":true}],"total":1}
I have tried a number of things but I can't seem to get the path right.
SET #MyUserid = JSON_QUERY(#Reponse, '$.hits[0].id')
SET #MyUserid =JSON_VALUE(#Reponse,'$.hits[0].id')
SET #MyUserid = JSON_QUERY(#Reponse, '$.id')
On most examples I have found the json is not a single line array so I feel like I am missing something there. I'm inexperienced with working with json so any help would be greatly appreciated.
You can try this
DECLARE #json NVARCHAR(MAX)=
N'{"hits":[{"id":1320172,"email":"xyz#domain.eu","first_name":"IMA","last_name":"TESTERTOO","created":"2018-12-12T11:52:58+00:00","roles":["Learner"],"status":true}],"total":1}';
--This will return just one selected value
SELECT JSON_VALUE(#json,'$.hits[0].id')
--This will return the whole everything:
SELECT A.total
,B.*
FROM OPENJSON(#json)
WITH(hits nvarchar(max) AS JSON, total int) A
CROSS APPLY OPENJSON(A.hits)
WITH(id int
,email nvarchar(max)
,first_name nvarchar(max)
,last_name nvarchar(max)
,created nvarchar(max)
,roles nvarchar(max) AS JSON
,[status] bit) B

want to get the Email information from XML, But getting error

CREATE TABLE XMLTABLE(id int IDENTITY PRIMARY KEY,XML_DATA XML,DATE DATETIME);
go
INSERT INTO XMLTABLE(XML_DATA,DATE)
SELECT CONVERT(XML,BULKCOLUMN)AS DATA,getdate()
FROM OPENROWSET(BULK 'c:\Demo.xml',SINGLE_BLOB)AS x
go
DECLARE #XML AS XML
DECLARE #OUPT AS INT
DECLARE #SQL NVARCHAR (MAX)
SELECT #XML= XML_DATA FROM XMLTABLE
EXEC sp_xml_preparedocument #OUPT OUTPUT,#XML,'<root xmlns:d="http://abc" xmlns:ns2="http://def" />'
SELECT EMAILR
FROM OPENXML(#OUPT,'d:ns2:FORM/ns2:Form1/ns2:Part/ns2:Part1/ns2:Ba')
WITH
(EMAILR [VARCHAR](100) 'ns2:EmailAddress')
EXEC sp_xml_removedocument #OUPT
go
i.e Demo.xml contains>>
<ns2:FORM xmlns="http://abc" xmlns:ns2="http://def">
<ns2:Form1>
<ns2:Part>
<ns2:Part1>
<ns2:Ba>
<ns2:EmailA>Hello#YAHOO.COM</ns2:EmailA> ...
Error:Msg 6603, Level 16, State 2, Line 6 XML parsing error: Expected
token 'eof' found ':'.
d:ns2-->:<--FORM/ns2:Form1/ns2:Part/ns2:Part1/ns2:Ba
The approach with sp_xml_... methods and FROM OPENXML is outdated!
You should better use the current XML methods .nodes(), .value(), query() and .modify().
Your XML example is not complete, neither is is valid, had to change it a bit to make it working. You'll probably have to adapt the XPath (at least Part1 is missing).
DECLARE #xml XML=
'<ns2:FORM xmlns="http://abc" xmlns:ns2="http://def">
<ns2:Form1>
<ns2:Part>
<ns2:Ba>
<ns2:EmailA>Hello#YAHOO.COM</ns2:EmailA>
</ns2:Ba>
</ns2:Part>
</ns2:Form1>
</ns2:FORM> ';
This is the secure way with namespaces and full path
WITH XMLNAMESPACES(DEFAULT 'http://abc'
,'http://def' AS ns2)
SELECT #xml.value('(/ns2:FORM/ns2:Form1/ns2:Part/ns2:Ba/ns2:EmailA)[1]','nvarchar(max)');
And this is the lazy approach
SELECT #xml.value('(//*:EmailA)[1]','nvarchar(max)')
You should - however - prefer the full approach. The more you give, the better and fast you get...

How to download a webpage and parse in SQL

I am simply trying to download a webpage and store it in an accessible format in SQL Server 2012. I have resorted to using dynamic SQL, but perhaps there is a cleaner, easier way to do this. I have been able to successfully download the htm files to my local drive using the below code, but I am having difficulty working with the html itself. I am trying to convert the webpage to XML and parse from there, but I think I am not addressing the HTML to XML conversion properly.
I get the following error, "Parsing XML with internal subset DTDs not allowed. Use CONVERT with style option 2 to enable limited internal subset DTD support"
DECLARE #URL NVARCHAR(500);
DECLARE #Ticker NVARCHAR(10)
DECLARE #DynamicTickerNumber INT
SET #DynamicTickerNumber = 1
CREATE TABLE Parsed_HTML(
[Date] DATETIME
,[Ticker] VarChar (8)
,[NodeName] VarChar (50)
,[Value] NVARCHAR (50));
WHILE #DynamicTickerNumber <= 2
BEGIN
SET #Ticker = (SELECT [Ticker] FROM [Unique Tickers Yahoo] WHERE [Unique Tickers Yahoo].[Ticker Number]= #DynamicTickerNumber)
SET #URL ='http://finance.yahoo.com/q/ks?s=' + #Ticker + '+Key+Statistics'
DECLARE #cmd NVARCHAR(250);
DECLARE #tOutput TABLE(data NVARCHAR(100));
DECLARE #file NVARCHAR(MAX);
SET #file='D:\Ressources\Execution Model\Execution Model for SQL\DB Temp\quoteYahooHTML.htm'
SET #cmd ='powershell "(new-object System.Net.WebClient).DownloadFile('''+#URL+''','''+#file+''')"'
EXEC master.dbo.xp_cmdshell #cmd, no_output
CREATE TABLE XmlImportTest
(
xmlFileName VARCHAR(300),
xml_data xml
);
DECLARE #xmlFileName VARCHAR(300)
SELECT #xmlFileName = 'D:\Ressources\Execution Model\Execution Model for SQL\DB Temp\quoteYahooHTML.htm'
EXEC('
INSERT INTO XmlImportTest(xmlFileName, xml_data)
SELECT ''' + #xmlFileName + ''', xmlData
FROM
(
SELECT *
FROM OPENROWSET (BULK ''' + #xmlFileName + ''' , SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)
')
DECLARE #x XML;
DECLARE #string VARCHAR(MAX);
SET #x = (SELECT xml_data FROM XmlImportTest)
SET #string = CONVERT(VARCHAR(MAX), #x, 1);
INSERT INTO [Parsed_HTML] ([NodeName], [Value])
SELECT [NodeName], [Value] FROM dbo.XMLTable(#string)
--above references XMLTable Parsing function that works consistently
END
Unfortunately this needs to be run within the confines of SQL Server, and my understanding is that the HTML Agility Pack is not immediately compatible. I also notice that the intermediate table, XMLimportTest, never gets populated, so this is likely not a function of malformed HTML.
Short answer: don't.
SQL is very good for some things but for downloading and parsing HTML it's a terrible choice. In your example you're using PowerShell to download the file, why not parse the HTML in PowerShell too? Then you could write the parsed data into something like a CSV file and load that in using OPENROWSET.
Another option, still not using SQL but a bit more within SQL Server might be to use a .Net SP via SQL CLR.
As a few of the comments point out, if you could guarantee the HTML was well formed XML then you could use SQL XML functionality to parse it, but web pages are rarely well formed XML so this would be a risky choice.

Dynamic Declare statements SQL Server

I'm using the varchar(MAX) value for text but as I'm building up the huge SQL it cuts the ending off.
Is there any way I can create a Dynamic Declare statement that I can then join together with others when executing the sql?
e.g. something like:
DECLARE #sSQLLeft + Convertvarchar(4),#index) varchar(MAX)
varchar(max) is up to about 2GB are you sure it cuts the ending off or is it just when you print it it only displays the first few hundred characters?
To View long text in SSMS without it getting truncated you can use this trick
SELECT #dynsql AS [processing-instruction(x)] FOR XML PATH('')
DECLARE #query VARCHAR(MAX)
DECLARE #query2 VARCHAR(MAX)
-- Do wahtever
EXEC (#query + #query2)
EDIT:
Martin Smith is quite right. It is possible that your query cuts in print. One of the reasons of this cut is NULL value in variable or in column which concatenates with your query and make rest of the query NULL.