Incorrect Results in my SQL select Query when parsing XML text column - sql

Hi all I have a table that holds my business Id's and it is varchar(255) data type
I also have a separate table that stores an XML structured document in a text data type column when the business gets approved by a lender (it stores the companys information etc).
I am trying to return all business ID's that are NOT approved by a lender, the only way i can know this is if the business ID does not exist in the XML.
I cannot join on any tables as i do not have any relational data, but i am trying to subquery it.
Any ideas? here is what i have
Select bus_id
From dbo.tbl_business
Where bus_id Not In (
Select Cast(company_xml_info As Varchar(Max))
From tbl_company_reports
Where Cast(company_xml_info As Varchar(Max)) Is Not Null
And company_xml_info Like '%Business id="' + bus_id + '"%'
And company_xml_info Is Not Null
And company_xml_current_status = 'Approved'
)

Here is an example mark of something similar you can do. This should run fine in SQL Management Studio 2008 and up:
DECLARE #Data TABLE (BusinessId VARCHAR(8))
INSERT INTO #Data (BusinessId) VALUES ('A68'),('A69'),('A70');
DECLARE #CompanyXml TABLE (company_xml_info VARCHAR(MAX));
INSERT INTO #CompanyXml (company_xml_info ) VALUES ('<CompanyInfo>
<Businesses>
<Business id="A68">
<Businessceo>Test</Businessceo>
</Business>
</Businesses>
</CompanyInfo>')
,('<CompanyInfo>
<Businesses>
<Business id="A70">
<Businessceo>Test2</Businessceo>
</Business>
</Businesses>
</CompanyInfo>')
--Data as is
Select *
From #Data
--example of your code as is
SELECT *
From #CompanyXml
--exclusionary listing
SELECT *
From #Data
EXCEPT
--the secret of this is part 1 casting it to xml. Then you extend that with '.value'. That wants a structure to get to the Id.
--I wrap that in ()'s then say the first instance of that [1] as in theory you could have more instances and do very complex parsing.
--Then it needs a type of sql to transform this value into
SELECT CAST(company_xml_info AS XML).value('(CompanyInfo/Businesses/Business/#id)[1]', 'varchar(8)')
From #CompanyXml
Update 6-29-17
If you have something that has repeat elements in a tree structure of your XML, I prefer the 'nodes' method of repeating them and then you do not have to worry about using a first. You merely need to iterate through what you have from the use of the 'nodes' syntax and get a value like so
DECLARE #X XML = '<CompanyInfo><Businesses><Business id="C1405"/><Business id="C1408"/><Business id="C1408"/></Businesses> </CompanyInfo>'
SELECT
x.query('.')
, x.value('#id', 'varchar(8)')
FROM #X.nodes('/CompanyInfo/Businesses/Business') AS y(x)

Related

Checking if field contains multiple string in sql server

I am working on a sql database which will provide with data some grid. The grid will enable filtering, sorting and paging but also there is a strict requirement that users can enter free text to a text input above the grid for example
'Engine 1001 Requi' and that the result will contain only rows which in some columns contain all the pieces of the text. So one column may contain Engine, other column may contain 1001 and some other will contain Requi.
I created a technical column (let's call it myTechnicalColumn) in the table (let's call it myTable) which will be updated each time someone inserts or updates a row and it will contain all the values of all the columns combined and separated with space.
Now to use it with entity framework I decided to use a table valued function which accepts one parameter #searchQuery and it will handle it like this:
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS #Result TABLE
( ... here come columns )
AS
BEGIN
DECLARE #searchToken TokenType
INSERT INTO #searchToken(token) SELECT value FROM STRING_SPLIT(#searchText,' ')
DECLARE #searchTextLength INT
SET #searchTextLength = (SELECT COUNT(*) FROM #searchToken)
INSERT INTO #Result
SELECT
... here come columns
FROM myTable
WHERE (SELECT COUNT(*) FROM #searchToken WHERE CHARINDEX(token, myTechnicalColumn) > 0) = #searchTextLength
RETURN;
END
Of course the solution works fine but it's kinda slow. Any hints how to improve its efficiency?
You can use an inline Table Valued Function, which should be quite a lot faster.
This would be a direct translation of your current code
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
(
WITH searchText AS (
SELECT value token
FROM STRING_SPLIT(#searchText,' ') s(token)
)
SELECT
... here come columns
FROM myTable t
WHERE (
SELECT COUNT(*)
FROM searchText
WHERE CHARINDEX(s.token, t.myTechnicalColumn) > 0
) = (SELECT COUNT(*) FROM searchText)
);
GO
You are using a form of query called Relational Division Without Remainder and there are other ways to cut this cake:
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
(
WITH searchText AS (
SELECT value token
FROM STRING_SPLIT(#searchText,' ') s(token)
)
SELECT
... here come columns
FROM myTable t
WHERE NOT EXISTS (
SELECT 1
FROM searchText
WHERE CHARINDEX(s.token, t.myTechnicalColumn) = 0
)
);
GO
This may be faster or slower depending on a number of factors, you need to test.
Since there is no data to test, i am not sure if the following will solve your issue:
-- Replace the last INSERT portion
INSERT INTO #Result
SELECT
... here come columns
FROM myTable T
JOIN #searchToken S ON CHARINDEX(S.token, T.myTechnicalColumn) > 0

How to display XML column data

I have a table column consist with the XML files. I want to read XML data and display it.
I come up with the following code. But it read only one row in the column
want to display other XML data also
declare #xml xml
select #xml = event_data_XML from #temp
SELECT * FROM (
SELECT
CAST(f.x.query('data(#name)') as varchar(150)) as data_name,
CAST(f.x.query('data(value)') as varchar(150)) as data_value
FROM #xml.nodes('/event') as t(n)
CROSS APPLY t.n.nodes('data') as f(x)) X
PIVOT (MAX(data_value) FOR data_name IN (NTDomainName, DatabaseName, ServerName)) as pvt
Output should be like this(NTDomainName, DatabaseName, ServerName are xml data)
There are a bunch of ways you could do this. I'll show you a way I think you'd find easiest.
To start, here's a table with a little test data:
CREATE TABLE dbo.stuff (
id int identity (1,1) primary key
, event_data_xml xml
, create_date datetime default(getdate())
, is_active bit default(1)
);
INSERT INTO dbo.stuff (event_data_xml)
VALUES ('<event name="thing" package="as">something</event>')
INSERT INTO dbo.stuff (event_data_xml)
VALUES ('<event name="otherthing" package="as">something else</event>')
---All records
SELECT * FROM dbo.[stuff];
Make sense so far? Here's the query I'd use if I wanted to mix XML data and column data:
---Parsed up
SELECT event_data_xml.value('/event[1]', 'nvarchar(max)') AS [parsed element #text]
, event_data_xml.value('/event[1]/#name', 'nvarchar(max)') AS [parsed attribute value]
, create_date --column from table
FROM dbo.stuff
WHERE is_active = 1;
Using the value() function on the XML column passing in an xpath to what I want to display and SQL Server data type for how I want it returned.
Just make sure you're selecting a single value with your xpath expression.

How to manipulate comma-separated list in SQL Server

I have a list of values such as
1,2,3,4...
that will be passed into my SQL query.
I need to have these values stored in a table variable. So essentially I need something like this:
declare #t (num int)
insert into #t values (1),(2),(3),(4)...
Is it possible to do that formatting in SQL Server? (turning 1,2,3,4... into (1),(2),(3),(4)...
Note: I can not change what those values look like before they get to my SQL script; I'm stuck with that list. also it may not always be 4 values; it could 1 or more.
Edit to show what values look like: under normal circumstances, this is how it would work:
select t.pk
from a_table t
where t.pk in (#place_holder#)
#placeholder# is just a literal place holder. when some one would run the report, #placeholder# is replaced with the literal values from the filter of that report:
select t.pk
from a_table t
where t.pk in (1,2,3,4) -- or whatever the user selects
t.pk is an int
note: doing
declare #t as table (
num int
)
insert into #t values (#Placeholder#)
does not work.
Your description is a bit ridicuolus, but you might give this a try:
Whatever you mean with this
I see what your trying to say; but if I type out '#placeholder#' in the script, I'll end up with '1','2','3','4' and not '1,2,3,4'
I assume this is a string with numbers, each number between single qoutes, separated with a comma:
DECLARE #passedIn VARCHAR(100)='''1'',''2'',''3'',''4'',''5'',''6'',''7''';
SELECT #passedIn; -->: '1','2','3','4','5','6','7'
Now the variable #passedIn holds exactly what you are talking about
I'll use a dynamic SQL-Statement to insert this in a temp-table (declared table variable would not work here...)
CREATE TABLE #tmpTable(ID INT);
DECLARE #cmd VARCHAR(MAX)=
'INSERT INTO #tmpTable(ID) VALUES (' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','),(') + ');';
EXEC (#cmd);
SELECT * FROM #tmpTable;
GO
DROP TABLE #tmpTable;
UPDATE 1: no dynamic SQL necessary, all ad-hoc...
You can get the list of numbers as derived table in a CTE easily.
This can be used in a following statement like WHERE SomeID IN(SELECT ID FROM MyIDs) (similar to this: dynamic IN section )
WITH MyIDs(ID) AS
(
SELECT A.B.value('.','int') AS ID
FROM
(
SELECT CAST('<x>' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','</x><x>') + '</x>' AS XML) AS AsXml
) as tbl
CROSS APPLY tbl.AsXml.nodes('/x') AS A(B)
)
SELECT * FROM MyIDs
UPDATE 2:
And to answer your question exactly:
With this following the CTE
insert into #t(num)
SELECT ID FROM MyIDs
... you would actually get your declared table variable filled - if you need it later...

SQL How to Split One Column into Multiple Variable Columns

I am working on MSSQL, trying to split one string column into multiple columns. The string column has numbers separated by semicolons, like:
190230943204;190234443204;
However, some rows have more numbers than others, so in the database you can have
190230943204;190234443204;
121340944534;340212343204;134530943204
I've seen some solutions for splitting one column into a specific number of columns, but not variable columns. The columns that have less data (2 series of strings separated by commas instead of 3) will have nulls in the third place.
Ideas? Let me know if I must clarify anything.
Splitting this data into separate columns is a very good start (coma-separated values are an heresy). However, a "variable number of properties" should typically be modeled as a one-to-many relationship.
CREATE TABLE main_entity (
id INT PRIMARY KEY,
other_fields INT
);
CREATE TABLE entity_properties (
main_entity_id INT PRIMARY KEY,
property_value INT,
FOREIGN KEY (main_entity_id) REFERENCES main_entity(id)
);
entity_properties.main_entity_id is a foreign key to main_entity.id.
Congratulations, you are on the right path, this is called normalisation. You are about to reach the First Normal Form.
Beweare, however, these properties should have a sensibly similar nature (ie. all phone numbers, or addresses, etc.). Do not to fall into the dark side (a.k.a. the Entity-Attribute-Value anti-pattern), and be tempted to throw all properties into the same table. If you can identify several types of attributes, store each type in a separate table.
If these are all fixed length strings (as in the question), then you can do the work fairly simply (at least relative to other solutions):
select substring(col, 1+13*(n-1), 12) as val
from t join
(select 1 as n union all select union all select 3
) n
on len(t.col) <= 13*n.n
This is a useful hack if all the entries are the same size (not so easy if they are of different sizes). Do, however, think about the data structure because semi-colon (or comma) separated list is not a very good data structure.
IF I were you, I would create a simple function that is dividing values separated with ';' like this:
IF EXISTS (SELECT * FROM sysobjects WHERE id = object_id(N'fn_Split_List') AND xtype IN (N'FN', N'IF', N'TF'))
BEGIN
DROP FUNCTION [dbo].[fn_Split_List]
END
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fn_Split_List](#List NVARCHAR(512))
RETURNS #ResultRowset TABLE ( [Value] NVARCHAR(128) PRIMARY KEY)
AS
BEGIN
DECLARE #XML xml = N'<r><![CDATA[' + REPLACE(#List, ';', ']]></r><r><![CDATA[') + ']]></r>'
INSERT INTO #ResultRowset ([Value])
SELECT DISTINCT RTRIM(LTRIM(Tbl.Col.value('.', 'NVARCHAR(128)')))
FROM #xml.nodes('//r') Tbl(Col)
RETURN
END
GO
Than simply called in this way:
SET NOCOUNT ON
GO
DECLARE #RawData TABLE( [Value] NVARCHAR(256))
INSERT INTO #RawData ([Value] )
VALUES ('1111111;22222222')
,('3333333;113113131')
,('776767676')
,('89332131;313131312;54545353')
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
SET NOCOUNT OFF
GO
The result is as the follow:
Value
1111111
22222222
113113131
3333333
776767676
313131312
54545353
89332131
Anyway, the logic in the function is not complicated, so you can easily put it anywhere you need.
Note: There is not limitations of how many values you will have separated with ';', but there are length limitation in the function that you can set to NVARCHAR(MAX) if you need.
EDIT:
As I can see, there are some rows in your example that will caused the function to return empty strings. For example:
number;number;
will return:
number
number
'' (empty string)
To clear them, just add the following where clause to the statement above like this:
SELECT SL.[Value]
FROM #RawData AS RD
CROSS APPLY [fn_Split_List] ([Value]) as SL
WHERE LEN(SL.[Value]) > 0

how to get values inside an xml column, when it's of type nvarchar

My question is similar to this one: Choose a XML node in SQL Server based on max value of a child element
except that my column is NOT of type XML, it's of type nvarchar(max).
I want to extract the XML node values from a column that looks like this:
<Data>
<el1>1234</el1>
<el2>Something</el2>
</Data>
How can I extract the values '1234' and 'Something' ?
doing a convert and using the col.nodes is not working.
CONVERT(XML, table1.col1).value('(/Data/el1)[1]','int') as 'xcol1',
After that, I would like to do a compare value of el1 (1234) with another column, and update update el1 as is. Right now I'm trying to just rebuild the XML when passing the update:
ie
Update table set col1 ='<Data><el1>'+#col2+'</el1><el2>???</el2>
You've got to tell SQL Server the number of the node you're after, like:
(/Data/el1)[1]
^^^
Full example:
declare #t table (id int, col1 varchar(max))
insert #t values (1, '<Data><el1>1234</el1><el2>Something</el2></Data>')
select CAST(col1 as xml).value('(/Data/el1)[1]', 'int')
from #t
-->
1234
SQL Server provides a modify function to change XML columns. But I think you can only use it on columns with the xml type. Here's an example:
declare #q table (id int, col1 xml)
insert #q values (1, '<Data><el1>1234</el1><el2>Something</el2></Data>')
update #q
set col1.modify('replace value of (/Data/el1/text())[1] with "5678"')
select *
from #q
-->
<Data><el1>5678</el1><el2>Something</el2></Data>
At the end of the day, SQL Server's XML support makes simple things very hard. If you value maintainability, you're better off processing XML on the client side.