Filter SQL SELECT by XML node - sql

There are single SQL Table with xml-type field:
| EntityId | EntityType | Xml column |
------------------------------------------------------
| 1 | Employee | `<productId>1</productId>`|
------------------------------------------------------
| 1 | Product | `<name>apple</name>` |
------------------------------------------------------
| 7 | Shop | `<country>...</country>` | |
-----------------------------------------------------|
What I need is a to filter table row by Xml node value:
SELECT * WHERE (EntityId='1' AND EntityType='Employee')
OR ( EntityId=SomeFuncToGetXmlFieldByNodeName('productId') )
Can u point me on how to write that SomeFuncToGetXmlFieldByNodeName(fieldName)

Looks like you want a function like this.
CREATE FUNCTION [dbo].[SomeFuncToGetXmlFieldByNodeName]
(
#NodeName nvarchar(100),
#XML xml
)
RETURNS nvarchar(max)
AS
BEGIN
RETURN #XML.value('(*[local-name(.) = sql:variable("#NodeName")]/text())[1]', 'nvarchar(max)')
END
It takes a node name and a some XML as parameter and returns the value in the node.
Use the function like this:
select T.EntityId,
T.EntityType,
T.[Xml column]
from YourTable as T
where T.EntityID = 1 and
T.EntityType = 'Employee' or
T.EntityId = dbo.SomeFuncToGetXmlFieldByNodeName('productId', T.[Xml column])
Instead of using the above I want to recommend you to try a query that does not use the scalar valued function. It uses exist() Method (xml Data Type) instead.
select T.EntityId,
T.EntityType,
T.[Xml column]
from YourTable as T
where T.EntityID = 1 and
T.EntityType = 'Employee' or
T.[Xml column].exist('/productId[. = sql:column("T.EntityID")]') = 1

I think you're looking for the documentation!
The "query" method is likely what you'll need. See examples in the linked article.

Related

Concatenate multiple rows to form one single row in SQL Server?

Overview
I need to build a description field that describes an entity. The data I am working with has the property description split for each individual key in my table. Below is an example of what the data looks like:
+------------+--------------------+----------+
| Key | Desc | Order_Id |
+------------+--------------------+----------+
| 5962417474 | Big Yellow Door | 14775 |
| 5962417474 | Orange Windows | 14776 |
| 5962417474 | Blue Triangle Roof | 14777 |
+------------+--------------------+----------+
Originally, I wrote a query using an aggregate function like so:
SELECT
[P].[KEY],
CONCAT (MIN([P].[Desc]), + ' ' + MAX([P].[Desc])) [PROPERTY_DESCRIPTION]
FROM [dbo].[PROP_DESC] [P]
WHERE [P].[KEY] = '5962417474'
GROUP BY [P].[KEY];
This worked great for two row entries but then I realized what if I have multiple records for a property description? So I wrote the following query to check if I had multiple property descriptions:
SELECT
[P].[KEY], COUNT([P].[KEY])
FROM [dbo].[PROP_DESC] [P]
GROUP BY [P].[KEY]
HAVING COUNT(*) > 2; -- Returns one record which is the above table result.
This gave me back a record with three descriptions so my original query will not work. How can I tackle this problem down when there are multiple fields?
Desired Output
+------------+---------------------------------------------------+----------+
| Key | Desc | Order_Id |
+------------+---------------------------------------------------+----------+
| 5962417474 | Big Yellow Door Orange Windows Blue Triangle Roof | 14775 |
+------------+---------------------------------------------------+----------+
It depends on what SQL language you're using, but you'll want to use some kind of group concat / array agg function. Eg:
SELECT
Key,
STRING_AGG(desc, ', ')
FROM TABLE
GROUP BY Key;
I have solved my problem with the following query for those that have the same problem and do not have access to STRING_AGG which is introduced in SQL Server 2017:
SELECT
[P].[KEY],
[PROPERTY_DESCRIPTION] = STUFF((
SELECT ' ' + [P2].[DESC]
FROM [dbo].[PROP_DESC] [P2]
WHERE [P].[KEY] = [P2].[KEY]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '')
FROM [dbo].[PROP_DESC] [P]
WHERE [P].[KEY] = '5962417474'
GROUP BY [P].[KEY]
There are many ways to do it in SQL server:
Below is one way:
SELECT key
,STUFF((SELECT '| ' + CAST(prop_desc AS VARCHAR(MAX)) [text()]
FROM PROP_DESC
WHERE key = t.key
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ') prop_desc
FROM PROP_DESC t
GROUP BY key

SQL Server : extract domain and params from 1 million rows into temp table

I have just over a million rows or Urls in one column. The column name is [url] and the table name is redirects.
I'm running SQL Server 2014.
I need a way to extract the sub domain for each url into a new column in a temp table.
Ideally at the same type select distinct param names for the query string into another column and the param values into another column
My main concern is performance not locking up the server while looping through a million rows.
I would be happy to run 3 queries to get the results if it makes more sense
Examples of the column data:
https://www.google.com/ads/ga-audiences?v=1&aip=1&t=sr&_r=4&tid=UA-9999999-1&cid=9999107657.199999834&jid=472999996&_v=j66&z=1963999907
https://track.kspring.com/livin-like-a-star#pid=370&cid=6546&sid=front
So I end up with 3 columns in a temp table
URL | Param | Qstring
------------------+-------+----------
www.google.com | v | 1
www.google.com | aip | 1
www.google.com | t | dc
www.google.com | tid | UA-1666666-1
www.google.com | jid | 472999996
track.kspring.com | pid | 370
track.kspring.com | cid | 6546
track.kspring.com | sid | front
I've been looking at some examples to extract the domain name from a string but I don't have much experience with regex or string manipulation.
This is the kind of processing at which .Net CLR functions excel. Just use Uri and parse away, from a CLR Table Value Function (so that you can output more than one column in one single call).
Grab a copy of NGrams8K and you can do this:
-- sample data
declare #table table ([url] varchar(8000));
insert #table values
('https://www.google.com/ads/ga-audiences?v=1&aip=1&t=sr&_r=4&tid=UA-9999999-1&cid=9999107657.199999834&jid=472999996&_v=j66&z=1963999907'),
('https://track.kspring.com/livin-like-a-star#pid=370&cid=6546&sid=front');
declare #delimiter varchar(20) = '%[#?;]%'; -- customizable parameter for parsing parameter values
-- solution
select
[url] = substring([url], a1.startPos, a2.aLen-a1.startPos),
[param] = substring(item, 1, charindex('=', split.item)-1),
qString = substring(item, charindex('=', split.item)+1, 8000)
from #table t
cross apply (values (charindex('//',[url])+2)) a1(startPos)
cross apply (values (charindex('/',[url],a1.startPos))) a2(aLen)
cross apply
(
select split.item
from (values (len(substring([url], a2.aLen,8000)), 1)) as l(s,d)
cross apply
( select -(l.d) union all
select ng.position
from dbo.NGrams8k(substring([url], a2.aLen,8000), l.d) as ng
where token LIKE #delimiter
) as d(p)
cross apply (values(replace(substring(substring([url], a2.aLen,8000), d.p+l.d,
isnull(nullif(patindex('%'+#delimiter+'%',
substring(substring([url], a2.aLen,8000), d.p+l.d, l.s)),0)-1, l.s+l.d)),
'&amp',''))) split(item)
where split.item like '%=%'
) split(item);
Results
url param qString
------------------- ------- ---------------------------------
www.google.com v 1
www.google.com aip 1
www.google.com t sr
www.google.com _r 4
www.google.com tid UA-9999999-1
www.google.com cid 9999107657.199999834
www.google.com jid 472999996
www.google.com _v j66
www.google.com z 1963999907
track.kspring.com pid 370
track.kspring.com cid 6546
track.kspring.com sid front

SQL join on returns none values

// EDIT Found the problem. I changed all type from text to varchar. Now it works fine.
I have a table called "sounds" which looks like this:
rowID type int(11) | userID type text | name type text| soundfile type text
1 | "mod001:02" | "Jimmy" | "music/song.mp3"
and a table called "soundlist" which looks like this:
soundID type int(11) | name type text | soundfile type text | used type tinyint(1)
1 | "topSong" | "music/song.mp3" | 1
My problem is, when i'm run this query
SELECT *
FROM sounds
INNER JOIN soundlist
ON STRCMP(soundlist.soundfile, sounds.soundfile) = 0
WHERE STRCMP(sounds.userID, "mod001:02") = 0;
i'm getting an empty result!
My goal is to set "soundlist.used" to 0. I only have "sounds.userID" given.
I'm currently using this query:
UPDATE soundlist
INNER JOIN sounds
ON STRCMP(sounds.userID, "mod001:02") = 0
SET soundlist.used = 0
WHERE STRCMP(soundlist.soundfile, sounds.soundfile) = 0;
You can use nested queries :
UPDATE soundlist
set soundlist.used=0
where soundfile IN ( -- using IN keyword instead of = if you want to update multiple entries
select sounds.soundfile
from sounds
where sounds.rowID=1 -- or any other condition
);
I am assuming that rowID is an INT.
And if you want to go even further and don't bother comparing strings, why not using foreign keys ?
Let Sounds the same way :
rowID | userID | name | soundfile
1 | "mod001:02" | "Jimmy" | "music/song.mp3"
And modify sound list to reference sounds :
soundID | name | soundId | used
1 | "topSong" | 1 | 1
Your query :
SELECT *
FROM sounds
INNER JOIN soundlist
ON STRCMP(soundlist.soundfile, sounds.soundfile) = 0
WHERE STRCMP(sounds.userID, "mod001:02") = 0;
would become
SELECT *
FROM sounds s
INNER JOIN soundlist l
ON s.rowId=l.soundId
where STRCMP(s.userID, "mod001:02") = 0;
This saves you one STRCMP.
Consider using indexes on varchar columns used for conditions, it is faster and sometimes easier to read queries (s.userID = "mod001:02" is more straigthforward)
Edited: This will update sounds.userid to "0" where soundlist.used is "1"
UPDATE sounds
INNER JOIN soundlist ON
sounds.soundfile = soundlist.soundfile
SET sounds.userid = "0"
WHERE soundlist.used = "1"
If, instead you want the sounds.userid to equal soundlist.us
UPDATE sounds
INNER JOIN soundlist ON
sounds.soundfile = soundlist.soundfile
SET sounds.userid = soundlist.used
The problem is that you the text data type, if I use varchar the first query gets me the desired result set

Counting occurrences in a table

Lets say I want to count the total number of occurrences of a name contained within a string in a column and display that total next to all occurrences of that name in a new column beside it. For example, if I have:
Name | Home Address | Special ID
==================================
Frank | 152414 | aTRF342
Jane | 4342342 | rRFC432
Mary | 423432 | xTRF353
James | 32111111 | tLZQ399
May | 4302443 | 3TRF322
How would I count the occurrences of special tags like 'TRF', 'RFC', or 'LZQ' so the table looks like this:
Name | Home Address | Special ID | Occurrences
================================================
Frank | 152414 | aTRF342 | 3
Jane | 4342342 | rRFC432 | 1
Mary | 423432 | xTRF353 | 3
James | 32111111 | tLZQ399 | 1
May | 4302443 | 3TRF322 | 3
Currently using Access 2007. Is this even possible using a SQL query?
Using Access 2007, I stored your sample data in a table named tblUser1384831. The query below returns this result set.
Name Home Address Special ID special_tag Occurrences
---- ------------ ---------- ----------- -----------
Frank 152414 aTRF342 TRF 3
Jane 4342342 rRFC432 RFC 1
Mary 423432 xTRF353 TRF 3
James 32111111 tLZQ399 LZQ 1
May 4302443 3TRF322 TRF 3
Although your question has a vba tag, you don't need to use a VBA procedure for this. You can do it with SQL and the Mid() function.
SELECT
base.[Name],
base.[Home Address],
base.[Special ID],
base.special_tag,
tag_count.Occurrences
FROM
(
SELECT
[Name],
[Home Address],
[Special ID],
Mid([Special ID],2,3) AS special_tag
FROM tblUser1384831
) AS base
INNER JOIN
(
SELECT
Mid([Special ID],2,3) AS special_tag,
Count(*) AS Occurrences
FROM tblUser1384831
GROUP BY Mid([Special ID],2,3)
) AS tag_count
ON base.special_tag = tag_count.special_tag;
You would have to GROUP BY the substring of Special ID. In MS Access, you can read about how to compute substrings here.
The problem in your case is that your data in Special ID column does not follow a standard pattern, one which easy to extract via the substring function. You might need to use regular expressions to extract such values, and later apply the GROUP BY to them.
With MSSQL, Oracle, PostgreSQL you would be able to declare a stored procedure (example CLR function in MS SQL Server) that would do this for you. Not sure with MS Access.
you can do something like this:
select Name, [Home Address], [Special ID],
(select count(*) from [your table] where [Special ID] = RemoveNonAlphaCharacters([Special ID]) ) as Occurrences
from [your table]
auxiliar function (got from this link):
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
While PatIndex('%[^a-z]%', #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex('%[^a-z]%', #Temp), 1, '')
Return #Temp
End
lets say your first table is called 'table_with_string'
the following code will show the occurance based on the first 3 charecters of string in Special ID column. since it is not clear how exactly you are passing the string to match
select tws.Name,tws.HomeAddress,tws.SpecialID,str_count.Occurrences from
table_with_string tws
left join
(select SpecialID,count(*) from table_with_string where specialID like(substring
(specialid,0,3))
group by specialId) as str_count(id,Occurrences)
on str_count.id=tws.SpecialID
I would suggest doing this explicitly as a join, so you are clear on how it works:
select tws.Name, tws.HomeAddress, tws.SpecialID, str_count.Occurrences
from table_with_string tws
join
(
select substring(spcecialid, 2, 3) as code, count(*) as Occurrences
from table_with_string tws
group by substring(spcecialid, 2, 3)
) s
on s.code = substring(tws.spcecialid, 2, 3)

How do I Pivot on an XML column's attributes in T-SQL

I need to perform a pivot on an XML column in a table, where the XML contains multiple elements with a number of attributes. The attributes in each element is always the same, however the number of elements will vary. Let me give an example...
FormEntryId | FormXML | DateCreated
====================================================================================
1 |<Root> | 10/15/2009
| <Form> |
| <FormData FieldName="Username" FieldValue="stevem" /> |
| <FormData FieldName="FirstName" FieldValue="Steve" /> |
| <FormData FieldName="LastName" FieldValue="Mesa" /> |
| </Form> |
|</Root> |
| |
------------------------------------------------------------------------------------
2 |<Root> | 10/16/2009
| <Form> |
| <FormData FieldName="Username" FieldValue="bobs" /> |
| <FormData FieldName="FirstName" FieldValue="Bob" /> |
| <FormData FieldName="LastName" FieldValue="Suggs" /> |
| <FormData FieldName="NewField" FieldValue="test" /> |
| </Form> |
|</Root> |
I need to wind up with a result set for each distinct FieldName attribute values (In this example, Username, FirstName, LastName, and NewField) with their corresponding FieldValue attributes as the value. The results for the example I gave above would look like:
FormEntryId | Username | FirstName | LastName | NewField | DateCreated
======================================================================
1 | stevem | Steve | Mesa | NULL | 10/15/2009
----------------------------------------------------------------------
2 | bobs | Bob | Suggs | test | 10/16/2009
I've figured out a way to accomplish this with static columns
SELECT
FormEntryId,
FormXML.value('/Root[1]/Form[1]/FormData[#FieldName="Username"][1]/#FieldValue','varchar(max)') AS Username,
FormXML.value('/Root[1]/Form[1]/FormData[#FieldName="FirstName"][1]/#FieldValue','varchar(max)') AS FirstName,
FormXML.value('/Root[1]/Form[1]/FormData[#FieldName="LastName"][1]/#FieldValue','varchar(max)') AS LastName,
FormXML.value('/Root[1]/Form[1]/FormData[#FieldName="NewField"][1]/#FieldValue','varchar(max)') AS NewField,
DateCreated
FROM FormEntry
However I would like to see if there's a method to have the columns be dynamic based on the distinct set of "FieldName" attribute values.
Have a look at this dynamic pivot and more recently this one - you basically need to be able to SELECT DISTINCT FieldName to use this technique to build your query dynamically.
Here's the full answer for your particular problem (note that there is a column order weakness when generating the list from the distinct attributes in knowing what order the columns should appear):
DECLARE #template AS varchar(MAX)
SET #template = 'SELECT
FormEntryId
,{#col_list}
,DateCreated
FROM FormEntry'
DECLARE #col_template AS varchar(MAX)
SET #col_template = 'FormXML.value(''/Root[1]/Form[1]/FormData[#FieldName="{FieldName}"][1]/#FieldValue'',''varchar(max)'') AS {FieldName}'
DECLARE #col_list AS varchar(MAX)
;WITH FieldNames AS (
SELECT DISTINCT FieldName
FROM FormEntry
CROSS APPLY (
SELECT X.FieldName.value('#FieldName', 'varchar(255)')
FROM FormXML.nodes('/Root[1]/Form[1]/FormData') AS X(FieldName)
) AS Y (FieldName)
)
SELECT #col_list = COALESCE(#col_list + ',', '') + REPLACE(#col_template, '{FieldName}', FieldName)
FROM FieldNames
DECLARE #sql AS varchar(MAX)
SET #sql = REPLACE(#template, '{#col_list}', #col_list)
EXEC (#sql)
Dynamic pivot isn't built into the language for good reason. It would be necessary to scan the entire table containing potential column names before the structure of the result were known. As a result, the table structure of the dynamic pivot statement would be unknown before run time. This creates many problems regarding parsing and interpretation of language.
If you decide to implement dynamic pivot on your own, watch out for SQL injection opportunities. Be sure to apply QUOTENAME or equivalent to the values you plan to use as column names in your result. Also consider what result you want if the number of distinct values in your source that will become column names exceeds the allowed number of columns of a result set.