SQL Server: how to compare two tables - sql

I have problem with comparing two tables in SQL Server.
I have first table [Table1] with text column where I store my content and second table [table2] with column of my keywords.
And now I want to compare all my keywords against my content and get a list of keywords with number of occurrences in the content. (clear enough?)

What version of SQL Server? If SQL2008 you can do (probably after casting from text to nvarchar(max))
WITH Table1 AS
(
SELECT 1 AS Id, N'how now brown cow' AS txt UNION ALL
SELECT 2, N'she sells sea shells upon the sea shore' UNION ALL
SELECT 3, N'red lorry yellow lorry' UNION ALL
SELECT 4, N'the quick brown fox jumped over the lazy dog'
),
Table2 AS
(
SELECT 'lorry' as keyword UNION ALL
SELECT 'yellow' as keyword UNION ALL
SELECT 'brown' as keyword
)
SELECT Table1.id,display_term, COUNT(*) As Cnt
FROM Table1
CROSS APPLY sys.dm_fts_parser('"' + REPLACE(txt,'"','""') + '"', 1033, 0,0)
JOIN Table2 t2 ON t2.keyword=display_term
WHERE TXT IS NOT NULL
GROUP BY Table1.id,display_term
ORDER BY Cnt DESC
Returns
id display_term Cnt
----------- ------------------------------ -----------
3 lorry 2
3 yellow 1
4 brown 1
1 brown 1

This would return you list of IDs from Table1 (id int, txt ntext) with key woards from Table2 (kwd nvarchar(255)) that exist in ntext field. Number of occurrences is tricky and you will have to write UDF, preferable CLR one, to get it.
I defined word as everything that is separated by space or open parenthesize from left and space, close parenhesize, comma, dot or semicolon from right. You can add more conditions eg quotes, double-quotes etc.
Select Table1.id, Table2.kwd
From Table1
Cross Join Table2
Where patindex(N'%[ (]'+Table2.kwd+N'[ ,.;)]%',N' '+cast(Table1.txt as nvarchar(max))+N' ')>0
Order by id, kwd

Related

Replace first two characters with three characters SQL Oracle

I have a question
I want to join two tables (Table1 and Table2) on custID column.
However for the join to work I need to edit Table1s custID column vlaues by removing the first two characters ('CC') and replacing them with 0s so the final output is padded to 8 digits.
So if Table1 had a value in custID of CC34054 then this would need to be converted to 00034054 for the join to identify that value in Table2.custID. If for instance the custID value in Table1 was CC3356, the value would need to be revised to 00003356 for the join to match.
Ive made some tables below so I can illustrate what I mean.
Table1
CustID
CC34054
CC3356
CC87901
Table2
CustID
00034054
00003356
00087901
I hope this makes sense. thanks!
One option is to replace CC with an empty string and apply LPAD function to fill up to 8 characters with zeros; do it either in JOIN or - possibly - by updating table1 (so join would then look simpler, just on a.custid = b.custid).
Sample data:
SQL> with
2 table1 (custid, name) as
3 (select 'CC34054', 'Little' from dual union all
4 select 'CC3356' , 'Scott' from dual
5 ),
6 table2 (custid, surname) as
7 (select '00034054', 'Foot' from dual union all
8 select '00003356', 'Tiger' from dual
9 )
Query:
10 select b.custid, a.name, b.surname
11 from table1 a join table2 b on
12 lpad(replace(a.custid, 'CC', ''), 8, '0') = b.custid;
CUSTID NAME SURNAME
-------- ------ ----------
00034054 Little Foot
00003356 Scott Tiger
SQL>
Another way to do it is to use SubStr() function to remove the first two characters (either they are 'CC' or anything else) and then do the Lpad to the length of 8 with '0' characters:
Select b.CUSTID
From tbl_1 a
Inner Join tbl_2 b on (Lpad(SubStr(a.CUSTID, 3), 8, '0') = b.CUSTID)
Regards...
I think no need to replace or pad leading characters, rather trim for both CustID columns if the data model is fixed throughout the whole table such as
SELECT *
FROM Table1 t1
JOIN Table2 t2
ON LTRIM(t1.CustID,'C') = LTRIM(t2.CustID,'0')
in which using only single characters as the second arguments would be sufficient.
Demo

Combine Query Result into single row

I have sql query that result data like this one
Name | City
-------------------
Frank | London
Sebastian | New York
I want to merge that result into a single row and column like this one
Frank;London;Sebastian;New York
How do I resolve this query problem? Thanks before
By default SSMS prints results in grid format. One of the options is to print the results to text. Click this button on the menu or press the shortcut "CTRL+T". This will print tab delimited results by default, instead of the semicolon delimited results that you want. This can be changed from the Query->Query Options->Results-> Text->Output Format menu or by using "CTRL+H" in a text editor (like Notepad) to find and replace all tabs with semicolons.
You can do :
select stuff ( (select distinct ';'+t1.col2
from table t cross apply
( values (name), (city) ) t1 (col2)
for xml path ('')
), 1, 1, ''
) ;
May be this one for Oracle?
WITH tmp AS
(
SELECT 'Frank' Name, 'London' City FROM dual
UNION
SELECT 'Sebastian', 'New York' FROM dual
)
SELECT LISTAGG(name||';'||city, '; ') WITHIN GROUP(ORDER BY null) FROM tmp
In SQL Server that should do the work:
SELECT ';'+rtrim(Name)+';'+rtrim(City)
FROM Table
FOR XML PATH('')
in oracle you don't have XML PATH('') syntax but you can concatenate a field like that for example:
SELECT ';' || WM_CONCAT(name||';'||City) AS Result
FROM table
Note that in Oracle 12c WM_CONCAT is deprecated but you can use ListaAgg
SELECT LISTAGG(Name||City,';') WITHIN GROUP(ORDER BY aColumn DESC) FROM TABLE
cheers

Comma separated column ids should show values or text(with function or query)

I have a table like this
Foreign table:
select * from table1
ID......NameIds
-------------------
1 ......1, 2 (its comma separated values)
Primary table(table2)
ID Name
-------------------
1 Cleo
2 Smith
I want to show table 1 as like (I require SQL function or query for it)
ID......NameIds
-------------------
1........Cleo, smith (show text/Name instead of values)
As per stated in comments - you should really rethink your table design, but it was interesting enough to try and write a query for that:
SELECT T1.ID, NameID, Name
INTO #Temporary
FROM #Table1 AS T1
CROSS APPLY (
SELECT CAST(('<X>' + REPLACE(T1.NameIDs, ',', '</X><X>') + '</X>') AS XML)
) AS X(XmlData)
CROSS APPLY (
SELECT NameID.value('.', 'INT')
FROM XmlData.nodes('X') AS T(NameID)
) AS T(NameID)
INNER JOIN #Table2 AS T2
ON T2.ID = T.NameID
SELECT ID, STUFF(T.Names, 1, 1, '') AS Names
FROM #Table1 AS T1
CROSS APPLY (
SELECT ',' + Name
FROM #Temporary AS T
WHERE T.ID = T1.ID
ORDER BY T.NameID
FOR XML PATH('')
) AS T(Names)
Result:
ID Names
--------------
1 Cleo,Smith
What it does, it splits your comma seperated list into rows, joins them on NameIDs and then concatenates them again. Guess how efficient is that?
It's probably not the most best way to do that, but it works.

Combining data from 2 tables in to 1 dynamic query

I have two tables:
table 1
id item itemType
-----------------------
1 book1 1
2 book2 1
3 laptop1 2
table 2
id itemId name value
------------------------------------------
1 1 author enid blyton
2 1 title five 1
3 2 author enid blyton
4 2 title five 2
5 3 cpu i7-940
6 3 ram 4 GB
7 3 vcard nvidia quadro
When I query with filter itemType = 1, the result should be:
query 1
id item author title
--------------------------------------------------------
1 book1 enid blyton five 1
2 book2 enid blyton five 2
and with filter itemType = 2
query 2
id item cpu ram vcard
----------------------------------------------
1 laptop1 i7-940 4 GB nvidia quadro
and without filter
query 3
id item author title cpu ram vcard
---------------------------------------------------------------------------
1 book1 enid blyton five 1
2 book2 enid blyton five 2
1 laptop1 i7-940 4 GB nvidia quadro
The reason I use table 2 is because the parameter of each itemType is created during the fly, so it is not possible to have a table like in query 3.
At this moment I can solve this in C# by rebuilding the table programmatically (using a lot of linq call). With a small size of table 1 (1K rows) and 2 (10K rows), the performance is good, but now the size of table 1 is already more than 100K rows and table 2 is more than 1M rows, and the performance is very low.
Is there any function using SQL query that can solve this problem?
Not exactly dynamic but if your name's are all known upfront, you can use PIVOT to retrieve your data.
PIVOT rotates a table-valued expression by turning the unique values
from one column in the expression into multiple columns in the output,
and performs aggregations where they are required on any remaining
column values that are wanted in the final output.
SQL Statement
SELECT t1.Id
, t1.item
, t2.author
, t2.title
, t2.cpu
, t2.ram
, t2.vcard
FROM table1 t1
INNER JOIN (
SELECT *
FROM (
SELECT itemId
, name
, value
FROM table2
) s
PIVOT (
MAX(Value)
FOR name IN (title, author, cpu, ram, vcard)
) p
) t2 ON t2.itemId = t1.Id
Test script
;WITH table1 (id, item, itemtype) AS (
SELECT 1, 'book1', 1
UNION ALL SELECT 2, 'book2', 1
UNION ALL SELECT 3, 'laptop1', 2
)
, table2 (id, itemId, name, value) AS (
SELECT 1, 1, 'author', 'enid blyton'
UNION ALL SELECT 2, 1, 'title', 'five 1'
UNION ALL SELECT 3, 2, 'author', 'enid blyton'
UNION ALL SELECT 4, 2, 'title', 'five 2'
UNION ALL SELECT 5, 3, 'cpu', 'i7 940'
UNION ALL SELECT 6, 3, 'ram', '4 GB'
UNION ALL SELECT 7, 3, 'vcard', 'nvidia quadro'
)
SELECT t1.Id
, t1.item
, t2.author
, t2.title
, t2.cpu
, t2.ram
, t2.vcard
FROM table1 t1
INNER JOIN (
SELECT *
FROM (
SELECT itemId
, name
, value
FROM table2
) s
PIVOT (
MAX(Value)
FOR name IN (title, author, cpu, ram, vcard)
) p
) t2 ON t2.itemId = t1.Id
I suggest running a query to return all possible names from table2 for the specified itemtype, like so:
select distinct name
from table2 t2
where exists (select null
from table1 t1
where t1.itemtype = #itemtype and
t1.id = t2.item_id)
In C#, concatenate the names into a single comma-separated string, then construct a new query string similar to Lieven's answer, like so:
SELECT t1.item
, t2.*
FROM table1 t1
INNER JOIN (SELECT *
FROM (SELECT itemId,
name,
value
FROM table2) s
PIVOT (MAX(Value)
FOR name IN (/*insert names string here*/)) p
) t2 ON t2.itemId = t1.Id
WHERE t1.itemtype = #itemtype;
(with the names string replacing the comment inside the brackets).
Incidentally, if possible, I suggest separating the names from Table 2 into a separate lookup table, like so:
name_table
----------
name_id
name
itemtype
- this would mean that the first query would only have to query a small lookup table rather than all of table 2; it could also be used for consistency in name values at data entry.

How do I select rows in table (A) sharing the same foreign key (itemId) where multiple rows in table have the values in table B

Sorry about the title, not sure how to describe without example. I trying to implement faceting of attributes in SQL Server 2008.
I have 2 tables. itemAttributes and facetParameters
Assume the following values in itemAttributes
id, itemId, name, value
---------------------------------------
1 1 keywords example1
2 1 keywords example2
3 2 color red
4 2 keywords example1
5 2 keywords example2
6 3 keywords example2
7 3 color red
8 3 color blue
Assume the following values in facetParameters
name value
----------------------
keywords example1
color red
I need to retrieve the (optional: distinct) itemIds where a given itemId has rows that contain all the values in facetParameters.
e.g. given the rows in facetParameters the query should return itemId 2. At the moment I would be using this in a CTE however given that they do not support a number of features I can work around this if there is no solution that works inside a CTE.
I have done a fair bit of sql over the years but this one has really stumped me and the shame is I keep thinking the answer must be simple.
You could join both tables, and use a having clause to ensure that all items match:
select ia.itemid
from #itemAttributes ia
inner join #facetParameters fp
on ia.name = fp.name
and ia.value = fp.value
group by ia.itemid
having count(distinct fp.name) =
(
select count(*) from #facetParameters
)
The count in the having clause assumes that the name uniquely identifies a row in the facetParameters table. If it doesn't, add an identity column to facetParameters, and use count(distinct id_column) instead of count(distinct fp.name).
Here's code to create the data set in the question:
declare #itemAttributes table (id int, itemId int,
name varchar(max), value varchar(max))
insert into #itemAttributes
select 1,1,'keywords','example1'
union all select 2,1,'keywords','example2'
union all select 3,2,'color','red'
union all select 4,2,'keywords','example1'
union all select 5,2,'keywords','example2'
union all select 6,3,'keywords','example2'
union all select 7,3,'color','red'
union all select 8,3,'color','blue'
declare #facetParameters table (name varchar(max), value varchar(max))
insert into #facetParameters
select 'keywords','example1'
union all select 'color','red'