Merge 2 tables without repeat data - sql

I need to merge the below tables.
Table 1
UserID |TopicName
1 |Topic1
1 |Topic2
2 |Topic1
2 |Topic2
2 |Topic3
Table2
UserID |Levelname
1 |level1
1 |level2
1 |level3
1 |level4
2 |level1
Output
UserID |TopicName|LevelName
1 |Topic1 |Level1
1 |Topic2 |Level2
1 | |Level3
1 | |Level4
TopicName|LevelName
1 |Topic1 |Level1
1 |Topic2 |Level2
1 | |Level3
1 | |Level4

It looks like you want to match rows that have the same userid, based on their position. You could enumerate the rows in a subquery, then left join.
For this to work consistently, you need a column that defines the position of each row in each table - since nothing is showing in that regard in your data, I used the other column table - but you might want to change that.
select t2.userid, t1.topicname, t2.levelname
from (
select t2.*, row_number() over(partition by userid order by topicname) rn
from table2 t2
) t2
left join (
select t1.*, row_number() over(partition by userid order by levelname) rn
from table2 t1
) t1 on t1.userid = t2.userid and t1.rn = t2.rn
You can add a where clause at the end of the query to filter on a given userid, as showned in your expected results:
where t2.userid = 1
If there might be "missing" rows in both tables, then a full join is a better pick:
select coalesce(t1.userid, t2.userid) userid, t1.topicname, t2.levelname
from (
select t2.*, row_number() over(partition by userid order by topicname) rn
from table2 t2
) t2
full join (
select t1.*, row_number() over(partition by userid order by levelname) rn
from table2 t1
) t1 on t1.userid = t2.userid and t1.rn = t2.rn

Related

comparing sum of 2 column values in one table with another column value in second table sql server

I have two tables...
table1 ( id, item, price ) values:
id | item | price
-------------
10 | book | 20
20 | copy | 30
30 | pen | 10
....table2 ( id, item, price) values:
id | item | price
-------------
10 | book | 20
10 | book | 30
now i if do not have a record with id-10,item-book and price-(20+30) in table 1 then i want to insert that row with sum(20+30) in a new table ...
If I haven't misunderstood your requirements, this should do it:
SELECT T2.ID, T2.ITEM,T2.SUMPRICE FROM
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table1 GROUP BY ID, ITEM) AS T1
INNER JOIN
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table2 GROUP BY ID, ITEM) AS T2
ON T1.ID = T2.id AND T1.item = T2.item WHERE T1.SUMPRICE <> T2.SUMPRICE
If your 3rd table is already created you could just use an INSERT INTO SELECT statement. Otherwise you could use a SELECT INTO like this:
SELECT T2.ID, T2.ITEM,T2.SUMPRICE as price into table3 FROM
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table1 GROUP BY ID, ITEM) AS
T1
INNER JOIN
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table2 GROUP BY ID, ITEM) AS
T2
ON T1.ID = T2.id AND T1.item = T2.item WHERE T1.SUMPRICE <> T2.SUMPRICE
Hope it helps!
EDIT 1
In the case that you want to get all the unmatched rows, that is:
Rows from table 2 where the id, item or price don't match simultenously with a given row in table 1
Rows from table 2 where none of their columns match simultenously with a given row in table 1
You could use the EXCEPT statement, for example:
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table2 GROUP BY ID, ITEM)
EXCEPT
(SELECT ID, ITEM, SUM(PRICE) AS SUMPRICE FROM table1 GROUP BY ID, ITEM)
This returns:
ID ITEM SUMPRICE
---- -------------------- -----------
10 book 50
Try the following
SELECT T2.id,T2.item,T2.Price as Table2_Price,T1.Price as Table1_Price FROM (
SELECT id,Item,Sum(Price) as Price
FROM Table2
Group BY id,Item ) AS T2 LEFT JOIN Table1 AS T1
ON T1.Item = T2.Item and T1.ID = T2.ID
After reading your comments when sum of these two not equals the one in another table it should return the sum from table 2 , you are asking for the following logic
SELECT T2.id,T2.Item, CASE WHEN T1.Price IS NULL THEN T2.Price
WHEN T2.Price <> T1.Price THEN T2.Price
ELSE T1.Price END as Price FROM (
SELECT id,Item,Sum(Price) as Price
FROM Table2
Group BY id,Item ) AS T2 LEFT JOIN Table1 AS T1
ON T1.Item = T2.Item and T1.ID = T2.ID
But this logic isn't useful, because if Sum(Table2_price) <> Table1_Price you want to select Sum(Table2_price) else when Sum(Table2_price) = Table1_Price you want Table1_Price !! So you want to always choose Sum(Table2_Price)

SQL Query - ID Join, Just Duplicates

I am working with Oracle SQL. I have two tables. One has ItemID and DatePurchased and the other has ItemID, CustomerID. I'm trying to join the tables so that I can see only those customers with multiple items.
In other words, if I had:
TABLE 1
ItemID---DatePurchased
1 MAR15
2 JUN10
3 APR02
and
TABLE 2
ItemID---CustomerID
1 1
2 1
3 2
I would want this returned:
TABLE 3
ItemID--DatePurchased--CustomerID
1 MAR15 1
2 JUN10 1
(Customer 2 is left out because he only has one item (ItemID=3)).
Any ideas on how to do this in SQL?
select ItemID, DatePurchased, CustomerID
from
(
select
T1.ItemID, T1.DatePurchased, T2.CustomerID,
count(*) over (partition by T2.CustomerId) as ItemCnt
from TABLE2 T2
join TABLE1 T1 on T1.ItemID = T2.ItemID
) dt
where ItemCnt > 1
select
T2.ItemID, T2.CustomerID, T1.DatePurchased
from TABLE2 as T2
inner join TABLE1 as T1 on T1.ItemID = T2.ItemID
where
T2.CustomerID in
(
select TT.CustomerID
from TABLE2 as TT
group by TT.CustomerID
having count(*) > 1
)

Counting word occurrence with SQL query

I have two tables.
Table1:
ID SENTENCE
1 The shoes are good shoes.
2 There is a tree.
3 This is nice, nice, nice!
Table2:
ID WORD
1 The
1 shoes
1 are
1 good
1 shoes
2 There
2 is
2 a
2 tree
3 This
3 is
3 nice
3 nice
3 nice
I need to count the occurrence of each word in every sentence from Table1. If any word occurs more than once (>1), then count it else skip it. In the end the resulting table should look like this:
ID SENTENCE CNT
1 The shoes are good shoes. 2
2 There is a tree.
3 This is nice, nice, nice! 3
You can use count() over():
select distinct t1.id,
t1.sentence,
coalesce(t2.cnt, 0) cnt
from table1 t1
left join
(
select t1.id,
t1.sentence,
t2.word,
count(t2.word) over(partition by t1.id, t2.word) cnt
from table1 t1
left join table2 t2
on t1.id = t2.id
) t2
on t1.id = t2.id
and t2.cnt > 1
order by t1.id
See SQL Fiddle with Demo.
Or you can just use count():
select t1.id,
t1.sentence,
coalesce(t2.cnt, 0) cnt
from table1 t1
left join
(
select t1.id,
t1.sentence,
t2.word,
count(t2.word) cnt
from table1 t1
left join table2 t2
on t1.id = t2.id
group by t1.id, t1.sentence, t2.word
having count(t2.word) > 1
) t2
on t1.id = t2.id
order by t1.id
See SQL Fiddle with Demo
SQL DEMO
select t1.id, t1.sentence,
coalesce(t2.cnt,0) as counts
from table1 t1
left join
(select id, word, count(id) cnt
from table2
group by id, word
having count(id) > 1)t2
on t1.id = t2.id
order by t1.id
;
| ID | SENTENCE | COUNTS |
-------------------------------------------
| 1 | The shoes are good shoes. | 2 |
| 2 | There is a tree. | 0 |
| 3 | This is nice, nice, nice! | 3 |
SELECT table1.id, table1.sentence, COUNT(word) as cnt FROM table2 JOIN table1 ON table1.id = table2.id GROUP BY table2.word HAVING COUNT(word) > 1
My answer is for mysql, I am verifying now that it works in sql as well
There are many join examples, so, I will add only word count examples:
Select REGEXP_COUNT('The shoes are good shoes.', ' ')+1 words_count
From dual
/
WORDS_COUNT
-----------
5
SELECT id
, LISTAGG(word, ' ') WITHIN GROUP (ORDER BY id, word) AS words
, count(*) word_cnt
FROM your_table2
GROUP BY id
/
ID WORDS WORD_CNT
---------------------------------------
1 The are good shoes shoes 5
2 There a is tree 4
3 This is nice nice nice 5

How do I remove duplicates in paging

table1 & table2:
table1 & table2 http://aftabfarda.parsfile.com/1.png
SELECT *
FROM (SELECT DISTINCT dbo.tb1.ID, dbo.tb1.name, ROW_NUMBER() OVER (ORDER BY tb1.id DESC) AS row
FROM dbo.tb1 INNER JOIN
dbo.tb2 ON dbo.tb1.ID = dbo.tb2.id_tb1) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY id DESC
Result:
Result... http://aftabfarda.parsfile.com/3.png
(id 11 Repeated 3 times)
How can I have this output:
ID name row
-- ------ ---
11 user11 1
10 user10 2
9 user9 3
8 user8 4
7 user7 5
6 user6 6
5 user5 7
You could apply distinct before row_number using a subquery:
select *
from (
select row_number() over (order by tbl.id desc) as row
, *
from (
select distinct t1.ID
, tb1.name
from dbo.tb1 as t1
join dbo.tb2 as t2
on t1.ID = t2.id_tb1
) as sub_dist
) as sub_with_rn
where row between 1 and 7
Alternatively to #Andomar's suggestion, you could use DENSE_RANK instead of ROW_NUMBER and rank the rows first (in the subquery), then apply DISTINCT (in the outer query):
SELECT DISTINCT
ID,
name,
row
FROM (
SELECT
t1.ID,
t1.name,
DENSE_RANK() OVER (ORDER BY t1.ID DESC) AS row
FROM dbo.tb1 t1
INNER JOIN dbo.tb2 t2 ON t1.ID = t2.id_tb1
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC
Similar, but not quite the same, although both might boil down to the same query plan, I'm just not sure. Worth testing, I think.
And, of course, you could also try a semi-join instead of a proper join, in the form of either IN or EXISTS, to prevent duplicates in the first place:
SELECT
ID,
name,
row
FROM (
SELECT
ID,
name,
ROW_NUMBER() OVER (ORDER BY ID DESC) AS row
FROM dbo.tb1
WHERE ID IN (SELECT id_tb1 FROM dbo.tb2)
/* Or:
WHERE EXISTS (
SELECT *
FROM dbo.tb2
WHERE id_tb1 = dbo.tb1.ID
)
*/
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC

SQL: 3 self-joins and then join them together

I have 2 tables to join in a specific way. I think my query is right, but not sure.
select t1.userID, t3.Answer Unit, t5.Answer Demo
FROM
table1 t1
inner join (select * from table2) t3 ON t1.userID = t3.userID
inner join (select * from table2) t5 ON t1.userID = t5.userID
where
NOT EXISTS (SELECT * FROM table1 t2 WHERE t2.userID = t1.userID AND t2.date > t1.date)
and NOT EXISTS (SELECT * FROM table2 t4 WHERE t4.userID = t3.userID and t4.counter > t3.counter)
and NOT EXISTS (SELECT * FROM table2 t6 WHERE t6.userID = t5.userID and t6.counter > t5.counter)
and t1.date_submitted >'1/1/2009'
and t3.question = Unit
and t5.question = Demo
order by
t1.userID
From table1 I want distinct userID where date > 1/1/2009
table1
userID Date
1 1/2/2009
1 1/2/2009
2 1/2/2009
3 1/2/2009
4 1/1/2008
So The result I want from table1 should be this:
userID
1
2
3
I then want to join this on userID with table2, which looks like this:
table2
userID question answer counter
1 Unit A 1
1 Demo x 1
1 Prod 100 1
2 Unit B 1
2 Demo Y 1
3 Prod 100 1
4 Unit A 1
1 Unit B 2
1 Demo x 2
1 Prod 100 2
2 Unit B 2
2 Demo Z 2
3 Prod 100 2
4 Unit A 2
I want to join table1 with table2 with this result:
userID Unit Demo
1 B X
2 B Z
In other words,
select distinct userID from table2 where question = Unit for the highest counter
and then
select distinct userID from table2 where question = Demo for the highest counter.
I think what I've done is 3 self-joins then joined those 3 together.
Do you think it's right?
SELECT du.userID, unit.answer, demo.answer
FROM (
SELECT DISTINCT userID
FROM table1
WHERE date > '1/1/2009'
) du
LEFT JOIN
table2 unit
ON (userID, question, counter) IN
(
SELECT du.userID, 'Unit', MAX(counter)
FROM table2 td
WHERE userID = du.userID
AND question = 'Unit'
)
LEFT JOIN
table2 demo
ON (userID, question, counter) IN
(
SELECT du.userID, 'Demo', MAX(counter)
FROM table2 td
WHERE userID = du.userID
AND question = 'Demo'
)
Having an index on table2 (userID, question, counter) will greatly improve this query.
Since you mentioned SQL Server 2005, the following will be easier and more efficient:
SELECT du.userID,
(
SELECT TOP 1 answer
FROM table2 ti
WHERE ti.user = du.userID
AND ti.question = 'Unit'
ORDER BY
counter DESC
) AS unit_answer,
(
SELECT TOP 1 answer
FROM table2 ti
WHERE ti.user = du.userID
AND ti.question = 'Demo'
ORDER BY
counter DESC
) AS demo_answer
FROM (
SELECT DISTINCT userID
WHERE date > '1/1/2009'
FROM table1
) du
To aggregate:
SELECT answer, COUNT(*)
FROM (
SELECT DISTINCT userID
FROM table1
WHERE date > '1/1/2009'
) du
JOIN table2 t2
ON t2.userID = du.userID
AND t2.question = 'Unit'
GROUP BY
answer