How should I query these tables? - sql

That's the database I have:
This is the (first) Offer-table with articles and the respective ID:
This is the (second) Bid-Table with the offered articles:
I have to query the numbers of the articles that have offered the same number of
So I want to spend this here:
ID1 ID2 Number_of_Orders
1 2 2
1 5 2
2 5 2
I tried to join it into inline views:
SELECT DISTINCT * FROM
(SELECT BID.ID as ID1 FROM OFFER
INNER JOIN BID ON OFFER.ID=BID.ID
GROUP BY GEBOT.ID) v1,
(SELECT BID.ID as ID2 FROM OFFER
INNER JOIN BID ON OFFER.ID=BID.ID
GROUP BY BID.ID) v2,
(SELECT COUNT(GID) as NUMBER_OF_ORDERS FROM BID
INNER JOIN OFFER ON OFFER.ID=BID.ID
GROUP BY BID.ID
) v3;
but I do not know how I should spend the two IDs under the condition that they have the same number of orders (bids)

You seem to want to count the bids for each ID, and then do a self-join on that result to find matches:
with cte (id, number_of_bids) as (
select id, count(*)
from bid
group by id
)
select c1.id as id1, c2.id as id2, c1.number_of_bids
from cte c1
join cte c2
on c2.number_of_bids = c1.number_of_bids
and c2.id > c1.id
order by id1, id2;
ID1 ID2 NUMBER_OF_BIDS
---------- ---------- --------------
1 2 2
1 5 2
2 5 2
The CTE just gets the number of offers for each ID with simple aggregation. (You could do it with inline views instead of a CTE, but you'd be counting them twice, once in each inline view).
Then the main query joins that CTE to itself on the aggregated number_of_bids being equal, and also one the second ID being higher than the first - which eliminates duplicates. Without doing that you'd see a row where ID1 was 5 and ID2 was 2, i.e. the reverse of the last of the three rows you want (and the same for the other two), plus each ID/count matched with itself.
You don't need to join to the offer table - you aren't using anything data from that.

you simply join(inner join) these two tables and put the condition such as table1.bidPrice = table2.bidPrice

Related

Exclude one item with different corelated value in the next column SQL

I have two tables:
acc_num
ser_code
1
A
1
B
1
C
2
C
2
D
and the second one is:
ser_code
value
A
5
B
8
C
10
D
15
I want to exclude all the accounts with the service codes that they have value of 10 or 15.
Because my data set is huge, I want to use NOT EXIST but it just excludes combination of acc_num and ser_code.
I want to exclude the acc_num with all of it's ser_code, because on of it's ser_code meats my criteria.
I used:
select acc_num, ser_code
from table 1
where NOT EXIST (select 1
FROM table 2 where acc_num = acc_num and value in (10, 15)
out put with above code is:
acc_num
ser_code
1
A
1
B
Desire out put would be empty.
here you are
select t1.acc_num,t1.ser_code from table1 t1, table2 t2
where (t1.ser_code=t2.ser_code and t2.value not in (10,15))
and t1.acc_num not in
(
select t3.acc_num from table1 t3,table2 t4
where t1.acc_num=t3.acc_num and t3.ser_code=t4.ser_code
and t4.value in (10,15)
) ;
This could be achieved in many ways. However using NOT EXISTS is the best option. The problem with your query is for acc_num 1, there are ser_code that does not have value as 10, 15. So you will get A and B in result.
To overcome that you must pull acc_num inside the sub-query
Query 1 (using NOT EXISTS):
As you can see in the below query, I have included acc_num inside sub-query, so that the filter works properly,
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
WHERE NOT EXISTS
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15) AND a.acc_num=one.acc_num
)
Query 2 (using LEFT JOIN):
NOT EXISTS often confusing due to it's nature (super fast though). Hence LEFT JOIN could also be used (expensive than NOT EXISTS),
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
LEFT JOIN
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15)
) b
ON a.acc_num=b.acc_num
WHERE b.acc_num IS NULL
Query 3 (using NOT IN):
NOT IN would also achieve this with comprehensive query but expensive than both of the above methods,
SELECT DISTINCT a.acc_num, a.ser_code
FROM one as a
WHERE a.acc_num NOT IN
(
SELECT DISTINCT one.acc_num
FROM two
INNER JOIN one
ON one.ser_code=two.ser_code
WHERE value IN (10,15)
)
All 3 would yield same result. I would prefer to go with NOT EXISTS
See demo with time consumption in db<>fiddle

Count the Same Columns in Two Differnt Table

I am looking for a way to count for the same column in two different tables.
So I have two tables, table1 and table2. They both have the column "category". I want to find a way to count category for these two tables and show as the result below.
I know how to do this individually by
select category, count(category) as cnt from table1
group by category
order by cnt desc
select category, count(category) as cnt from table2
group by category
order by cnt desc
Not sure how to combine the two into one.
The expected result should be like below. Please note there are some "category" in table1 but not in table2 or vice versa, for example category c and d.
table1 table2
a 4 2
b 4 3
c 3
d 4
One method is full join:
select coalesce(t1c.category, t2c.category) as category,
t1c.t1_cnt, t2c.t2_cnt
from (select category, count(*) as t1_cnt
from table1
group by category
) t1c full join
(select category, count(*) as t2_cnt
from table2
group by category
) t2c
on t1c.category = t2c.category;
You need to be very careful that you aggregate before doing the join.

SQL Find most rows that match between two tables

I am using SQL Server 2012 I have two tables like the following
Table1 and Table 2 both have many groups, indicated by the group column. The name of the group may match in both tables, but it may not. What is important is finding the group on Table 2 that has the most members that match members in a group on Table1.
I first tried doing this with a vlookup, but the problem is vlookup pulls the first entry in the Group column that has a match, not the group with the most matches. Below vlookup would pull BBB, but the correct result is CCC.
Ties may occur. There might be more than one group on Table2 that match Table1 with the same number of members thus the best thing may be to count the number of matches, but there are thousands of groups so it's not ideal to sort and sift through a column with counts. I need something like a case statement where if there is a MAX(match) then Table1 would show the group name with MAX(Match) in the derived column BestMatch. It'd be most ideal if the column could display all the groups on table 2 that have MAX(Match which may be one or more. Perhaps it could be comma separated.
If not if the column could just say tie and I could look for the tie, it'd be ideal if this is the best option, when the word tie appears it repeats besides every member that matches so I will know to look for groups that matching which accounts and how many that matched.
We really could do with some expected output to help clarify the question.
If I understand you correctly however, this query will get you close to the results you require:
;with cte as
( SELECT t1a.[group] AS Group1
, t2a.[Group] AS Group2
, RANK() OVER(PARTITION BY t1a.[group]
ORDER BY COUNT(t2a.[Group]) DESC) AS MatchRank
FROM Table1 t1a
JOIN Table2 t2a
ON t1a.member = t2a.member
GROUP BY t1a.[group], t2a.[GRoup])
SELECT *
FROM cte
WHERE MatchRank=1
The query doesn't identify ties, but it will display any tied results...
If you are a newbie to common table expressions(the ;with statement) there is a useful description here.
select *
from Table1 t1
outer apply
(
select top 1 t2.[Group]
from Table2 t2
where t2.Member = t1.Member
group by t2.[Group]
order by count(*) desc
) m
It may not be the most elegant solution but I think it could do the work:
select *
from
(select t1.[group] as t1group, t1.member, t2.[group] as t2group
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)a
where member = (select max(t1.member)
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)
In case of 2 rows from Table2 matching the maximum members in Table1, both results would be displayed
PS: an example of your desired results would have been helpful
Count member matches per group pair and rank them so the group pairs with the highest match count get rank #1. Once you found these, you can select the related records from table1 and table2.
select t1.grp, t1.member, t2.grp
from t1
join
(
select
t1.grp as grp1,
t2.grp as grp2,
rank() over (order by count(*) desc) as rnk
from t1
join t2 on t2.member = t1.member
group by t1.grp, t2.grp
) grps on grps.rnk = 1 and grps.grp1 = t1.grp
left join t2 on t2.grp = grps.grp2 and t2.member = t1.member
order by t1.grp, t1.member, t2.grp;
This gives you ties in separate rows, e.g. for AAA having four different members (123,456,789,555) with two matches both in CCC and DDD:
grp1 member grp2
AAA 123 CCC
AAA 123 DDD
AAA 456 CCC
AAA 789
AAA 555 DDD
If you want one row per grp1 and member with all matching grp2 in a string then you need some clumsy STUFF trick in SQL Server as far as I am aware. Look up "GROUP_CONCAT in SQL Server" to find the technique needed.

Chaining joins in SQL based on dynamic table

The title may not be accurate for the question but here goes! I have the following table:
id1 id2 status
1 2 a
2 3 b
3 4 c
6 7 d
7 8 e
8 9 f
9 10 g
I would like to get the first id1 and last status based on a dynamic chain joining, meaning that the result table will be:
id final_status
1 c
6 g
Logically, I want to construct the following arrays based on joining the table to itself:
id1 chained_ids chained_status
1 [2,3,4] [a,b,c]
6 [7,8,9,10] [d,e,f,g]
Then grab the last element of the chained_status list.
Since if we were to keep joining this table to itself on id1 = id2 we would eventually have single rows with these results. The problem is that the number of joins is not constant (a single id may be chained many or few times). There is always a 1 to 1 mapping of id1 to id2.
Thanks in advanced! This can be done in either T-SQL or Hive (if someone has a clever map-reduce solution).
You can do this with a recursive CTE:
;WITH My_CTE AS
(
SELECT
id1,
id2,
status,
1 AS lvl
FROM
My_Table T1
WHERE
NOT EXISTS
(
SELECT *
FROM My_Table T2
WHERE T2.id2 = T1.id1
)
UNION ALL
SELECT
CTE.id1,
T3.id2,
T3.status,
CTE.lvl + 1
FROM
My_CTE CTE
INNER JOIN My_Table T3 ON T3.id1 = CTE.id2
)
SELECT
CTE.id1,
CTE.status
FROM
My_CTE CTE
INNER JOIN (SELECT id1, MAX(lvl) AS max_lvl FROM My_CTE GROUP BY id1) M ON
M.id1 = CTE.id1 AND
M.max_lvl = CTE.lvl

SQL join without losing rows

I have 2 tables with the same schema of userID, category, count. I need a query to sum the count of each userID/category pair. Sometimes a pair will exist in one table and not the other. I'm having trouble doing a join without losing the rows where a userID/category pair only exists in 1 table. This is what I'm trying (without success):
select a.user, a.category, count=a.count+b.count
from #temp1 a join #temp2 b
on a.user = b.user and a.category = b.category
Example:
Input:
user category count
id1 catB 3
id2 catG 9
id3 catW 17
user category count
id1 catB 1
id2 catM 5
id3 catW 13
Desired Output:
user category count
id1 catB 4
id2 catG 9
id2 catM 5
id3 catW 30
Update: "count" is not the actual column name. I just used it for the sake of this example, and I forgot it's a reserved word.
You need to:
Use a full outer join so you don't drop rows present in one table and not the other
Coalesce counts prior to addition, because 0 + NULL = NULL
Also, because COUNT is a reserved word, I would recommend escaping it.
So, using all of these guidelines, your query becomes:
SELECT COALESCE(a.user, b.user) AS user,
COALESCE(a.category, b.category) AS category,
COALESCE(a.[count],0) + COALESCE(b.[count],0) AS [count]
FROM #temp1 AS a
FULL OUTER JOIN #temp2 AS b
ON a.user = b.user AND
a.category = b.category
One way to approach this is with a full outer join:
select coalesce(a.user, b.user) as user,
coalesce(a.category, b.category) as category,
coalesce(a.count, 0) + coalesce(b.count, 0) as "count"
from #temp1 a full outer join
#temp2 b
on a.user = b.user and
a.category = b.category;
When using full outer join, you have to be careful because the key fields can be NULL when there is a match in only one table. As a result, the select tends to have a lot of coalesce()s (or similar constructs).
Another way is using a union all query with aggregation:
select "user", category, SUM(count) as "count"
from ((select "user", category, "count"
from #temp1
) union all
(select "user", category, "count"
from #temp2
)
) t
group by "user", category