Join and compare 2 queries of 2 tables - sql

This is probably a quite trivial question for many here but I am not used to write sub queries and joins, so I hope someone want to help.
I have two tables: new_road and old_roads.
These two queries sum up the length of the roads belonging to a specific road number.
SELECT new_road.nummer, SUM(new_road.length) FROM road_table.road GROUP BY new_road.nummer
SELECT old_road.nummer, SUM(ST_length(old_road.geom)) FROM old_road_table.old_road GROUP BY old_road.nummer
I wish to have a result table where these two queries are joined so I can compare the new and old summed length for each road number.
Like
old.nummer old.length new.nummer new.lenght
2345 10.3 2345 10.5
2346 578.2 2346 600
2347 54.2 NULL NULL
NULL NULL 2546 32.2
I think some version of an outer join is needed because there will be a road numbers in the old_road table that does not exist in the new.road table and i would like to see them too.
Appreciate any advice
Edit:
After advice from below did I came up with this:
SELECT * FROM
(SELECT new_road.nummer, SUM(new_road.length) FROM road_table.road GROUP BY new_road.nummer) new_table
FULL OUTER JOIN
(SELECT old_road.nummer, SUM(ST_length(old_road.geom)) FROM old_road_table.old_road GROUP BY old_road.nummer) old_table
ON new_road.nummer = old_road.nummer
But each time I run it I get missing FROM-clause entry. When I run each sub query individually they work. I have crosschecked with the documentation and it look OK to me, but clearly I am missing something here.

Consider using a FULL OUTER JOIN
This is not the exact output you requested but you don't need to display the nummer twice.
SELECT
COALESCE(new_road.nummer,old_road.nummer)nummer,
new_road.length,
old_road.length
FROM (
SELECT new_road.nummer
,SUM(new_road.length) length
FROM road_table.road
GROUP BY new_road.nummer
) new_road
FULL OUTER JOIN (
SELECT old_road.nummer
,SUM(ST_length(old_road.geom))length
FROM old_road_table.old_road
GROUP BY old_road.nummer
) old_road ON
old_road.nummer = new_road.nummer

Following query should solve the purpose. I didn't run it but the basic idea is result of a query on a table is another table on which you can query again.
Select * FROM (SELECT new_road.nummer, SUM(new_road.length) FROM road_table.road GROUP BY new_road.nummer) table1 JOIN (SELECT old_road.nummer, SUM(ST_length(old_road.geom)) FROM old_road_table.old_road GROUP BY old_road.nummer) table2 ON table1.new_road.nummer = table2.old_road.nummer

The tricky bit here is that you want to make sure you include all of the keys from both lists. My favorite way to do this kind of thing is:
select * from (
SELECT distinct new_road.nummer as nummer from road_table.road
union
SELECT distinct old_road.nummer as nummer FROM old_road_table.old_road
) allkeys
left join
(
SELECT new_road.nummer as nummer, SUM(new_road.length) as nlen
FROM road_table.road GROUP BY new_road.nummer
) n
on allkeys.nummer = n.nummer
left join
(
SELECT old_road.nummer as nummer, SUM(ST_length(old_road.geom)) as olen
FROM old_road_table.old_road GROUP BY old_road.nummer
) o
on allkeys.nummer = o.nummer
The first subquery builds a list of all keys, then you join to both of your queries from there. There's nothing wrong with an outer join, but I find this easier to manage if you have to include 3 or more tables. If you had to include another table it would just be one more union in allkeys and one more left join to that table.

Related

How to get a result set containing the absence of a value?

Scenario: Have a table with four columns. District_Number, District_name, Data_Collection_Week, enrollments. Each week we get data, BUT sometimes we do not.
Task: My supervisor wants me to produce a query that will let us know, which districts did not submit a given week.
What I have tried is below, but I cannot get a NULL value on those that did not submit a week.
SELECT DISTINCT DistrictNumber, DistrictName, DataCollectionWeek
into #test4
FROM EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DistrictNumber, DataCollectionWeek asc
select DISTINCT DataCollectionWeek
into #test5
from EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DataCollectionWeek
select b.DistrictNumber, b.DistrictName, b.DataCollectionWeek
from #test5 a left outer join #test4 b on (a.DataCollectionWeek = b.DataCollectionWeek)
order by b.DistrictNumber, b.DataCollectionWeek asc
One option uses a cross join of two select distinct subqueries to generate all possible combinations of districts and weeks, and then not exists to identify those that are not available in the table:
select d.districtnumber, w.datacollectionweek
from (select distinct districtnumber from edw_requests.instruction_delivery_enrollment_2021) d
cross join (select distinct datacollectionweek from edw_requests.instruction_delivery_enrollment_2021) w
where not exists (
select 1
from edw_requests.instruction_delivery_enrollment_2021 i
where i.districtnumber = d.districtnumber and i.datacollectionweek = w.datacollectionweek
)
This would be simpler (and much more efficient) if you had referential tables to store the districts and weeks: you would then use them directly instead of the select distinct subqueries.

SUM a column count from two tables

I have this simple unioned query in SQL Server 2014 where I am getting counts of rows from each table, and then trying to add a TOTAL row at the bottom that will SUM the counts from both tables. I believe the problem is the LEFT OUTER JOIN on the last union seems to be only summing the totals from the first table
SELECT A.TEST_CODE, B.DIVISION, COUNT(*)
FROM ALL_USERS B, SIGMA_TEST A
WHERE B.DOMID = A.DOMID
GROUP BY A.TEST_CODE, B.DIVISION
UNION
SELECT E.TEST_CODE, F.DIVISION, COUNT(*)
FROM BETA_TEST E, ALL_USERS F
WHERE E.DOMID = F.DOMID
GROUP BY E.TEST_CODE, F.DIVISION
UNION
SELECT 'TOTAL', '', COUNT(*)
FROM (SIGMA_TEST A LEFT OUTER JOIN BETA_TEST E ON A.DOMID
= E.DOMID )
Here is a sample of the results I am getting:
I would expect the TOTAL row to display a result of 6 (2+1+3=6)
I would like to avoid using a Common Table Expression (CTE) if possible. Thanks in advance!
Since you are counting users with matching DOMIDs in the first two statements, the final statement also needs to include the ALL_USERS table. The final statement should be:
SELECT 'TOTAL', '', COUNT(*)
FROM ALL_USERS G LEFT OUTER JOIN
SIGMA_TEST H ON G.DOMID = H.DOMID
LEFT OUTER JOIN BETA_TEST I ON I.DOMID = G.DOMID
WHERE (H.TEST_CODE IS NOT NULL OR I.TEST_CODE IS NOT NULL)
I would consider doing a UNION ALL first then COUNT:
SELECT COALESCE(TEST_CODE, 'TOTAL'),
DIVISION,
COUNT(*)
FROM (
SELECT A.TEST_CODE, B.DIVISION
FROM ALL_USERS B
INNER JOIN SIGMA_TEST A ON B.DOMID = A.DOMID
UNION ALL
SELECT E.TEST_CODE, F.DIVISION
FROM BETA_TEST E
INNER JOIN ALL_USERS F ON E.DOMID = F.DOMID ) AS T
GROUP BY GROUPING SETS ((TEST_CODE, DIVISION ), ())
Using GROUPING SETS you can easily get the total, so there is no need to add a third subquery.
Note: I assume you want just one count per (TEST_CODE, DIVISION). Otherwise you have to also group on the source table as well, as in #Gareth's answer.
I think you can achieve this with a single query. It seems your test tables have similar structures, so you can union them together and join to ALL_USERS, finally, you can use GROUPING SETS to get the total
SELECT ISNULL(T.TEST_CODE, 'TOTAL') AS TEST_CODE,
ISNULL(U.DIVISION, '') AS DIVISION,
COUNT(*)
FROM ALL_USERS AS U
INNER JOIN
( SELECT DOMID, TEST_CODE, 'SIGNMA' AS SOURCETABLE
FROM SIGMA_TEST
UNION ALL
SELECT DOMID, TEST_CODE, 'BETA' AS SOURCETABLE
FROM BETA_TEST
) AS T
ON T.DOMID = U.DOMID
GROUP BY GROUPING SETS ((T.TEST_CODE, U.DIVISION, T.SOURCETABLE), ());
As an aside, the implicit join syntax you are using was replaced over a quarter of a century ago in ANSI 92. It is not wrong, but there seems to be little reason to continue to use it, especially when you are mixing and matching with explicit outer joins and implicit inner joins. Anyone else that might read your SQL will certainly appreciate consistency.

Comparing two sum function in where clause

I want to check that an amount of likes the users received in all their personal pictures is at least twice as large as the number of likes received in the group pictures in which they are tagged.
In case the user is not tagged in any group photo but is tagged in a personal picture that has received at least one like, it will be returned.
My Question is:
How can I make a comparison between 2 sum functions
Where one result of the sum is returned in the nested query and compared with the external query.
Can I set an auxiliary variable to enter the sum value in it and compare it?
Thanks for the helpers:)
Select distinct UIP.userID
From tblUserInPersonalPic UIP
where **sum(UIP.numOfLikes) over (Partition by UIP.userID)*0.5** >
(Select distinct U.userID, sum(P.numOfLikes) over (Partition by U.userID)
From tblgroupPictures P left outer join
tblUserInGroupPic U On P.picNum=U.picNum
group by U.userID,P.numOfLikes,P.picNum)
It's kinda hard to know for sure, and of course I can't test my answer,
but I think you can do it with a couple of left joins, group by and having:
SELECT Personal.UserId
FROM tblUserInPersonalPic Personal
LEFT JOIN tblUserInGroupPic UserInGroup ON Personal.userID = UserInGroup.UesrId
LEFT JOIM tblgroupPictures GroupPictures ON UserInGroup.picNum = GroupPictures.picNum
GROUP BY Personal.userID
HAVING SUM(GroupPictures.numOfLikes) * 2 < SUM(Personal.numOfLikes)
Please note: When posting sql questions it's always best to provide sample data as DDL + DML (Create table + insert into statements) and desired results, so that who ever answers you can test the answer before posting it.
Try using two ctes..pseudo code.Also note distinct in second query will not even work,since you are returning two columns,so i changed it it below,so that you can get that column as well
;with tbl1
as
(
select a,sum(col1) as summ
from
tbl1
)
,tbl2
as
(
select userid,sum(Anothersmcol) as sum2
from tbl2
)
select tbl1.columns,tbl2.columns
from
tbl1 t1
join
tbl2 t2
on t1.sumcol>t2.sumcol
You can't use window functions in a where clause. Define it in a subquery:
select *
from (
select sum(...) over (...) as Sum1
, OtherColumn
from YourTable
) sub
where Sum1 < (...your subquery...)

SQL join count and select query

I have two tables, one is a list of 'gangs' and one is a list of 'gang_members' the gang_members.gang_id refers to the gang.id they are in, I know how to count all the members in one gang, but I need to join the following queries into one:
SELECT * FROM gangs LIMIT 8
SELECT count(gang_id) FROM gangs_members WHERE gang_id = <GANG ID>
I think this is possible, I could do it in a loop while it's going through the gangs but that would be inefficient
SELECT A.*, B.RC
FROM gangs A
LEFT JOIN (SELECT gang_id, COUNT(*) AS RC FROM gangs_members GROUP BY gang_id) B ON A.gang_id=B.gang_id
Probably something like this
SELECT count(gang_id)
FROM gangs_members
WHERE gang_id IN (SELECT gang_id FROM gangs LIMIT 8)

Opposite of UNION SQL Query

I have 2 tables :
interests (storing the interest ID and name)
person_interests(storing the person_id and interest_id)
How do I select all the interests that a particular person has not selected?
I have tried the following SQL Query and am still not getting the desired result
SELECT *
FROM interests LEFT JOIN person_interests
ON interests.id=person_interests.person_id
WHERE person_interests.id IS NULL
AND person_id=66;
Use NOT EXISTS
SELECT *
FROM interests
WHERE NOT EXISTS (
SELECT person_interests.interest_id
FROM person_interests
WHERE person_id = 66
AND interests.id = person_interests.interest_id
)
SELECT * from interests
WHERE interest_id NOT IN
(SELECT interest_id FROM person_interests WHERE person_id=66)
There are a couple things going on.
First, I think you have an error in your join. Shouldn't it be interests.id=person_interests.interest_id instead of interests.id=person_interests.person_id?
That aside, I still don't think you would be getting the desired result because your person_id filter is on the RIGHT side of your LEFT OUTER join, thus turning it back into an inner join. There are several ways to solve this. Here's what I would probably do:
SELECT *
FROM
(SELECT interests.*, person_id
FROM interests LEFT JOIN person_interests
ON interests.id=person_interests.interest_id
WHERE person_interests.id IS NULL )
WHERE person_id=66;