How to include US states with zero counts() in Oracle SQL? - sql

I'm struggling to craft a query for an Oracle db that will return "0" for each US state that is not represented in the results of this query:
SELECT TBL_STATES.STATE_ABBR as State, count(tbl_stations.station_state) AS Observations
FROM TBL_STATES
LEFT JOIN TBL_STATIONS ON TBL_STATES.STATE_ABBR = TBL_STATIONS.STATION_STATE
LEFT JOIN TBL_OBSERVATIONS ON TBL_STATIONS.STATION_ID = TBL_OBSERVATIONS.STATION_ID
WHERE EXTRACT(year FROM TBL_OBSERVATIONS.OBSERVATION_DATE)='2015'
AND EXTRACT(month FROM TBL_OBSERVATIONS.OBSERVATION_DATE)='8'
GROUP BY STATE_ABBR
ORDER BY STATE_ABBR
The query currently returns a count for each state in which an observation has been made, as shown here:
STATE | OBSERVATIONS
AZ | 131
CA | 30
CO | 9
FL | 6
...and so on.
What I'd like to see is a count for every entry in TBL_STATES (which contains records for all 50 states + DC & PR):
STATE | OBSERVATIONS
AK | 0
AL | 0
AR | 0
AZ | 131
CA | 30
CO | 9
CT | 0
DC | 0
DE | 0
FL | 6
...etc.
I've also attempted variations of NVL(count(TBL_STATES.STATE_ABBR),0) without success.
What the heck am I missing here?

Just move the date conditions to the ON clause:
SELECT TBL_STATES.STATE_ABBR as State, count(o.station_id) AS Observations
FROM TBL_STATES LEFT JOIN
TBL_STATIONS
ON TBL_STATES.STATE_ABBR = TBL_STATIONS.STATION_STATE LEFT JOIN
TBL_OBSERVATIONS
ON TBL_STATIONS.STATION_ID = TBL_OBSERVATIONS.STATION_ID AND
EXTRACT(year FROM TBL_OBSERVATIONS.OBSERVATION_DATE) = 2015
EXTRACT(month FROM TBL_OBSERVATIONS.OBSERVATION_DATE) = 8
GROUP BY STATE_ABBR
ORDER BY STATE_ABBR;
Because you seem to want to count observations, I also fixed the COUNT() to be from the observations table. This is important when the left join does not find a match.
I am a big fan of using table aliases. Also, direct date comparisons can be more efficient than extract(). So, I would recommend:
SELECT st.STATE_ABBR as State, count(o.STATION_ID) AS Observations
FROM TBL_STATES st LEFT JOIN
TBL_STATIONS sta
ON st.STATE_ABBR = sta.STATION_STATE LEFT JOIN
TBL_OBSERVATIONS o
ON sta.STATION_ID = o.STATION_ID AND
o.OBSERVATION_DATE >= DATE '2015-08-01' AND
o.OBSERVATION_DATE < DATE '2015-09-01'
GROUP BY st.STATE_ABBR
ORDER BY st.STATE_ABBR;

Change the WHERE clause to AND moving the criteria to the join. The where clause is negating the left join making it an inner.
we may also have to use a sub query to limit the results first then join... or perhaps you need to be counting the occurrences of observation_date instead of station_state, since year isn't in tbl_stations...
SELECT TBL_STATES.STATE_ABBR as State, count(b.ovservation_date) AS Observations
FROM TBL_STATES
LEFT JOIN TBL_STATIONS ON TBL_STATES.STATE_ABBR = TBL_STATIONS.STATION_STATE
LEFT JOIN (SELECT * FROM TBL_OBSERVATIONS
WHERE EXTRACT(year FROM TBL_OBSERVATIONS.OBSERVATION_DATE)='2015'
AND EXTRACT(month FROM TBL_OBSERVATIONS.OBSERVATION_DATE)='8') B
ON TBL_STATIONS.STATION_ID = B.STATION_ID
GROUP BY STATE_ABBR
ORDER BY STATE_ABBR
Logically the left joins are returning NULL counts for those states... but then the where clause is ELIMINATING the records because there is no observation date thus NULL <> 2015, and the records are being excluded, negating the left join.

Related

Select MAX and RIGHT OUTER JOIN

I have platform to extract data from sql tables and so far all queries were generated by simple drag and drop tool. Now I am trying to change query manually, but it's not working as expected...
Can you take a look?
Query delivered by generator:
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY)
FROM
repo.MAT RIGHT OUTER JOIN inventory.INV ON (inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK)
WHERE
( inventory.INV.MRP_COMPANY_COD IN ('01','02') )
GROUP BY
1,
2
Results:
Material A | 2020.01.01 | 100
Material A | 2020.01.02 | 200
Material A | 2020.01.03 | 300
Material B | 2020.01.01 | 10
Material B | 2020.01.02 | 0
What I am looking for: only values for the latest date for each material.
Material A | 2020.01.03 | 300
Material B | 2020.01.02 | 0
I tried with MAX(inventory.INV.MRP_RQMT_DT), but no success. Any help is appreciated!
You can try the below -
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY)
FROM
repo.MAT RIGHT OUTER JOIN inventory.INV ON inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK
WHERE
inventory.INV.MRP_COMPANY_COD IN ('01','02') and inventory.INV.MRP_RQMT_DT=(select max(inventory.INV.MRP_RQMT_DT) from inventory.INV inv1 where inventory.INV.MRP_MAT_A_FK=inv1.MRP_MAT_A_FK)
GROUP BY 1, 2
You did not specify the database engine, but the RANK windows function works in many major ones (I will use T-SQL syntax).
SELECT * FROM (
SELECT
repo.MAT.MAT_A_COD,
inventory.INV.MRP_RQMT_DT,
SUM(inventory.INV.MRP_AVL_QTY),
RANK () OVER (PARTITION BY repo.MAT.MAT_A_COD ORDER BY inventory.INV.MRP_RQMT_DT) rn
FROM repo.MAT RIGHT OUTER JOIN inventory.INV ON (inventory.INV.MRP_MAT_A_FK=repo.MAT.MAT_A_PK)
WHERE inventory.INV.MRP_COMPANY_COD IN ('01','02')
GROUP BY 1, 2
)
WHERE rn = 1
You can use window functions:
SELECT m.MAT_A_COD, i.MRP_RQMT_DT,
SUM(i.MRP_AVL_QTY)
FROM repo.MAT LEFT JOIN
(SELECT i.*,
MAX(MRP_RQMT_DT) OVER (PARTITION BY MRP_MAT_A_FK ORDER BY DESC) as max_MRP_RQMT_DT
FROM inventory.INV i
) i
ON i.MRP_MAT_A_FK = r.MAT_A_PK AND
i.MRP_RQMT_DT = i.max_MRP_RQMT_DT
WHERE i.MRP_COMPANY_COD IN ('01', '02')
GROUP BY 1, 2;
Note other changes to the query:
Table aliases make the query easier to write and to read.
An outer join does not seem necessary at all. But if you do use one, it is probably on the MAT table, not the inventory table.
If you use an outer join, you should try to take the columns from the table where you are keeping all the rows -- the first table in a LEFT JOIN. I don't recommend RIGHT JOINs in general.

How to join the results of two queries in just one table grouped by YEAR and MONTH?

I have two tables,materials_students and components_students. Both of them has afinished_at column. material_student has a component_student_id column.
I need to count the number of components_students and materials_students (Where finished_at id is not NULL), extract month and year from finished_at, group the result by month and year and plot it in just one table, like this:
| Materials | Components | Month | Year
---------------------------------------------
| 45 3 1 2019
| 37 6 2 2019
| 63 8 3 2019
I know how to do this for one table only, but dont know how to join the results in just one table.
Find below how I did for one table:
FROM materials_students
LEFT JOIN students ON materials_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE materials_students.finished_at IS NOT NULL
GROUP BY YEAR, MONTH
ORDER BY YEAR, MONTH
Thanks!
The best is to assemble a subquery for each case, then join them.
select
ISNULL(M.yy, C.yy) [yy],
ISNULL(M.mm, C.mm) [mm],
ISNULL(number_material_students, 0) [number_material_students],
ISNULL(number_components_students, 0) [number_component_students]
from (
SELECT
year(materials_students.finished_at) yy,
month(materials_students.finished_at) mm,
count(*) number_material_students
FROM materials_students
LEFT JOIN students ON materials_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE materials_students.finished_at IS NOT NULL
GROUP BY year(materials_students.finished_at), month(materials_students.finished_at)
) M
full outer join (
SELECT
year(components_students.finished_at) yy,
month(components_students.finished_at) mm,
count(*) number_material_students
FROM components_students
LEFT JOIN students ON components_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE components_students.finished_at IS NOT NULL
GROUP BY year(materials_students.finished_at), month(materials_students.finished_at)
) C
ON C.yy = M.yy AND C.mm = M.mm
ORDER BY 1, 2
I had to make a FULL OUTER JOIN between the subqueries, because there may have been year/months that appear only on materials, but not on components, and vice-versa.
To retrieve the year I use the ISNULL() function, so in case year is not filled from the materials subquery, I use it from the components subquery. Similar reasoning applies to all other resulting columns.

Full outer join not bringing all results SQL-Server

I have an issue trying to retrieve all results from a join. I have set up a similar scenario in SQL fiddle and it works but in SQL Server it doesn't. I want to bring results for everything if they're either invoiced or shipped.
The result i am getting in SQL-SERVER is
| No | Order1 | Shipdate | No | Order1 | InvDate |
|-----|--------|----------|--------|--------|----------|
| 111 | 222 | 17-01-18 | 111 | 222 | 24-01-18 |
| 222 | 333 | 18-01-18 | 222 | 333 | 24-01-18 |
Even if the change the join to full outer, right join i still get this result.
I would have thought if i use full outer it will bring all the results back regardless of matches but it doesnt.
What am i missing to give me the full outer result? Thanks
sql fiddle - http://sqlfiddle.com/#!18/89943/1
This is your query:
SELECT S.No, s.Order1, s.Shipdate, i.No, i.Order1, i.InvDate
FROM Ship S LEFT JOIN
Invoice I
ON s.No=i.No AND s.Order1 = i.Order1
WHERE S.Person = 1;
Changing the LEFT JOIN to FULL JOIN doesn't change anything. The WHERE clause turns the FULL JOIN into a LEFT JOIN, because non-matching rows on that table have NULL values and fail the WHERE condition.
The OUTER JOIN query can be setup as follows:
SELECT
S.No, s.Order1, s.Shipdate,
i.No, i.Order1, i.InvDate
FROM Ship S
FULL JOIN Invoice I
ON s.No = i.No AND
s.Order1=i.Order1
-- WHERE S.Person = 1
and produces output
But by adding following filtering criteria to WHERE clause, the OUTER JOIN will produce exactly the same result with LEFT OUTER JOIN
WHERE S.Person = 1
Left Join
SELECT
S.No, s.Order1, s.Shipdate,
i.No, i.Order1, i.InvDate
FROM Ship S
LEFT JOIN Invoice I
ON s.No = i.No AND
s.Order1=i.Order1
WHERE S.Person = 1
In this sense, the Full Outer Join will give any rows that meet the ON clause criteria (S.[No]=I.[No] AND S.Order1=I.Order1) then any rows that don't meet the ON clause criteria but do meet the WHERE criteria.
So it is necessary to include WHERE I.Person=1 as well as WHERE S.Person=1 but qualified by the OR Clause for those rows that have not been shipped yet or have been shipped but not invoiced (if any).
SELECT S.[No],S.Order1,S.Shipdate,I.No,I.Order1,I.InvDate
FROM Ship S FULL OUTER JOIN Invoice I
ON S.[No]=I.[No] AND S.Order1=I.Order1
WHERE S.Person=1 OR I.Person=1
I am assuming you are looking only for Person 1's data.
You can use cross join to get the result for this.
Code
SELECT S.No, s.Order1, s.Shipdate, i.No, i.Order1, i.InvDate
FROM Ship S CROSS JOIN
Invoice I
WHERE S.Person = 1;

Select all from max date

Good morning,
I am writing a SQL query for the latest metal prices with the latest date they were put into the database. Example table below:
ID Date Created
1 01/01/01 01:01
2 01/01/01 01:02
3 01/01/01 01:03
4 01/01/01 01:04
1 02/01/01 01:01
2 02/01/01 01:02
So from this I want the following result:
ID Date Created
1 02/01/01 01:01
2 02/01/01 01:02
When I run the below query it is just giving me the last one entered into the date base so from the above example it would be ID 2 DateCreated 02/01/01 01:02. The query I am using is below:
SELECT mp.MetalSourceID, ROUND(mp.PriceInPounds,2),
mp.UnitPrice, mp.HighUnitPrice, mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice, ms.MetalSourceName,
ms.UnitBasis, cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
WHERE DateCreated = (SELECT MAX (DateCreated) FROM tblMetalPrice)
GROUP BY mp.MetalSourceID;
Could anyone please help its driving me crazy not knowing and my brain is dead this friday morning.
Thanks
Use a correlated subquery for the where clause:
WHERE DateCreated = (SELECT MAX(DateCreated) FROM tblMetalPrice mp2 WHERE mp2.id = mp.id)
You can join on a subquery, and I don't think you'll need the group by, or indeed the where clause (because that's handled by the join).
SELECT mp.MetalSourceID,
ROUND(mp.PriceInPounds,2),
mp.UnitPrice,
mp.HighUnitPrice,
mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice,
ms.MetalSourceName,
ms.UnitBasis,
cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
INNER JOIN (SELECT ID,MAX(DateCreated) AS maxdate FROM tblMetalPrice GROUP BY ID) AS md
ON md.ID = mp.ID
AND md.maxdate = mp.DateCreated
with maxDates as
(select max(datecreated) maxd, ids grp , count(1) members from s_tableA group by ids having count(1) > 1)
select ids, datecreated from s_tableA,maxDates
where maxd = datecreated and ids = grp;
this query will give your desired result. Correlated sub queries tend to consume lot of processing time, because for each row of the outer query it has to process all the rows in the inner query.

3 joins and where clause together

I have 3 tables
bl_main (bl_id UNIQUE, bl_area)
bl_details (bl_id UNIQUE, name)
bl_data(bl_id, month, paper_tons, bottles_tons)
bl_id is not unique in the last table. There will be multiple rows of same bl_id.
I am trying to retrieve data in the following way
bl_id | name | bl_area | sum(paper_tons) | sum (bottles_tons) | paper_tons | bottles_tons
sum(paper_tons) should return the sum of all the paper tons for the same bl_id like Jan to December.
Using the below query i am able to retrieve all the data correctly except in the result, there are multiple occurances of bl_ids(From bl_data table).
SELECT bl_main.bl_id,name,bl_area,sums.SummedPaper, sums.SummedBottles,paper_tons,bottles_tons
FROM bl_main
JOIN bl_details ON
bl_main.bl_id= bl_details.bl_id
left outer JOIN bl_data ON
bl_data.bl_id= bl_main.bl_id
left outer JOIN (
SELECT bl_id, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles
FROM bl_data
GROUP by bl_id) sums ON sums.bl_id = bl_main.bl_id
I wanto retrieve only the unique values of bl_ids without repetition and it should contain the bl_id which has the max month and not all the months for the same bl_id.
For ex:
INCORRECT
**0601** University Hall 75.76 17051 1356 4040 1154 **11**
**0601** University Hall 75.76 17051 1356 9190 101 **12**
**0605** UIC Student 22.86 3331 14799 0 356 **8**
CORRECT
**0601** University Hall 75.76 17051 1356 9190 101 **12**
**0605** UIC Student 22.86 3331 14799 0 356 **8**
I know I can get the max value using
WHERE Month = (SELECT MAX(Month)
but where exactlt should i add this in the query and should i change the join definition.
Any help is highly appreciated as i am new to sql. Thanks in advance.
You have two tables that probably should be combined into one (bl_main and bl_details). But putting that aside, what you need is a self-join subquery to select the row with the max month. Something like the following (untested):
SELECT bl_main.bl_id, bl_details.name, bl_main.bl_area, sums.sum_paper_tons,
sums.sum_bottles_tons, maxmonth.paper_tons, maxmonth.bottles_tons
FROM bl_main
INNER JOIN bl_details ON bl_main.bl_id = bl_details.bl_id
LEFT OUTER JOIN (SELECT bl_id, SUM(paper_tons) AS sum_paper_tons,
SUM(bottles_tons) AS sum_bottles_tons
FROM bl_data
GROUP BY bl_id) sums ON bl_main.bl_id = sums.bl_id
LEFT OUTER JOIN (SELECT bl_id, paper_tons, bottles_tons
FROM bl_data data2
INNER JOIN (SELECT bl_id, MAX(month) AS max_month
FROM bl_data
GROUP BY bl_id) m
ON m.bl_id = data2.bl_id
AND m.max_month = data2.month) maxmonth
ON bl_main.bl_id = maxmonth.bl_id
You can join the table containing the month against itself, using a subquery of the form:
Select *
From mytable m
Inner Join (Select max(Month) as Month, myId
From mytable
Group By myId) mnth
On mnth.myId = m.myId and mnth.Month = m.Month
Your JOIN clause
left outer JOIN bl_data ON
bl_data.bl_id= bl_main.bl_id
does not specify which month to select for the data you are displaying with paper_tons and bottles_tons.
You could update that JOIN to only contain the max month, and this should limit the entries, like so:
left outer JOIN (SELECT bl_id, MAX(Month) as Month from bl_data GROUP BY bl_id) as Month
ON Month.bl_id = bl_main.bl_id
left outer JOIN bl_data ON
bl_data.bl_id = bl_main.bl_id AND bl_data.Month = Month.bl_Month
I think this query is what you are looking for
SELECT bl_main.bl_id,name, bl_area, sums.SummedPaper, sums.SummedBottles, paper_tons, bottles_tons
FROM bl_main
JOIN bl_details ON bl_main.bl_id= bl_details.bl_id
left outer JOIN bl_data ON bl_data.bl_id= bl_main.bl_id
left outer JOIN
(
SELECT bl_id, month, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles
FROM bl_data
WHERE month in
(SELECT MAX(month) FROM bl_data GROUP BY bl_id)
GROUP BY bl_id, month
) sums ON sums.bl_id = bl_main.bl_id
I wanted to just add a comment to the answer lc gave, but I don't have 50 reputation points yet. It is a link to an article that I believe explains this question and adds the why the solution that lc gave is correct.
http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables