How to find highest count of result set using multiple tables in SQL (Oracle) - sql

I have four tables. Here are the skeletons...
ACADEMIC_TBL
academic_id
academic_name
AFFILIATION_TBL
academic_id*
institution_id*
joined_date
leave_date
INSTITUTION_TBL
institution_id
institution_name
REVIEW_TBL
academic_id*
institution_id*
date_posted
review_score
Using these tables I need to find the academic (displaying their name, not ID) with the highest number of reviews and the institution name (not ID) they are currently affiliated with. I imagine this will need to be done using multiple sub-select scripts but I'm having trouble figuring out how to structure it.

this will work:
SELECT at.academic_name,
it.institution_name,
Max(rt.review_score),
from academic_tbl at,
affiliation_tbl afft,
institution_tbl it,
review_tbl rt
WHERE AT.academic_id=afft.academic_id
AND afft.institution_id=it.institution_id
AND afft.academic_id=rt.academic_id
GROUP BY at.academia_name,it.instituton_id

You need an aggregated query that JOINs all 4 tables to count how many reviews were performed by each academic.
Query :
SELECT
inst.institution_name,
aca.academic_name,
COUNT(*)
FROM
academic_tbl aca
INNER JOIN affiliation_tbl aff ON aff.academic_id = aca.academic_id
INNER JOIN institution_tbl inst ON inst.institution_id = aff.institution_id
INNER JOIN review_tbl rev ON rev.academic_id = aca.academic_id AND rev.institution_id = aff.institution_id
GROUP BY
inst.institution_name,
aca.academic_name,
inst.institution_id,
aca.academic_id
NB :
added the academic and institution id to the GROUP BY clause to prevent potential academics or institutions having the same name from being (wrongly) grouped together
if the same academic performed reviews for different institutions, then you will find one row for each academic / institution couple, which, if I understood you right, is what you want

Try this one:
select
inst.institution_name
, aca.academic_name
from
academic_tbl aca
, institution_tbl inst
, affiliation_tbl aff
, review_tbl rev
, (
select
max(rt.review_score) max_score
from
review_tbl rt
, affiliation_tbl aff_inn
where
rt.date_posted >= aff_inn.join_date
and rt.date_posted <= aff_inn.leave_date
and rt.academic_id = aff_inn.academic_id
and rt.institution_id = aff_inn.institution_id
)
agg
where
aca.academic_id = inst.academic_id
and inst.institution_id = aff.institution_id
and aff.institution_id = rev.institution_id
and aff.academic_id = rev.academic_id
and rev.date_posted >= aff.join_date
and rev.date_posted <= aff.leave_date
and rev.review_score = agg.max_score
;
It might return more than one academic, if there are more with the same score (maximum one).

Related

Joining multiple CTEs

I am working on a database of a large retail store.
I have to query data from multiple tables to get numbers such as revenue, raw proceeds and compare different time periods.
Most of it is quite easy but I was struggling to work out a way of joining multiple CTEs.
I made a fiddle so you know what I am talking about.
I simplified the structure a lot and left out quite a few columns in the subqueries because they do not matter in this case.
As you can see every row in every table has country and brand in it.
The final query has to be grouped by those.
What I first tried was to FULL JOIN all the tables, but that didn't work in some cases as you can see here: SQLfiddle #1. Note the two last rows which did not group correctly.
Select Coalesce(incoming.country, revenue.country, revcompare.country,
openord.country) As country,
Coalesce(incoming.brand, revenue.brand, revcompare.brand,
openord.brand) As brand,
incoming.OrdersNet,
openord.OpenOrdersNet,
revenue.Revenue,
revenue.RawProceeds,
revcompare.RevenueCompare,
revcompare.RawProceedsCompare
From incoming
Full Join openord On openord.country = incoming.country And
openord.brand = incoming.brand
Full Join revenue On revenue.country = incoming.country And
revenue.brand = incoming.brand
Full Join revcompare On revcompare.country = incoming.country And
revcompare.brand = incoming.brand
Group By incoming.OrdersNet,
openord.OpenOrdersNet,
revenue.Revenue,
revenue.RawProceeds,
revcompare.RevenueCompare,
revcompare.RawProceedsCompare,
incoming.country,
revenue.country,
openord.country,
revcompare.country,
incoming.brand,
revenue.brand,
revcompare.brand,
openord.brand
Order By country,
brand
I then rewrote the query keeping all the CTEs. I added another CTE (basis) which UNIONs all the possible country and brand combinations and left joined on that one.
Now it works fine (check it out here -> SQLfiddle #2) but it just seems so complicated. Isn't there an easier way to achieve this? The only thing I probably won't be able to change are the CTEs as in real life they are way more complex.
WITH basis AS (
SELECT Country, Brand FROM incoming
UNION
SELECT Country, Brand FROM openord
UNION
SELECT Country, Brand FROM revenue
UNION
SELECT Country, Brand FROM revcompare
)
SELECT
basis.Country,
basis.Brand,
incoming.OrdersNet,
openord.OpenOrdersNet,
revenue.Revenue,
revenue.RawProceeds,
revcompare.RevenueCompare,
revcompare.RawProceedsCompare
FROM basis
LEFT JOIN incoming On incoming.Country = basis.Country AND incoming.Brand = basis.Brand
LEFT JOIN openord On openord.Country = basis.Country AND openord.Brand = basis.Brand
LEFT JOIN revenue On revenue.Country = basis.Country AND revenue.Brand = basis.Brand
LEFT JOIN revcompare On revcompare.Country = basis.Country AND revcompare.Brand = basis.Brand
Thank you all for your help!
Since you only work with two tables, orders and rev, consider conditional aggregation by moving WHERE conditions to CASE logic for single aggregate query. Also, consider only one CTE for all possible country/brand pairs for LEFT JOIN on the two tables.
WITH cb AS (
SELECT Country, Brand FROM orders
UNION
SELECT Country, Brand FROM rev
)
SELECT cb.Country
, cb.Brand
, SUM(o.netprice) AS OrdersNet
, SUM(CASE
WHEN o.isopen = 1
THEN o.netprice
END) AS OpenOrdersNet
, SUM(CASE
WHEN r.bdate BETWEEN '2020-12-01' AND '2020-12-31'
THEN r.netprice
END) AS Revenue
, SUM(CASE
WHEN r.bdate BETWEEN '2020-12-01' AND '2020-12-31'
THEN r.rpro
END) AS RawProceeds
, SUM(CASE
WHEN r.bdate BETWEEN '2020-11-01' AND '2020-11-30'
THEN r.netprice
END) AS RevenueCompare
, SUM(CASE
WHEN r.bdate BETWEEN '2020-11-01' AND '2020-11-30'
THEN r.rpro
END) AS RawProceedsCompare
FROM cb
LEFT JOIN orders o
ON cb.Country = o.Country
AND cb.Brand = o.Brand
LEFT JOIN rev r
ON cb.Country = r.Country
AND cb.Brand = r.Brand
GROUP BY cb.Country
, cb.Brand
SQL Fiddle

How to use 1 SQL query related to date and time in order to compare value difference

SELECT c.treatment_category, a.treatment_id, MAX(a.counts - b.counts) AS ReviewDifference
FROM
(SELECT treatment_id, COUNT(treatment_id) AS counts
FROM review
WHERE DATE(review.created) BETWEEN DATE(TIMESTAMP'2016-01-01 00:00:00.0') AND DATE(TIMESTAMP'2016-12-31 23:59:59.999')
GROUP BY treatment_id) a
LEFT JOIN
(SELECT treatment_id, COUNT(treatment_id)
FROM review
WHERE DATE(review.created) BETWEEN DATE(TIMESTAMP'2015-01-01 00:00:00.0') AND DATE(TIMESTAMP'2015-12-31 23:59:59.999')
GROUP BY treatment_id) b
ON a = b
LEFT JOIN
(SELECT t.treatment_category AS category, r.treatment_id AS number
FROM treatment t
LEFT JOIN review r
ON t.treatment_id = r.treatment_id
GROUP BY category, number) c
ON b.treatment_id = c.number
GROUP BY a.treatment_id, c.treatment_category
ORDER BY ReviewDifference DESC
LIMIT 1;
I need some hints or simpler query on how to do this question since it is related to date and time. Thank you.
What treatment category has seen the biggest increase in reviews from 2015 to 2016?
Please see below for the tables.
I have provided my code snippet and I would like to find a simpler and cleaner way on writing the code.
SELECT t.treatment_id, t.treatment_name,
COUNT( CASE WHEN YEAR(created) = 2016 THEN r.review_id END)
- COUNT( CASE WHEN YEAR(created) = 2015 THEN r.review_id END) as review_count
FROM treatments t
JOIN reviews r
ON t.treatment_id = r.treatment_id
GROUP BY t.treatment_id, t.treatment_name,
ORDER BY review_count DESC

By store (number), list the maximum number of store visits made by a customer

The three tables that I'm linking are item_scan_fact, member_dimension and store_dimension. So far this is what I have:
SELECT
store_dimension.store_number,
member_dimension.member_number
COUNT (item_scan_fact.visit_number) AS NumVisits
FROM
member_dimension,
item_scan_fact
INNER JOIN store_dimension
ON item_scan_fact.member_key = member_dimension.member_key
AND item_scan_fact.store_key = store_dimension.store_key
GROUP BY
store_dimension.store_number,
member_dimension.member_number, NumVisits;
On the surface it appears solvable with a couple Common Table Expressions
Does this help point you in the right direction?
WITH s1 -- JJAUSSI: Find the visit_number_count by member_key and store_key
AS
(SELECT isf.member_key
,isf.store_key
-- JJAUSSI: DISTINCT resolves a potential 1:N (one to many) relationship here
,COUNT( DISTINCT isf.visit_number) AS visit_number_count
FROM item_scan_fact isf
GROUP BY isf.member_key
,isf.store_key),
s2 -- JJAUSSI: Find the visit_number_count_max by member_key
AS
(SELECT s1.member_key
,MAX(s1.visit_number_count) AS visit_number_count_max
FROM s1
GROUP BY s1.member_key)
-- JJAUSSI: Use this version to see the list of store_key values
-- that have the visit_number_count_max value. This has the potential
-- to be a 1:N relationship.
SELECT s1.member_key
,md.member_number
,s1.store_key
,sd.store_number
,s1.visit_number_count
FROM s2 INNER JOIN s1
ON s2.member_key = s1.member_key
AND s2.visit_number_count_max = s1.visit_number_count
INNER JOIN store_dimension sd
ON sd.store_key = s1.store_key
INNER JOIN member_dimension md
ON md.member_key = s1.member_key;
If this is what you were going for...congratulations! On to the next query!
If you ultimately are after a single store_key response for each member_key (basically you want to determine the member_key's "primary" store_key) then an additional step is probably needed (depending on your data).
Here are some ideas:
Evaluate the member_key based on some other summable facet of
item_scan_fact (like total price paid?)
If you consider all store_key values of equal merit that have the same visit_number_count_max value for a given member_key, just choose a store_key with MAX or MIN
You would seem to want:
SELECT member_number, MAX(NumVisits)
FROM (SELECT sd.store_number, md.member_number
COUNT(*) AS NumVisits
FROM member_dimension md JOIN
item_scan_fact isf
ON md.member_key = isf.member_key JOIN
store_dimension sd
ON isf.store_key = sd. store_key
GROUP BY sd.store_number, md.member_number
) sm
GROUP BY member_number;
If you want to return both the max and the matching customer number you can apply a Teradata SQL extension, qualify:
SELECT sd.store_number, md.member_number
COUNT(*) AS NumVisits
FROM member_dimension md JOIN
item_scan_fact isf
ON md.member_key = isf.member_key JOIN
store_dimension sd
ON isf.store_key = sd. store_key
GROUP BY sd.store_number, md.member_number
QUALIFY
rank() -- might return multiple rows with the same max, ROW_NUMBER a single row
over (partition by sd.store_number
order by NumVisits desc) = 1

SQL Server Query for Many to Many Relationship

I have the following Many to many relationship (See the picture below) in my SQL server.
In most cases there's are 2 rows in table tblWavelengths related to the table tblSensors, (in some cases only 1, and in extreme cases there can be 20 rows)
I made the following simple query to retrieve the data from those 3 tables :
select W.DateTimeID,S.SensorName,S.SensorType,W.Channel,W.PeakNr,W.Wavelength
from tblWavelengths as W
Left Join tblSensorWavelengths as SW on W.tblWavelengthID = SW.WavelengthID
Left Join tblSensors as S on SW.SensorID = S.SensorID
order by W.DateTimeID
After running this query I got the following results :
Here comes my problem. I want to write a query which filters only those Sensors (SensorName) which at a given moment in time (DateTimeID) has two rows (two different wavelengths) in the tblWavelengths table. So for example I want to have the results without
the 77902/001 Sensor - because it has only one row (one Wavelength) related to the tblSensors at a given moment in time
You could use a windowed function to find out the number of wavelengths for each sensorname/datetimeid combination:
WITH Data AS
( SELECT W.DateTimeID,
S.SensorName,
S.SensorType,
W.Channel,
W.PeakNr,
W.Wavelength,
[Wcount] = COUNT(*) OVER(PARTITION BY s.SensorName, d.DateTimeID)
from tblWavelengths as W
LEFT JOIN tblSensorWavelengths as SW
ON W.tblWavelengthID = SW.WavelengthID
LEFT JOIN tblSensors as S
ON SW.SensorID = S.SensorID
)
SELECT DateTimeID, SensorName, SensorType, Channel, PeakNr, WaveLength
FROM Data
WHERE Wcount = 2
ORDER BY DateTimeID;
ADDENDUM
As an after thought I realised that you might have two results for one sensor at the same time with the same wavelength, which would return 2 records, but not have two different wavelengths. Since windowed functions don't support the use of DISTINCT an alternative is below
WITH Data AS
( SELECT W.DateTimeID,
S.SensorName,
S.SensorType,
W.Channel,
W.PeakNr,
W.Wavelength,
W.tblWaveLengthID
from tblWavelengths as W
LEFT JOIN tblSensorWavelengths as SW
ON W.tblWavelengthID = SW.WavelengthID
LEFT JOIN tblSensors as S
ON SW.SensorID = S.SensorID
)
SELECT d.DateTimeID, d.SensorName, d.SensorType, d.Channel, d.PeakNr, d.WaveLength
FROM Data d
INNER JOIN
( SELECT DateTimeID, SensorName
FROM Data
GROUP BY DateTimeID, SensorName
HAVING COUNT(DISTINCT tblWaveLengthID) = 2
) t
ON t.DateTimeID = d.DateTimeID
AND t.SensorName = d.SensorName
ORDER BY d.DateTimeID;

Using Count() and Sum() correctly in SQL?

Ok, so I hope I can explain this question well enough, because I feel like this is going to be a tough one.
I have two tables I'm working with today. These look like:
#pset table (PersonID int, SystemID int, EntitlementID int, TargetID int)
#Connector table (TargetName varchar(10), fConnector bit)
The first table stores records that tell me, oh this person has this system, which is composed of these entitlements, whom have these targets. A little complicated, but stay with me. The second stores the TargetName and then whether or not that target has a connector in my not-so-theoretical system.
What I'm trying to do is merge these two tables so that I can see the target flag for each row in #pset. This will help me later as you'll see.
If each entitlement in a system has a connector to the target (the flag is true for all of them), then I'd like to know.
All the others should go into a different table.
This is what I tried to do, but it didn't work. I need to know where I went wrong. Hopefully someone with more experience than me will be able to answer.
-- If the count(123) = 10 (ten rows with SystemID = 123) and the sum = 10, cool.
select pset.*, conn.fConnector from #pset pset
inner join vuTargets vt
on vt.TargetID = pset.TargetID
inner join #conn conn
on conn.TargetName = vt.TargetName
group by ProfileID, SystemRoleID, EntitlementID, TargetID, fConnector
having count(SystemID) = sum(cast(fConnector as int))
order by ProfileID
and
-- If the count(123) = 10 (ten rows with SystemID = 123) and the sum <> 10
select pset.*, conn.fConnector from #pset pset
inner join vuTargets vt
on vt.TargetID = pset.TargetID
inner join #conn conn
on conn.TargetName = vt.TargetName
group by ProfileID, SystemRoleID, EntitlementID, TargetID, fConnector
having count(SystemID) <> sum(cast(fConnector as int))
order by ProfileID
Unfortunately, these do not work :(
Edit
Here is a screenshot showing the problem. Notice ProfileID 1599 has a SystemID of 1126567, but one of the entitlements doesn't have a connector! How can I get both of these rows into the second query? (above)
Your basic problem is that you're trying to roll up to two different record sets.
The initial set (the SELECT and GROUP BY clauses) is saying that you want one record for every difference in the set [ProfileId, SystemId, EntitlementId, TargetId, fConnector].
The second set (the HAVING clause) is saying that you want, for every row in the inital set, to compare it's COUNT of records with the SUM of the connections. However, because you've asked for grouping down to the individual flag, this has the effect of getting a single row for each flag (assuming 1-to-1 relationships). Effectively, you're saying - 'Hey, if this target has a connection? Yeah, I want it'.
What you appear to want is a roll up to the SystemId value. To do that, you will need to change your SELECT and GROUP BY clauses to only include the set [ProfileId, SystemId]. This will return only those rows (keyed from profile and system) who has all targets 'connected'. You will not be able to see the individual entitlements, targets, and whether they are connected (you will be able to infer that they will all be/not be connected, however).
EDIT:
In the interests of full disclosure, here is how you'd get something similar to your original results set, where it lists all EntitlementIds and TargetIds:
WITH all_connections as (SELECT pset.ProfileId, pset.SystemRoleId
FROM #pset pset
INNER JOIN vuTargets vt
ON vt.TargetId = pset.TargetId
INNER JOIN #conn conn
ON conn.TargetName = vt.TargetName
GROUP BY pset.ProfileId, pset.SystemRoleId
HAVING COUNT(pset.SystemRoleId)
= SUM(CAST(fConnector as INT)))
SELECT pset.*
FROM #pset pset
JOIN all_connections conn
ON conn.ProfileId = pset.ProfileId
AND conn.SystemRoleId = pset.SystemRoleId
This should get you a listing, down to the TargetId, of ProfileId/SystemRoleId keys where all EntitlementIds and TargetIds have a connection (or, flip the CTE = to <> for those where not all do).
Edit: fixed my original queries, updated the description as well
You can split this up: first find the TargetIDs that have an fConnector of 0. Then find the PersonID, SystemID pairs that have any target equal to the ones you found. Then select the relevant data: (this finds the PersonID, SystemID pair where at least one entitlement does not have a connector to the target)
with abc as (
select PersonID, SystemID
from pset P
where TargetID in (
select TargetID
from vuTargets V join connector C on V.TargetName = C.TargetName
where C.fConnector = 0
)
)
select P.PersonID, P.SystemID, P.EntitlementID, P.TargetID, C.fConnector
from pset P
join abc on ((P.PersonID = abc.PersonID) and (P.SystemID = abc.SystemID))
join vuTargets V on P.TargetID = V.TargetID
join connector C on V.TargetName = C.TargetName
The query to find the PersonID, SystemID pairs where all entitlements have a connector to the target is similar:
with abc as (
select PersonID, SystemID
from pset P
where TargetID in (
select TargetID
from vuTargets V join connector C on V.TargetName = C.TargetName
where C.fConnector = 0
)
)
select P.PersonID, P.SystemID, P.EntitlementID, P.TargetID, C.fConnector
from
pset P
join abc on ((P.PersonID <> abc.PersonID) or (P.SystemID <> abc.SystemID))
join vuTargets V on P.TargetID = V.TargetID
join connector C on V.TargetName = C.TargetName
The difference is in the join with the temp table (<> vs =). This is very similar to zero's answer, but doesn't use counts or sums.