Non null value associated with least value in other column - sql

My data is like this:
Desired output:
I have tried using following SQL:
CASE
WHEN (MINDAY_DIFF > 0) AND (MINDAY_DIFF IS NOT NULL)
THEN FIRST_VALUE(BP_MED) OVER (PARTITION BY ID ORDER BY MINDAY_DIFF ASC)
END AS DRUG
This returns NULL.
I also tried
CASE
WHEN (MINDAY_DIFF > 0)
THEN BP_MED
ELSE NULL
END AS DRUG
It returns both non-null values of BP_MED.
I also tried NVL but that didn't work either.
Since it is in Netezza. There are fewer solutions online. Please help.

The concept here is the following working inside out:
We didn't always have analytics to make things easier: So not knowing Netezza I took a more... antiquated approach. There may be better/more efficient ways; which I would look for given a place to play around with; but I think this would work in most any RDBMS as I tried to avoid any RDBMS Specific aspects unless we're dealing with pre left join supported RDBMS.
Find the Min ID and the day difference for that ID in a result set (MinAndID)
LEFT Join back to the baseSet to get all possible values specifically to get BP_Med
Then Join back to base Table to ensure we get ALL records and then populate only the BP_Med which links to the minDay_Diff. Since it's a left join only 1 record per ID should return.
UNTESTED:
SELECT A.*, DesiredDrug.BP_MED
FROM TABLE A
LEFT JOIN (SELECT ID, Fill_Date, BP_MED, MinDay_Diff
FROM TABLE BaseSet
INNER JOIN (SELECT MIN(MINDAY_DIFF) MDD, ID
FROM TABLE
WHERE ID is not null
GROUP BY ID) MinAndID
on BaseSet.ID = MinAndID.ID
and BaseSet.MinDay_Diff = MinAndID.MDD) DesiredDrug
on A.ID = DesiredDrug
and A.Fill_Date = DesiredDrug.Fill_Date
and A.BP_Med= DesiredDrug.BP_Med
and A.MinDay_Diff = DesiredDrug.MinDay_Diff

Related

Complex SQL View with Joins & Where clause

My SQL skill level is pretty basic. I have certainly written some general queries and done some very generic views. But once we get into joins, I am choking to get the results that I want, in the view I am creating.
I feel like I am almost there. Just can't get the final piece
SELECT dbo.ics_supplies.supplies_id,
dbo.ics_supplies.old_itemid,
dbo.ics_supplies.itemdescription,
dbo.ics_supplies.onhand,
dbo.ics_supplies.reorderlevel,
dbo.ics_supplies.reorderamt,
dbo.ics_supplies.unitmeasure,
dbo.ics_supplies.supplylocation,
dbo.ics_supplies.invtype,
dbo.ics_supplies.discontinued,
dbo.ics_supplies.supply,
dbo.ics_transactions.requsitionnumber,
dbo.ics_transactions.openclosed,
dbo.ics_transactions.transtype,
dbo.ics_transactions.originaldate
FROM dbo.ics_supplies
LEFT OUTER JOIN dbo.ics_orders
ON dbo.ics_supplies.supplies_id = dbo.ics_orders.suppliesid
LEFT OUTER JOIN dbo.ics_transactions
ON dbo.ics_orders.requisitionnumber =
dbo.ics_transactions.requsitionnumber
WHERE ( dbo.ics_transactions.transtype = 'PO' )
When I don't include the WHERE clause, I get 17,000+ records in my view. That is not correct. It's doing this because we are matching on a 1 to many table. Supplies table is 12,000 records. There should always be 12,000 records. Never more. Never less.
The pieces that I am missing are:
I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
Sample code or anything would help a lot. I have done a lot of research on joins and it's still very confusing to get what I need for results.
EDIT/Update
I feel as if I asked my question in the wrong way, or didn't give a good overall view of what I am asking. For that, I apologize. I am still very new to SQL, but trying hard.
ICS_Supplies Table has 12,810 records
ICS_Orders Table has 3,666 records
ICS_Transaction Table has 4,701 records
In short, I expect to see a result of 12,810 records. No more and no less. I am trying to create a View of ALL records from the ICS_Supplies table.
Not all records in Supply Table are in Orders and or Transaction Table. But still, I want to see all 12,810 records, regardless.
My users have requested that IF any of these supplies have an open PO (ICS_Transactions.OpenClosed = 'Open' and ICS_Transactions.InvType = 'PO') Then, I also want to see additional fields from ICS_Transactions (ICS_Transactions.OpenClosed, ICS_Transactions.InvType, ICS_Transactions.OriginalDate, ICS_Transactions.RequsitionNumber).
If there are no open PO's for supply record, then these additional fields should be blank/null (regardless to what data is in these added fields, they should display null if they don't meet the criteria).
The ICS_Orders Table is nly needed to hop from the ICS_Supplies to the ICS_Transactions (I first, need to obtain the Requisition Number from the Orders field, if there is one).
I am sorry if I am not doing a good job to explain this. Please ask if you need clarification.
Here's a simplified version of Ross Bush's answer (It removes a join from the CTE to keep things more focussed, speed things up, and cut down the code).
;WITH
ordered_ics_transactions AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY requisitionnumber
ORDER BY originaldate DESC
)
AS seq_id
FROM
dbo.ics_transactions
)
SELECT
s.supplies_id, s.old_itemid,
s.itemdescription, s.onhand,
s.reorderlevel, s.reorderamt,
s.unitmeasure, s.supplylocation,
s.invtype, s.discontinued,
s.supply,
t.requsitionnumber, t.openclosed,
t.transtype, t.originaldate
FROM
dbo.ics_supplies AS s
LEFT OUTER JOIN
dbo.ics_orders AS o
ON o.supplies_id = s.suppliesid
LEFT OUTER JOIN
ordered_ics_transactions AS t
ON t.requisitionnumber = o.requisitionnumber
AND t.transtype = 'PO'
AND t.seq_id = 1
This will only join the most recent transaction record for each requisitionnumber, and only if it has transtype = 'PO'
IF you want to reverse that (joining only transaction records that have transtype = 'PO', and of those only the most recent one), then move the transtype = 'PO' filter to be a WHERE clause inside the ordered_ics_transactions CTE.
You can possibly work with the query below to get what you need.
1. I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I would solve this by creating a CTE with all the ICS_Transaction fields needed in the query, rank-ordered by OPriginalDate, partitioned by suppliesid.
2. I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
If you move the condition from the WHERE clause to the LEFT JOIN then ICS_Transactions not matching the criteria will be peeled and replaced with null values with the rest of the query records.
;
WITH ReqNumberRanked AS
(
SELECT
dbo.ICS_Orders.SuppliesID,
dbo.ICS_Transactions.RequisitionNumber,
dbo.ICS_Transactions.TransType,
dbo.ICS_Transactions.OriginalDate,
dbo.ICS_Transactions.OpenClosed,
RequisitionNumberRankReversed = RANK() OVER(PARTITION BY dbo.ICS_Orders.SuppliesID, dbo.ICS_Transactions.RequisitionNumber ORDER BY dbo.ICS_Transactions.OriginalDate DESC)
FROM
dbo.ICS_Orders
LEFT OUTER JOIN dbo.ICS_Transactions ON dbo.ICS_Orders.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
)
SELECT
dbo.ICS_Supplies.Supplies_ID, dbo.ICS_Supplies.Old_ItemID,
dbo.ICS_Supplies.ItemDescription, dbo.ICS_Supplies.OnHand,
dbo.ICS_Supplies.ReorderLevel, dbo.ICS_Supplies.ReorderAmt,
dbo.ICS_Supplies.UnitMeasure,
dbo.ICS_Supplies.SupplyLocation, dbo.ICS_Supplies.InvType,
dbo.ICS_Supplies.Discontinued, dbo.ICS_Supplies.Supply,
ReqNumberRanked.RequsitionNumber,
ReqNumberRanked.OpenClosed,
ReqNumberRanked.TransType,
ReqNumberRanked.OriginalDate
FROM
dbo.ICS_Supplies
LEFT OUTER JOIN dbo.ICS_Orders ON dbo.ICS_Supplies.Supplies_ID = dbo.ICS_Orders.SuppliesID
LEFT OUTER JOIN ReqNumberRanked ON ReqNumberRanked.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
AND (ReqNumberRanked.TransType = 'PO')
AND ReqNumberRanked.RequisitionNumberRankReversed = 1

Compare value of one field in one table to the total sum of one column in another table

I'm having trouble with executing a query to compare where the value of a column in one table is not equal to the sum of another column in a different table. Below is the query I have been trying to execute:
select id.invoice_no,sum(id.bank_charges),
from db2apps.invoice_d id
inner join db2apps.invoice_h ih on (id.invoice_no = ih.invoice_no)
group by id.invoice_no
having coalesce(sum(id.bank_charges), 0) != ih.tax_value
with ur;
I tried with joining on the tables, the group by having format, etc and have had no luck. I really want to select id.invoice_no, ih.tax_value, and sum(id.bank_charges) in the result set, and also grab the data where the sum(id.bank_charges) is not equal to the value of ih.tax_value. Any help would be appreciated.
Perhaps this solves your problem:
select ih.invoice_no, ih.tax_value, sum(id.bank_charges)
from db2apps.invoice_h ih left join
db2apps.invoice_d id
on id.invoice_no = ih.invoice_no
group by ih.invoice_no, ih.tax_value
having coalesce(sum(id.bank_charges), 0) <> ih.tax_value;
The most logical way is probably to SUM the invoice detail first.
SELECT IH.INVOICE_NO
, IH.TAX_VALUE
FROM
DB2APPS.INVOICE_H IH
JOIN
( SELECT INVOICE_NO
, COALESCE(SUM(BANK_CHARGES),0) AS BANK_CHARGES
FROM
DB2APPS.INVOICE_D
GROUP BY
INVOICE_NO
) ID
ON
ID.INVOICE_NO = IH.INVOICE_NO
WHERE
ID.BANK_CHARGE <> IH.TAX_VALUE
Generally, you never need to use HAVING in SQL and often your code will be clearer and easier to follow if you do avoid using it (even if it it sometimes a bit longer).
P.S. you can remove the COALESCE if BANK_CHARGES is NOT NULL.

Need to make SQL subquery more efficient

I have a table that contains all the pupils.
I need to look through my registered table and find all students and see what their current status is.
If it's reg = y then include this in the search, however student may change from y to n so I need it to be the most recent using start_date to determine the most recent reg status.
The next step is that if n, then don't pass it through. However if latest reg is = y then search the pupil table, using pupilnumber; if that pupil number is in the pupils table then add to count.
Select Count(*)
From Pupils Partition(Pupils_01)
Where Pupilnumber in (Select t1.pupilnumber
From registered t1
Where T1.Start_Date = (Select Max(T2.Start_Date)
From registered T2
Where T2.Pupilnumber = T1.Pupilnumber)
And T1.reg = 'N');
This query works, but it is very slow as there are several records in the pupils table.
Just wondering if there is any way of making it more efficient
Worrying about query performance but not indexing your tables is, well, looking for a kind word here... ummm... daft. That's the whole point of indexes. Any variation on the query is going to be much slower than it needs to be.
I'd guess that using analytic functions would be the most efficient approach since it avoids the need to hit the table twice.
SELECT COUNT(*)
FROM( SELECT pupilnumber,
startDate,
reg,
rank() over (partition by pupilnumber order by startDate desc) rnk
FROM registered )
WHERE rnk = 1
AND reg = 'Y'
You can look execution plan for this query. It will show you high cost operations. If you see table scan in execution plan you should index them. Also you can try "exists" instead of "in".
This query MIGHT be more efficient for you and hope at a minimum you have indexes per "pupilnumber" in the respective tables.
To clarify what I am doing, the first inner query is a join between the registered table and the pupil which pre-qualifies that they DO Exist in the pupil table... You can always re-add the "partition" reference if that helps. From that, it is grabbing both the pupil AND their max date so it is not doing a correlated subquery for every student... get all students and their max date first...
THEN, join that result to the registration table... again by the pupil AND the max date being the same and qualify the final registration status as YES. This should give you the count you need.
select
count(*) as RegisteredPupils
from
( select
t2.pupilnumber,
max( t2.Start_Date ) as MostRecentReg
from
registered t2
join Pupils p
on t2.pupilnumber = p.pupilnumber
group by
t2.pupilnumber ) as MaxPerPupil
JOIN registered t1
on MaxPerPupil.pupilNumber = t1.pupilNumber
AND MaxPerPupil.MostRecentRec = t1.Start_Date
AND t1.Reg = 'Y'
Note: If you have multiple records in the registration table, such as a person taking multiple classes registered on the same date, then you COULD get a false count. If that might be the case, you could change from
COUNT(*)
to
COUNT( DISTINCT T1.PupilNumber )

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

SQL JOIN limit results to rows where specific value does not exist

I am joining two tables using SQL. I'm joining a table which contains charter flight information and a table which contains the crew assigned. In my results, I only want to display the rows that only have a value of "Pilot" in the the crew table and not "Copilot" or both.
SELECT * FROM TABLE_A JOIN TABLE_B ON (TABLE_A.Value = TABLE_B.Value) WHERE TABLE_A.OtherValue = 'Pilot'
This is off the top of my head, so some syntax may be off. The main point is the WHERE clause. You can specify the value that you are looking for in the column (in your case you are looking for Pilot).
EDIT: To prevent a value you can do something like WHERE TABLE.VALUE != 'Copilot' != may need to be written as <> depending on the what SQL it is.
EDIT2: My SQL-Server is throwing a hissy and not connecting, so this is also entirely off the top of my head and I think it's a bit of a hack-job, but I think it'll do the job. :)
SELECT [CHARTER].*, COUNT(*) as Tally FROM [CHARTER] JOIN [CREW] ON ([CHARTER].[CHAR_TRIP] = [CREW].[CHAR_TRIP]) WHERE [CREW].[CREW_JOB] = 'PILOT' OR [CREW].[CREW_JOB] = 'COPILOT' GROUP BY [CHARTER].* HAVING Tally = 1
This assume that all flights have a pilot, but not all flights have a co-pilot. To get the exact display you want, you might have to use it as a sub-query (to remove the Tally column).
SELECT *
FROM charter ch
JOIN crew cr ON ch.char_trip = cr.char_trip
WHERE NOT EXISTS(SELECT *
FROM crew cr2
WHERE cr2.char_trip = ch.char_trip
AND cr2.crew_job != 'PILOT')
I think that should do the trick. The join to the crew table in line 3 is optional, and only if you need results from that table. The NOT EXISTS anti-join is what evaluates all crew for a given trip and checks for any that are not pilots.
You should really help us out here with the schema for us to provide you with a decent query. I think the most important thing here is how do you determine who is a pilot and/or copilot and how do you relate each person to the flight.
I think something like this might help:
SELECT * FROM Charter C
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'PILOT' AND (SELECT COUNT(*) FROM Charter
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'CoPilot'
AND Charter.Chart_Trip = C.ChartTrip) = 0
Although this might not be the cleanest solution.. it should do the work.