Manipulating SQL table - sql

I have a table, the structure of which I have simplified to the smaller table below.
I want to manipulate the dataset below into the following form:
The new dataset will contain a single record for each case of DC, with a yes/no flag indicating if the NatureOfTumour has changed from DC to IN, and the time taken to change from DC to IN if applicable.
The change from DC to IN will be considered only if location has remained the same i.e. only those records should be considered where NatureOfTumour has changed from DC to IN and the location remained the same. ItemNo is the unique ID.
On a community member's advice I have pasted the table in text below as well, cleaned up as best as I could. The last column "Gen" is empty. ItemNo is the unique ID. Copying the text below to excel and doing a text-to-columns (separated by spaces) should give you the original table in a readable format. Sorry cant think of a better way to paste the table here.
ItemNo DateOfTest NatureOfTumour Location Centre Gen
2345 07/2006 DC P S-224
2345 12/2006 IN P S-224
2342 05/2004 DC Q B-266
3878 06/2006 DC P S-224
3878 05/2005 DC Q S-224
5678 09/2000 IN P S-224
5597 10/2001 DC P B-266
5597 01/1999 IN Q B-266

Try this. The LEAD function looks at the next row based on groups of ItemNo ordered by DateOfTest.
WITH abc AS (
SELECT
ItemNo
,DateOfTest
,NatureOfTumour
,Location
,Centre
,LEAD(NatureOfTumour) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureNature
,LEAD(Location) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureLocation
,LEAD(DateOfTest) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureDateOfTest
FROM test_results
)
SELECT
ItemNo
,DateOfTest
,NatureOfTumour
,CASE
WHEN FutureNature = 'IN'
AND FutureLocation = Location
THEN 'Yes'
ELSE 'NO'
END AS State_Change
,FutureDateOfTest - DateOfTest as Date_Diff
,Location
,Centre
from abc
WHERE NatureOfTumour = 'DC'

You need a self join. Something along these lines:
SELECT
d.ItemNo,
i.DateOfTest - d.DateOfTest AS datediff,
d.Location,
d.Centre,
d.Gen
FROM
(
SELECT
*
FROM demo
WHERE NatureOfTumour = 'DC'
) d
INNER JOIN
(
SELECT
*
FROM demo
WHERE NatureOfTumour = 'IN'
) i ON d.ItemNo = i.ItemNo
AND d.Location = i.Location;

If I understood your question, you could try this:
Let me know .
If you want in output only the rows who changed (GEN='Y'), change LEFT JOIN to INNER JOIN.
SELECT A.ITEMNO, A.DATEOFTEST, A.NATUREOFTUMOUR, A.LOCATION
, CASE WHEN B.NATUREOFTUMOUR='IN' AND A.LOCATION = B.LOCATION THEN 'Y' ELSE 'N' END AS GEN_NEW
, CASE WHEN B.NATUREOFTUMOUR='IN' AND A.LOCATION = B.LOCATION THEN B.DATEOFTEST-A.DATEOFTEST END AS TIME_PASS
FROM TE A
LEFT JOIN TE B ON A.ITEMNO=B.ITEMNO AND B.NATUREOFTUMOUR<>'DC' AND A.DATEOFTEST < B.DATEOFTEST
WHERE A.NATUREOFTUMOUR='DC
OR (I can't understand from your question)
SELECT A.ITEMNO, A.DATEOFTEST, A.NATUREOFTUMOUR, A.LOCATION
, CASE WHEN B.NATUREOFTUMOUR='IN' THEN 'Y' ELSE 'N' END AS GEN_NEW
, CASE WHEN B.NATUREOFTUMOUR='IN' THEN B.DATEOFTEST-A.DATEOFTEST END AS TIME_PASS
FROM TE A
LEFT JOIN TE B ON A.ITEMNO=B.ITEMNO AND B.NATUREOFTUMOUR<>'DC' AND A.DATEOFTEST < B.DATEOFTEST AND A.LOCATION = B.LOCATION
WHERE A.NATUREOFTUMOUR='DC'\\
Output
ITEMNO DATEOFTEST NATUREOFTUMOUR LOCATION GEN_NEW TIME_PASS
1 2345 01.07.2006 DC P Y 153
2 2342 01.06.2006 DC Q N NULL
3 5597 01.10.2001 DC P N NULL
4 3878 01.05.2005 DC Q N NULL
5 3878 01.06.2006 DC P N NULL

Related

Multiple joins chose latest record by date

I am trying to link four tables (3 of them key to this question). I need to pull the latest payment type used from T16. T16 links to T17 via headerid, which links to A10 via pledgeid.
I have tried this a bunch of different ways. The code below is giving me the latest date for each payment type, but what I really want is just the last payment type.
SELECT DISTINCT
A10.RecordId
,A10.AccountNumber
,A01.FamilyId
,a01.FamilyMemberType
,A10.PledgeCode --Child Number
,A10.OriginalPledgeId
,A10.PledgeId
,A01.FirstName
,A01.LastName
,A10.PledgeStatus
,A10.AmountPerGift
,A10.PledgeFrequency
,t16.PaymentType
FROM
A10_AccountPledges A10
LEFT JOIN
A01_AccountMaster A01 ON a01.AccountNumber = a10.AccountNumber
LEFT JOIN
T17_RecurringDonations T17 ON T17.PledgeId = A10.PledgeId
LEFT JOIN
T16_RecurringTransactionHeaders T16 ON T16.HeaderId = T17.HeaderId
INNER JOIN
(SELECT
T17.pledgeID
,MAX(T16.LastUsedDate) as lastdate
FROM
T17_RecurringDonations T17
LEFT JOIN
T16_RecurringTransactionHeaders T16 ON T16.HeaderId = T17.HeaderId
GROUP BY
T17.pledgeID) pm ON pm.PledgeId = A10.PledgeId --and pm.lastdate = T16.LastUsedDate
WHERE
A01.[Status] = 'A'
AND a10.PledgeId = 398353 --test case
You can do this like below:
select top 1 *
from (
YOUR QUERY HERE
)
order by lastdate desc;
Notes:
YOUR QUERY HERE is a placeholder which will hold your whole query as a subquery
you need to add the selection of lastdate to your subselect in order to make sure you can order by lastdate desc
EDIT
As #spaindc explained, this idea was applied, resulting in
SELECT distinct
A10.AccountNumber
,A01.FamilyId
,a01.FamilyMemberType
,A10.PledgeCode --Child Number
,A10.OriginalPledgeId
,A10.PledgeId
,A01.FirstName
,A01.LastName
,A10.PledgeStatus
,A10.PledgeFrequency
,(SELECT top 1 T16.PaymentType
FROM T16_RecurringTransactionHeaders T16, T17_RecurringDonations T17
where T16.HeaderId = T17.HeaderId
and T17.PledgeId = A10.PledgeId
order by t16.LastUsedDate desc
) as PaymentType
FROM A10_AccountPledges A10
left join A01_AccountMaster A01 on a01.AccountNumber = a10.AccountNumber
where A01.[Status] = 'A'
Thanks to Lajos for correct answer. Here is the final code for reference.
SELECT distinct
A10.AccountNumber
,A01.FamilyId
,a01.FamilyMemberType
,A10.PledgeCode --Child Number
,A10.OriginalPledgeId
,A10.PledgeId
,A01.FirstName
,A01.LastName
,A10.PledgeStatus
,A10.PledgeFrequency
,(SELECT top 1 T16.PaymentType
FROM T16_RecurringTransactionHeaders T16, T17_RecurringDonations T17
where T16.HeaderId = T17.HeaderId
and T17.PledgeId = A10.PledgeId
order by t16.LastUsedDate desc
) as PaymentType
FROM A10_AccountPledges A10
left join A01_AccountMaster A01 on a01.AccountNumber = a10.AccountNumber
where A01.[Status] = 'A'

Selecting Rows That Have One Value but Not Another

I need a get some rows of two tables with join that shoud have one value in a column (1407) but shouldn't have other value (1403)
These is the tables and the query:
select a.job, a.date, b.group from log a inner join active_tmp b
on a.jobno=b.jobno and a.no=b.no where b.list = 'N'
AND LOGDATE = TO_CHAR(TRUNC(SYSDATE),'YYYYMMDD')
and a.job not like 'HOUSE%'
and a.job not like 'CAR%' and (errorCode=1047 and errorCode<>1403);
LOG
JOB DATE LOGDATE JOBNO NO errorCode
MAM 20220123 20220125 33 22 1047
MAM 20220123 20220125 33 22 1403
DAD 20220122 20220125 11 99 1047
MAM 20220122 20220125 33 22 0323
DAD 20220122 20220125 11 99 0444
ACTIVE_TMP
JOB JOBNO NO GROUP LIST
MAM 33 22 LAPTOP N
MAM 33 22 LAPTOP N
DAD 11 99 KEY N
But I get:
MAM,20220123,LAPTOP
DAD,20220122,KEY
I need:
DAD,20220122,KEY
Because MAM have both codes (1047 and 1043).
To rephrase, I think you mean "I want to return matching rows that have error code 1047 but for which the same values of jobno, no, list do not have a corresponding row with error code 1403"
This part is redundant:
AND (errorCode = 1047 AND errorCode <> 1403);
If you are saying errorCode must be 1047, you are also saying it is not equal to 1403.
I think you want to select some rows into some result set, then check that there's not another row that disqualifies one of the selected rows from the final result.
So,
SELECT a.job,
a.date,
b.group
FROM _log a
INNER JOIN _active_tmp b
ON a.jobno = b.jobno
AND a.no = b.no
WHERE b.list = 'N'
AND LOGDATE = TO_CHAR(CURRENT_TIMESTAMP,'YYYYMMDD')
AND a.job NOT LIKE 'HOUSE%'
AND a.job NOT LIKE 'CAR%'
AND a.errorCode = 1047
AND NOT EXISTS (SELECT 1
FROM _log c
INNER JOIN _active_tmp d
ON c.jobno = d.jobno
AND c.no = d.no
WHERE a.job = c.job
AND a.date = c.date
AND b.group = d.group
AND c.errorCode = 1403)
We select the rows that satisfy the join and have error code 1047 then subtract from that set those rows that also satisfy the join but have error code 1403. You could possibly make this more terse using CTE or a temp table, but this works too.
Note I had to change a few things to make it work in my engine (Postgres), so you may have to change a few things back to Oracle.
You need to change the error code logic. Identify what JOB values has 1403 and then exclude those values
select distinct a.job, a.date, b.[group] from LOG a inner join active_tmp b
on a.jobno=b.jobno and a.no=b.no where b.list = 'N'
AND LOGDATE = TO_CHAR(TRUNC(SYSDATE),'YYYYMMDD')
and a.job not like 'HOUSE%'
and a.job not like 'CAR%' and a.job not in (select JOB from log where errorCode in(1403));

Can't order query correctly

A while ago I requested help to code a LEFT JOIN filtering in a particular way that the result postition the desired value in the first row.
Need to retrieve table's last inserted/updated record with some exclusions
The thing now is that there are many cases which are mixing data. The scenario is that on the same table we have 2 values that we need to organize on different columns. The PO_ID is unique, but can have 1 or more values on the other tables, and for this particular case 1 PO_ID has 3 SHIP_ID_CUS values. We only need 1 PO_ID per row (no duplicates) that is way we used the MAX() and GROUP BY.
Here is a piece of the code that I think cause issues.
select
z.po_id,
max(scdc.ship_id) as ship_id_cdc,
max(lscdc.ship_evnt_cd) as last_event_cdc,
max(lscdc.ship_evnt_tms) as event_tms_cdc,
max(scus.SHIP_ID) as ship_id_cus,
max(lscus.ship_evnt_cd) as last_event_cus,
max(lscus.ship_evnt_tms) as event_tms_cus
from TABLE.A z
left join (select distinct po_id, iltc.ship_id, s.ship_to_loc_code from TABLE.B iltc inner join TABLE.C s on iltc.ship_id=s.ship_id and iltc.ship_to_loc_code=s.ship_to_loc_code and s.ship_to_ctry<>' ') AS A ON z.po_id = a.po_id
left JOIN TABLE.C scus ON A.SHIP_ID = scus.SHIP_ID AND A.SHIP_TO_LOC_CODE = scus.SHIP_TO_LOC_CODE and scus.loc_type = 'CUS' AND DAYS(scus.shipment_tms)+10 >= DAYS(z.ship_tms)
left JOIN TABLE.C scdc ON A.SHIP_ID = scdc.SHIP_ID AND A.SHIP_TO_LOC_CODE = scdc.SHIP_TO_LOC_CODE and scdc.loc_type = 'CDC' AND DAYS(scdc.shipment_tms)+10 >= DAYS(z.ship_tms)
left join
( select ship_id_856, ship_to_loc_cd856, ship_evnt_cd, ship_evnt_tms, carr_tracking_num, event_srv_lvl
, row_number() over(partition by ship_id order by updt_job_tms desc) as RN
FROM TABLE.D
WHERE LEFT(ship_evnt_cd, 1) <> '9') lscus
ON lscus.ship_id_856=scus.ship_id and scus.ship_to_loc_code=lscus.ship_to_loc_cd856 and lscus.rn = 1
left join
( select ship_id_856, ship_to_loc_cd856, ship_evnt_cd, ship_evnt_tms, carr_tracking_num, event_srv_lvl
, row_number() over(partition by ship_id order by updt_job_tms desc) as RN
FROM TABLE.D
WHERE LEFT(ship_evnt_cd, 1) <> '9') lscdc
ON lscdc.ship_id_856=scdc.ship_id and lscdc.ship_to_loc_cd856=scdc.ship_to_loc_code and lscdc.rn = 1
WHERE
z.po_id = 'T1DLDC'
GROUP BY z.po_id
By searching that condition we get the following result
The problem is that if we search directly on the TABLE.D, the last event that we need (with last update record tms) is another one (X1) and somehow the date is incorrect.
What is even more weird, is that if we search for the ship_id_cus on the original query, we get the correct code but still with a wrong date...
WHERE
--z.po_id = 'T1DLDC'
scus.ship_id = 'D30980'
GROUP BY z.po_id
I tried other logic changes like modifying the left joins to search on a subquery.
left JOIN ( select * from TABLE.C order by updt_job_tms desc) scus ON A.SHIP_ID = scus.SHIP_ID AND A.SHIP_TO_LOC_CODE = scus.SHIP_TO_LOC_CODE and scus.loc_type = 'CUS' AND DAYS(scus.shipment_tms)+10 >= DAYS(z.ship_tms)
But this is also giving the same exact results by searching either by po_id or ship_id_cus
Any ideas or comment will be much appreciated.
Thanks
------------------------------------UPDATE-----------------------------------
Adding the result of the LEFT JOIN with the row_partition() including all the ship_id_cus for that po_id, and all the codes with the tms. None match here.
Based on all these, it should be the last ship_id_cus with X1 event/tms. If we exclude also the ones starting with 9, we would get the following result.
(I am not applying here ordering by ship_id_cus, which already described before that did not work either the way I implemented)
If you have a table: TBL1
ID APPROVED APPROVER DATE_APPROVED
====== ======== ======== =============
ABC Y JOE 2019-01-13
ABC N ZACK 2018-12-23
ABC N SUE 2019-02-23
And you do SQL:
SELECT ID, MAX(APPROVED) AS APPROVAL
,MAX(APPROVER) AS APPROVED_BY , MAX(DATE_APPROVED) AS APPROVED_ON
FROM TBL1 GROUP BY ID
you will get result:
ID APPROVAL APPROVED_BY APPROVED_ON
====== ======== =========== =============
ABC Y ZACK 2019-02-23
which is correct to the code but is NOT what you want
Try the following:
SELECT T1.ID, T1.APPROVED, T1.APPROVER, T1.DATE_APPROVED
FROM TBL1 AS T1
INNER JOIN (SELECT ID, MAX(DATE_APPROVED) AS APPROVED_ON
FROM TBL1 GROUP BY ID
) AS T2
ON T1.ID =T2.ID
AND T1.DATE_APPROVED = T2.APPROVED_ON
Result:
ID APPROVED APPROVER DATE_APPROVED
====== ======== ======== =============
ABC N SUE 2019-02-23

aggregating nested SQL statements to fewer columns

I am trying to aggregate my data and group it with respect to SKU's and the cluster ID associated with that SKU.
My current output brings back roughly 40,000 rows (5 SKU's * 8,000 Stores) however I want just 35.
My code:
SELECT DISTINCT E.*
FROM ALC_ITEM_SOURCE P
RIGHT JOIN
(
SELECT D.* ,SUM(L.ALLOCATED_QTY) AS TOTAL_ALLOCATED
FROM ALC_ITEM_LOC L
RIGHT JOIN
(
SELECT C.*
FROM STORE S,
(
SELECT A.*, B.LOCATION AS STORE_NUMBER
FROM FDT_MAP_CLUSTER_LOCATION B,
(
SELECT DISTINCT SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.SKU
from fdt_maptool_sas_data ss
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988
)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'
GROUP BY SS.ALLOC_CLUSTER_ID, SS.ALLOC_CLUSTER_NAME, SS.WORKSHEET_ID, SS.SKU
)A
WHERE B.CLUSTER_ID = A.ALLOC_CLUSTER_ID
AND B.LOCATION_TYPE = 'S'
)C
WHERE S.STORE = C.STORE_NUMBER
AND S.STORE_CLOSE_DATE IS NULL
AND S.DISTRICT NOT IN (997, 998, 999)
AND S.STORE_OPEN_DATE <= SYSDATE
)D
ON L.ITEM_ID = D.SKU
AND L.LOCATION_ID = D.STORE_NUMBER
GROUP BY D.ALLOC_CLUSTER_ID, D.ALLOC_CLUSTER_NAME, D.SKU, D.STORE_NUMBER
)E
ON P.ITEM_ID = E.SKU
AND P.SOURCE_TYPE <> 4
AND P.RELEASE_DATE > '01-FEB-2018'
My desired result would contain:
SKU Cluster_ID Total_allocated Count(stores)
1000989 1AA STORES 258 200
1000989 2A STORES 78 600
1000989 B STORES 36 500
1000989 C STORES 114 100
1000989 D STORES 144 1222
1000989 E STORES 168 600
1000989 F STORES 60 501
Which is taking a sum of total allocated per store per cluster ID.
As you can see each SKU has a grade (AA-F), I would want to repeat this 5 times since I have 5 SKU's.
Basically I am asking how can I aggregate my data up to look like the above table from the 40,000 rows it is now.
Any help is appreciated!
Just to make your sql nicer and neat, you should avoid constructing joins in 'where' statement.
Also I think you have nothing to do with ALC_ITEM_SOURCE table, since you did not use it practically.
You may try this version, or at least start working on it:
select SS.ALLOC_CLUSTER_ID,SS.ALLOC_CLUSTER_NAME,SS.SKU,SUM (L.ALLOCATED_QTY) as total_allocated,count(b.location) as store_number
FROM fdt_maptool_sas_data ss
inner join
FDT_MAP_CLUSTER_LOCATION b on B.CLUSTER_ID =A.ALLOC_CLUSTER_ID AND B.LOCATION_TYPE = 'S'
inner join store s on S.STORE = b.location AND S.STORE_CLOSE_DATE IS NULL AND S.DISTRICT NOT IN (997, 998, 999) AND S.STORE_OPEN_DATE <= SYSDATE
left outer join ALC_ITEM_LOC L on L.ITEM_ID = ss.SKU AND L.LOCATION_ID = b.location
WHERE SS.SKU IN (1099866,
1099896,
1000898,
1000960,
1000988)
AND SS.ORDER_NO IS NOT NULL
AND ALLOC_CLUSTER_NAME NOT LIKE '%DC Cluster%'

SQL Query including pivot

I have done a query which include the left outer join and the pivot.
However I keep getting a bugs saying incorrect column.
SELECT * FROM
(select Max(datetimestamp)as datetimestamp, currentSet, tGroup_id from tPhos_Line_Operator
group by currentSet, tGroup_id)T
LEFT OUTER JOIN
(SELECT PO.tGroup_id AS G_ID, PO.CurrentSet AS cr,gP.tTest_id AS Header,convert(float,Po.Results) as Results from tPhos_Line_Operator PO
inner join tPhos_Line_Parameter pp
on PO.tPhos_Line_Parameter_id = PP.id
INNER JOIN tGroup_Parameter GP
on pp.tGroup_Parameter_id = gp.id
where PP.tPhosline_id=134)P
on T.tGroup_id = P.G_ID
AND T.CurrentSet = p.cr
PIVOT ( MAX(p.Results) For Header IN ([4],[23],[24])) AS pvt
Anyone know how to get the DateTimeStamp and the pivot record?
Which mean i will only have 4 columns in this case.
Currently i need to select * from.
I'm sorry still a junior in query.
Thanks in advance.
sample data could go here:
sample of expected result:
dateTime | currentset | tGroup_id | G_ID | cr | 4 | 23 | 24 |
2015-03-11 07:00:24.313 1 69 69 1 8.36 10 14.4
2015-03-12 00:31:58.257 2 69 69 2 9.12 8 14.4
I am making a guess. It appears that you want to "pivot" some results so you get to see these side by side instead of across multiple rows.
While PIVOT has been added into many SQL implementations, it is not the only way to achieve pivoted data nor is it always the best or easiest way to do it. Here is an old fashioned "pivot" that uses a set of case expressions and 'GROUP BY':
SELECT
PO.tGroup_id AS G_ID
, PO.CurrentSet AS cr
, MAX( datetimestamp ) AS datetimestamp
, MAX( CASE WHEN gP.tTest_id = 4 THEN CONVERT(float, Po.Results) END ) AS Results4
, MAX( CASE WHEN gP.tTest_id = 23 THEN CONVERT(float, Po.Results) END ) AS Results23
, MAX( CASE WHEN gP.tTest_id = 24 THEN CONVERT(float, Po.Results) END ) AS Results24
FROM tPhos_Line_Operator PO
INNER JOIN tPhos_Line_Parameter pp ON PO.tPhos_Line_Parameter_id = PP.id
INNER JOIN tGroup_Parameter GP ON pp.tGroup_Parameter_id = gp.id
WHERE PP.tPhosline_id = 134
GROUP BY
PO.currentSet
, PO.tGroup_id
Because you haven't supplied sample data I don't know the details but hopefully you can bend this to suit your data.
This is an alternative approach using the PIVOT operator but this also relies on using MAX() OVER(). One complication for using the PIVOT operator here is that you require a maximum datetime value as well as pivoted rows which is complex using the inbuilt pivot operator. I believe that complexity can be overcome by the MaxDateTime column seen below:
SELECT
MaxDateTime , CR, G_ID, [4], [23], [24]
FROM (
SELECT
PO.tGroup_id AS G_ID
, PO.CurrentSet AS cr
, gP.tTest_id AS Header
, MAX( PO.datetimestamp ) OVER (PARTITION BY PO.tGroup_id, PO.CurrentSet, gP.tTest_id) AS MaxDateTime
, CONVERT( float, Po.Results ) AS Results
FROM tPhos_Line_Operator PO
INNER JOIN tPhos_Line_Parameter pp ON PO.tPhos_Line_Parameter_id = PP.id
INNER JOIN tGroup_Parameter GP ON pp.tGroup_Parameter_id = gp.id
WHERE PP.tPhosline_id = 134
) AS sourve_tbl
PIVOT (MAX( Results ) FOR Header IN ([4], [23], [24])
) AS pvt_tbl
;
Yes your existing query is wrong.
I hv tried to correct it.
;With CTE as
(
select Max(datetimestamp)as datetimestamp, currentSet, tGroup_id
from tPhos_Line_Operator
group by currentSet, tGroup_id
)
Select * from
( SELECT PO.tGroup_id AS G_ID, PO.CurrentSet AS cr,gP.tTest_id AS Header
,convert(float,Po.Results) as Results
,T.*
from tPhos_Line_Operator PO
inner join tPhos_Line_Parameter pp
on PO.tPhos_Line_Parameter_id = PP.id
INNER JOIN tGroup_Parameter GP
on pp.tGroup_Parameter_id = gp.id
left join CTE T on T.tGroup_id = P.PO.tGroup_id
AND T.CurrentSet = PO.CurrentSet
where PP.tPhosline_id=134)tbl
PIVOT ( MAX(p.Results) For Header IN ([4],[23],[24])) AS pvt