How do I combine subquery rows into one column in Oracle? - sql

I'm working with FileNet. I'm trying to get the folders that a document may be filed in to appear in one column of the record set delimited with semicolons. This was the layout previously decided on and I am tasked with making Oracle do it. Here's what I have for a query so far:
SELECT d1.F_DOCNUMBER,
d1.F_DOCCLASSNUMBER,
d1.F_ENTRYDATE,
d1.F_ARCHIVEDATE,
d1.F_RETENTBASE,
d1.F_RETENTDISP,
d1.F_RETENTOFFSET,
d1.F_PAGES,
d1.F_DOCTYPE,
d1.F_DOCFORMAT,
d1.A32 AS CERT_NUM,
d1.A35 AS DOC_TYPE,
d1.A36 AS BATCH_KEY,
d1.A37 AS FIELD_REP_CODE,
d1.A38 AS EFFECTIVE_DATE,
d1.A39 AS VOUCH_NUM_HIGH,
d1.A40 AS VOUCH_NUM_LOW,
f1.Folders
FROM doctaba d1
LEFT JOIN (SELECT SUBSTR (SYS_CONNECT_BY_PATH (F_FOLDERNAME , ';'), 2) Folders
FROM (SELECT fc2.F_DOCNUMBER, f2.F_FOLDERNAME, ROW_NUMBER () OVER (ORDER BY f2.F_FOLDERNAME) rn, COUNT (*) OVER () cnt
FROM folder_contents fc2
INNER JOIN folder f2
ON f2.F_FOLDERNUMBER = fc2.F_FOLDERNUMBER
WHERE fc2.F_DOCNUMBER = d1.F_DOCNUMBER)
WHERE rn = cnt
START WITH rn = 1
CONNECT BY rn = PRIOR rn + 1) f1
ON d1.F_DOCNUMBER = f1.F_DOCNUMBER
WHERE d1.F_DOCTYPE IS NULL
AND d1.F_DOCNUMBER >= 107777
AND d1.F_DOCNUMBER <= 305791
ORDER BY d1.F_DOCNUMBER;
The problem is that d1.F_DOCNUMBER is being marked as an invalid identifier. I read on some forums that Oracle may not let that column identifier work multiple query levels down. Anyone have some suggestions on how to make this work? Thanks!
EDIT:
Here's my original query that just includes the folder values in rows.
SELECT doctaba.F_DOCNUMBER,
doctaba.F_DOCCLASSNUMBER,
doctaba.F_ENTRYDATE,
doctaba.F_ARCHIVEDATE,
doctaba.F_RETENTBASE,
doctaba.F_RETENTDISP,
doctaba.F_RETENTOFFSET,
doctaba.F_PAGES,
doctaba.F_DOCTYPE,
doctaba.F_DOCFORMAT,
doctaba.A32 AS CERT_NUM,
doctaba.A35 AS DOC_TYPE,
doctaba.A36 AS BATCH_KEY,
doctaba.A37 AS FIELD_REP_CODE,
doctaba.A38 AS EFFECTIVE_DATE,
doctaba.A39 AS VOUCH_NUM_HIGH,
doctaba.A40 AS VOUCH_NUM_LOW,
folder.F_FOLDERNAME
FROM doctaba
LEFT JOIN folder_contents
ON doctaba.F_DOCNUMBER = folder_contents.F_DOCNUMBER
INNER JOIN folder
ON folder.F_FOLDERNUMBER = folder_contents.F_FOLDERNUMBER
WHERE doctaba.F_DOCTYPE IS NULL
AND doctaba.F_DOCNUMBER >= 107777
AND doctaba.F_DOCNUMBER <= 17208174
ORDER BY doctaba.F_DOCNUMBER;

In this case, you are lucky. You are only getting one value from the subquery, so you can just make it a correlated subquery in the select clause:
SELECT . . .
(SELECT SUBSTR(SYS_CONNECT_BY_PATH (F_FOLDERNAME , ';'), 2) as Folders
FROM (SELECT fc2.F_DOCNUMBER, f2.F_FOLDERNAME,
ROW_NUMBER () OVER (ORDER BY f2.F_FOLDERNAME) rn,
COUNT (*) OVER () cnt
FROM folder_contents fc2 INNER JOIN
folder f2
ON f2.F_FOLDERNUMBER = fc2.F_FOLDERNUMBER
WHERE fc2.F_DOCNUMBER = d1.F_DOCNUMBER
)
WHERE rn = cnt
START WITH rn = 1
CONNECT BY rn = PRIOR rn + 1
) as Folders
FROM doctaba d1
WHERE d1.F_DOCTYPE IS NULL AND
d1.F_DOCNUMBER >= 107777 AND
d1.F_DOCNUMBER <= 305791
ORDER BY d1.F_DOCNUMBER;

Related

Update failing while using dense_rank and row_number

Here is the sample data of the two tables , just put together for easy reference
I want the upper part of the table [Outbound].[dbo].[Encounter_Out_P] with column "277CA_FILENAME","277CA_FILENAME2","277CA_FILENAME3","277CA_FILENAME4" as NULLS which are sorted by File_Submitted_DT ascending order to be updated with "277FileId" values of the lower table [Outbound].[dbo].[Encounter_Out_277_P] which are sorted by EDIFECSProcessDate in ascending order. Thanks in advance
Here is my code
WITH
cte_2771 AS (
SELECT
"277CA_FILENAME",
File_Submitted_DT,
TRN02_PatientControlNumber
FROM (
SELECT
I."277CA_FILENAME",
I.File_Submitted_DT,
#cte_277.TRN02_PatientControlNumber
,dense_rank() OVER(PARTITION BY #cte_277.TRN02_PatientControlNumber ORDER BY ABS(DATEDIFF(MINUTE, i.File_Submitted_DT, #cte_277.EDIFECSProcessDate)) ASC) rw1
FROM
[Outbound].[dbo].[Encounter_Out_P] I
INNER JOIN #cte_277 ON I.EncounterID = #cte_277.TRN02_PatientControlNumber
--WHERE EncounterID = 'AP230120920712808806'
)t
WHERE
t.rw1 = 1
)
,
cte_2772 AS (
SELECT
"277FileId"
,1 + ((ROW_NUMBER() OVER(PARTITION BY TRN02_PatientControlNumber ORDER BY EDIFECSProcessDate,File_Submitted_DT ASC ) - 1) % 4)rw2
,TRN02_PatientControlNumber
FROM (
SELECT DISTINCT
p."277FileId",
p.EDIFECSProcessDate,
p.TRN02_PatientControlNumber
,p.ID
,cte_2771.File_Submitted_DT
FROM [Outbound].[dbo].[Encounter_Out_277_P] p
INNER JOIN cte_2771 ON cte_2771.File_Submitted_DT < p.EDIFECSProcessDate
WHERE
p.TRN02_PatientControlNumber = cte_2771.TRN02_PatientControlNumber
) t
)
UPDATE cte_2771
SET "277CA_FILENAME" =
COALESCE(cte_2771."277CA_FILENAME", cte_2772."277FileId" )
FROM cte_2771 INNER JOIN cte_2772
ON cte_2772.TRN02_PatientControlNumber = cte_2771.TRN02_PatientControlNumber
WHERE cte_2772.rw2 = 1
I want the output to be like below, (the upperpart) just put together for easy reference
Notes
I have posted the code for "277CA_FILENAME" only, since it is the same for the rest by changing the WHERE condition changes as "WHERE cte_2772.rw2 = 2,3,4"
if I Uncomment the --WHERE EncounterID = 'AP230120920712808806' in the cte_2771 , it is working perfectly, but if I comment it and run for the entire load, one row gets correct and the other one gets NULL

SQL - ROW_NUMBER that is used in a multi-condition LEFT JOIN

Two tables store different properties for each product: CTI_ROUTING_VIEW and ORD_MACH_OPS
They are both organized by SPEC_NO > MACH_SEQ_NO but the format of the Sequence number is different for each table so it can't be used for a JOIN. ORCH_MACH_OPS has MACHINE and PASS_NO, meaning if a product goes through the same machine twice, the row with the higher SEQ_NO will be PASS_NO 2, 3, etc. CTI_ROUTING_VIEW does not offer PASS_NO, but I can achieve the desired result with:
SELECT TOP (1000) [SPEC_NO]
,[SPEC_PART_NO]
,[MACH_NO]
,[MACH_SEQ_NO]
,[BLANK_WID]
,[BLANK_LEN]
,[NO_OUT_WID]
,[NO_OUT_LEN]
,[SU_MINUTES]
,[RUN_SPEED]
,[NO_COLORS]
,[PRINTDIEID]
,[CUTDIEID]
,ROW_NUMBER() OVER (PARTITION BY MACH_NO ORDER BY MACH_SEQ_NO) as PASS_NO
FROM [CREATIVE].[dbo].[CTI_ROUTING_VIEW]
I would think that I could use this artificial PASS_NO as a JOIN condition, but I can't seem to get it to come through. This is my first time using ROW_NUMBER() so I'm just wondering if I'm doing something wrong in the JOIN syntax.
SELECT rOrd.[SPEC_NO]
,rOrd.[MACH_SEQ_NO]
,rOrd.[WAS_REROUTED]
,rOrd.[NO_OUT]
,rOrd.[PART_COMP_FLG]
,rOrd.[SCHED_START]
,rOrd.[SCHED_STOP]
,rOrd.[MACH_REROUTE_FLG]
,rOrd.[MACH_DESCR]
,rOrd.REPLACED_MACH_NO
,rOrd.MACH_NO
,rOrd.PASS_NO
,rWip.MAX_TRX_DATETIME
,ISNULL(rWip.NET_FG_SUM*rOrd.NO_OUT,0) as NET_FG_SUM
,CASE
WHEN rCti.BLANK_WID IS NULL then 'N//A'
ELSE CONCAT(rCti.BLANK_WID, ' X ', rCti.BLANK_LEN)
END AS SIZE
,ISNULL(rCti.PRINTDIEID,'N//A') as PRINTDIEID
,ISNULL(rCti.CUTDIEID, 'N//A') as CUTDIEID
,rStyle.DESCR as STYLE
,ISNULL(rCti.NO_COLORS, 0) as NO_COLORS
,CAST(CONCAT(rOrd.ORDER_NO,'-',rOrd.ORDER_PART_NO) as varchar) as ORD_MACH_KEY
FROM [CREATIVE].[dbo].[ORD_MACH_OPS] as rOrd
LEFT JOIN (SELECT DISTINCT
[SPEC_NO]
,[SPEC_PART_NO]
,[MACH_NO]
,MACH_SEQ_NO
,[BLANK_WID]
,[BLANK_LEN]
,[NO_COLORS]
,[PRINTDIEID]
,[CUTDIEID]
,ROW_NUMBER() OVER (PARTITION BY MACH_NO ORDER BY MACH_SEQ_NO) as PASS_NO
FROM [CREATIVE].[dbo].[CTI_ROUTING_VIEW]) as rCti
ON rCti.SPEC_NO = rOrd.SPEC_NO
and rCti.MACH_NO =
CASE
WHEN rOrd.REPLACED_MACH_NO is null then rOrd.MACH_NO
ELSE rOrd.REPLACED_MACH_NO
END
and rCti.PASS_NO = rOrd.PASS_NO
LEFT JOIN INVENTORY_ITEM_TAB as rTab
ON rTab.SPEC_NO = rOrd.SPEC_NO
LEFT JOIN STYLE_DESCRIPTION as rStyle
ON rStyle.DESCR_CD = rTab.STYLE_CD
LEFT JOIN (
SELECT
JOB_NUMBER
,FORM_NO
,TRX_ORIG_MACH_NO
,PASS_NO
,SUM(GROSS_FG_QTY-WASTE_QTY) as NET_FG_SUM
,MAX(TRX_DATETIME) as MAX_TRX_DATETIME
FROM WIP_MACH_OPS
WHERE GROSS_FG_QTY <> 0
GROUP BY JOB_NUMBER, FORM_NO, TRX_ORIG_MACH_NO, PASS_NO) as rWip
ON rWip.JOB_NUMBER = rOrd.ORDER_NO
and rWip.FORM_NO = rOrd.ORDER_PART_NO
and rWip.TRX_ORIG_MACH_NO = rOrd.MACH_NO
and rWip.PASS_NO = rOrd.PASS_NO
WHERE rOrd.SCHED_START > DATEADD(DAY, -20, GETDATE())
I fixed it by adding a second partition.
ROW_NUMBER() OVER (PARTITION BY SPEC_NO, MACH_NO ORDER BY MACH_SEQ_NO) as PASS_NO

How to select a single row for each unique ID

SQL novice here learning on the job, still a greenhorn. I have a problem I don't know how to overcome. Using IBM Netezza and Aginity Workbench.
My current output will try to return one row per case number based on when a task was created. It will only keep the row with the newest task. This gets me about 85% of the way there. The issue is that sometimes multiple tasks have a create day of the same day.
I would like to incorporate Task Followup Date to only keep the newest row if there are multiple rows with the same Case Number. I posted an example of what my current code outputs and what i would like it to output.
Current code
SELECT
A.PS_CASE_ID AS Case_Number
,D.CASE_TASK_TYPE_NM AS Task
,C.TASK_CRTE_TMS
,C.TASK_FLWUP_DT AS Task_Followup_Date
FROM VW_CC_CASE A
INNER JOIN VW_CASE_TASK C ON (A.CASE_ID = C.CASE_ID)
INNER JOIN VW_CASE_TASK_TYPE D ON (C.CASE_TASK_TYPE_ID = D.CASE_TASK_TYPE_ID)
INNER JOIN ADMIN.VW_RSN_CTGY B ON (A.RSN_CTGY_ID = B.RSN_CTGY_ID)
WHERE
(A.PS_Z_SPSR_ID LIKE '%EFT' OR A.PS_Z_SPSR_ID LIKE '%CRDT')
AND CAST(A.CASE_CRTE_TMS AS DATE) >= '2020-01-01'
AND B.RSN_CTGY_NM = 'Chargeback Initiation'
AND CAST(C.TASK_CRTE_TMS AS DATE) = (SELECT MAX(CAST(C2.TASK_CRTE_TMS AS DATE)) from VW_CASE_TASK C2 WHERE C2.CASE_ID = C.CASE_ID)
GROUP BY
A.PS_CASE_ID
,D.CASE_TASK_TYPE_NM
,C.TASK_CRTE_TMS
,C.TASK_FLWUP_DT
Current output
Desired output
You could use ROW_NUMBER here:
WITH cte AS (
SELECT DISTINCT A.PS_CASE_ID AS Case_Number, D.CASE_TASK_TYPE_NM AS Task,
C.TASK_CRTE_TMS, C.TASK_FLWUP_DT AS Task_Followup_Date,
ROW_NUMBER() OVER (PARTITION BY A.PS_CASE_ID ORDER BY C.TASK_FLWUP_DT DESC) rn
FROM VW_CC_CASE A
INNER JOIN VW_CASE_TASK C ON A.CASE_ID = C.CASE_ID
INNER JOIN VW_CASE_TASK_TYPE D ON C.CASE_TASK_TYPE_ID = D.CASE_TASK_TYPE_ID
INNER JOIN ADMIN.VW_RSN_CTGY B ON A.RSN_CTGY_ID = B.RSN_CTGY_ID
WHERE (A.PS_Z_SPSR_ID LIKE '%EFT' OR A.PS_Z_SPSR_ID LIKE '%CRDT') AND
CAST(A.CASE_CRTE_TMS AS DATE) >= '2020-01-01' AND
B.RSN_CTGY_NM = 'Chargeback Initiation' AND
CAST(C.TASK_CRTE_TMS AS DATE) = (SELECT MAX(CAST(C2.TASK_CRTE_TMS AS DATE))
FROM VW_CASE_TASK C2
WHERE C2.CASE_ID = C.CASE_ID)
)
SELECT
Case_Number,
Task,
TASK_CRTE_TMS,
Task_Followup_Date
FROM cte
WHERE rn = 1;
One method used window functions:
with cte as (
< your query here >
)
select x.*
from (select cte.*,
row_number() over (partition by case_number, Task_Followup_Date
order by TASK_CRTE_TMS asc
) as seqnum
from cte
) x
where seqnum = 1;

Return calculated column values for SELECT DISTINCT query

I have the following table in SQlite:
_id|token|status |timestamp|mood|eta|name|calc_eta
__________________________________________________________________________ 168|iqmC.3aHMBGbl|ok|1516625084498|50|-4154|Sample Name|1516625533082
169|iqmC.3aHMBGbl|ok|1516625084498|50|-4214|Sample Name|1516625533108
170|iqmC.3aHMBGbl|ok|1516625084498|50|-4274|Sample Name|1516625533414
171|iqmC.3aHMBGbl|ok|1516625084498|50|-4334|Sample Name|1516625533160
172|iqmC.3aHMBGbl|ok|1516625084498|50|-4394|Sample Name|1516625533680
173|iqmC.3aHMBGbl|ok|1516625084498|50|-4420|Sample Name|1516625533068
174|iqmC.3aHMBGbl|ok|1516625084498|50|-4428|Sample Name|1516625533482
175|iqmC.3aHMBGbl|ok|1516625084498|50|-4483|Sample Name|1516625533155
176|iqmC.3aHMBGbl|ok|1516625084498|50|-4543|Sample Name|1516625533148
177|TFbintkHMBw4H|ok|1516630122485|50|2526|Sample Name|1516632672019
178|TFbintkHMBw4H|ok|1516630122485|50|2520|Sample Name|1516632671903
179|TFbintkHMBw4H|ok|1516630122485|50|2460|Sample Name|1516632672321
180|TFbintkHMBw4H|ok|1516630122485|50|2344|Sample Name|1516632672859
181|TFbintkHMBw4H|ok|1516630122485|50|2336|Sample Name|1516632671939
182|TFbintkHMBw4H|ok|1516630122485|50|2281|Sample Name|1516632672802
183|TFbintkHMBw4H|ok|1516630122485|50|2220|Sample Name|1516632671828
184|TFbintkHMBw4H|ok|1516630122485|50|2161|Sample Name|1516632672625
I'm trying to come up with a query on it that would give me the difference between the two newest(based on auto-increment _id), calc_eta values for each distinct token value.
So in this case the result should be:
iqmC.3aHMBGbl|-7
TFbintkHMBw4H|797
I got this far with the SQL but it is not providing the calculated value for each distinct token currently and I'm not sure how to go further.
SELECT DISTINCT token,
(SELECT calc_eta
FROM DATA s
WHERE
(SELECT count(*)
FROM DATA f
WHERE f.token = s.token
AND f._id >= s._id) <= 1) -
(SELECT calc_eta
FROM
(SELECT calc_eta,
MIN(_id)
FROM DATA s
WHERE
(SELECT count(*)
FROM DATA f
WHERE f.token = s.token
AND f._id >= s._id) <= 2)) AS delay
FROM DATA;
In most SQL dialects, you would use window functions such as lag():
select d.*,
(calc_eta - prev_calc_eta) as diff
from (select d.*,
lag(calc_eta) over (partition by token order by _id) as prev_calc_eta,
row_number() over (partition by token order by _id desc) as seqnum
from data d
) d
where seqnum = 1;

SQL ROW_NUMBER with INNER JOIN

I need to use ROW_NUMBER() in the following Query to return rows 5 to 10 of the result. Can anyone please show me what I need to do? I've been trying to no avail. If anyone can help I'd really appreciate it.
SELECT *
FROM villa_data
INNER JOIN villa_prices
ON villa_prices.starRating = villa_data.starRating
WHERE villa_data.capacity >= 3
AND villa_data.bedrooms >= 1
AND villa_prices.period = 'lowSeason'
ORDER BY villa_prices.price,
villa_data.bedrooms,
villa_data.capacity
You need to stick it in a table expression to filter on ROW_NUMBER. You won't be able to use * as it will complain about the column name starRating appearing more than once so will need to list out the required columns explicitly. This is better practice anyway.
WITH CTE AS
(
SELECT /*TODO: List column names*/
ROW_NUMBER()
OVER (ORDER BY villa_prices.price,
villa_data.bedrooms,
villa_data.capacity) AS RN
FROM villa_data
INNER JOIN villa_prices
ON villa_prices.starRating = villa_data.starRating
WHERE villa_data.capacity >= 3
AND villa_data.bedrooms >= 1
AND villa_prices.period = 'lowSeason'
)
SELECT /*TODO: List column names*/
FROM CTE
WHERE RN BETWEEN 5 AND 10
ORDER BY RN
You can use a with clause. Please try the following
WITH t AS
(
SELECT villa_data.starRating,
villa_data.capacity,
villa_data.bedrooms,
villa_prices.period,
villa_prices.price,
ROW_NUMBER() OVER (ORDER BY villa_prices.price,
villa_data.bedrooms,
villa_data.capacity ) AS 'RowNumber'
FROM villa_data
INNER JOIN villa_prices
ON villa_prices.starRating = villa_data.starRating
WHERE villa_data.capacity >= 3
AND villa_data.bedrooms >= 1
AND villa_prices.period = 'lowSeason'
)
SELECT *
FROM t
WHERE RowNumber BETWEEN 5 AND 10;