Oracle SQL: Show entries from component tables once apiece - sql

My objective is produce a dataset that shows a boatload of data from, in total, just shy of 50 tables, all in the same Oracle SQL database schema. Each table except the first consists of, as far as the report I'm building cares, two elements:
A foreign-key identifier that matches a row on the first table
A date
There may be many rows on one of these tables corresponding to one case, and it will NOT be the same number of rows from table to table.
My objective is to have each row in the first table show up as many times as needed to display all the results from the other tables once. So, something like this (except on a lot more tables):
CASE_FILE_ID INITIATED_DATE INSPECTION_DATE PAYMENT_DATE ACTION_DATE
------------ -------------- --------------- ------------ -----------
1000 10-JUL-1986 14-JUL-1987 10-JUL-1986
1000 14-JUL-1988 10-JUL-1987
1000 14-JUL-1989 10-JUL-1988
1000 10-JUL-1989
My current SQL code (shrunk down to five tables, but the rest all follow the same format as T1-T4):
SELECT DISTINCT
A.CASE_FILE_ID,
T1.DATE AS INITIATED_DATE,
T2.DATE AS INSPECTION_DATE,
T3.DATE AS PAYMENT_DATE,
T4.DATE AS ACTION_DATE
FROM
RECORDS.CASE_FILE A
LEFT OUTER JOIN RECORDS.INITIATE T1 ON A.CASE_FILE_ID = T1.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.INSPECTION T2 ON A.CASE_FILE_ID = T2.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.PAYMENT T3 ON A.CASE_FILE_ID = T3.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.ACTION T4 ON A.CASE_FILE_ID = T4.CASE_FILE_ID
ORDER BY
A.CASE_FILE_ID
The problem is, the output this produces results in distinct combinations; so in the above example (where I added a 'WHERE' clause of A.CASE_FILE_ID = '1000'), instead of four rows for case 1000, it'd show twelve (1 Initiated Date * 3 Inspection Dates * 4 Payment Dates = 12 rows). Suffice it to say, as the number of tables increases, this would get very prohibitive in both display and runtime, very quickly.
What is the best way to get an output loosely akin to the ideal above, where any one date is only shown once? Failing that, is there a way to get it to only show as many lines for one CASE_FILE as it needs to show all the dates, even if some dates repeat within that?

There isn't a good way, but there are two ways. One method involves subqueries for each table and complex outer joins. The second involves subqueries and union all. Let's go with that one:
SELECT CASE_FILE_ID,
MAX(INITIATED_DATE) as INITIATED_DATE,
MAX(INSPECTION_DATE) as INSPECTION_DATE,
MAX(PAYMENT_DATE) as PAYMENT_DATE,
MAX(ACTION) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, DATE as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, DATE as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
DATE as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, ACTION as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;
Hmmm, a closely related solution is easier to maintain:
SELECT CASE_FILE_ID,
MAX(CASE WHEN type = 'INITIATED' THEN DATE END) as INITIATED_DATE,
MAX(CASE WHEN type = 'INSPECTION' THEN DATE END) as INSPECTION_DATE,
MAX(CASE WHEN type = 'PAYMENT' THEN DATE END) as PAYMENT_DATE,
MAX(CASE WHEN type = 'ACTION' THEN DATE END) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as TYPE, NULL as DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'PAYMENT', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'ACTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;

Related

SQL - find row with closest date but different column value

i'm new to SQL and i would need an help.
I have a TAB and I need to find for any item B in the TAB the item A with the closest date. In this case the A with 02.09.2021 04:25:30
Date.
Item
07.09.2021 05:02:05
A
06.09.2021 05:01:02
A
05.09.2021 05:00:02
A
04.09.2021 04:59:01
A
03.09.2021 04:58:03
A
02.09.2021 04:56:55
A
02.09.2021 04:33:56
B
02.09.2021 04:25:30
A
WITH CTE(DATE,ITEM)AS
(
SELECT '20210907 05:02:05' , 'A'UNION ALL
SELECT '20210906 05:01:02' , 'A'UNION ALL
SELECT '20210905 05:00:02' , 'A'UNION ALL
SELECT'20210904 04:59:01' , 'A'UNION ALL
SELECT'20210903 04:58:03' , 'A'UNION ALL
SELECT'20210902 04:56:55' , 'A'UNION ALL
SELECT'20210902 04:33:56' , 'B'UNION ALL
SELECT'20210902 04:25:30' , 'A'
)
SELECT
CAST(C.DATE AS DATETIME)X_DATE,C.ITEM,Q.CLOSEST
FROM CTE AS C
OUTER APPLY
(
SELECT TOP 1 CAST(X.DATE AS DATETIME)CLOSEST
FROM CTE AS X
WHERE X.ITEM='A'AND CAST(X.DATE AS DATETIME)<CAST(C.DATE AS DATETIME)
ORDER BY CAST(X.DATE AS DATETIME) ASC
)Q
WHERE C.ITEM='B'
You can use OUTER APPLY-approach as in the above query.
Please also take a look that datetime-column (DATE)is written in the ISO-compliant form
Your data has only two columns. If you want the only the closest A timestamp, then the fastest way is probably window functions:
select t.*,
(case when prev_a_date is null then next_a_date
when next_a_date is null then prev_a_date
when datediff(second, prev_a_date, date) <= datediff(second, date, next_a_date) then prev_a_date
else next_a_date
end) as a_date
from (select t.*,
max(case when item = 'A' then date end) over (order by date) as prev_a_date,
min(case when item = 'A' then date end) over (order by date desc) as next_a_date
from t
) t
where item = 'B';
This uses seconds to measure the time difference, but you can use a smaller unit if appropriate.
You can also do this using apply if you have more columns from the "A" rows that you want:
select tb.*, ta.*
from t b outer apply
(select top (1) ta.*
from t ta
where item = 'A'
order by abs(datediff(second, a.date, b.date))
) t
where item = 'B';

oracle sql get transactions between the period

I have 3 tables in oracle sql namely investor, share and transaction.
I am trying to get new investors invested in any shares for a certain period. As they are the new investor, there should not be a transaction in the transaction table for that investor against that share prior to the search period.
For the transaction table with the following records:
Id TranDt InvCode ShareCode
1 2020-01-01 00:00:00.000 inv1 S1
2 2019-04-01 00:00:00.000 inv1 S1
3 2020-04-01 00:00:00.000 inv1 S1
4 2021-03-06 11:50:20.560 inv2 S2
5 2020-04-01 00:00:00.000 inv3 S1
For the search period between 2020-01-01 and 2020-05-01, I should get the output as
5 2020-04-01 00:00:00.000 inv3 S1
Though there are transactions for inv1 in the table for that period, there is also a transaction prior to the search period, so that shouldn't be included as it's not considered as new investor within the search period.
Below query is working but it's really taking ages to return the results calling from c# code leading to timeout issues. Is there anything we can do to refine to get the results quicker?
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
),
SHARES_IN_PERIOD AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.SHARECODE = S.SHARECODE
WHERE T.TRANDT >= :startDate AND T.TRANDT <= :endDate
),
PREVIOUS_SHARES AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.TRSTCODE = S.TRSTCODE
WHERE T.TRANDT < :startDate
)
SELECT
DISTINCT
SP.INVCODE AS InvestorCode,
SP.SHARECODE AS ShareCode,
SP.TYPE AS ShareType
FROM SHARES_IN_PERIOD SP
WHERE (SP.INVCODE, SP.SHARECODE, SP.TYPE) NOT IN
(
SELECT
PS.INVCODE,
PS.SHARECODE,
PS.TYPE
FROM PREVIOUS_SHARES PS
)
With the suggestion given by #Gordon Linoff, I tried following options (for all the shares I need) but they are taking long time too. Transaction table is over 32 million rows.
1.
WITH
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join investors i on i.invcode = t.invcode
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode IN (SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL)))
and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
If you want to know if the first record in transactions for a share is during a period, you can use window functions:
select t.*
from (select t.*,
row_number() over (partition by invcode, sharecode order by trandt) as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode = :sharecode and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
For performance for this code, you want an index on transactions(invcode, sharecode, trandate).

I need to write a query to mark previous record as “Not eligible ” if a new record comes in within 30 days with same POS Order ID

I have a requirement to write a query to retrieve the records which have POS_ORDER_ID in the table with same POS_ORDER_ID which comes within 30days as new record with status 'Canceled', 'Discontinued' and need to mark previous POS_ORDER_ID record as it as not eligible
Table columns:
POS_ORDER_ID,
Status,
Order_date,
Error_description
A query containing MAX() and ROW_NUMBER() analytic functions might help you such as :
with t as
(
select t.*,
row_number() over (partition by pos_order_id order by Order_date desc ) as rn,
max(Order_date) over (partition by pos_order_id) as mx
from tab t -- your original table
)
select pos_order_id, Status, Order_date, Error_description,
case when rn >1
and t.status in ('Canceled','Discontinued')
and mx - t.Order_date <= 30
then
'Not eligible'
end as "Extra Status"
from t
Demo
Please use below query,
Select and validate
select POS_ORDER_ID, Status, Order_date, Error_description, row_number()
over(partition by POS_ORDER_ID order by Order_date desc)
from table_name;
Update query
merge into table_name t1
using
(select row_id, POS_ORDER_ID, Status, Order_date, Error_description,
row_number() over(partition by POS_ORDER_ID order by Order_date desc) as rnk
from table_name) t2
on (t1.POS_ORDER_ID = t2.POS_ORDER_ID and t1.row_id = t2.row_id)
when matched then
update
set
case when t2.rnk = 1 then 'Canceled' else 'Not Eligible';

How to get the validity date range of a price from individual daily prices in SQL

I have some prices for the month of January.
Date,Price
1,100
2,100
3,115
4,120
5,120
6,100
7,100
8,120
9,120
10,120
Now, the o/p I need is a non-overlapping date range for each price.
price,from,To
100,1,2
115,3,3
120,4,5
100,6,7
120,8,10
I need to do this using SQL only.
For now, if I simply group by and take min and max dates, I get the below, which is an overlapping range:
price,from,to
100,1,7
115,3,3
120,4,10
This is a gaps-and-islands problem. The simplest solution is the difference of row numbers:
select price, min(date), max(date)
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by price, order by date) as seqnum2
from t
) t
group by price, (seqnum - seqnum2)
order by min(date);
Why this works is a little hard to explain. But if you look at the results of the subquery, you will see how the adjacent rows are identified by the difference in the two values.
SELECT Lag.price,Lag.[date] AS [From], MIN(Lead.[date]-Lag.[date])+Lag.[date] AS [to]
FROM
(
SELECT [date],[Price]
FROM
(
SELECT [date],[Price],LAG(Price) OVER (ORDER BY DATE,Price) AS LagID FROM #table1 A
)B
WHERE CASE WHEN Price <> ISNULL(LagID,1) THEN 1 ELSE 0 END = 1
)Lag
JOIN
(
SELECT [date],[Price]
FROM
(
SELECT [date],Price,LEAD(Price) OVER (ORDER BY DATE,Price) AS LeadID FROM [#table1] A
)B
WHERE CASE WHEN Price <> ISNULL(LeadID,1) THEN 1 ELSE 0 END = 1
)Lead
ON Lag.[Price] = Lead.[Price]
WHERE Lead.[date]-Lag.[date] >= 0
GROUP BY Lag.[date],Lag.[price]
ORDER BY Lag.[date]
Another method using ROWS UNBOUNDED PRECEDING
SELECT price, MIN([date]) AS [from], [end_date] AS [To]
FROM
(
SELECT *, MIN([abc]) OVER (ORDER BY DATE DESC ROWS UNBOUNDED PRECEDING ) end_date
FROM
(
SELECT *, CASE WHEN price = next_price THEN NULL ELSE DATE END AS abc
FROM
(
SELECT a.* , b.[date] AS next_date, b.price AS next_price
FROM #table1 a
LEFT JOIN #table1 b
ON a.[date] = b.[date]-1
)AA
)BB
)CC
GROUP BY price, end_date

SQL - values from two rows into new two rows

I have a query that gives a sum of quantity of items on working days. on weekend and holidays that quantity value and item value is empty.
I would like that on empty days is last known quantity and item.
My query is like this:
`select a.dt,b.zaliha as quantity,b.artikal as item
from
(select to_date('01-01-2017', 'DD-MM-YYYY') + rownum -1 dt
from dual
connect by level <= to_date(sysdate) - to_date('01-01-2017', 'DD-MM-YYYY') + 1
order by 1)a
LEFT OUTER JOIN
(select kolicina,sum(kolicina)over(partition by artikal order by datum_do) as zaliha,datum_do,artikal
from
(select sum(vv.kolicinaulaz-vv.kolicinaizlaz)kolicina,vz.datum as datum_do,vv.artikal
from vlpzaglavlja vz, vlpvarijante vv
where vz.id=vv.vlpzaglavlje
and vz.orgjed='01006'
and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum,vv.artikal
order by vv.artikal,vz.datum asc)
order by artikal,datum_do asc)b
on a.dt=b.datum_do
where a.dt between to_date('12102017','ddmmyyyy') and to_date('16102017','ddmmyyyy')
order by a.dt`
and my output is like this:
and I want this:
In short, if quantity is null use lag(... ignore nulls) and coalesce or nvl:
select dt, item,
nvl(quantity, lag(quantity ignore nulls) over (partition by item order by dt))
from t
order by dt, item
Here is the full query, I cannot test it, but it is something like:
with t as (
select a.dt, b.zaliha as quantity, b.artikal as item
from (
select date '2017-10-10' + rownum - 1 dt
from dual
connect by date '2017-10-10' + rownum - 1 <= date '2017-10-16' ) a
left join (
select kolicina, datum_do, artikal,
sum(kolicina) over(partition by artikal order by datum_do) as zaliha
from (
select sum(vv.kolicinaulaz-vv.kolicinaizlaz) kolicina,
vz.datum as datum_do, vv.artikal
from vlpzaglavlja vz
join vlpvarijante vv on vz.id = vv.vlpzaglavlje
where vz.orgjed = '01006' and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum, vv.artikal)) b
on a.dt = b.datum_do)
select *
from (
select dt, item,
nvl(quantity, lag(quantity ignore nulls)
over (partition by item order by dt)) qty
from t)
where dt >= date '2017-10-12'
order by dt, item
There are several issues in your query, major and minor:
in date generator (subquery a) you are selecting dates from long period, january to september, then joining with main tables and summing data and then selecting only small part. Why not filter dates at first?,
to_date(sysdate). sysdate is already date,
use ansi joins,
do not use order by in subqueries, it has no impact, only last ordering is important,
use date literals when defining dates, it is more readable.