I have a table:
table1
id e_id e_nm e_value line_num
59 BHT03-01 Ref ID 04/18/1820 4
59 BHT03-02 38 4
59 BHT03-03 10 4
59 ABC03-01 Ref ID 04/18/1820 4
59 ABC03-02 38 4
59 ABC03-03 10 4
60 BHT03-01 Ref ID 05/09/1820 4
60 BHT03-02 52 4
60 BHT03-03 43 4
I need to concatenate each BHT03-01, BHT03-02 and BHT03-03 separated by : into 1 BHT03-01 for each id and line_num.
All other e_id apart from BHT03-01 should be unaffected.
Here is the output:
table1
id e_id e_nm e_value line_num
59 BHT03-01 Ref ID 04/18/1820:38:10 4
59 BHT03-02 38 4
59 BHT03-03 10 4
59 ABC03-01 Ref ID 04/18/1820 4
59 ABC03-02 38 4
59 ABC03-03 10 4
60 BHT03-01 Ref ID 05/09/1820:52:43 4
60 BHT03-02 52 4
60 BHT03-03 43 4
Once I get this table, I also have to drop all the rows with BHT03-02, BHT03-03.
How can I do it in Oracle SQL?
Given that the updated value in the e_value column is actually derived data, I suggest just creating a computed column when you select, maybe in a view:
SELECT
id,
e_id,
e_nm,
CASE WHEN e_id LIKE 'BHT03%' AND
ROW_NUMBER() OVER (PARTITION BY id, line_num ORDER BY e_id) = 1
THEN LISTAGG(e_value, ':') WITHIN GROUP (ORDER BY e_id)
OVER (PARTITION BY id, line_num)
ELSE e_value END AS e_value,
line_num
FROM yourTable
ORDER BY
id,
e_id;
The above logic detects the first row in each id group, which should correspond to the row with the date in the e_value column. In the case of the first row, it displays a colon-separated concatenation of all the records in the id group, otherwise it just repeats the e_value which is already there.
The following statement will do the update and delete in one go:
MERGE INTO table1 tgt
USING (SELECT ID,
e_id,
line_num,
listagg(e_value, ':') WITHIN GROUP (ORDER BY e_id) OVER (PARTITION BY id, line_num) e_value,
CASE WHEN e_id IN ('BHT03-02', 'BHT03-03') THEN 'Y' ELSE 'N' END del
FROM table1
WHERE e_id IN ('BHT03-01', 'BHT03-02', 'BHT03-03')) src
ON (tgt.id = src.id AND tgt.line_num = src.line_num AND tgt.e_id = src.e_id) -- or whatever the unique identifiers are for the table1 rows
WHEN MATCHED THEN
UPDATE SET tgt.e_value = src.e_value
DELETE WHERE src.del = 'Y';
And here's a demo of it working.
Related
I have two tables
accounts table
account_id location_id
1 11
1 12
2 21
2 22
Events_table
location_id events_id event_date
11 e1 2022/03/04
11 e3 2022/03/05
12 e2 2022/03/10
21 e5 2022/04/10
21 e2 2022/04/09
The result I expected is to get only latest event_id for location with respect to account
Result Expected:
account_id location_id events_id event_date
1 11 e3 2022/03/05
1 12 e2 2022/03/10
2 21 e5 2022/04/10
Use:
with cte as
( select *,
row_number() over(partition by location_id order by event_date desc ) row_num
from Events
) select a.account_id,
a.location_id,
cte.events_id,
cte.event_date
from accounts a
inner join cte on cte.location_id=a.location_id
where cte.row_num=1;
Demo
Problem statement is to calculate median from a table that has two columns. One specifying a number and the other column specifying the frequency of the number.
For e.g.
Table "Numbers":
Num
Freq
1
3
2
3
This median needs to be found for the flattened array with values:
1,1,1,2,2,2
Query:
with ct1 as
(select num,frequency, sum(frequency) over(order by num) as sf from numbers o)
select case when count(num) over(order by num) = 1 then num
when count(num) over (order by num) > 1 then sum(num)/2 end median
from ct1 b where sf <= (select max(sf)/2 from ct1) or (sf-frequency) <= (select max(sf)/2 from ct1)
Is it not possible to use count(num) over(order by num) as the condition in the case statement?
Find the relevant row / 2 rows based of the accumulated frequencies, and take the average of num.
The example and Fiddle will also show you the
computations leading to the result.
If you already know that num is unique, rowid can be removed from the ORDER BY clauses
with
t1 as
(
select t.*
,nvl(sum(freq) over (order by num,rowid rows between unbounded preceding and 1 preceding),0) as freq_acc_sum_1
,sum(freq) over (order by num, rowid) as freq_acc_sum_2
,sum(freq) over () as freq_sum
from t
)
select t1.*
,case
when freq_sum/2 between freq_acc_sum_1 and freq_acc_sum_2
then 'V'
end as relevant_record
from t1
order by num, rowid
Fiddle
Example:
ID
NUM
FREQ
FREQ_ACC_SUM_1
FREQ_ACC_SUM_2
FREQ_SUM
RELEVANT_RECORD
7
8
1
0
1
18
5
10
1
1
2
18
1
29
3
2
5
18
6
31
1
5
6
18
3
33
2
6
8
18
4
41
1
8
9
18
V
9
49
2
9
11
18
V
2
52
1
11
12
18
8
56
3
12
15
18
10
92
3
15
18
18
MEDIAN
45
Fiddle for 1M records
You can find the one (or two) middle value(s) and then average:
SELECT AVG(num) AS median
FROM (
SELECT num,
freq,
SUM(freq) OVER (ORDER BY num) AS cum_freq,
(SUM(freq) OVER () + 1)/2 AS median_freq
FROM table_name
)
WHERE cum_freq - freq < median_freq
AND median_freq < cum_freq + 1
Or, expand the values using a LATERAL join to a hierarchical query and then use the MEDIAN function:
SELECT MEDIAN(num) AS median
FROM table_name t
CROSS JOIN LATERAL (
SELECT LEVEL
FROM DUAL
WHERE freq > 0
CONNECT BY LEVEL <= freq
)
Which, for the sample data:
CREATE TABLE table_name (Num, Freq) AS
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 3 FROM DUAL;
Outputs:
MEDIAN
1.5
(Note: For your sample data, there are 6 items, an even number, so the MEDIAN will be half way between the value of 3rd and 4rd items; so half way between 1 and 2 = 1.5.)
db<>fiddle here
I am attempting to remove transactions that have been reversed from a table. the table has Account, Date, Amount and Row. If a transaction has been reversed Account will match and Amount will be inverse of each other.
Example Table
Account Date Amount Row
12 1/1/18 45 72 -- Case 1
12 1/2/18 50 73
12 1/2/18 -50 74
12 1/3/18 52 75
15 1/1/18 51 76 -- Case 2
15 1/2/18 51 77
15 1/2/18 -51 78
15 1/2/18 51 79
18 1/2/18 50 80 -- Case 3
18 1/2/18 50 81
18 1/2/18 -50 82
18 1/2/18 -50 83
18 1/3/18 50 84
18 1/3/18 50 85
20 1/1/18 57 88 -- Case 4
20 1/2/18 57 89
20 1/4/18 -57 90
20 1/5/18 57 91
Desired Results Table
Account Date Amount Row
12 1/1/18 45 72 -- Case 1
12 1/3/18 52 75
15 1/1/18 51 76 -- Case 2
15 1/2/18 51 79
18 1/3/18 50 84 -- Case 3
18 1/3/18 50 85
20 1/1/18 57 88 -- Case 4
20 1/5/18 57 91
Removing all instances of inverse transactions does not work when there are multiple transactions when all other columns are the same. My attempt was to count all duplicate transactions, count of all inverse duplicate transactions, subtracting those to get the number of rows I needed from each transactions group. I was going to pull the first X rows but found in most cases I want the last X rows of each group, or even a mix (the first and last in Case 2).
I either need a method of removing pairs from the original table, or working from what I have so far, a method of distinguishing which transactions to pull.
Code so far:
--adding row Numbers
with a as (
select
account a,
date d,
amount f,
row_number() over(order by account, date) r
from table),
--counting Duplicates
b as (
select a.a, a.f, Dups
from a join (
select a, f, count(*) Dups
from a
group by a.a, a.f
having count(*)>1
) b
on a.a=b.a and
b.f=a.f
where a.f>0
),
--counting inverse duplicates
c as (
select a.a, a.f, InvDups
from a join (
select a, f, count(*) InvDups
from a
group by a.a, a.f
having count(*)>1
) b
on a.a=b.a and
-b.f=a.f
where a.f>0
),
--combining c and d to get desired number of rows of each transaction group
d as (
select
b.a, b.f, dups, InvDups, Dups-InvDups TotalDups
from b join c
on b.a=c.a and
b.f=c.f
),
--getting the number of rows from the beginning of each transaction group
select d.a, d.d, d.f
from
(select
a, d, f, row_number() over (group by a, d, f) r2
from a) e
join d
on e.a=d.a and
TotalDups<=r2
You can try this.
SELECT T_P.* FROM
( SELECT *, ROW_NUMBER() OVER(PARTITION BY Account, Amount ORDER BY [Row] ) RN from #MyTable WHere Amount > 0 ) T_P
LEFT JOIN
( SELECT *, ROW_NUMBER() OVER(PARTITION BY Account, Amount ORDER BY [Row] ) RN from #MyTable WHere Amount < 0 ) T_N
ON T_P.Account = T_N.Account
AND T_P.Amount = ABS(T_N.Amount)
AND T_P.RN = T_N.RN
WHERE
T_N.Account IS NULL
The following handles your three cases:
with t as (
select t.*,
row_number() over (partition by account, date, amount order by row) as seqnum
from table t
)
select t.*
from t
where not exists (select 1
from t t2
where t2.account = t.account and t2.date = t.date and
t2.amount = -t.amount and t2.seqnum = t.seqnum
);
Use This
;WITH CTE
AS
(
SELECT
[Row]
FROM YourTable YT
WHERE Amount > 0
AND EXISTS
(
SELECT 1 FROM YourTable WHERE Account = YT.Account
AND [Date] = YT.[Date]
AND (Amount+YT.Amount)=0
)
UNION ALL
SELECT
[Row]
FROM YourTable YT
WHERE Amount < 0
AND EXISTS
(
SELECT 1 FROM YourTable WHERE Account = YT.Account
AND [Date] = YT.[Date]
AND (Amount+YT.Amount)>0
)
)
SELECT * FROM YourTable
WHERE EXISTS
(
SELECT 1 FROM CTE WHERE [Row] = YourTable.[Row]
)
I have a compound primary key where the single parts are potentially random. They aren't in any particular order and one can be unique or they can be all the same.
I do not care which row I get. This is like "Just pick one from each group".
My table:
KeyPart1 KeyPart2 KeyPart3 colA colB colD
11 21 39 d1
11 22 39 d2
12 21 39 d2
12 22 39 d3
13 21 38 d3
13 22 38 d5
Now what I want is to get for each entry in colD one row, I do not care which one.
KeyPart1 KeyPart2 KeyPart3 colA colB colD
11 21 39 d1
12 21 39 d2
12 22 39 d3
13 22 38 d5
For rows that are unique by colD, you will have to decide which other column values will be discarded. Here, within the over clause I have use partition by colD which provides the wanted uniqueness by that column, but the order by is arbitrary and you may want to change it to suit your needs.
select
d.*
from (
select
t.*
, row_number() over (partition by t.colD
order by t.KeyPart1,t.KeyPart2,t.KeyPart) as rn
from yourtable t
) d
where d.rn = 1;
The following should work in almost any version of DB2:
select t.*
from (select t.*,
row_number() over (partition by KeyPart1, KeyPart2
order by KeyPart1
) as seqnum
from t
) t
where seqnum = 1;
If you only care about column d, and the first two key parts, then you can use group by:
select KeyPart1, KeyPart2, min(colD)
from t
group by KeyPart1, KeyPart2;
Change 'order by' if necessary
with D as (
select distinct ColdD from yourtable
)
select Y.* from D
inner join lateral
(
select * from yourtable X
where X.ColdD =D.ColdD
order by X.KeyPart1, X.KeyPart2, X.KeyPart3
fetch first rows only
) Y on 1=1
I'm working on the following query and table
SELECT dd.actual_date, dd.week_number_overall, sf.branch_id, AVG(sf.overtarget_qnt) AS targetreach
FROM sales_fact sf, date_dim dd
WHERE dd.date_id = sf.date_id
AND dd.week_number_overall BETWEEN 88-2 AND 88
AND sf.branch_id = 1
GROUP BY dd.actual_date, branch_id, dd.week_number_overall
ORDER BY dd.actual_date ASC;
ACTUAL_DATE WEEK_NUMBER_OVERALL BRANCH_ID TARGETREACH
----------- ------------------- ---------- -----------
13/08/14 86 1 -11
14/08/14 86 1 12
15/08/14 86 1 11.8
16/08/14 86 1 1.4
17/08/14 86 1 -0.2
19/08/14 86 1 7.2
20/08/14 87 1 16.6
21/08/14 87 1 -1.4
22/08/14 87 1 14.4
23/08/14 87 1 2.8
24/08/14 87 1 18
26/08/14 87 1 13.4
27/08/14 88 1 -1.8
28/08/14 88 1 10.6
29/08/14 88 1 7.2
30/08/14 88 1 14
31/08/14 88 1 9.6
02/09/14 88 1 -3.2
the "TargetReach" column shows whether target has been reach or not.
A negative value means target wasn't reached on that day.
How can I get calculate the number of ROW with positive value for this query?
that will show something like:
TOTAL_POSITIVE_TARGET_REACH WEEK_NUMBER_OVERALL
--------------------------- ------------------
13 88
I have tried to use CASE but still not working right.
Thanks a lot.
You want to use conditional aggregation:
with t as (
<your query here>
)
select week_number_overall, sum(case when targetreach > 0 then 1 else 0 end)
from t
group by week_number_overall;
However, I would rewrite your original query to use proper join syntax. Then the query would look like:
SELECT week_number_overall,
SUM(CASE WHEN targetreach > 0 THEN 1 ELSE 0 END)
FROM (SELECT dd.actual_date, dd.week_number_overall, sf.branch_id, AVG(sf.overtarget_qnt) AS targetreach
FROM sales_fact sf JOIN
date_dim dd
ON dd.date_id = sf.date_id
WHERE dd.week_number_overall BETWEEN 88-2 AND 88 AND sf.branch_id = 1
GROUP BY dd.actual_date, branch_id, dd.week_number_overall
) t
GROUP BY week_number_overall
ORDER BY week_number_overall;
THe difference between a CTE (the first solution) and a subquery is (in this case) just a matter of preference.
SELECT WEEK_NUMBER_OVERALL, COUNT(*) TOTAL_POSITIVE_TARGET_REACH
FROM (your original query)
WHERE TARGETREACH >= 0
GROUP BY WEEK_NUMBER_OVERALL
select sum( decode( sign( TARGETREACH ) , -1 , 0 , 0 , 0 , 1 , 1 ) )
from ( "your query here" );
Use HAVING Clause
SELECT dd.actual_date, dd.week_number_overall, sf.branch_id, AVG(sf.overtarget_qnt) AS targetreach
FROM sales_fact sf, date_dim dd
WHERE dd.date_id = sf.date_id
AND dd.week_number_overall BETWEEN 88-2 AND 88
AND sf.branch_id = 1
GROUP BY dd.actual_date, branch_id, dd.week_number_overall
HAVING AVG(sf.overtarget_qnt)>0
ORDER BY dd.actual_date ASC;
Using decode(), sign() get both positive count & negative count.
drop table test;
create table test (
key number(5),
value number(5));
insert into test values ( 1, -9 );
insert into test values ( 2, -8 );
insert into test values ( 3, 10 );
insert into test values ( 4, 12 );
insert into test values ( 5, -9 );
insert into test values ( 6, 8 );
insert into test values ( 7, 51 );
commit;
select sig , count ( sig ) from
(
select key, ( (decode( sign( value ) , -1 , '-ve' , 0 , 'zero' , 1 , '+ve' ) ) ) sig
from test
)
group by sig
SIG COUNT(SIG)
---- ----------------------
+ve 4
-ve 3