ORACLE SQL : SUM in first row with condition - sql

I'm struggling to obtain the following results in a query
Here is my table
Line_num
Line_typ
Cost
1000
6
0
2000
7
5000
3000
7
3000
4000
7
2000
5000
6
0
6000
9
3000
7000
7
2000
8000
1
2000
What I want as result is this
Line_num
Line_typ
Cost
1000
6
10000 (0+5000+3000+2000)
5000
6
5000 (0+3000+2000)
8000
1
2000
Basically to display only rows with line_typ in (6,1) but sum the column costs of all other lines in between.
Thank you for your ideas and help !!
Ivan

The reason Tim Biegeleisen's answer is excellent, but needs some intermediate products and oracle features, can be seen when one uses minimal SQL:
Select line_typ in (1, 6), order by line_num.
Sum the cost for second run over the table with larger line_num n, but
where (third run) there is no line_typ in (1, 6) before n.
Basic complexity O(N³), with indices a bit less.
So:
select h.line_num, h.line_typ,
(select sum(cost)
from tbl g
where g.line_num >= h.line_num
and not exists (select *
from tbl
where line_num > h.line_num
and line_typ in (1, 6)
and line_num <= g.line_num)
) as cost
from tbl h
where h.line_typ in (1, 6)
order by h.line_num
(Given for non-oracle searchers.)
Here is a link to a working demo.

This is a play on a gaps and islands problem. Each island begins upon encountering a Line_typ value of 1 or 6, and ends right before a record containing another 1 or 6 value. We can use COUNT() as an analytic function to find the groups, then report the sums of cost.
WITH cte AS (
SELECT t.*, CASE WHEN Line_typ IN (1, 6) THEN 1 ELSE 0 END AS flag
FROM yourTable t
),
cte2 AS (
SELECT t.*, SUM(flag) OVER (ORDER BY Line_num) AS grp
FROM cte t
),
cte3 AS (
SELECT t.*, MIN(Line_num) OVER (PARTITION BY grp) AS Min_Line_num
FROM cte2 t
)
SELECT MIN(Line_num) AS Line_num,
MAX(CASE WHEN Line_num = Min_Line_num THEN Line_typ END) AS Line_typ,
SUM(Cost) AS Cost
FROM cte3 t
GROUP BY grp
ORDER BY MIN(t.Line_num);
Demo

From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row pattern matching:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
ORDER BY line_num
MEASURES
FIRST(line_num) AS line_num,
FIRST(line_typ) AS line_typ,
SUM(cost) AS total_cost
PATTERN (match_type other_type*)
DEFINE
match_type AS line_typ IN (6, 1),
other_type AS line_typ NOT IN (6, 1)
)
Which, for the sample data:
CREATE TABLE table_name (line_num, line_typ, cost) AS
SELECT 1000, 6, 0 FROM DUAL UNION ALL
SELECT 2000, 7, 5000 FROM DUAL UNION ALL
SELECT 3000, 7, 3000 FROM DUAL UNION ALL
SELECT 4000, 7, 2000 FROM DUAL UNION ALL
SELECT 5000, 6, 0 FROM DUAL UNION ALL
SELECT 6000, 9, 3000 FROM DUAL UNION ALL
SELECT 7000, 7, 2000 FROM DUAL UNION ALL
SELECT 8000, 1, 2000 FROM DUAL;
Outputs:
LINE_NUM
LINE_TYP
TOTAL_COST
1000
6
10000
5000
6
5000
8000
1
2000
db<>fiddle here

Related

How to group items by rows

I wanted to group the number of shop but i am not sure what is the syntax to create a group that is not exist in the table. I wanted the output to be like this
Group | Number of items
1 | XXX
2 | XXX
Group 1 would have number of items that is less than 10 while group 2 would have item that is more than 10.I have the data for the number of items, but I need to create the group number and I am not sure how. Thank you in advance.
Way I have tried:
SELECT
case when b.item_stock < 10 then count(a.shopid) else null end as Group_1,
case when b.item_stock >= 10 or b.item_stock < 100 then count(a.shopid) else null end as Group_2
FROM `table_a` a
left join `table_b` b
on a.id= b.id
where registration_time between "2017-01-01" and "2017-05-31"
group by b.item_stock
LIMIT 1000
Below is the BigQuery way of doing this
select 'group_' || range_bucket(item_stock, [0, 10]) as group_id,
count(*) as number_of_items
from your_table
group by group_id
if apply to dummy data like
with your_table as (
select 'ID001' shop_id, 40 item_stock union all
select 'ID002', 20 union all
select 'ID003', 30 union all
select 'ID004', 9 union all
select 'ID005', 44 union all
select 'ID006', 22 union all
select 'ID007', 28 union all
select 'ID008', 35 union all
select 'ID009', 20 union all
select 'ID010', 4 union all
select 'ID011', 5 union all
select 'ID012', 45 union all
select 'ID013', 29 union all
select 'ID014', 8 union all
select 'ID015', 40 union all
select 'ID016', 26 union all
select 'ID017', 31 union all
select 'ID018', 48 union all
select 'ID019', 45 union all
select 'ID020', 13
)
output is
Benefit of this solution is that it is easily extended to any number of ranges just by adding those into range_bucket function -
for example : range_bucket(item_stock, [0, 10, 50, 100, 1000])
From the example you've shared you were close to solving this one, just need to tweak your case statement.
The case statement in your query is splitting the groups into two separate columns, whereas you need these groups in one column with the totals to the right.
Consider the below change to your select statement.
case when b.item_stock < 10 then "Group_1"
when b.item_stock >= 10 then "Group_2" else null end as Groups,
count(a.shop_id) as total
Schema (MySQL v5.7)
CREATE TABLE id (
`shop_id` VARCHAR(5),
`item_stock` INTEGER
);
INSERT INTO id
(`shop_id`, `item_stock`)
VALUES
('ID001', '40'),
('ID002', '20'),
('ID003', '30'),
('ID004', '9'),
('ID005', '44'),
('ID006', '22'),
('ID007', '28'),
('ID008', '35'),
('ID009', '20'),
('ID010', '4'),
('ID011', '5'),
('ID012', '45'),
('ID013', '29'),
('ID014', '8'),
('ID015', '40'),
('ID016', '26'),
('ID017', '31'),
('ID018', '48'),
('ID019', '45'),
('ID020', '13');
Query #1
SELECT
case when item_stock < 10 then "Group_1"
when item_stock >= 10 then "Group_2" else null end as Groups,
count(shop_id) as total
FROM id group by 1;
Groups
total
Group_1
4
Group_2
16
View on DB Fiddle
Tom

Analytic Function: ROW_NUMBER( )

I have a table "Invoice"
id integer Primary key
customer_id Integer
total Number (*,2)
The query is to display all customer_id, total and running serial number to each customer with alias name as 'SNO'. And the records should be displayed in ascending order based on the customer_id and then by SNO.
Hints:
Analytic Function: ROW_NUMBER( )
Analytic Clause: query_partition_clause and order_by_clause.
I wrote the below query:
Select customer_id,
total,
ROW_NUMBER( ) OVER (PARTITION BY customer_id ORDER BY customer_id ASC) AS "SNO"
from invoice;
But the result is failing. What is that I am missing. Also what is meant "the records should be displayed in ascending order based on the customer_id and then by SNO".
The result I am getting is as below:
CUSTOMER_ID TOTAL SNO
1 70000 1
2 250000 1
2 560000 2
3 200000 1
3 45000 2
4 475000 1
5 50000 1
5 10000 2
6 600000 1
6 90000 2
Expected result is :
CUSTOMER_ID TOTAL SNO
1 70000 1
2 250000 1
2 560000 2
3 45000 1
3 200000 2
4 475000 1
5 10000 1
5 50000 2
6 600000 1
6 90000 2
TOTAL Column data is not matching.
You're close, you probably need to order the row_number by id (assuming it's ascending based on time)
Select customer_id,
total,
ROW_NUMBER( ) OVER (PARTITION BY customer_id ORDER BY id ASC) AS "SNO"
from invoice
order by customer_id, "SNO" -- should be the default anyway (but there's no guarantee)
I have not found any order by clause in your query, another issue in which order you want to generate SNO ? by using id or total that will impact on your ordering
with cte as
(
select 1 cid, 70000 total from dual
union all
select 2, 250000 from dual
union all
select 2, 560000 from dual
union all
select 3, 200000 from dual
union all
select 3, 45000 from dual
union all
select 4, 475000 from dual
union all
select 5, 50000 from dual
union all
select 5, 10000 from dual
union all
select 6, 600000 from dual
union all
select 6, 90000 from dual
)Select cid,total,ROW_NUMBER( ) OVER (PARTITION BY cid ORDER BY total ) AS "SNO" from cte order by cid,SNO

SQL oracle group list number

Please help me: group list number
A new group starts when the values descend. You can find the groups where they start using lag(). Then do a cumulative sum:
select t.*,
1 + sum(case when prev_col2 < col2 then 0 else 1 end) over (order by col1) as grp
from (select t.*,
lag(col2) over (order by col1) as prev_col2
from t
) t;
In Oracle 12.1 and above, this is a simple application of the match_recognize clause:
with
inputs ( column1, column2 ) as (
select 1, 1000 from dual union all
select 2, 2000 from dual union all
select 3, 3000 from dual union all
select 4, 6000 from dual union all
select 5, 7500 from dual union all
select 6, 0 from dual union all
select 7, 500 from dual union all
select 8, 600 from dual union all
select 9, 900 from dual union all
select 10, 2300 from dual union all
select 11, 4700 from dual union all
select 12, 40 from dual union all
select 13, 1000 from dual union all
select 14, 2000 from dual union all
select 15, 4000 from dual
)
-- End of simulated inputs (not part of the solution).
-- SQL query begins BELOW THIS LINE. Use actual table and column names.
select column1, column2, column3
from inputs
match_recognize(
order by column1
measures match_number() as column3
all rows per match
pattern ( a b* )
define b as column2 >= prev(column2)
)
order by column1 -- If needed.
;
OUTPUT:
COLUMN1 COLUMN2 COLUMN3
---------- ---------- ----------
1 1000 1
2 2000 1
3 3000 1
4 6000 1
5 7500 1
6 0 2
7 500 2
8 600 2
9 900 2
10 2300 2
11 4700 2
12 40 3
13 1000 3
14 2000 3
15 4000 3
You can use window function to mark the point where column_2 restarts and use cumulative sum to get the desired result
Select column_1,
Column_2,
Sum(flag) over (order by column_1) as column_3
From (
Select t.*,
Case when column_2 < lag(column_2,1,0) over (order by column_1) then 1 else 0 end as flag
From your_table t
) t;

Split a column into multiple columns

select distinct account_num from account order by account_num;
The above query gave the below result
account_num
1
2
4
7
12
18
24
37
45
59
I want to split the account_num column into tuple of three account_num's like (1,2,4);(7,12,18);(24,37,45),(59); The last tuple has only one entry as there are no more account_num's left. Now I want a query to output the min and max of each tuple. (please observe that the max of one tuple is less than the min of the next tuple). Output desired is shown below
1 4
7 18
24 45
59 59
Edit: I have explained my requirement in the best way I could
You can use the example below as a scratch, this is only based on information you have provided so far. For further documentation, you can consult Oracle's analytical functions docs:
with src as( --create a source data
select 1 col from dual union
select 2 from dual union
select 4 from dual union
select 7 from dual union
select 12 from dual union
select 18 from dual union
select 24 from dual union
select 37 from dual union
select 45 from dual union
select 59 from dual
)
select
col,
decode(col_2, 0, max_col, col_2) col_2 -- for the last row we get the maximum value for the row
from (
select
col,
lead(col, 2, 0) over (order by col) col_2, -- we get the values from from two rows behind
max(col) over () max_col, -- we get the max value to be used for the last row in the result
rownum rn from src -- we get the rownum to handle the final output
) where mod(rn - 1, 3) = 0 -- only get rows having a step of two
This is another solution.
SELECT *
FROM (SELECT DISTINCT MIN(val) over(PARTITION BY gr) min_,
MAX(val) over(PARTITION BY gr) max_
FROM (SELECT val,
decode(trunc(rn / 3), rn / 3, rn / 3, ceil(rn / 3)) gr
FROM (SELECT val,
row_number() over(ORDER BY val) rn
FROM (select distinct account_num from account order by account_num)))) ORDER BY min_
UPDATED
Solution without analytic function.
SELECT MIN(val) min_,
MAX(val) max_
FROM (SELECT val,
ceil(rn / 3) gr
FROM (SELECT val,
rownum rn
FROM A_DEL_ME)) GROUP BY gr
Please add more information on what you want to do. What is the connection between account_number 1 and number 4, 7 and 18? Is there any? If not, why would you want to split this into two columns and what is the rule for splitting it?
With what you have posted, you could do something like this:
select 1 as account_num, 4 as account_num1 from dual
union all select 7 as account_num, 18 as account_num1 from dual
...
and so on, but I don't see the use for this.

SQL Grouping by Ranges

I have a data set that has timestamped entries over various sets of groups.
Timestamp -- Group -- Value
---------------------------
1 -- A -- 10
2 -- A -- 20
3 -- B -- 15
4 -- B -- 25
5 -- C -- 5
6 -- A -- 5
7 -- A -- 10
I want to sum these values by the Group field, but parsed as it appears in the data. For example, the above data would result in the following output:
Group -- Sum
A -- 30
B -- 40
C -- 5
A -- 15
I do not want this, which is all I've been able to come up with on my own so far:
Group -- Sum
A -- 45
B -- 40
C -- 5
Using Oracle 11g, this is what I've hobbled togther so far. I know that this is wrong, by I'm hoping I'm at least on the right track with RANK(). In the real data, entries with the same group could be 2 timestamps apart, or 100; there could be one entry in a group, or 100 consecutive. It does not matter, I need them separated.
WITH SUB_Q AS
(SELECT K_ID
, GRP
, VAL
-- GET THE RANK FROM TIMESTAMP TO SEPARATE GROUPS WITH SAME NAME
, RANK() OVER(PARTITION BY K_ID ORDER BY TMSTAMP) AS RNK
FROM MY_TABLE
WHERE K_ID = 123)
SELECT T1.K_ID
, T1.GRP
, SUM(CASE
WHEN T1.GRP = T2.GRP THEN
T1.VAL
ELSE
0
END) AS TOTAL_VALUE
FROM SUB_Q T1 -- MAIN VALUE
INNER JOIN SUB_Q T2 -- TIMSTAMP AFTER
ON T1.K_ID = T2.K_ID
AND T1.RNK = T2.RNK - 1
GROUP BY T1.K_ID
, T1.GRP
Is it possible to group in this way? How would I go about doing this?
I approach this problem by defining a group which is the different of two row_number():
select group, sum(value)
from (select t.*,
(row_number() over (order by timestamp) -
row_number() over (partition by group order by timestamp)
) as grp
from my_table t
) t
group by group, grp
order by min(timestamp);
The difference of two row numbers is constant for adjacent values.
A solution using LAG and windowed analytic functions:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( "Timestamp", "Group", Value ) AS
SELECT 1, 'A', 10 FROM DUAL
UNION ALL SELECT 2, 'A', 20 FROM DUAL
UNION ALL SELECT 3, 'B', 15 FROM DUAL
UNION ALL SELECT 4, 'B', 25 FROM DUAL
UNION ALL SELECT 5, 'C', 5 FROM DUAL
UNION ALL SELECT 6, 'A', 5 FROM DUAL
UNION ALL SELECT 7, 'A', 10 FROM DUAL;
Query 1:
WITH changes AS (
SELECT t.*,
CASE WHEN LAG( "Group" ) OVER ( ORDER BY "Timestamp" ) = "Group" THEN 0 ELSE 1 END AS hasChangedGroup
FROM TEST t
),
groups AS (
SELECT "Group",
VALUE,
SUM( hasChangedGroup ) OVER ( ORDER BY "Timestamp" ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS grp
FROM changes
)
SELECT "Group",
SUM( VALUE )
FROM Groups
GROUP BY "Group", grp
ORDER BY grp
Results:
| Group | SUM(VALUE) |
|-------|------------|
| A | 30 |
| B | 40 |
| C | 5 |
| A | 15 |
This is typical "star_of_group" problem (see here: https://timurakhmadeev.wordpress.com/2013/07/21/start_of_group/)
In your case, it would be as follows:
with t as (
select 1 timestamp, 'A' grp, 10 value from dual union all
select 2, 'A', 20 from dual union all
select 3, 'B', 15 from dual union all
select 4, 'B', 25 from dual union all
select 5, 'C', 5 from dual union all
select 6, 'A', 5 from dual union all
select 7, 'A', 10 from dual
)
select min(timestamp), grp, sum(value) sum_value
from (
select t.*
, sum(start_of_group) over (order by timestamp) grp_id
from (
select t.*
, case when grp = lag(grp) over (order by timestamp) then 0 else 1 end
start_of_group
from t
) t
)
group by grp_id, grp
order by min(timestamp)
;