how to divide the obtained sum of data - sql

I need to get the sum of the records matching the key and the result of it is my TOTALQUANTITY
then i need to split it for packages, max package size is 5.
so when i get a result(TOTALQUANTITY) equals to 13 i should get something like this:
package |ofpackages |totalquantity |quantityofpackage
1 |3 |13 |5
2 |3 |13 |5
3 |3 |13 |3
my attempt:
SELECT
count(*) as TOTALQUANTITY,
get_token(data1) data1,
get_token(data2) data2,
get_token(data3) data3,
floor((row_number() over (partition BY get_token(data2) order by get_token(data2) ) - 1 ) / 5) package
FROM
--my working code
WHERE
--my working conditions
GROUP BY
get_token(data1),
get_token(data2),
get_token(data3)
ORDER BY
get_token(data2)
Totalquantity gives the correct value, but the package unfortunately doesn't :(
How it should be? how can i get the remaining values?

Add the use MOD() to determine the remainder, if zero output the division, otherwise add the package amount to the totalquantity then divide and round.
Using a derived table makes this a little easier I believe:
SELECT
TOTALQUANTITY,
get_token(data1) data1,
get_token(data2) data2,
get_token(data3) data3,
package,
case when MOD(TOTALQUANTITY,package) = 0
then TOTALQUANTITY + package
else round((TOTALQUANTITY + package)/package,0)
end as quantityofpackage
FROM (
SELECT
count(*) as TOTALQUANTITY,
get_token(data1) data1,
get_token(data2) data2,
get_token(data3) data3,
floor((row_number() over (partition BY get_token(data2) order by get_token(data2) ) - 1 ) / 5) package
FROM
--my working code
WHERE
--my working conditions
GROUP BY
get_token(data1),
get_token(data2),
get_token(data3)
) derived
ORDER BY
data2
Note you don't specify which dbms you are using so I have assumed it supports MOD() if not it will have an equivalent e.g. % in MS SQL Server

I'm not sure from the example code how that's supposed to work. Suppose there's a table which contains packages and quantities.
drop table if exists #package_quantities;
go
create table #package_quantities(
package int not null,
quantity int not null);
go
insert #package_quantities(package, quantity) values
(10, 10),
(10, 1),
(10, 2),
(20, 5),
(20, 9),
(30, 12),
(40, 20);
To generate new rows the query uses a tally tvf called fnNumbers. The CTE pack_cte sums quantitites for each package. The query then splits the packages into bundles with a maximum quantity of 5. Each subpackage has a unique sequence number 1, 2, 3...
fnNumbers tvf
create function [dbo].[fnNumbers](
#zero_or_one bit,
#n bigint)
returns table with schemabinding as return
with n(n) as (select null from (values (1),(2),(3),(4)) n(n))
select 0 n where #zero_or_one = 0
union all
select top(#n) row_number() over(order by (select null)) n
from n na, n nb, n nc, n nd, n ne, n nf, n ng, n nh,
n ni, n nj, n nk, n nl, n nm, n np, n nq, n nr;
Query
with pack_cte(package, sum_quantity) as (
select package, sum(quantity)
from #package_quantities
group by package)
select p.package, p.sum_quantity, p_calc.num_packages, fn.n subpackage,
case when p_calc.num_packages<>fn.n or (p_calc.num_packages=fn.n and p.sum_quantity%5=0)
then 5 else p.sum_quantity%5 end quanityofpackage
from pack_cte p
cross apply (select ceiling(p.sum_quantity/5.0) num_packages) p_calc
cross apply fnNumbers(1, p_calc.num_packages) fn;
Output
package sum_quantity num_packages subpackage quanityofpackage
10 13 3 1 5
10 13 3 2 5
10 13 3 3 3
20 14 3 1 5
20 14 3 2 5
20 14 3 3 4
30 12 3 1 5
30 12 3 2 5
30 12 3 3 2
40 20 4 1 5
40 20 4 2 5
40 20 4 3 5
40 20 4 4 5

Related

Is it possible to use a aggregate function over partition by as a case condition in SQL?

Problem statement is to calculate median from a table that has two columns. One specifying a number and the other column specifying the frequency of the number.
For e.g.
Table "Numbers":
Num
Freq
1
3
2
3
This median needs to be found for the flattened array with values:
1,1,1,2,2,2
Query:
with ct1 as
(select num,frequency, sum(frequency) over(order by num) as sf from numbers o)
select case when count(num) over(order by num) = 1 then num
when count(num) over (order by num) > 1 then sum(num)/2 end median
from ct1 b where sf <= (select max(sf)/2 from ct1) or (sf-frequency) <= (select max(sf)/2 from ct1)
Is it not possible to use count(num) over(order by num) as the condition in the case statement?
Find the relevant row / 2 rows based of the accumulated frequencies, and take the average of num.
The example and Fiddle will also show you the
computations leading to the result.
If you already know that num is unique, rowid can be removed from the ORDER BY clauses
with
t1 as
(
select t.*
,nvl(sum(freq) over (order by num,rowid rows between unbounded preceding and 1 preceding),0) as freq_acc_sum_1
,sum(freq) over (order by num, rowid) as freq_acc_sum_2
,sum(freq) over () as freq_sum
from t
)
select t1.*
,case
when freq_sum/2 between freq_acc_sum_1 and freq_acc_sum_2
then 'V'
end as relevant_record
from t1
order by num, rowid
Fiddle
Example:
ID
NUM
FREQ
FREQ_ACC_SUM_1
FREQ_ACC_SUM_2
FREQ_SUM
RELEVANT_RECORD
7
8
1
0
1
18
5
10
1
1
2
18
1
29
3
2
5
18
6
31
1
5
6
18
3
33
2
6
8
18
4
41
1
8
9
18
V
9
49
2
9
11
18
V
2
52
1
11
12
18
8
56
3
12
15
18
10
92
3
15
18
18
MEDIAN
45
Fiddle for 1M records
You can find the one (or two) middle value(s) and then average:
SELECT AVG(num) AS median
FROM (
SELECT num,
freq,
SUM(freq) OVER (ORDER BY num) AS cum_freq,
(SUM(freq) OVER () + 1)/2 AS median_freq
FROM table_name
)
WHERE cum_freq - freq < median_freq
AND median_freq < cum_freq + 1
Or, expand the values using a LATERAL join to a hierarchical query and then use the MEDIAN function:
SELECT MEDIAN(num) AS median
FROM table_name t
CROSS JOIN LATERAL (
SELECT LEVEL
FROM DUAL
WHERE freq > 0
CONNECT BY LEVEL <= freq
)
Which, for the sample data:
CREATE TABLE table_name (Num, Freq) AS
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 3 FROM DUAL;
Outputs:
MEDIAN
1.5
(Note: For your sample data, there are 6 items, an even number, so the MEDIAN will be half way between the value of 3rd and 4rd items; so half way between 1 and 2 = 1.5.)
db<>fiddle here

Get running no series on basis of one column value

I have a table like
SELECT str AS company, item#, Qty
FROM temp_on_hand
WHERE qty > 2
ORDER BY Item# ASC
output of that query is -
company item# Qty
1 746 3
5 9526 1
1 14096 1
2 14096 2
3 14095 2
I want to generate new item#( with addition of '-0001' to current item#) on basis of Qty column i.e. if Qty column has value 3 for company 1 than query should return three rows like -
company NewItem# Item# Qty
1 746-00001 746 3
1 746-00002 746 3
1 746-00003 746 3
5 9526-00001 9526 1
1 14096-00001 14096 1
2 14096-00002 14096 2
2 14096-00003 14096 2
3 14095-00001 14095 3
3 14095-00002 14095 3
3 14095-00003 14095 3
. . . . . . .
Table structure like that
CREATE TABLE temp_on_hand(str INT, item# INT,Qty INT)
INSERT INTO temp_on_hand VALUES (1, 746, 3)
INSERT INTO temp_on_hand VALUES (5, 9526, 1)
INSERT INTO temp_on_hand VALUES (1, 14096, 1)
INSERT INTO temp_on_hand VALUES (2, 14096, 2)
INSERT INTO temp_on_hand VALUES (3, 14095, 2)
ALTER TABLE temp_on_hand ADD new_item# VARCHAR)
similarly for upcoming values.
Thanks in advance
You can join to a Numbers table.
You can use a real one, but I will use Itzik Ben-Gan's on-the-fly tally table (it's actually better as an inline Table-valued Function).
EDIT: According to your comments, you don't actually need the numbering from Nums, you want a fresh overall numbering. So you can just select from L1
WITH
L0 AS ( SELECT 1 AS c
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1)) AS D(c) ),
L1 AS ( SELECT 1 AS c FROM L0 A, L0 B ), -- add more cross joins for more rows
SELECT
t.str AS company,
t.item# + FORMAT(ROW_NUMBER() OVER (ORDER BY t.item# ASC), '-0000') NewItem#,
t.item#,
t.Qty
FROM temp_on_hand t
CROSS APPLY(
SELECT TOP (t.Qty) c
FROM L1
) n
WHERE t.qty > 2
ORDER BY t.Item#, n.rownum ASC;
db<>fiddle
The key to good performance using a numbers table based approach is to make sure the row expansion is constrained by a row goal, i.e. SELECT TOP(n), without a row goal the full cartesian product is used. Also, the FORMAT function is known to be slow.
You could try something like this
[EDIT]: The sequence assigned to the NewItem# does not reset for each (startdate, enddate) pair.
drop TABLE if exists #temp_on_hand;
go
CREATE TABLE #temp_on_hand(str INT, item# INT,Qty INT)
INSERT INTO #temp_on_hand VALUES
(1, 746, 3),
(5, 9526, 1),
(1, 14096, 1),
(2, 14096, 2),
(3, 14095, 3);
with
l as (select 1 n from (values (1),(1),(1),(1),(1),(1),(1),(1)) as v(n))
select *, concat_ws('-', item#,
right('00000'+cast(row_number() over (order by (select null)) as varchar(5)), 5)) NewItem#
from #temp_on_hand toh
cross apply (select top (toh.Qty) 1 n
from l l1, l l2,l l3, l l4) tally;
str item# Qty n NewItem#
1 746 3 1 746-00001
1 746 3 1 746-00002
1 746 3 1 746-00003
5 9526 1 1 9526-00004
1 14096 1 1 14096-00005
2 14096 2 1 14096-00006
2 14096 2 1 14096-00007
3 14095 3 1 14095-00008
3 14095 3 1 14095-00009
3 14095 3 1 14095-00010
A recursive CTE is a simple method:
WITH cte as (
SELECT str AS company, item#, Qty, 1 as n
FROM temp_on_hand
WHERE qty > 2
UNION ALL
SELECT company, item#, Qty, n + 1
FROM cte
WHERE n < Qty
)
SELECT str, item# + format(n, '0000') as newitem#, item#, qty
FROM cte;
Note that if qty exceeds 100, you will also need option (maxrecursion 0).
EDIT:
If you want the numbering within a company, you can use window functions and a cumulative sum:
WITH cte as (
SELECT str AS company, item#, Qty, 1 as n,
SUM(qty) OVER (PARTITION BY str ORDER BY item#) - qty as start_qty
FROM temp_on_hand
WHERE qty > 2
UNION ALL
SELECT company, item#, Qty, n + 1, start_qty
FROM cte
WHERE n < Qty
)
SELECT str,
item# + format(n + start_qty, '0000') as newitem#, item#, qty
FROM cte;

Break up running sum into maximum group size / length

I am trying to break up a running (ordered) sum into groups of a max value. When I implement the following example logic...
IF OBJECT_ID(N'tempdb..#t') IS NOT NULL DROP TABLE #t
SELECT TOP (ABS(CHECKSUM(NewId())) % 1000) ROW_NUMBER() OVER (ORDER BY name) AS ID,
LEFT(CAST(NEWID() AS NVARCHAR(100)),ABS(CHECKSUM(NewId())) % 30) AS Description
INTO #t
FROM sys.objects
DECLARE #maxGroupSize INT
SET #maxGroupSize = 100
;WITH t AS (
SELECT
*,
LEN(Description) AS DescriptionLength,
SUM(LEN(Description)) OVER (/*PARTITION BY N/A */ ORDER BY ID) AS [RunningLength],
SUM(LEN(Description)) OVER (/*PARTITION BY N/A */ ORDER BY ID)/#maxGroupSize AS GroupID
FROM #t
)
SELECT *, SUM(DescriptionLength) OVER (PARTITION BY GroupID) AS SumOfGroup
FROM t
ORDER BY GroupID, ID
I am getting groups that are larger than the maximum group size (length) of 100.
A recusive common table expression (rcte) would be one way to resolve this.
Sample data
Limited set of fixed sample data.
create table data
(
id int,
description nvarchar(20)
);
insert into data (id, description) values
( 1, 'qmlsdkjfqmsldk'),
( 2, 'mldskjf'),
( 3, 'qmsdlfkqjsdm'),
( 4, 'fmqlsdkfq'),
( 5, 'qdsfqsdfqq'),
( 6, 'mds'),
( 7, 'qmsldfkqsjdmfqlkj'),
( 8, 'qdmsl'),
( 9, 'mqlskfjqmlkd'),
(10, 'qsdqfdddffd');
Solution
For every recursion step evaluate (r.group_running_length + len(d.description) <= #group_max_length) if the previous group must be extended or a new group must be started in a case expression.
Set group target size to 40 to better fit the sample data.
declare #group_max_length int = 40;
with rcte as
(
select d.id,
d.description,
len(d.description) as description_length,
len(d.description) as running_length,
1 as group_id,
len(d.description) as group_running_length
from data d
where d.id = 1
union all
select d.id,
d.description,
len(d.description),
r.running_length + len(d.description),
case
when r.group_running_length + len(d.description) <= #group_max_length
then r.group_id
else r.group_id + 1
end,
case
when r.group_running_length + len(d.description) <= #group_max_length
then r.group_running_length + len(d.description)
else len(d.description)
end
from rcte r
join data d
on d.id = r.id + 1
)
select r.id,
r.description,
r.description_length,
r.running_length,
r.group_id,
r.group_running_length,
gs.group_sum
from rcte r
cross apply ( select max(r2.group_running_length) as group_sum
from rcte r2
where r2.group_id = r.group_id ) gs -- group sum
order by r.id;
Result
Contains both the running group length as well as the group sum for every row.
id description description_length running_length group_id group_running_length group_sum
-- ---------------- ------------------ -------------- -------- -------------------- ---------
1 qmlsdkjfqmsldk 14 14 1 14 33
2 mldskjf 7 21 1 21 33
3 qmsdlfkqjsdm 12 33 1 33 33
4 fmqlsdkfq 9 42 2 9 39
5 qdsfqsdfqq 10 52 2 19 39
6 mds 3 55 2 22 39
7 qmsldfkqsjdmfqlkj 17 72 2 39 39
8 qdmsl 5 77 3 5 28
9 mqlskfjqmlkd 12 89 3 17 28
10 qsdqfdddffd 11 100 3 28 28
Fiddle to see things in action (includes random data version).

grouping results based on time diff in sql

I have results like this
TimeDiffMin | OrdersCount
10 | 2
12 | 5
09 | 6
20 | 15
27 | 11
I would like the following
TimeDiffMin | OrdersCount
05 | 0
10 | 8
15 | 5
20 | 15
25 | 0
30 | 11
So you can see that i want the grouping of every 5 minutes and show the total order count in those 5 minutes. eg. 0-5 minutes 0 orders, 5-10 minutes 8 orders
any help would be appreciated.
current query:
SELECT TimeDifferenceInMinutes, count(OrderId) NumberOfOrders FROM (
SELECT AO.OrderID, AO.OrderDate, AON.CreatedDate AS CancelledDate, DATEDIFF(minute, AO.OrderDate, AON.CreatedDate) AS TimeDifferenceInMinutes
FROM
(SELECT OrderID, OrderDate FROM AC_Orders) AO
JOIN
(SELECT OrderID, CreatedDate FROM AC_OrderNotes WHERE Comment LIKE '%has been cancelled.') AON
ON AO.OrderID = AON.OrderID
WHERE DATEDIFF(minute, AO.OrderDate, AON.CreatedDate) <= 100 AND AO.OrderDate >= '2016-12-01'
) AS Temp1
GROUP BY TimeDifferenceInMinutes
Now, if you are open to a TVF.
I use this UDF to create dynamic Date/Time Ranges. You supply the range and increment
Declare #YourTable table (TimeDiffMin int,OrdersCount int)
Insert Into #YourTable values
(10, 2),
(12, 5),
(09, 6),
(20,15),
(27,11)
Select TimeDiffMin = cast(R2 as int)
,OrdersCount = isnull(sum(OrdersCount),0)
From (Select R1=RetVal,R2=RetVal+5 From [dbo].[udf-Range-Number](0,25,5)) A
Left Join (
-- Your Complicated Query
Select * From #YourTable
) B on TimeDiffMin >= R1 and TimeDiffMin<R2
Group By R1,R2
Order By 1
Returns
TimeDiffMin OrdersCount
5 0
10 6
15 7
20 0
25 15
30 11
The UDF if interested
CREATE FUNCTION [dbo].[udf-Range-Number] (#R1 money,#R2 money,#Incr money)
Returns Table
Return (
with cte0(M) As (Select cast((#R2-#R1)/#Incr as int)),
cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (Select M from cte0) Row_Number() over (Order By (Select NULL)) From cte1 a,cte1 b,cte1 c,cte1 d,cte1 e,cte1 f,cte1 g,cte1 h )
Select RetSeq=1,RetVal=#R1 Union All Select N+1,(N*#Incr)+#R1
From cte2
)
-- Max 100 million observations
-- Select * from [dbo].[udf-Range-Number](0,4,0.25)
You can do this using a derived table to first build up your time difference windows and then joining from that to sum up all the Orders that fall within that window.
declare #t table(TimeDiffMin int
,OrdersCount int
);
insert into #t values
(10, 2)
,(12, 5)
,(09, 6)
,(20,15)
,(27,11);
declare #Increment int = 5; -- Set your desired time windows here.
with n(n)
as
( -- Select 10 rows to start with:
select n from(values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) as n(n)
),n2 as
( -- CROSS APPLY these 10 rows to get 10*10=100 rows we can use to generate incrementing ROW_NUMBERs. Use more CROSS APPLYs to get more rows:
select (row_number() over (order by (select 1))-1) * #Increment as StartMin
,(row_number() over (order by (select 1))) * #Increment as EndMin
from n -- 10 rows
cross apply n n2 -- 100 rows
--cross apply n n3 -- 1000 rows
--cross apply n n4 -- 10000 rows
)
select m.EndMin as TimeDiffMin
,isnull(sum(t.OrdersCount),0) as OrdersCount
from n2 as m
left join #t t
on(t.TimeDiffMin >= m.StartMin
and t.TimeDiffMin < m.EndMin
)
where m.EndMin <= 30 -- Filter as required
group by m.EndMin
order by m.EndMin
Query result:
TimeDiffMin OrdersCount
5 0
10 6
15 7
20 0
25 15
30 11

How to get change points in oracle select query?

How can I select change points from this data set
1 0
2 0
3 0
4 100
5 100
6 100
7 100
8 0
9 0
10 0
11 100
12 100
13 0
14 0
15 0
I want this result
4 7 100
11 12 100
This query based on analytic functions lag() and lead() gives expected output:
select id, nid, point
from (
select id, point, p1, lead(id) over (order by id) nid
from (
select id, point,
decode(lag(point) over (order by id), point, 0, 1) p1,
decode(lead(point) over (order by id), point, 0, 2) p2
from test)
where p1<>0 or p2<>0)
where p1=1 and point<>0
SQLFiddle
Edit: You may want to change line 3 in case there only one row for changing point:
...
select id, point, p1,
case when p1=1 and p2=2 then id else lead(id) over (order by id) end nid
...
It would be simple to use ROW_NUMBER analytic function, MIN and MAX.
This is a frequently asked question about finding the interval/series of values and skip the gaps. I like the word given to it as Tabibitosan method by Aketi Jyuuzou.
For example,
SQL> SELECT MIN(A),
2 MAX(A),
3 b
4 FROM
5 ( SELECT a,b, a-Row_Number() over(order by a) AS rn FROM t WHERE b <> 0
6 )
7 GROUP BY rn,
8 b
9 ORDER BY MIN(a);
MIN(A) MAX(A) B
---------- ---------- ----------
4 7 100
11 12 100
SQL>