Calculation average costing using with recursive sql (postgres 9.1) - sql

The closest thing I got from searching this site is this :
Inventory Average Cost Calculation in SQL
But unfortunately it was oracle specific, using model clause.
So let's begin.
There are two tables:
-the one that holds inventory transactions, and
-the one that holds the latest inventory valuation
I am trying to make an inventory valuation report using average costing method based on a certain date.
Doing it the normal way, calculating from the beginning until that specific date, will yield variable response time.
Imagine calculating on five years worth of data ( and thousands different inventory items ).
It will take considerable amount of time ( and my company is not silicon-valley grade. meaning, 2 core cpu and 8 GB of RAM only)
so I am calculating it backwardly: from the latest (current) backtrack to that specific date.
(Every month the accounting dept will check on data, so the calculation will only deal with 1 month's worth of data, forever.
equal to consistent unchanging performance)
I have merged the table into one on the script below
create table test3 ( rn integer, amt numeric, qty integer, oqty integer);
insert into test3 (rn,amt,qty,oqty) values (0,2260038.16765793,8,0);
insert into test3 (rn,amt,qty,oqty) values (1,1647727.2727,3,0);
insert into test3 (rn,amt,qty,oqty) values (2,2489654.75326715,0,1);
insert into test3 (rn,amt,qty,oqty) values (3,2489654.75326715,0,1);
insert into test3 (rn,amt,qty,oqty) values (4,1875443.6364,1,0);
insert into test3 (rn,amt,qty,oqty) values (5,1647727.2727,3,0);
insert into test3 (rn,amt,qty,oqty) values (6,3012987.01302857,0,1);
insert into test3 (rn,amt,qty,oqty) values (7,3012987.01302857,0,1);
select * from test3; (already sorted desc so rn=1 is the newest transaction)
rn amt qty oqty
0 2260038.168 8 0 --> this is the current average
1 1647727.273 3 0
2 2489654.753 0 1
3 2489654.753 0 1
4 1875443.636 1 0
5 1647727.273 3 0
6 3012987.013 0 1
7 3012987.013 0 1
with recursive
runsum (id,amt,qty,oqty,sqty,avg) as
(select data.id, data.amt, data.qty, data.oqty, data.sqty, data.avg
from (
select rn as id,amt,qty, oqty,
sum(case when rn=0 then qty else
case when oqty=0 then qty*-1
else oqty end end) over (order by rn) as sqty, lag(amt) over (order by rn) as avg
from test3 ) data
),
trans (id,amt,qty,oqty,sqty,prevavg,avg) as
(select id,amt,qty,oqty, sqty,avg,avg
from runsum
union
select runsum.id,trans.amt,trans.qty, trans.oqty, trans.sqty, lag(trans.avg) over (order by 1),
case when runsum.sqty=0 then runsum.amt else
((trans.prevavg*(runsum.sqty+trans.qty))-(runsum.amt*trans.qty)+(trans.prevavg*trans.oqty))/(runsum.sqty+trans.oqty)
end
from runsum join trans using (id))
select *
from trans
where prevavg is null and avg is not null
order by id;
The result is supposed to be like this
rn amt qty oqty sum avg
1 1647727.273 3 0 5 2627424.705
2 2489654.753 0 1 6 2627424.705
3 2489654.753 0 1 7 2627424.705
4 1875443.636 1 0 6 2752754.883
5 1647727.273 3 0 3 3857782.493
6 3012987.013 0 1 4 3857782.493
7 3012987.013 0 1 5 3857782.493
but instead I get this
id amt qty oqty sqty avg
1 1647727.273 3 0 5 2627424.705
2 2489654.753 0 1 6 2627424.705
3 2489654.753 0 1 7 2627424.705
5 1647727.273 3 0 3 3607122.137 --> id=4 is missing thus
screwing the calculation
and id=6 in turn dissappears tpp
7 3012987.013 0 1 5 3607122.137
I am flabbergasted.
Where is the mistake?
Thank you for your kind help.
EDITED
Average Costing Method backtracking ( given current avg calculate last transaction avg, and so on until nth transactions )
Avg (n) = ((Avg(n-1) * (Cum Qty(n)+In Qty(n))) - (In Amount(n) * In Qty (n)) + (Avg(n-1) * Out Qty(n))/(Cum Qty(n)+Out Amount(n))
Cumulative qty for backtracking transactions would be minus for in, plus for out.
So if current qty is 8, transaction in qty before is 3, then cumulative qty for that transaction is 5.
To calculate the average for one transaction before last, then we use current average to use in that transaction calculation.
CURRENT ANSWER BY #kordirko's help
with recursive
runsum (id,amt,qty,oqty,sqty,avg) as
(select data.id, data.amt, data.qty, data.oqty, data.sqty, data.avg
from (
select rn as id,amt,qty, oqty,
sum(case when rn=0 then qty else
case when oqty=0 then qty*-1
else oqty end end) over (order by rn) as sqty, lag(amt) over (order by rn) as avg
from test3 ) data
),
counter (maximum) as
(select count(rn)
from test3
),
trans (n, id,amt,qty,oqty,sqty,prevavg,avg) as
(select 0 n, id,amt,qty,oqty, sqty,avg,avg
from runsum
union
select trans.n+1, runsum.id,trans.amt,trans.qty, trans.oqty, trans.sqty,
lag(trans.avg) over (order by 1),
case when runsum.sqty=0 then runsum.amt else
((trans.prevavg*(runsum.sqty+trans.qty))-(runsum.amt*trans.qty)+(trans.prevavg*trans.oqty))/(runsum.sqty+trans.oqty)
end
from runsum join trans using (id)
where trans.n<(select maximum*2 from counter))
select *
from trans
where prevavg is null and avg is not null
order by id;

This is probably not the "best" answer to your question, but while struggling with this tricky problem, I hit - just by accident - some ugly workaround :).
Click on this SQL Fiddle demo
with recursive
trans (n, id, amt, qty, oqty, sqty, prevavg, avg) as (
select 0 n, id, amt, qty, oqty, sqty, avg, avg
from runsum
union
select trans.n + 1, runsum.id, trans.amt, trans.qty, trans.oqty, trans.sqty,
lag(trans.avg) over (order by 1),
case when runsum.sqty=0 then runsum.amt
else
((trans.prevavg *(runsum.sqty+trans.qty))-(runsum.amt*trans.qty)+(trans.prevavg*trans.oqty))/(runsum.sqty+trans.oqty)
end
from runsum
join trans using (id)
where trans.n < 20
)
select *
from trans
where prevavg is null and avg is not null
order by id;
It seems that the source of the problem is UNION clause in the recursive query.
Read this link: http://www.postgresql.org/docs/8.4/static/queries-with.html
They wrote that for UNION the recursive query discards duplicate rows while evaluating recursive query.

Related

BigQuery SQL - Create New Column Based on the Max Value from Multiple Columns

I have a table contains info about customers and their purchases amount of each type of food. I want to create new columns that is the most freq type of food they have purchased. Is there an efficient way to do this?
I tried using case when and do one-to-one comparison, but it got very tedious.
Sample data:
Cust_ID
apple_type1
apple_type2
apple_type3
apple_type4
apple_type5
apple_type6
1
2
0
0
3
6
1
2
0
0
0
1
0
1
3
4
2
1
1
0
1
4
5
5
5
0
0
0
5
0
0
0
0
0
0
--WANT
Cust_ID
freq_apple_type_buy
1
type5
2
type4 and type6
3
type1
4
type1 and type2 and type3
5
unknown
Consider below approach
select Cust_ID, if(count(1) = any_value(all_count), 'unknown', string_agg(type, ' and ')) freq_apple_type_buy
from (
select *, count(1) over(partition by Cust_ID) all_count
from (
select Cust_ID, replace(arr[offset(0)], 'apple_', '') type,cast(arr[offset(1)] as int64) value
from data t,
unnest(split(translate(to_json_string((select as struct * except(Cust_ID) from unnest([t]))), '{}"', ''))) kv,
unnest([struct(split(kv, ':') as arr)])
)
where true qualify 1 = rank() over(partition by Cust_ID order by value desc)
)
group by Cust_ID
if applied to sample data in your question - output is
This uses UNPIVOT to turn your columns in to rows. Then uses RANK() to assign each row a rank, which means if multiple rows are matched in quantity, they share the same rank.
It then selects only the products with rank=1 (possibly multiple rows, if multiple products are tied for first place)
WITH
normalised_and_ranked AS
(
SELECT
cust_id,
product,
qty,
RANK() OVER (PARTITION BY cust_id ORDER BY qty DESC) AS product_rank,
ROW_NUMBER() OVER (PARTITION BY cust_id ORDER BY qty DESC) AS product_row
FROM
yourData
UNPIVOT(
qty FOR product IN (apple_type1, apple_type2, apple_type3, apple_type4, apple_type5, apple_type6)
)
)
SELECT
cust_id,
CASE WHEN qty = 0 THEN NULL ELSE product END AS product,
CASE WHEN qty = 0 THEN NULL ELSE qty END AS qty
FROM
normalised_and_ranked
WHERE
(product_rank = 1 AND qty > 0)
OR
(product_row = 1)
Edit: fudge added to ensure row of nulls returned if all qty are 0.
(Normally I'd just not return a row for such customers.)

Can I count number of SUBSEQUENT rows with values larger than current row?

Row Input Output Output Explanation
1 14.93 6 6 because input value on rows 2 to 7 are smaller than row 1
2 9.74 0 0 because input value on row 3 is larger than row 2
3 12.89 0 0 because input value on row 4 is larger than row 3
4 13.09 2 2 because input value on rows 5 to 6 are smaller than row 4
5 7.84 0 0 because input value on row 6 is larger than row 5
6 12.81 0 0 because input value on row 7 is larger than row 6
7 13.15 0 0 because input value on row 8 is larger than row 7
8 18.15 0 0 because input value in row 8 is last in series
Please can you help me with defining the SQL server code for the logic in the table?
I have tried a number of different approaches including recursive CTEs, CAST, LEAD… OVER..., etc. My SQL skills are not up to this challenge, which seems to be easy to describe in words, but difficult to code!
Please not the logic in the last row is different from the rest.
MAX output value should be 244.
declare #t table
(
Row int,
Input decimal(5,2)
);
insert into #t(Row, Input)
values
(1, 14.93),
(2, 9.74),
(3, 12.89),
(4, 13.09),
(5, 7.84),
(6, 12.81),
(7, 13.15),
(8, 18.15);
select *,
case
when lead(a.Input) over(order by a.Row) < a.Input then
(
select count(*) - count(xyz)
from
(
select case when b.Input < a.Input then null else b.Input end as xyz
from #t as b
where b.Row > a.Row
) as c
)
else 0
end as Output
from #t as a;
I don't think this can easily be done with window functions. We need to iterate for each original row, while keeping track of the original value.
I would use a recursive query here:
with
data as (select t.*, row_number() over(order by row) rn from mytable t),
cte as (
select row, rn, input, 0 as output from data
union all
select c.row, d.rn, c.input, c.output + 1
from cte c
inner join data d on d.rn = c.rn + 1 and d.input < c.input
)
select input, max(output) as output
from cte
group by row, input
order by row
For each row, the logic is to iteratively check the following rows. It the following value is smaller than the one on the original row, we increment the output counter; if it is not, the recursion stops for that row. Then all that is left to do is keep the greatest counter per original row.
Demo on DB Fiddle:
input | output
----: | -----:
14.93 | 6
9.74 | 0
12.89 | 0
13.09 | 2
7.84 | 0
12.81 | 0
13.15 | 0
18.15 | 0
You can do this with apply:
with t as (
select t.*, row_number() over (order by row) as seqnum,
1 + count(*) over () as cnt
from mytable t
)
select t.*, coalesce(coalesce(t2.min_seqnum, t.cnt) - t.seqnum - 1, 0) as output
from t outer apply
(select min(t2.seqnum) as min_seqnum
from t t2
where t2.row > t.row and t2.input > t.input
) t2
order by row;
The idea is to find the next row that is bigger than the current row. The slight complication (why cnt is needed) is in case there is no larger row.
Here is a db<>fiddle.
You can use sub-query as follows:
WITH CTE AS
(SELECT T.*,
ROW_NUMBER() OVER (ORDER BY ROW) AS RN
FROM YOUR_TABLE T)
SELECT C.ROW, C.INPUT,
COALESCE((SELECT MIN(CC.RN) - C.RN - 1
FROM CTE CC
WHERE CC.INPUT > C.INPUT AND CC.RN > C.RN)
, 0) AS OUTPUT
FROM CTE C;

T-SQL Select all combinations of ranges that meet aggregate criteria

Problem restated per comments
Say we have the following integer id's and counts...
id count
1 0
2 10
3 0
4 0
5 0
6 1
7 9
8 0
We also have a variable #id_range int.
Given a value for #id_range, how can we select all combinations of id ranges, without using while loops or cursors, that meet the following criteria?
1) No two ranges in a combination can overlap (min and max of each range are inclusive)
2) sum(count) for a combination of ranges must equal sum(count) of the initial data set (20 in this case)
3) Only include ranges where sum(count) > 0
The simplest case would be when #id_range = max(id) - min(id), or 7 given the above data. In this case, there's only one solution:
minId maxId count
---------------------
1 8 20
But if #id_range = 1 for example, there would be 4 possible solutions:
Solution 1:
minId maxId count
---------------------
1 2 10
5 6 1
7 8 9
Solution 2:
minId maxId count
---------------------
1 2 10
6 7 10
Solution 3:
minId maxId count
---------------------
2 3 10
5 6 1
7 8 9
Solution 4:
minId maxId count
---------------------
2 3 10
6 7 10
The end goal is to identify which solutions have the fewest number of ranges (solution # 2 and 4, in above example where #id_range = 1).
this solution does not list all possible combination but just try to get group it in smallest possible no of rows.
Hopefully it will cover all possible scenario
-- create the sample table
declare #sample table
(
id int,
[count] int
)
-- insert some sample data
insert into #sample select 1, 0
insert into #sample select 2, 10
insert into #sample select 3, 0
insert into #sample select 4, 0
insert into #sample select 5, 0
insert into #sample select 6, 1
insert into #sample select 7, 9
insert into #sample select 8, 0
-- the #id_range
declare #id_range int = 1
-- the query
; with
cte as
(
-- this cte identified those rows with count > 0 and group them together
-- sign(0) gives 0, sign(+value) gives 1
-- basically it is same as case when [count] > 0 then 1 else 0 end
select *,
grp = row_number() over (order by id)
- dense_rank() over(order by sign([count]), id)
from #sample
),
cte2 as
(
-- for each grp in cte, assign a sub group (grp2). each sub group
-- contains #id_range number of rows
select *,
grp2 = (row_number() over (partition by grp order by id) - 1)
/ (#id_range + 1)
from cte
where count > 0
)
select MinId = min(id),
MaxId = min(id) + #id_range,
[count] = sum(count)
from cte2
group by grp, grp2

SQL: How to select some (but not all) records in a query multiple times

We have a bunch of records and we assign a random number to each record whose value is between 1 and the total number of records in the following manner:
SELECT personID, ROW_NUMBER()
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
Easy like pie. Let's assume that a LOWER (edit: NOT higher, sorry!) number is better for Customer's purposes, and that they like how the 'random' element here works. Trouble is, customer now says 'some people are special and we want them to get three chances, and then save their best result as their number.'
Since we don't hand out numbers serially but all at once, the approach here seems to be to select special people three times in this query, and then grab their highest row number.
This is similar to, but one step more involved than this question (and others like it):
Select Records multiple times from table
I don't want to select ALL records three times; but I do want to do everything in one go; that is, I can't assign special people numbers, and then assign everyone else numbers - it has to be one query.
How would I construct a JOIN (and/or a CTE) to model this, assuming we can rely on a field like isSpecial = 1 on each record?
How would I then grab the 'lowest number' (i.e. first row_number appearance of that record) from the result in my SELECT statement?
Platform: Microsoft SQL 2012
SAMPLE DATA (including isSpecial in the output query just for demonstration's sake) - also, we want the minimum number here for business purposes, not the maximum
personID isSpecial
1 1
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
Current output:
SELECT personID, isSpecial, row_number
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
personID RowNumber isSpecial
8 1 0
2 2 0
10 3 0
1 4 1
9 5 0
3 6 0
4 7 0
6 8 0
5 9 0
7 10 0
DESIRED OUTPUT:
personID MinRowNumber isSpecial rowNumber1 rowNumber2 rowNumber3
8 1 0 1
2 2 0 2
1 3 1 4 7 3
9 5 0 5
3 6 0 6
6 8 0 8
5 9 0 9
7 10 0 10
4 11 0 11
10 12 0 12
You could do this using a tally table and some aggregation. Something along these lines.
WITH
cteTally(N) AS (select n from (values (1),(2),(3))dt(n))
select personID
, MAX(RowNumber)
from
(
SELECT personID
, ROW_NUMBER() OVER(ORDER BY NEWID()) as RowNumber
FROM folks f
join cteTally t on t.N <= case when f.IsSpecial = 1 then 3 else 1 end
) x
group by x.personID
--EDIT--
You stated you might want all rows not just the MAX one. Here is how you could do that.
WITH
cteTally(N) AS (select n from (values (1),(2),(3))dt(n))
SELECT personID
, ROW_NUMBER() OVER(ORDER BY NEWID()) as RowNumber
FROM folks f
join cteTally t on t.N <= case when f.IsSpecial = 1 then 3 else 1 end
I think you can use the UNION approach, but only apply NEWID() once:
create table folks (personID int, isSpecial int)
insert into folks values (1,1);
insert into folks values (2,0);
insert into folks values (3,0);
insert into folks values (4,0);
insert into folks values (5,0);
insert into folks values (6,0);
insert into folks values (7,0);
insert into folks values (8,0);
insert into folks values (9,0);
insert into folks values (10,0);
select * from folks;
select
personID,
min(rownumber) as min_rownumber
from
(SELECT
personID,
ROW_NUMBER() OVER(ORDER BY NEWID()) as RowNumber
FROM
(select personID from folks
union all
select personID from folks where isSpecial = 1
union all
select personID from folks where isSpecial = 1) u
) r
group by
personID
SQLFiddle
A correct way to solve the task is this.
Let we have O ordinary people plus S special people. Each ordinary person has one chance, each special person has 3 chances. We should generate O plus S * 3 random numbers evenly distributed in the range of [1 .. O+S*3], then order all people according to the numbers that they got. Special people will appear 3 times in this ordered list, ordinary people will appear only once.
Here is the query that does it. The code for creating the table with sample data is shown below in my first variant. CTE_Numbers is just a table with three numbers. If you want to give a different number of chances to special people, alter this query. CTE lists all ordinary people once plus all special people three times. CTE_rn assigns a random number to each row. Each special person gets three random numbers. As each special person has three rows in CTE_rn, final query groups by PersonID and leaves only one row for each special person with the minimum number. To get a better understanding how it works, examine the intermediate results of CTE_rn.
WITH
CTE_Numbers
AS
(
SELECT Number
FROM (VALUES (1),(2),(3)) AS N(Number)
)
,CTE
AS
(
-- list ordinary people only once
SELECT PersonID,IsSpecial
FROM #T
WHERE IsSpecial = 0
UNION ALL
-- list each special person three times
SELECT PersonID,IsSpecial
FROM #T CROSS JOIN CTE_Numbers
WHERE IsSpecial = 1
)
,CTE_rn
AS
(
SELECT
PersonID,IsSpecial
,ROW_NUMBER() OVER(ORDER BY CRYPT_GEN_RANDOM(4)) AS rn
FROM CTE
)
SELECT
PersonID,IsSpecial
,MIN(rn) AS FinalRank
FROM CTE_rn
GROUP BY PersonID,IsSpecial
ORDER BY FinalRank;
result
PersonID IsSpecial FinalRank
9 0 1
2 0 2
1 1 3
10 0 4
8 0 5
5 0 6
3 0 7
7 0 9
4 0 10
6 0 12
Note, how FinalRank has values from 1 to 12 (not 10) and values 8 and 11 are not shown. The special person had them. Special person got random numbers 3, 8, 11 and the final result contains only minimum out of these three.
The first variant. It works, but results are skewed.
Very straight-forward. Generate random row numbers three times, join them together and for ordinary people pick the result of the first random number, for special people pick the minimum of three runs.
Nobody promised any particular distribution of random numbers for NEWID, so you'd better not use it in this case. In this example I used CRYPT_GEN_RANDOM.
I put the same query to get random numbers in three separate CTEs, rather than using the same CTE in the join, to make sure that it is calculated three times. If you use a single CTE, the server may be smart enough to calculate random numbers only once, rather than three times and this not what we need here. We do need 30 calls to CRYPT_GEN_RANDOM here.
DECLARE #T TABLE (PersonID int, IsSpecial bit);
INSERT INTO #T(PersonID, IsSpecial) VALUES
(1 , 1),
(2 , 0),
(3 , 0),
(4 , 0),
(5 , 0),
(6 , 0),
(7 , 0),
(8 , 0),
(9 , 0),
(10, 0);
WITH
CTE1
AS
(
SELECT PersonID, IsSpecial,
ROW_NUMBER() OVER(ORDER BY CRYPT_GEN_RANDOM(4)) AS rn
FROM #T
)
,CTE2
AS
(
SELECT PersonID, IsSpecial,
ROW_NUMBER() OVER(ORDER BY CRYPT_GEN_RANDOM(4)) AS rn
FROM #T
)
,CTE3
AS
(
SELECT PersonID, IsSpecial,
ROW_NUMBER() OVER(ORDER BY CRYPT_GEN_RANDOM(4)) AS rn
FROM #T
)
,CTE_All
AS
(
SELECT
CTE1.PersonID
,CTE1.IsSpecial
,CTE1.rn AS rn1
,CTE2.rn AS rn2
,CTE3.rn AS rn3
,CA.MinRN
FROM
CTE1
INNER JOIN CTE2 ON CTE2.PersonID = CTE1.PersonID
INNER JOIN CTE3 ON CTE3.PersonID = CTE1.PersonID
CROSS APPLY
(
SELECT MIN(A.rn) AS MinRN
FROM (VALUES (CTE1.rn), (CTE2.rn), (CTE3.rn)) AS A(rn)
) AS CA
)
SELECT
PersonID
,IsSpecial
,CASE WHEN IsSpecial = 0
THEN rn1 -- a person is not special, he gets random rank from the first run only
ELSE MinRN -- a special person, he gets a rank that is minimum of three runs
END AS FinalRank
,rn1
,rn2
,rn3
,MinRN
FROM CTE_All
ORDER BY FinalRank;
result set
PersonID IsSpecial FinalRank rn1 rn2 rn3 MinRN
8 0 1 1 1 1 1
6 0 2 2 7 2 2
5 0 3 3 5 6 3
1 1 3 9 3 4 3
4 0 4 4 6 3 3
7 0 5 5 9 10 5
3 0 6 6 8 9 6
2 0 7 7 2 8 2
10 0 8 8 10 5 5
9 0 10 10 4 7 4
You can see that special people can (by chance) get the same rank as ordinary people. You can favor special people further and make sure that they appear before ordinary people in this case. Just alter ORDER BY to be ORDER BY FinalRank, IsSpecial DESC.
How about using UNION?
SELECT personID, ROW_NUMBER()
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
WHERE isSpecial = 0
UNION ALL
SELECT personID, MAX(RN)
FROM (
SELECT personID, ROW_NUMBER() AS 'RN'
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
WHERE isSpecial = 1
UNION ALL
SELECT personID, ROW_NUMBER()
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
WHERE isSpecial = 1
UNION ALL
SELECT personID, ROW_NUMBER()
OVER(ORDER BY NEWID()) as RowNumber
FROM folks
WHERE isSpecial = 1
)
GROUP BY personID

Generating order statistics grouped by order total

Hopefully I can explain this correctly. I have a table of line orders (each line order consists of quantity of item and the price, there are other fields but I left those out.)
table 'orderitems':
orderid | quantity | price
1 | 1 | 1.5000
1 | 2 | 3.22
2 | 1 | 9.99
3 | 4 | 0.44
3 | 2 | 15.99
So to get order total I would run
SELECT SUM(Quantity * price) AS total
FROM OrderItems
GROUP BY OrderID
However, I would like to get a count of all total orders under $1 (just provide a count).
My end result I would like would be able to define ranges:
under $1, $1 - $3, 3-5, 5-10, 10-15, 15.. etc;
and my data to look like so (hopefully):
tunder1 | t1to3 | t3to5 | t5to10 | etc
10 | 500 | 123 | 5633 |
So that I can present a piechart breakdown of customer orders on our eCommerce site.
Now I can run individual SQL queries to get this, but I would like to know what the most efficient 'single sql query' would be. I am using MS SQL Server.
Currently I can run a single query like so to get under $1 total:
SELECT COUNT(total) AS tunder1
FROM (SELECT SUM(Quantity * price) AS total
FROM OrderItems
GROUP BY OrderID) AS a
WHERE (total < 1)
How can I optimize this? Thanks in advance!
select
count(case when total < 1 then 1 end) tunder1,
count(case when total >= 1 and total < 3 then 1 end) t1to3,
count(case when total >= 3 and total < 5 then 1 end) t3to5,
...
from
(
select sum(quantity * price) as total
from orderitems group by orderid
);
you need to use HAVING for filtering grouped values.
try this:
DECLARE #YourTable table (OrderID int, Quantity int, Price decimal)
INSERT INTO #YourTable VALUES (1,1,1.5000)
INSERT INTO #YourTable VALUES (1,2,3.22)
INSERT INTO #YourTable VALUES (2,1,9.99)
INSERT INTO #YourTable VALUES (3,4,0.44)
INSERT INTO #YourTable VALUES (3,2,15.99)
SELECT
SUM(CASE WHEN TotalCost<1 THEN 1 ELSE 0 END) AS tunder1
,SUM(CASE WHEN TotalCost>=1 AND TotalCost<3 THEN 1 ELSE 0 END) AS t1to3
,SUM(CASE WHEN TotalCost>=3 AND TotalCost<5 THEN 1 ELSE 0 END) AS t3to5
,SUM(CASE WHEN TotalCost>=5 THEN 1 ELSE 0 END) AS t5andup
FROM (SELECT
SUM(quantity * price) AS TotalCost
FROM #YourTable
GROUP BY OrderID
) dt
OUTPUT:
tunder1 t1to3 t3to5 t5andup
----------- ----------- ----------- -----------
0 0 0 3
(1 row(s) affected)
WITH orders (orderid, quantity, price) AS
(
SELECT 1, 1, 1.5
UNION ALL
SELECT 1, 2, 3.22
UNION ALL
SELECT 2, 1, 9.99
UNION ALL
SELECT 3, 4, 0.44
UNION ALL
SELECT 4, 2, 15.99
),
ranges (bound) AS
(
SELECT 1
UNION ALL
SELECT 3
UNION ALL
SELECT 5
UNION ALL
SELECT 10
UNION ALL
SELECT 15
),
rr AS
(
SELECT bound, ROW_NUMBER() OVER (ORDER BY bound) AS rn
FROM ranges
),
r AS
(
SELECT COALESCE(rf.rn, 0) AS rn, COALESCE(rf.bound, 0) AS f,
rt.bound AS t
FROM rr rf
FULL JOIN
rr rt
ON rt.rn = rf.rn + 1
)
SELECT rn, f, t, COUNT(*) AS cnt
FROM r
JOIN (
SELECT SUM(quantity * price) AS total
FROM orders
GROUP BY
orderid
) o
ON total >= f
AND total < COALESCE(t, 10000000)
GROUP BY
rn, t, f
Output:
rn f t cnt
1 1 3 1
3 5 10 2
5 15 NULL 1
, that is 1 order from $1 to $3, 2 orders from $5 to $10, 1 order more than $15.