Generating order statistics grouped by order total - sql

Hopefully I can explain this correctly. I have a table of line orders (each line order consists of quantity of item and the price, there are other fields but I left those out.)
table 'orderitems':
orderid | quantity | price
1 | 1 | 1.5000
1 | 2 | 3.22
2 | 1 | 9.99
3 | 4 | 0.44
3 | 2 | 15.99
So to get order total I would run
SELECT SUM(Quantity * price) AS total
FROM OrderItems
GROUP BY OrderID
However, I would like to get a count of all total orders under $1 (just provide a count).
My end result I would like would be able to define ranges:
under $1, $1 - $3, 3-5, 5-10, 10-15, 15.. etc;
and my data to look like so (hopefully):
tunder1 | t1to3 | t3to5 | t5to10 | etc
10 | 500 | 123 | 5633 |
So that I can present a piechart breakdown of customer orders on our eCommerce site.
Now I can run individual SQL queries to get this, but I would like to know what the most efficient 'single sql query' would be. I am using MS SQL Server.
Currently I can run a single query like so to get under $1 total:
SELECT COUNT(total) AS tunder1
FROM (SELECT SUM(Quantity * price) AS total
FROM OrderItems
GROUP BY OrderID) AS a
WHERE (total < 1)
How can I optimize this? Thanks in advance!

select
count(case when total < 1 then 1 end) tunder1,
count(case when total >= 1 and total < 3 then 1 end) t1to3,
count(case when total >= 3 and total < 5 then 1 end) t3to5,
...
from
(
select sum(quantity * price) as total
from orderitems group by orderid
);

you need to use HAVING for filtering grouped values.

try this:
DECLARE #YourTable table (OrderID int, Quantity int, Price decimal)
INSERT INTO #YourTable VALUES (1,1,1.5000)
INSERT INTO #YourTable VALUES (1,2,3.22)
INSERT INTO #YourTable VALUES (2,1,9.99)
INSERT INTO #YourTable VALUES (3,4,0.44)
INSERT INTO #YourTable VALUES (3,2,15.99)
SELECT
SUM(CASE WHEN TotalCost<1 THEN 1 ELSE 0 END) AS tunder1
,SUM(CASE WHEN TotalCost>=1 AND TotalCost<3 THEN 1 ELSE 0 END) AS t1to3
,SUM(CASE WHEN TotalCost>=3 AND TotalCost<5 THEN 1 ELSE 0 END) AS t3to5
,SUM(CASE WHEN TotalCost>=5 THEN 1 ELSE 0 END) AS t5andup
FROM (SELECT
SUM(quantity * price) AS TotalCost
FROM #YourTable
GROUP BY OrderID
) dt
OUTPUT:
tunder1 t1to3 t3to5 t5andup
----------- ----------- ----------- -----------
0 0 0 3
(1 row(s) affected)

WITH orders (orderid, quantity, price) AS
(
SELECT 1, 1, 1.5
UNION ALL
SELECT 1, 2, 3.22
UNION ALL
SELECT 2, 1, 9.99
UNION ALL
SELECT 3, 4, 0.44
UNION ALL
SELECT 4, 2, 15.99
),
ranges (bound) AS
(
SELECT 1
UNION ALL
SELECT 3
UNION ALL
SELECT 5
UNION ALL
SELECT 10
UNION ALL
SELECT 15
),
rr AS
(
SELECT bound, ROW_NUMBER() OVER (ORDER BY bound) AS rn
FROM ranges
),
r AS
(
SELECT COALESCE(rf.rn, 0) AS rn, COALESCE(rf.bound, 0) AS f,
rt.bound AS t
FROM rr rf
FULL JOIN
rr rt
ON rt.rn = rf.rn + 1
)
SELECT rn, f, t, COUNT(*) AS cnt
FROM r
JOIN (
SELECT SUM(quantity * price) AS total
FROM orders
GROUP BY
orderid
) o
ON total >= f
AND total < COALESCE(t, 10000000)
GROUP BY
rn, t, f
Output:
rn f t cnt
1 1 3 1
3 5 10 2
5 15 NULL 1
, that is 1 order from $1 to $3, 2 orders from $5 to $10, 1 order more than $15.

Related

BigQuery SQL - Create New Column Based on the Max Value from Multiple Columns

I have a table contains info about customers and their purchases amount of each type of food. I want to create new columns that is the most freq type of food they have purchased. Is there an efficient way to do this?
I tried using case when and do one-to-one comparison, but it got very tedious.
Sample data:
Cust_ID
apple_type1
apple_type2
apple_type3
apple_type4
apple_type5
apple_type6
1
2
0
0
3
6
1
2
0
0
0
1
0
1
3
4
2
1
1
0
1
4
5
5
5
0
0
0
5
0
0
0
0
0
0
--WANT
Cust_ID
freq_apple_type_buy
1
type5
2
type4 and type6
3
type1
4
type1 and type2 and type3
5
unknown
Consider below approach
select Cust_ID, if(count(1) = any_value(all_count), 'unknown', string_agg(type, ' and ')) freq_apple_type_buy
from (
select *, count(1) over(partition by Cust_ID) all_count
from (
select Cust_ID, replace(arr[offset(0)], 'apple_', '') type,cast(arr[offset(1)] as int64) value
from data t,
unnest(split(translate(to_json_string((select as struct * except(Cust_ID) from unnest([t]))), '{}"', ''))) kv,
unnest([struct(split(kv, ':') as arr)])
)
where true qualify 1 = rank() over(partition by Cust_ID order by value desc)
)
group by Cust_ID
if applied to sample data in your question - output is
This uses UNPIVOT to turn your columns in to rows. Then uses RANK() to assign each row a rank, which means if multiple rows are matched in quantity, they share the same rank.
It then selects only the products with rank=1 (possibly multiple rows, if multiple products are tied for first place)
WITH
normalised_and_ranked AS
(
SELECT
cust_id,
product,
qty,
RANK() OVER (PARTITION BY cust_id ORDER BY qty DESC) AS product_rank,
ROW_NUMBER() OVER (PARTITION BY cust_id ORDER BY qty DESC) AS product_row
FROM
yourData
UNPIVOT(
qty FOR product IN (apple_type1, apple_type2, apple_type3, apple_type4, apple_type5, apple_type6)
)
)
SELECT
cust_id,
CASE WHEN qty = 0 THEN NULL ELSE product END AS product,
CASE WHEN qty = 0 THEN NULL ELSE qty END AS qty
FROM
normalised_and_ranked
WHERE
(product_rank = 1 AND qty > 0)
OR
(product_row = 1)
Edit: fudge added to ensure row of nulls returned if all qty are 0.
(Normally I'd just not return a row for such customers.)

How to return all records from table A , if any one of the column has a specific value in oracle sql?

Below is the sample data
If I pass lot name as a parameter, I want to return employees who has greater than 0 records in The specific Lot . Not just the one record but all the records of that employee.
Table A
Empid lotname itemcount
1 A 1
1 B 1
2 B 0
3 B 1
3 C 0
Parameter - B
Result :
Empid lotname itemcount
1 A 1
1 B 1
3 B 1
3 C 0
Because employee 3 and 1 has count in B lot. All the employee lot details should be returned.
select data.* from A data,
(select Empid,count(lotname)
from A
group by Empid
having count(lotname)>1) MulLotEmp
where data.lotname='B'
and data.Empid=MulLotEmp.Empid;
Check if this query solves your problem. In this I created a inner table first for your first requirement that emp with multiple lot, then I mapped this table with actual table with condition of input lot name.
If I understand correctly, you want all "1" and then only "0" if there is no "1".
One method is:
select a.*
from a
where itemcount = 1 or
not exists (select 1 from a a2 where a2.empid = a.empid and a2.itemcount = 1);
In Oracle, you can use the MAX analytic function:
SELECT Empid,
lotname,
itemcount
FROM (
SELECT t.*,
MAX( itemcount ) OVER ( PARTITION BY Empid ) AS max_itemcount
FROM table_name t
)
WHERE max_itemcount = 1;
So, for you sample data:
CREATE TABLE table_name ( Empid, lotname, itemcount ) AS
SELECT 1, 'A', 1 FROM DUAL UNION ALL
SELECT 1, 'B', 1 FROM DUAL UNION ALL
SELECT 2, 'B', 0 FROM DUAL UNION ALL
SELECT 3, 'B', 1 FROM DUAL UNION ALL
SELECT 3, 'C', 0 FROM DUAL;
This outputs:
EMPID | LOTNAME | ITEMCOUNT
----: | :------ | --------:
1 | A | 1
1 | B | 1
3 | B | 1
3 | C | 0
db<>fiddle here
The analytic function
sum(case when LOTNAME = 'B' /* parameter */ then ITEMCOUNT end) over (partition by EMPID) as lot_itemcnt
calculates for each customer the total number of items with the selected lot.
Feel free to use it as a bind variable, e.g.
sum(case when LOTNAME = ? /* parameter */ then ITEMCOUNT end) over (partition by EMPID) as lot_itemcnt
The whole query is than as follows
with cust as (
select
EMPID, LOTNAME, ITEMCOUNT,
sum(case when LOTNAME = 'B' /* parameter */ then ITEMCOUNT end) over (partition by EMPID) as lot_itemcnt
from tab)
select
EMPID, LOTNAME, ITEMCOUNT
from cust
where lot_itemcnt >= 1;

Finding orders where products of both types are present

Consider below table tbl:
ordernr productId productType
1 12 A
2 15 B
2 13 C
2 12 A
3 15 B
3 12 A
3 11 D
How can I get only rows where products of both productType's B and C are present in the order?
The desired output should be below because products of both type B and C are present in the order:
2 15 B
2 13 C
2 12 A
It might be more efficient to use use exists twice:
select t.*
from mytable t
where
exists (select 1 from mytable t1 where t1.ordernr = t.ordernr and t1.productid = 'B')
and exists (select 1 from mytable t1 where t1.ordernr = t.ordernr and t1.productid = 'C')
This query would take advantage of an index on (ordernr, productid).
One method is using a CTE to get the counts and then filter using those in the outer query:
WITH CTE AS(
SELECT ordernr,
productId,
productType
COUNT(CASE productType WHEN 'B' THEN 1 END) AS BCount,
COUNT(CASE productType WHEN 'C' THEN 1 END) AS CCount
FROM dbo.YourTable)
SELECT ordernr,
productId,
productType
FROM CTE
WHERE BCount > 0
AND CCount > 0;
You can get all the ordernrs that you need with this query:
select ordernr
from tablename
where productType in ('B', 'C')
group by ordernr
having count(distinct productType) = 2
So you can use it with the operator in:
select * from tablename
where ordernr in (
select ordernr
from tablename
where productType in ('B', 'C')
group by ordernr
having count(distinct productType) = 2
)
See the demo.
Results:
> ordernr | productId | productType
> ------: | --------: | :----------
> 2 | 15 | B
> 2 | 13 | C
> 2 | 12 | A

SQL query to search first rows until sum = value and skip big value that can exceed the value

I have a table
id | amount
---+--------
1 | 500
2 | 300
3 | 750
4 | 200
5 | 500
I want to select rows ascending until the sum is 1000 or until all rows are searched (and skip a big value (750) that can exceed 1000).
How can I do query to return some rows like below?
Thanks for help
id | amount
---+--------
1 | 500
2 | 300
4 | 200
I think that you need a common table expression for this.
The idea is to do a cumulative sum that skips the rows that would cause the sum to go above 1000 (aliased sm in the CTE), and to flag the records to skip (aliased keep in the CTE). Then the outer query just filters on the flag.
with recursive cte as (
select
id,
amount,
case when amount > 1000 then 0 else amount end sm,
case when amount > 1000 then 0 else 1 end keep
from mytable
where id = 1
union all
select
t.id,
t.amount,
case when c.sm + t.amount > 1000 then c.sm else c.sm + t.amount end,
case when c.sm + t.amount > 1000 then 0 else 1 end
from cte c
inner join mytable t on t.id = c.id + 1
)
select id, amount from cte where keep = 1 order by id
Demo on DB Fiddle:
id | amount
-: | -----:
1 | 500
2 | 300
4 | 200
you should get the expected result using a recursively common table expression..
doing something like this..
with RECURSIVE yourtableOrdered as (select row_number() over (order by id) row_num, id, val from (values (1, 500), (2, 300), (3, 750), (4, 200), (5, 500)) V (id, val)),
lineSum as (
select row_num, id, val,
case when val <= 1000 then val else 0 end totalSum,
case when val <= 1000 then true else false end InResult
from yourtableOrdered
where row_num = 1
union all
select y.row_num, y.id, y.val,
case when previousLine.totalSum + y.val <= 1000 then previousLine.totalSum + y.val else previousLine.totalSum end totalSum,
case when previousLine.totalSum + y.val <= 1000 then true else false end InResult
from yourtableOrdered y
inner join lineSum previousLine
on y.row_num = previousLine.row_num + 1
),
yourExpectedResult as (
select * from lineSum where InResult = true
)
select * from yourExpectedResult
see a working sample in
http://sqlfiddle.com/#!17/2cbcf/1/0
Use a cumulative sum:
select t.*
from (select t.*,
sum(amount) over (order by id) as running_amount
from t
) t
where running_amount - amount < 1000;

SQL: Get multiple line entries linked to one item?

I have a table:
ID | ITEMID | STATUS | TYPE
1 | 123 | 5 | 1
2 | 123 | 4 | 2
3 | 123 | 5 | 3
4 | 125 | 3 | 1
5 | 125 | 5 | 3
Any item can have 0 to many entries in this table. I need a query that will tell me if an ITEM has all it's entries in either a state of 5 or 4. For example, in the above example, I would like to end up with the result:
ITEMID | REQUIREMENTS_MET
123 | TRUE --> true because all statuses are either 5 or 4
125 | FALSE --> false because it has a status of 3 and a status of 5.
If the 3 was a 4 or 5, then this would be true
What would be even better is something like this:
ITEMID | MET_REQUIREMENTS | NOT_MET_REQUIREMENTS
123 | 3 | 0
125 | 1 | 1
Any idea how to write a query for that?
Fast, short, simple:
SELECT itemid
,count(status = 4 OR status = 5 OR NULL) AS met_requirements
,count(status < 4 OR status > 5 OR NULL) AS not_met_requirements
FROM tbl
GROUP BY itemid
ORDER BY itemid;
Assuming all columns to be integer NOT NULL.
Builds on basic boolean logic:
TRUE OR NULL yields TRUE
FALSE OR NULL yields NULL
And NULL is not counted by count().
->SQLfiddle demo.
SELECT a.ID FROM (SELECT ID, MIN(STATUS) AS MINSTATUS, MAX(STATUS) AS MAXSTATUS FROM TABLE_NAME AS a GROUP BY ID)
WHERE a.MINSTATUS >= 4 AND a.MAXSTATUS <= 5
One way of doing this would be
SELECT t1.itemid, NOT EXISTS(SELECT 1
FROM mytable t2
WHERE itemid=t1.itemid
AND status NOT IN (4, 5)) AS requirements_met
FROM mytable t1
GROUP BY t1.itemid
UPDATE: for your updated requirement, you can have something like:
SELECT itemid,
sum(CASE WHEN status IN (4, 5) THEN 1 ELSE 0 END) as met_requirements,
sum(CASE WHEN status IN (4, 5) THEN 0 ELSE 1 END) as not_met_requirements
FROM mytable
GROUP BY itemid
simple one:
select
"ITEMID",
case
when min("STATUS") in (4, 5) and max("STATUS") in (4, 5) then 'True'
else 'False'
end as requirements_met
from table1
group by "ITEMID"
better one:
select
"ITEMID",
sum(case when "STATUS" in (4, 5) then 1 else 0 end) as MET_REQUIREMENTS,
sum(case when "STATUS" in (4, 5) then 0 else 1 end) as NOT_MET_REQUIREMENTS
from table1
group by "ITEMID";
sql fiddle demo
WITH dom AS (
SELECT DISTINCT item_id FROM items
)
, yes AS ( SELECT item_id, COUNT(*) AS good_count FROM items WHERE status IN (4,5) GROUP BY item_id
)
, no AS ( SELECT item_id, COUNT(*) AS bad_count FROM items WHERE status NOT IN (4,5) GROUP BY item_id
)
SELECT d.item_id
, COALESCE(y.good_count,0) AS good_count
, COALESCE(n.bad_count,0) AS bad_count
FROM dom d
LEFT JOIN yes y ON y.item_id = d.item_id
LEFT JOIN no n ON n.item_id = d.item_id
;
Can be done with an outer join, too:
WITH yes AS ( SELECT item_id, COUNT(*) AS good_count FROM items WHERE status IN (4,5) GROUP BY item_id)
, no AS ( SELECT item_id, COUNT(*) AS bad_count FROM items WHERE status NOT IN (4,5) GROUP BY item_id)
SELECT COALESCE(y.item_id, n.item_id) AS item_id
, COALESCE(y.good_count,0) AS good_count
, COALESCE(n.bad_count,0) AS bad_count
FROM yes y
FULL JOIN no n ON n.item_id = y.item_id
;
Nevermind, it was actually easy to do:
select ITEM_ID ,
sum (case when STATUS >= 3 then 1 else 0 end ) as met_requirements,
sum (case when STATUS < 3 then 1 else 0 end ) as not_met_requirements
from TABLE as d
group by ITEM_ID