How to use distinct and sum both together in oracle? - sql

For example my table contains the following data:
ID price
-------------
1 10
1 10
1 20
2 20
2 20
3 30
3 30
4 5
4 5
4 15
So given the example above,
ID price
-------------
1 30
2 20
3 30
4 20
-----------
ID 100
How to write query in oracle? first sum(distinct price) group by id then sum(all price).

I would be very careful with a data structure like this. First, check that all ids have exactly one price:
select id
from table t
group by id
having count(distinct price) > 1;
I think the safest method is to extract a particular price for each id (say the maximum) and then do the aggregation:
select sum(price)
from (select id, max(price) as price
from table t
group by id
) t;
Then, go fix your data so you don't have a repeated additive dimension. There should be a table with one row per id and price (or perhaps with duplicates but controlled by effective and end dates).
The data is messed up; you should not assume that the price is the same on all rows for a given id. You need to check that every time you use the fields, until you fix the data.

first sum(distinct price) group by id then sum(all price)
Looking at your desired output, it seems you also need the final sum(similar to ROLLUP), however, ROLLUP won't directly work in your case.
If you want to format your output in exactly the way you have posted your desired output, i.e. with a header for the last row of total sum, then you could set the PAGESIZE in SQL*Plus.
Using UNION ALL
For example,
SQL> set pagesize 7
SQL> WITH DATA AS(
2 SELECT ID, SUM(DISTINCT price) AS price
3 FROM t
4 GROUP BY id
5 )
6 SELECT to_char(ID) id, price FROM DATA
7 UNION ALL
8 SELECT 'ID' id, sum(price) FROM DATA
9 ORDER BY ID
10 /
ID PRICE
--- ----------
1 30
2 20
3 30
4 20
ID PRICE
--- ----------
ID 100
SQL>
So, you have an additional row in the end with the total SUM of price.
Using ROLLUP
Alternatively, you could use ROLLUP to get the total sum as follows:
SQL> set pagesize 7
SQL> WITH DATA AS
2 ( SELECT ID, SUM(DISTINCT price) AS price FROM t GROUP BY id
3 )
4 SELECT ID, SUM(price) price
5 FROM DATA
6 GROUP BY ROLLUP(id);
ID PRICE
---------- ----------
1 30
2 20
3 30
4 20
ID PRICE
---------- ----------
100
SQL>

First do the DISTINCT and then a ROLLUP
SELECT ID, SUM(price) -- sum of the distinct prices
FROM
(
SELECT DISTINCT ID, price -- distinct prices per ID
FROM tab
) dt
GROUP BY ROLLUP(ID) -- two levels of aggregation, per ID and total sum

SELECT ID,SUM(price) as price
FROM
(SELECT ID,price
FROM TableName
GROUP BY ID,price) as T
GROUP BY ID
Explanation:
The inner query will select different prices for each ids.
i.e.,
ID price
-------------
1 10
1 20
2 20
3 30
4 5
4 15
Then the outer query will select SUM of those prices for each id.
Final Result :
ID price
----------
1 30
2 20
3 30
4 20
Result in SQL Fiddle.

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE MYTABLE ( ID, price ) AS
SELECT 1, 10 FROM DUAL
UNION ALL SELECT 1, 10 FROM DUAL
UNION ALL SELECT 1, 20 FROM DUAL
UNION ALL SELECT 2, 20 FROM DUAL
UNION ALL SELECT 2, 20 FROM DUAL
UNION ALL SELECT 3, 30 FROM DUAL
UNION ALL SELECT 3, 30 FROM DUAL
UNION ALL SELECT 4, 5 FROM DUAL
UNION ALL SELECT 4, 5 FROM DUAL
UNION ALL SELECT 4, 15 FROM DUAL;
Query 1:
SELECT COALESCE( TO_CHAR(ID), 'ID' ) AS ID,
SUM( PRICE ) AS PRICE
FROM ( SELECT DISTINCT ID, PRICE FROM MYTABLE )
GROUP BY ROLLUP ( ID )
ORDER BY ID
Results:
| ID | PRICE |
|----|-------|
| 1 | 30 |
| 2 | 20 |
| 3 | 30 |
| 4 | 20 |
| ID | 100 |

Related

ORACLE SQL: Reseting a counting sequence when ID ends for the next one

I am currently trying to reset a sequence if an ID of a Customer ends
right now, it is something like this:
CustomerID Product PosNr
1 Banana 1
1 Papaya 2
1 Apple 3
2 Laptop 1
2 Keyboard 2
I hope it is clear what I mean.
The PosNr should reset for another Customer.
Can I Set up something like this while inserting the values into the table, or in any other way?
It is row_number analytic function with appropriate partitioning.
SQL> with test (customerid, product) as
2 (select 1, 'banana' from dual union all
3 select 1, 'papaya' from dual union all
4 select 1, 'apple' from dual union all
5 select 2, 'laptop' from dual union all
6 select 2, 'keyboard' from dual
7 )
8 select customerid, product,
9 row_number() over (partition by customerid order by product) posnr
10 from test
11 /
CUSTOMERID PRODUCT POSNR
---------- -------- ----------
1 apple 1
1 banana 2
1 papaya 3
2 keyboard 1
2 laptop 2
SQL>

How to use SUM DISTINCT when the order has the same qty of items

I'm working on a query to show me total amount of orders sent and qty of items sent in a day. Due to the lots of joins I have duplicate rows. It looks like this:
DispatchDate Order Qty
2019-07-02 1 2
2019-07-02 1 2
2019-07-02 1 2
2019-07-02 2 2
2019-07-02 2 2
2019-07-02 2 2
2019-07-02 3 5
2019-07-02 3 5
2019-07-02 3 5
I'm using this query:
SELECT DispatchDate, COUNT(DISTINCT Order), SUM(DISTINCT Qty)
FROM TABLE1
GROUP BY DispatchDate
Obviously on this date there 3 orders with total of items that equals 9
However, the query is returning:
3 orders and 7 items
I don't have a clue how to resolve this issue. How can I sum the quantities for each orders instead of simply removing duplicates from only one column like SUM DISTINCT does
Could do a CTE
with cte1 as (
SELECT Order AS Order
, DispatchDate
, MAX(QTY) as QTY
FROM FROM TABLE1
GROUP BY Order
, DispatchDate
)
SELECT DispatchDate
, COUNT(Order)
, SUM(Qty)
FROM cte1
GROUP BY DispatchDate
You have major problems with your data model, if the data is stored this way. If this is the case, you need a table with one row per order.
If this is the result of a query, you can probably fix the underlying query so you are not getting duplicates.
If you need to work with the data in this format, then extract a single row for each group. I think that row_number() is quite appropriate for this purpose:
select count(*), sum(qty)
from (select t.*, row_number() over (partition by dispatchdate, corder order by corder) as seqnum
from t
) t
where seqnum = 1
Here is a db<>fiddle.
At first, you should avoid multiplicating of the rows while linking. Like, for example, using LEFT JOIN instead of JOIN. But, as we are where are:
SELECT DispatchDate, sum( Qty)
FROM (
SELECT distinct DispatchDate, Order, Qty
FROM TABLE1 )T
GROUP BY DispatchDate
you have typed SUM(DISTINCT Qty), which summed up distinct values for Qty, that is 2 and 5. This is 7, isn't it?
Due to the lots of joins I have duplicate rows.
IMHO, you should fix your primary data at first. Probably the Qty column is function of unique combination of DispatchDate,Order tuple. Delete duplicities in primary data source and ensure there cannot be different Qty for two rows with same DispatchDate,Order. Then go back to your task and you'll find your SQL much simpler. No offense regarding other answers but they just mask the mess in primary data source and are unclear about choosing Qty for duplicate DispatchDate,Order (some take max, some sum).
Try this:
SELECT DispatchDate, COUNT(DISTINCT Order), SUM(DISTINCT Qty)
FROM TABLE1
GROUP BY DispatchDate, Order
I think you need dispatch date and order wise sum of distinct quantity.
How about this? Check comments within the code.
(I renamed the order column to corder; order can't be used as an identifier).
SQL> WITH test (dispatchdate, corder, qty)
2 -- your sample data
3 AS (SELECT DATE '2019-07-02', 1, 2 FROM DUAL UNION ALL
4 SELECT DATE '2019-07-02', 1, 2 FROM DUAL UNION ALL
5 SELECT DATE '2019-07-02', 1, 2 FROM DUAL UNION ALL
6 --
7 SELECT DATE '2019-07-02', 2, 2 FROM DUAL UNION ALL
8 SELECT DATE '2019-07-02', 2, 2 FROM DUAL UNION ALL
9 SELECT DATE '2019-07-02', 2, 2 FROM DUAL UNION ALL
10 --
11 SELECT DATE '2019-07-02', 3, 5 FROM DUAL UNION ALL
12 SELECT DATE '2019-07-02', 3, 5 FROM DUAL UNION ALL
13 SELECT DATE '2019-07-02', 3, 5 FROM DUAL),
14 -- compute sum of distinct qty per BOTH dispatchdate AND corder
15 temp
16 AS ( SELECT t1.dispatchdate,
17 t1.corder,
18 SUM (DISTINCT t1.qty) qty
19 FROM test t1
20 GROUP BY t1.dispatchdate,
21 t1.corder
22 )
23 -- the final result is then simple
24 SELECT t.dispatchdate,
25 COUNT (*) cnt,
26 SUM (qty) qty
27 FROM temp t
28 GROUP BY t.dispatchdate;
DISPATCHDA CNT QTY
---------- ---------- ----------
02.07.2019 3 9
SQL>

Create table from loop output Oracle SQL

I need to pull a random sample from a table of ~5 million observations based on 175 demographic options. The demographic table is something like this form:
1 40 4%
2 30 3%
3 30 3%
- -
174 2 .02%
175 1 .01%
Basically I need this same demographic breakdown randomly sampled from the 5M row table. For each demographic I need a sample of the same one from the larger table but with 5x the number of observations (example: for demographic 1 I want a random sample of 200).
SELECT *
FROM (
SELECT *
FROM my_table
ORDER BY
dbms_random.value
)
WHERE rownum <= 100;
I've used this syntax before to get a random sample but is there any way I can modify this as a loop and substitute variable names from existing tables? I'll try to encapsulate the logic I need in pseudocode:
for (each demographic_COLUMN in TABLE1)
select random(5*num_obs_COLUMN in TABLE1) from ID_COLUMN in TABLE2
/*somehow join the results of each step in the loop into one giant column of IDs */
You could join your tables (assuming the 1-175 demographic value exists in both, or there is an equivalent column to join on), something like:
select id
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage
Each row in the main table is given a random pseudo-row-number within its demographic (via the analytic row_number()). The outer query then uses the relevant percentage to select how many of those randomly-ordered rows for each demographic to return.
I'm not sure I've understood how you're actually picking exactly how many of each you want, so that probably needs to be adjusted.
Demo with a smaller sample in a CTE, and matching smaller match condition:
-- CTEs for sample data
with my_table (id, demographic) as (
select level, mod(level, 175) + 1 from dual connect by level <= 175000
),
demographics (demographic, percentage, str) as (
select 1, 40, '4%' from dual
union all select 2, 30, '3%' from dual
union all select 3, 30, '3%' from dual
-- ...
union all select 174, 2, '.02%' from dual
union all select 175, 1, '.01%' from dual
)
-- actual query
select demographic, percentage, id, rn
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage;
DEMOGRAPHIC PERCENTAGE ID RN
----------- ---------- ---------- ----------
1 40 94150 1
1 40 36925 2
1 40 154000 3
1 40 82425 4
...
1 40 154350 199
1 40 126175 200
2 30 36051 1
2 30 1051 2
2 30 100451 3
2 30 18026 149
2 30 151726 150
3 30 125302 1
3 30 152252 2
3 30 114452 3
...
3 30 104652 149
3 30 70527 150
174 2 35698 1
174 2 67548 2
174 2 114798 3
...
174 2 70698 9
174 2 30973 10
175 1 139649 1
175 1 156974 2
175 1 145774 3
175 1 97124 4
175 1 40074 5
(you only need the ID, but I'm including the other columns for context); or more succinctly:
with my_table (id, demographic) as (
select level, mod(level, 175) + 1 from dual connect by level <= 175000
),
demographics (demographic, percentage, str) as (
select 1, 40, '4%' from dual
union all select 2, 30, '3%' from dual
union all select 3, 30, '3%' from dual
-- ...
union all select 174, 2, '.02%' from dual
union all select 175, 1, '.01%' from dual
)
select demographic, percentage, count(id) as ids, min(id) as min_id, max(id) as max_id
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage
group by demographic, percentage
order by demographic;
DEMOGRAPHIC PERCENTAGE IDS MIN_ID MAX_ID
----------- ---------- ---------- ---------- ----------
1 40 200 175 174825
2 30 150 1 174126
3 30 150 2452 174477
174 2 10 23448 146648
175 1 5 19074 118649
db<>fiddle

How can I use the self join to find equal values within a group?

I am looking to find instances of GROUPID where all price values are 0. The following is a simplified version of what I am looking at
--------------------------------
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 002 | 4 | 4 |
| 002 | 4 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
| 004 | 4 | 4 |
| 004 | 7 | 4 |
--------------------------------
I am attempting to use the following query to find all GROUPID where both PRICE values for that particular group = 0.
SELECT * FROM MYTABLE WHERE GROUPID IN
(SELECT TB1.GROUPID FROM MYTABLE TB1 JOIN MYTABLE TB2 ON TB1.GROUPID = TB2.GROUPID
AND TB1.PRICE = 0 AND TB2.PRICE = 0)
and CUSTOMER = 4
ORDER BY GROUPID;
This query returns:
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
--------------------------------
In my case, I only need it to return GROUPID 003.
I'd also like to ask for assistance in modifying the query to return all non 0 equal PRICE values within a groupid. It doesn't have to be in the same query as above. For example the return would be:
| Groupid | Price | Customer|
--------------------------------
| 002 | 4 | 4 |
| 002 | 4 | 4 |
Any help would be appreciated. Thank you for your time.
If all the prices are zero, then look at the minimum and maximum price for the groupid:
select groupid
from mytable t
group by groupid
having min(price) = 0 and max(price) = 0;
I should point out that no self-join is required for this.
Try this:
SELECT * FROM MYTABLE as m1 where Price = 0
and not exists (
select 1
from MYTABLE as m2
where m2.Groupid = m1.Groupid
and m2.Price <> 0
)
You can list the the rows, whose group_id has no rows with non-zero price
select groupid, price, customer from mytable t
where not exists (
select 1 from mytable where group_id = t.group_id
and price != 0
);
I would do it by counting the number of distinct prices over each group (and I'm assuming that each group will have the same customer) and then filtering on the price(s) you're interested in.
For example, for prices that are 0 for all rows in each groupid+customer:
WITH mytable AS (SELECT '001' groupid, 9 price, 4 customer FROM dual UNION ALL
SELECT '001' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 7 price, 4 customer FROM dual)
SELECT groupid,
customer,
price
FROM (SELECT groupid,
customer,
price,
COUNT(DISTINCT price) OVER (PARTITION BY groupid, customer) num_distinct_prices
FROM mytable)
WHERE num_distinct_prices = 1
AND price = 0;
GROUPID CUSTOMER PRICE
------- ---------- ----------
003 4 0
003 4 0
Just change the and price = 0 to and price != 0 if you want the groups which have the same non-zero price for all rows. Or simply remove that predicate altogether.
EDIT
Gordon's is the best solution for the first part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING (MAX(price) = 0 and MIN(price) = 0)
And for the second part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING MIN(price) <> 0 AND (MAX(price) = MIN(price))
My original one:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING SUM(Price) =0
This assumes, there are no negative prices.
To the second part of your question:
SELECT groupid
FROM mytable t
WHERE Price > 0
GROUP BY GroupID, Price
HAVING COUNT(price) > 1
In your sample data, you have only one customer. I assume if you had more than one customer, you would still want to return the rows where the groupid has the same price, across all rows and all customers. If so, you could use the query below. It is almost the same as Boneist's - I just use min(price) and max(price) instead of count(distinct), and I don't include customer in partition by.
If the price may be NULL, it will be ignored in the computation of max price and min price; if all the NON-NULL prices are equal for a groupid, all the rows for that group will be returned. If price can be NULL and this is NOT the desired behavior, that can be changed easily - but the OP will have to clarify.
The query below retrieves all the cases when there is a single price for the groupid. To retrieve only the groups where the price is 0 (an additional condition), add and price = 0 to the WHERE clause of the outer query. I added more test data to illustrate some of the cases the query covers.
with
test_data ( groupid, price, customer ) as (
select '001', 9, 4 from dual union all
select '001', 0, 4 from dual union all
select '002', 4, 4 from dual union all
select '002', 4, 4 from dual union all
select '003', 0, 4 from dual union all
select '003', 0, 4 from dual union all
select '004', 4, 4 from dual union all
select '004', 7, 4 from dual union all
select '002', 4, 8 from dual union all
select '005', 2, 8 from dual union all
select '005', null, 8 from dual
),
prep ( groupid, price, customer, min_price, max_price) as (
select groupid, price, customer,
min(price) over (partition by groupid),
max(price) over (partition by groupid)
from test_data
)
select groupid, price, customer
from prep
where min_price = max_price
;
GROUPID PRICE CUSTOMER
------- --------- ---------
002 4 8
002 4 4
002 4 4
003 0 4
003 0 4
005 8
005 2 8
This may be what you want:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price <> 0)
and just change the last line for the other query:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price = 0)
I would do it very similarly to what Gordon posted
SELECT groupId
FROM MyTable
GROUP BY groupId
HAVING SUM(price) = 0

Show multiple rows from single row data based on column values

I have a data below
ID DATE COMPLIANCE ISBREAKREDUCED ISSECONDMEALBREAKREDUCED
1208240 4/12/2015 2 1 1
How do I use it in my base query to show different rows for column ISBREAKREDUCED & ISSECONDMEALBREAKREDUCED if there values are 1.
This table join condition is based on id and date fields which is why it selects single row. How can I split this row into a multiple rows in the output?
I think you are looking for a typical UNPIVOT query.
You could do it as:
SQL> WITH DATA AS(
2 SELECT 1208240 ID, 1 ISBREAKREDUCED, 1 ISSECONDMEALBREAKREDUCED FROM dual UNION ALL
3 SELECT 1208241 ID, 2 ISBREAKREDUCED, 3 ISSECONDMEALBREAKREDUCED FROM dual
4 )
5 SELECT *
6 FROM
7 (SELECT *
8 FROM DATA UNPIVOT (ISSECONDMEALBREAKREDUCED FOR ISBREAKREDUCED IN (ISBREAKREDUCED AS 1, ISSECONDMEALBREAKREDUCED AS 1))
9 )
10 WHERE ISBREAKREDUCED =1
11 AND ISSECONDMEALBREAKREDUCED =1
12 /
ID ISBREAKREDUCED ISSECONDMEALBREAKREDUCED
---------- -------------- ------------------------
1208240 1 1
1208240 1 1
SQL>