Count frequency for values in a column - sql

I'm trying to count occurrences of values in a specific column.
cus_id prod_id income
100 10 90
100 10 80
100 20 110
122 20 9
122 30 10
When doing the query, I would like to receive something like this:
cus_id count(prod_id = 10) (prod_id = 20) (prod_id = 30) sum(income)
100 2 1 0 280
122 0 1 1 19
At the moment my initial approach is this:
select cus_id, prod_id, count(prod_id), sum(income) from t group by 1,2
Any insights would be highly appreciated. Thanks in advance!

Oracle SQL
with t (cus_id, prod_id, income) as (
select 100, 10, 90 from dual union all
select 100, 10, 80 from dual union all
select 100, 20, 110 from dual union all
select 122, 20, 9 from dual union all
select 122, 30, 10 from dual)
select
cus_id,
count(case when prod_id = 10 then income end) sum_prod_10,
count(case when prod_id = 20 then income end) sum_prod_20,
count(case when prod_id = 30 then income end) sum_prod_30,
count(income) sum_income
from t
group by cus_id;
CUS_ID SUM_PROD_10 SUM_PROD_20 SUM_PROD_30 SUM_INCOME
---------- ----------- ----------- ----------- ----------
122 0 1 1 2
100 2 1 0 3
SQL>
https://dbfiddle.uk/XZs56Hks

Related

ORACLE SQL, I don't know how to use SUM() here

Table TRANSACTION:
TRANS_VALUE
USER ID
TRANS_TYPE_ID
10
1
2
5
2
1
15
1
1
20
2
2
10
1
2
5
1
2
15
3
1
20
3
1
I need to get to this:
USER
SUM(TRANS_TYPE_1)
SUM(TRANS_TYPE_2)
1
15
25
2
5
20
3
35
NULL
Can someone help me?
I tried this but sadness
SELECT
user_id AS "USER
SUM(trans_value)
FROM
TRANSACTION
WHERE
trans_value = 1
GROUP BY
user_id
ORDER BY 1;
I need to get to this
USER
SUM(TRANS_TYPE_1)
SUM(TRANS_TYPE_2)
1
15
25
2
5
20
3
35
NULL
Use conditional aggregation:
SELECT user_id,
SUM(CASE trans_type_id WHEN 1 THEN trans_value END) AS sum_trans_type_1,
SUM(CASE trans_type_id WHEN 2 THEN trans_value END) AS sum_trans_type_2
FROM transaction
GROUP BY user_id
or PIVOT:
SELECT *
FROM transaction
PIVOT (
SUM(trans_value)
FOR trans_type_id IN (
1 AS sum_trans_type_1,
2 AS sum_trans_type_2
)
)
Which, for the sample data:
CREATE TABLE transaction (TRANS_VALUE, USER_ID, TRANS_TYPE_ID) AS
SELECT 10, 1, 2 FROM DUAL UNION ALL
SELECT 5, 2, 1 FROM DUAL UNION ALL
SELECT 15, 1, 1 FROM DUAL UNION ALL
SELECT 20, 2, 2 FROM DUAL UNION ALL
SELECT 10, 1, 2 FROM DUAL UNION ALL
SELECT 5, 1, 2 FROM DUAL UNION ALL
SELECT 15, 3, 1 FROM DUAL UNION ALL
SELECT 20, 3, 1 FROM DUAL;
Both output:
USER_ID
SUM_TRANS_TYPE_1
SUM_TRANS_TYPE_2
1
15
25
2
5
20
3
35
null
fiddle

Check Distinct value Present in the group

I have a table with multiple pos and I need to find the Purchase Id where it has only wallet per group and nothing else in the group.
For eg,here PID - 4 and 5 has only wallet , rest has other's as well. So wallet_flag should be 1 in the output.
I tried to use window's function but could not achieve the result. Can you please suggest.
select PID
,POS
, SUM(CASE WHEN POS='bwallet' THEN 1 ELSE 0 END ) OVER(PARTITION BY PID) as FLAG
from PAYMENTS
where "status" = 'SUCCESS'
OUTPUT:
Here's one option:
Sample data:
SQL> with test (pid, pos, amount) as
2 (select 1, 'wallet', 10 from dual union all
3 select 1, 'BT' , 10 from dual union all
4 select 1, 'Cash' , 10 from dual union all
5 select 2, 'BT' , 50 from dual union all
6 select 3, 'Cash' , 24 from dual union all
7 select 3, 'BT' , 12 from dual union all
8 select 4, 'wallet', 100 from dual union all
9 select 5, 'wallet', 20 from dual union all
10 select 5, 'wallet', 100 from dual
11 ),
Query begins here; cnt will be 0 if there's only "wallet" per PID:
12 temp as
13 (select pid,
14 sum(case when pos = 'wallet' then 0 else 1 end) cnt
15 from test
16 group by pid
17 )
18 select a.pid, a.pos, a.amount,
19 case when b.cnt = 0 then 1 else 0 end wallet_flag
20 from test a join temp b on a.pid = b.pid
21 order by a.pid;
PID POS AMOUNT WALLET_FLAG
---------- ------ ---------- -----------
1 wallet 10 0
1 BT 10 0
1 Cash 10 0
2 BT 50 0
3 Cash 24 0
3 BT 12 0
4 wallet 100 1
5 wallet 20 1
5 wallet 100 1
9 rows selected.
SQL>
SELECT
your_table.*,
MIN(
CASE pos
WHEN 'wallet' THEN 1
ELSE 0
END
)
OVER (
PARTITION BY pid
)
AS wallet_flag
from
your_table
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=e05a7863b9f4d912dcdf5ced5ec1c1b2

oracle - group by - no aggregation

I have an oracle table , where ref_id is the flag field is the type of data and ORN is the order of data in each ref_id :
ref_id data ORN flag
1 100 0 0
1 200 1 0
1 300 2 0
1 400 3 0
1 110 0 1
1 210 1 1
1 150 0 2
1 250 1 2
1 350 2 2
1 450 3 2
2 500 0 0
2 600 1 0
2 700 2 0
2 800 3 0
2 120 0 1
2 220 1 1
2 320 1 1
2 420 1 1
2 170 0 2
2 270 1 2
2 370 2 2
2 470 3 2
I need to group the data in a way to get last data in flag 0 and last data in flag 2 for each ref_id
so the new table will be something like this:
ref_id data_1 data_2
1 400 450
2 800 470
any hint how to accomplish this without using loops?
You can use the analytical function and group by as follows:
SELECT REF_ID,
MAX(CASE WHEN FLAG = 0 THEN DATA END) AS DATA_0,
MAX(CASE WHEN FLAG = 2 THEN DATA END) AS DATA_2
FROM
(
SELECT REF_ID, DATA, ORN, FLAG,
ROW_NUMBER() OVER (PARTITION BY REF_ID, FLAG ORDER BY ORN DESC) AS RN
FROM YOUR_TABLE
WHERE FLAG IN (0,2)
)
WHERE RN = 1
GROUP BY REF_ID
Alternatively use a two step approach, first (in the CTE) select only the values of the DATA column that corresponds to the last ORN within the REF_ID
Note that is case the ORNis not unique you may get more than one row potentially with different values.
In the next step simple aggregate on REF_ID, I'm using max function, i.e. this will get that highest value of DATA in case of ties.
In case the combination of REF_ID and ORN is unique (primary key) you may use MIN and MAX interchangeable, but it is good to know that they will provide diffremt result if dups are allowed.
with agg as (
select
REF_ID,FLAG, DATA, ORN,
case when flag = 0 and ORN = max(ORN) over (partition by REF_ID, FLAG) then data end as data_0,
case when flag = 2 and ORN = max(ORN) over (partition by REF_ID, FLAG) then data end as data_2
from tab
)
select REF_ID,
max(data_0) as data_0,
max(data_2) as data_2
from agg
group by REF_ID
order by 1;
Here the result of the CTE
REF_ID FLAG DATA ORN DATA_0 DATA_2
---------- ---------- ---------- ---------- ---------- ----------
1 0 100 0
1 0 200 1
1 0 300 2
1 0 400 3 400
1 1 110 0
1 1 210 1
1 2 150 0
1 2 250 1
1 2 350 2
1 2 450 3 450
...
and the result of the final query
REF_ID DATA_0 DATA_2
---------- ---------- ----------
1 400 450
2 800 470
You may use the aggregate functions (FIRST/LAST) for the purpose.
https://docs.oracle.com/database/121/SQLRF/functions074.htm#SQLRF00641
https://docs.oracle.com/database/121/SQLRF/functions095.htm#SQLRF00653.
with t (ref_id,data,ORN,flag) as (
select 1, 100, 0, 0 from dual union all
select 1, 200, 1, 0 from dual union all
select 1, 300, 2, 0 from dual union all
select 1, 400, 3, 0 from dual union all
select 1, 110, 0, 1 from dual union all
select 1, 210, 1, 1 from dual union all
select 1, 150, 0, 2 from dual union all
select 1, 250, 1, 2 from dual union all
select 1, 350, 2, 2 from dual union all
select 1, 450, 3, 2 from dual union all
select 2, 500, 0, 0 from dual union all
select 2, 600, 1, 0 from dual union all
select 2, 700, 2, 0 from dual union all
select 2, 800, 3, 0 from dual union all
select 2, 120, 0, 1 from dual union all
select 2, 220, 1, 1 from dual union all
select 2, 320, 1, 1 from dual union all
select 2, 420, 1, 1 from dual union all
select 2, 170, 0, 2 from dual union all
select 2, 270, 1, 2 from dual union all
select 2, 370, 2, 2 from dual union all
select 2, 470, 3, 2 from dual
)
select
ref_id
, max(decode(flag, 0, data)) keep (dense_rank last order by decode(flag, 0, 100, 50), orn ) x
, max(decode(flag, 2, data)) keep (dense_rank last order by decode(flag, 2, 100, 50), orn ) y
-- or
, min(decode(flag, 0, data)) keep (dense_rank first order by decode(flag, 0, 50, 100), orn desc) xx
, min(decode(flag, 2, data)) keep (dense_rank first order by decode(flag, 2, 50, 100), orn desc) yy
from t
group by ref_id
REF_ID X Y XX YY
---------- ---------- ---------- ---------- ----------
1 400 450 400 450
2 800 470 800 470

Create table from loop output Oracle SQL

I need to pull a random sample from a table of ~5 million observations based on 175 demographic options. The demographic table is something like this form:
1 40 4%
2 30 3%
3 30 3%
- -
174 2 .02%
175 1 .01%
Basically I need this same demographic breakdown randomly sampled from the 5M row table. For each demographic I need a sample of the same one from the larger table but with 5x the number of observations (example: for demographic 1 I want a random sample of 200).
SELECT *
FROM (
SELECT *
FROM my_table
ORDER BY
dbms_random.value
)
WHERE rownum <= 100;
I've used this syntax before to get a random sample but is there any way I can modify this as a loop and substitute variable names from existing tables? I'll try to encapsulate the logic I need in pseudocode:
for (each demographic_COLUMN in TABLE1)
select random(5*num_obs_COLUMN in TABLE1) from ID_COLUMN in TABLE2
/*somehow join the results of each step in the loop into one giant column of IDs */
You could join your tables (assuming the 1-175 demographic value exists in both, or there is an equivalent column to join on), something like:
select id
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage
Each row in the main table is given a random pseudo-row-number within its demographic (via the analytic row_number()). The outer query then uses the relevant percentage to select how many of those randomly-ordered rows for each demographic to return.
I'm not sure I've understood how you're actually picking exactly how many of each you want, so that probably needs to be adjusted.
Demo with a smaller sample in a CTE, and matching smaller match condition:
-- CTEs for sample data
with my_table (id, demographic) as (
select level, mod(level, 175) + 1 from dual connect by level <= 175000
),
demographics (demographic, percentage, str) as (
select 1, 40, '4%' from dual
union all select 2, 30, '3%' from dual
union all select 3, 30, '3%' from dual
-- ...
union all select 174, 2, '.02%' from dual
union all select 175, 1, '.01%' from dual
)
-- actual query
select demographic, percentage, id, rn
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage;
DEMOGRAPHIC PERCENTAGE ID RN
----------- ---------- ---------- ----------
1 40 94150 1
1 40 36925 2
1 40 154000 3
1 40 82425 4
...
1 40 154350 199
1 40 126175 200
2 30 36051 1
2 30 1051 2
2 30 100451 3
2 30 18026 149
2 30 151726 150
3 30 125302 1
3 30 152252 2
3 30 114452 3
...
3 30 104652 149
3 30 70527 150
174 2 35698 1
174 2 67548 2
174 2 114798 3
...
174 2 70698 9
174 2 30973 10
175 1 139649 1
175 1 156974 2
175 1 145774 3
175 1 97124 4
175 1 40074 5
(you only need the ID, but I'm including the other columns for context); or more succinctly:
with my_table (id, demographic) as (
select level, mod(level, 175) + 1 from dual connect by level <= 175000
),
demographics (demographic, percentage, str) as (
select 1, 40, '4%' from dual
union all select 2, 30, '3%' from dual
union all select 3, 30, '3%' from dual
-- ...
union all select 174, 2, '.02%' from dual
union all select 175, 1, '.01%' from dual
)
select demographic, percentage, count(id) as ids, min(id) as min_id, max(id) as max_id
from (
select d.demographic, d.percentage, t.id,
row_number() over (partition by d.demographic order by dbms_random.value) as rn
from demographics d
join my_table t on t.demographic = d.demographic
)
where rn <= 5 * percentage
group by demographic, percentage
order by demographic;
DEMOGRAPHIC PERCENTAGE IDS MIN_ID MAX_ID
----------- ---------- ---------- ---------- ----------
1 40 200 175 174825
2 30 150 1 174126
3 30 150 2452 174477
174 2 10 23448 146648
175 1 5 19074 118649
db<>fiddle

How to sum two different fields from two tables with one field is common

I have two tables Sales and Charges.
Tables having data as:
'Sales' 'Charges'
SID F_AMT SID C_AMT
1 100 1 10
1 100 1 10
1 100 1 20
1 200 2 20
2 200 2 10
2 300 3 20
4 300 3 30
4 300 3 10
4 300 5 20
4 200 5 10
I want the output as below:
SID Total_Fees Total_charges
1 500 40
2 500 30
3 0 60
4 1100 0
5 0 30
Assuming you want to do it for the whole tables this is the simplest approach:
Select Sid
, Sum(f_amt) as total_fees
, Sum(c_amt) as total_charges
From ( select sid, f_amt, 0 as c_amt
From sales
Union all
select sid, 0 as f_amt, c_amt
From charges
)
Group by sid
Use full join and nvl():
select sid, nvl(sum(f_amt), 0) fees, nvl(sum(c_amt), 0) charges
from sales s
full join charges c using (sid)
group by sid
order by sid
Demo:
with sales(sid, f_amt) as (
select 1, 100 from dual union all select 1, 100 from dual union all
select 1, 100 from dual union all select 1, 200 from dual union all
select 2, 200 from dual union all select 2, 300 from dual union all
select 4, 300 from dual union all select 4, 300 from dual union all
select 4, 300 from dual union all select 4, 200 from dual ),
charges (sid, c_amt) as (
select 1, 10 from dual union all select 1, 10 from dual union all
select 1, 20 from dual union all select 2, 20 from dual union all
select 2, 10 from dual union all select 3, 20 from dual union all
select 3, 30 from dual union all select 3, 10 from dual union all
select 5, 20 from dual union all select 5, 10 from dual )
select sid, nvl(sum(f_amt), 0) fees, nvl(sum(c_amt), 0) charges
from sales s
full join charges c using (sid)
group by sid
order by sid
Output:
SID FEES CHARGES
------ ---------- ----------
1 1500 160
2 1000 60
3 0 60
4 1100 0
5 0 30
You could use conditional aggregation:
SELECT SID,
COALESCE(SUM(CASE WHEN t=1 THEN AMT END),0) AS Total_Fees,
COALESCE(SUM(CASE WHEN t=2 THEN AMT END),0) AS Total_Charges
FROM (SELECT SID, F_AMT AS AMT, 1 AS t
FROM Sales
UNION ALL
SELECT SID, C_AMT AS AMT, 2 AS t
FROM Charges) sub
GROUP BY SID
ORDER BY SID;
DB Fiddle Demo