Oracle Sql: Obtain a Sum of a Group, if Subgroup condition met - sql

I have a dataset upon which I am trying to obain a summed value for each group, if a subgroup within each group meets a certain condition. I am not sure if this is possible, or if I am approaching this problem incorrectly.
My data is structured as following:
+----+-------------+---------+-------+
| ID | Transaction | Product | Value |
+----+-------------+---------+-------+
| 1 | A | 0 | 10 |
| 1 | A | 1 | 15 |
| 1 | A | 2 | 20 |
| 1 | B | 1 | 5 |
| 1 | B | 2 | 10 |
+----+-------------+---------+-------+
In this example I want to obtain the sum of values by the ID column, if a transaction does not contain any products labeled 0. In the above described scenario, all values related to Transaction A would be excluded because Product 0 was purchased. With the outcome being:
+----+-------------+
| ID | Sum of Value|
+----+-------------+
| 1 | 15 |
+----+-------------+
This process would repeat for multiple IDs with each ID only containing the sum of values if the transaction does not contain product 0.

Hmmm . . . one method is to use not exists for the filtering:
select id, sum(value)
from t
where not exists (select 1
from t t2
where t2.id = t.id and t2.transaction = t.transaction and
t2.product = 0
)
group by id;

Do not need to use correlated subquery with not exists.
Just use group by.
with s (id, transaction, product, value) as (
select 1, 'A', 0, 10 from dual union all
select 1, 'A', 1, 15 from dual union all
select 1, 'A', 2, 20 from dual union all
select 1, 'B', 1, 5 from dual union all
select 1, 'B', 2, 10 from dual)
select id, sum(sum_value) as sum_value
from
(select id, transaction,
sum(value) as sum_value
from s
group by id, transaction
having count(decode(product, 0, 1)) = 0
)
group by id;
ID SUM_VALUE
---------- ----------
1 15

Related

Find top parent of child, multiple levels

ENTRY TABLE
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 2 |
| 4 | null |
| 5 | 4 |
| 6 | 5 |
...
I make copies of the entries in some cases and they are conneted by parent ID.
Each entry can have one copy:
THIS WONT HAPPEN
__________________
| ID | PARENT_ID |
| 1 | null |
| 2 | 1 |
| 3 | 1 |
...
Sometimes I need to take a copy and query for it's top level parent. I need to find the top parent entries for all the entries I search for.
For example, if I query for the parents of ID 6 and 3, I would get ID 4 and 1.
If I query for the parents of ID 5 and 2, I would get ID 4 and 1.
But also If I query for ID 5 and 1, it should return ID 4 and 1 because the entry ID 1 is already the top parent itself.
I don't know where to begin since I don't know how to recursively query in such case.
Can anyone point me in the right direction?
I know that the query below will just return the child elemements (ID 6 and 3), but I don't know where to go from here honestly.
I am using OracleSQL by the way.
SELECT * FROM entry WHERE id IN (6, 3);
You can use a hierarchical query and CONNECT_BY_ROOT.
Either starting at the root of the hierarchy and working down:
SELECT id,
CONNECT_BY_ROOT(id) AS root_id
FROM entry
WHERE id IN (6, 3)
START WITH parent_id IS NULL
CONNECT BY PRIOR id = parent_id;
Or, from the entry back up to the root:
SELECT CONNECT_BY_ROOT(id) AS id,
id AS root_id
FROM entry
WHERE parent_id IS NULL
START WITH id IN (6, 3)
CONNECT BY PRIOR parent_id = id;
Which, for the sample data:
CREATE TABLE entry( id, parent_id ) AS
SELECT 1, NULL FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 3, 2 FROM DUAL UNION ALL
SELECT 4, NULL FROM DUAL UNION ALL
SELECT 5, 4 FROM DUAL UNION ALL
SELECT 6, 5 FROM DUAL UNION ALL
SELECT 7, 6 FROM DUAL
Both output:
ID
ROOT_ID
3
1
6
4
db<>fiddle here
You can use recursive CTE to walk the graph and find the initial parent. For example:
with
n (starting_id, current_id, parent_id, v) as (
select id, id, parent_id, 0 from entry where id in (6, 3)
union all
select n.starting_id, e.id, e.parent_id, n.v - 1
from n
join entry e on e.id = n.parent_id
)
select starting_id, current_id as initial_id
from (
select n.*, row_number() over(partition by starting_id order by v) as rn
from n
) x
where rn = 1
Result:
STARTING_ID INITIAL_ID
------------ ----------
3 1
6 4
See running example at db<>fiddle.

Possible to use a column name in a UDF in SQL?

I have a query in which a series of steps is repeated constantly over different columns, for example:
SELECT DISTINCT
MAX (
CASE
WHEN table_2."GRP1_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP1_MINIMUM_DATE",
MAX (
CASE
WHEN table_2."GRP2_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP2_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
I was considering writing a function to accomplish this as doing so would save on space in my query. I have been reading a bit about UDF in SQL but don't yet understand if it is possible to pass a column name in as a parameter (i.e. simply switch out "GRP1_MINIMUM_DATE" for "GRP2_MINIMUM_DATE" etc.). What I would like is a query which looks like this
SELECT DISTINCT
FUNCTION(table_2."GRP1_MINIMUM_DATE") AS "GRP1_MINIMUM_DATE",
FUNCTION(table_2."GRP2_MINIMUM_DATE") AS "GRP2_MINIMUM_DATE",
FUNCTION(table_2."GRP3_MINIMUM_DATE") AS "GRP3_MINIMUM_DATE",
FUNCTION(table_2."GRP4_MINIMUM_DATE") AS "GRP4_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
Can anyone tell me if this is possible/point me to some resource that might help me out here?
Thanks!
There is no such direct as #Tejash already stated, but the thing looks like your database model is not ideal - it would be better to have a table that has USER_ID and GRP_ID as keys and then MINIMUM_DATE as seperate field.
Without changing the table structure, you can use UNPIVOT query to mimic this design:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4))
Result:
| USER_ID | GRP_ID | MINIMUM_DATE |
|---------|--------|--------------|
| 1 | 1 | 09/09/19 |
| 1 | 2 | 09/09/19 |
| 1 | 3 | 09/09/19 |
| 1 | 4 | 09/09/19 |
| 2 | 1 | 09/08/19 |
| 2 | 2 | 09/07/19 |
| 2 | 3 | 09/06/19 |
| 2 | 4 | 09/05/19 |
With this you can write your query without further code duplication and if you need use PIVOT-syntax to get one line per USER_ID.
The final query could then look like this:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
, INPUT_COHORT(USER_ID, ANCHOR_DATE)
AS (SELECT 1, SYSDATE-1 FROM dual UNION ALL
SELECT 2, SYSDATE-2 FROM dual UNION ALL
SELECT 3, SYSDATE-3 FROM dual)
-- Above is sampledata query starts from here:
, unpiv AS (SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4)))
SELECT qcsj_c000000001000000 user_id, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE
FROM INPUT_COHORT cohort
LEFT JOIN unpiv table_2
ON cohort.USER_ID = table_2.USER_ID
pivot (MAX(CASE WHEN minimum_date <= cohort."ANCHOR_DATE" THEN 1 ELSE 0 END) AS MINIMUM_DATE
FOR grp_id IN (1 AS GRP1,2 AS GRP2,3 AS GRP3,4 AS GRP4))
Result:
| USER_ID | GRP1_MINIMUM_DATE | GRP2_MINIMUM_DATE | GRP3_MINIMUM_DATE | GRP4_MINIMUM_DATE |
|---------|-------------------|-------------------|-------------------|-------------------|
| 3 | | | | |
| 1 | 0 | 0 | 0 | 0 |
| 2 | 0 | 1 | 1 | 1 |
This way you only have to write your calculation logic once (see line starting with pivot).

How to create a query with all of dependencies in hierarchical organization?

I've been trying hard to create a query to see all dependencies in a hierarchical organization. But the only I have accuaried is to retrieve the parent dependency. I have attached an image to show what I need.
Thanks for any clue you can give me.
This is the code I have tried with the production table.
WITH CTE AS
(SELECT
H1.systemuserid,
H1.pes_aprobadorid,
H1.yomifullname,
H1.internalemailaddress
FROM [dbo].[ext_systemuser] H1
WHERE H1.pes_aprobadorid is null
UNION ALL
SELECT
H2.systemuserid,
H2.pes_aprobadorid,
H2.yomifullname,
H2.internalemailaddress
FROM [dbo].[ext_systemuser] H2
INNER JOIN CTE c ON h2.pes_aprobadorid=c.systemuserid)
SELECT *
FROM CTE
OPTION (MAXRECURSION 1000)
You are almost there with your query. You just have to include all rows as a starting point. Also the join should be cte.parent_id = ext.user_id and not the other way round. I've done an example query in postgres, but you shall easily adapt it to your DBMS.
with recursive st_units as (
select 0 as id, NULL as pid, 'Director' as nm
union all select 1, 0, 'Department 1'
union all select 2, 0, 'Department 2'
union all select 3, 1, 'Unit 1'
union all select 4, 3, 'Unit 1.1'
),
cte AS
(
SELECT id, pid, cast(nm as text) as path, 1 as lvl
FROM st_units
UNION ALL
SELECT c.id, u.pid, cast(path || '->' || u.nm as text), lvl + 1
FROM st_units as u
INNER JOIN cte as c on c.pid = u.id
)
SELECT id, pid, path, lvl
FROM cte
ORDER BY lvl, id
id | pid | path | lvl
-: | ---: | :--------------------------------------- | --:
0 | null | Director | 1
1 | 0 | Department 1 | 1
2 | 0 | Department 2 | 1
3 | 1 | Unit 1 | 1
4 | 3 | Unit 1.1 | 1
1 | null | Department 1->Director | 2
2 | null | Department 2->Director | 2
3 | 0 | Unit 1->Department 1 | 2
4 | 1 | Unit 1.1->Unit 1 | 2
3 | null | Unit 1->Department 1->Director | 3
4 | 0 | Unit 1.1->Unit 1->Department 1 | 3
4 | null | Unit 1.1->Unit 1->Department 1->Director | 4
db<>fiddle here
I've reached this code that it is working but when I include a hierarchy table of more than 1800 the query is endless.
With cte AS
(select systemuserid, systemuserid as pes_aprobadorid, internalemailaddress, yomifullname
from #TestTable
union all
SELECT c.systemuserid, u.pes_aprobadorid, u.internalemailaddress, u.yomifullname
FROM #TestTable as u
INNER JOIN cte as c on c.pes_aprobadorid = u.systemuserid
)
select distinct * from cte
where pes_aprobadorid is not null
OPTION (MAXRECURSION 0)

To count a column based on another column's repeating(same) entry

I want to create a report of calls last made based on weeks from last call and call-Group
Actual Data is like below with call id, date of call and call grouping
callid | Date | Group
----------------------------
1 | 6-1-18 | a1
2 | 6-1-18 | a2
3 | 7-1-18 | a3
4 | 8-1-18 | a1
5 | 9-1-18 | a2
6 | 9-1-18 | a4
Expected data is to display the number of calls for each call group corresponding to the number of week from last call
week | |
from | |
last |Group|Group
call | a1 | a2
--------------------
1 | 2 | 2 ->number of calls
2 | - | -
3 | 1 | -
4 | 2 | -
5 | - | 3
6 | - | -
Can anyone please tell me some solution for this
Although you data provided was a very small set and not rich enough to cover all cases, here is an sql that will calculate the number of weeks difference between each call and last call within a group and count the number of calls for each group for the particular week difference.
with your_table as (
select 1 as "callid", to_date('6-1-18','dd-mm-rr') as "date", 'a1' as "group" from dual
union select 2, to_date('6-1-18','mm-dd-rr'), 'a2' from dual
union select 3, to_date('7-1-18','mm-dd-rr'), 'a3' from dual
union select 4, to_date('8-1-18','mm-dd-rr'), 'a1' from dual
union select 5, to_date('9-1-18','mm-dd-rr'), 'a2' from dual
union select 6, to_date('6-1-18','mm-dd-rr'), 'a4' from dual
),
data1 as (
select t.*, max(t."date") over (partition by t."group") last_call_dt from your_table t
),
data2 as (select t.*, round((last_call_dt-t."date")/7,0) as weeks_diff from data1 t)
select * from (
select t.weeks_diff, t."callid", t."group" from data2 t
)
PIVOT
(
COUNT("callid")
FOR "group" IN ('a1', 'a2', 'a3','a4')
)
order by weeks_diff
to try it out with your table just make the following change:
with your_table as (select * from my_table), ....
let me know how it goes :)

Using CASE WHEN with an unknown number of categories [duplicate]

This question already has answers here:
How to Pivot table in BigQuery
(7 answers)
Closed 2 years ago.
I have a table like the following:
| user_id | product_purchased |
-------------------------------
| 111 | A |
| 111 | B |
| 222 | B |
| 222 | B |
| 333 | C |
| 444 | A |
I want to pivot the table to have user ids as rows and counts of each product purchased as by the user as columns.
So for the above table, this would look like:
| user_id | product A | product B | product C |
-----------------------------------------------
| 111 | 1 | 1 | 0 |
| 222 | 0 | 2 | 0 |
| 333 | 0 | 0 | 1 |
| 444 | 1 | 0 | 0 |
I know this can be done manually using countif statements:
#standardsql
select user_id,
countif(product_purchased = 'A') as 'A',
countif(product_purchased = 'B') as 'B',
etc,
group by user_id
However, in reality the table has too many possible products to make it feasible to write all of the options out manually. Is there a way to do this pivoting in a more automated and elegant way?
in reality the table has too many possible products to make it feasible to write all of the options out manually
Below is for BigQuery Standard SQL
You can do this in two steps - first prepare dynamically pivot query by running below
#standardSQL
SELECT CONCAT('SELECT user_id, ',
STRING_AGG(
CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased)
),
' FROM `project.dataset.your_table` GROUP BY user_id')
FROM (
SELECT product_purchased
FROM `project.dataset.your_table`
GROUP BY product_purchased
)
as a result you will get string representing the query that you need to run to get desired result
As an example, if to apply to dummy data from your question
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 111 user_id, 'A' product_purchased UNION ALL
SELECT 111, 'B' UNION ALL
SELECT 222, 'B' UNION ALL
SELECT 222, 'B' UNION ALL
SELECT 333, 'C' UNION ALL
SELECT 444, 'A'
)
SELECT CONCAT('SELECT user_id, ',
STRING_AGG(
CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased)
),
' FROM `project.dataset.your_table` GROUP BY user_id')
FROM (
SELECT product_purchased
FROM `project.dataset.your_table`
GROUP BY product_purchased
)
you will get below query (formatted for better view here)
SELECT
user_id,
COUNTIF(product_purchased = "A") AS product_A,
COUNTIF(product_purchased = "B") AS product_B,
COUNTIF(product_purchased = "C") AS product_C
FROM `project.dataset.your_table`
GROUP BY user_id
Now, you can just run this to get desired result without manual coding
Again, if to run it against dummy data from your question
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 111 user_id, 'A' product_purchased UNION ALL
SELECT 111, 'B' UNION ALL
SELECT 222, 'B' UNION ALL
SELECT 222, 'B' UNION ALL
SELECT 333, 'C' UNION ALL
SELECT 444, 'A'
)
SELECT
user_id,
COUNTIF(product_purchased = "A") AS product_A,
COUNTIF(product_purchased = "B") AS product_B,
COUNTIF(product_purchased = "C") AS product_C
FROM `project.dataset.your_table`
GROUP BY user_id
-- ORDER BY user_id
you get expected result
Row user_id product_A product_B product_C
1 111 1 1 0
2 222 0 2 0
3 333 0 0 1
4 444 1 0 0
Is there a way to do this pivoting in a more automated and elegant way?
You can easily automate above using any client of your choice