I have this query :
select pivot_table.*
from (
Select STATUS,USER_TYPE
FROM TRANSACTIONS tr
join TRANSACTION_STATUS_CODES sc on sc.id = tr.user_type
join TRANSACTION_USER_TYPES ut on ut.id=tr.user_type
WHERE Tr.User_Type between 1 and 5
And tr.status!=1
AND Tr.Update_Date BETWEEN TO_DATE('2022-01-01 00:00:00', 'yyyy-mm-dd HH24:MI:SS')
AND TO_DATE('2022-11-13 23:59:59', 'yyyy-mm-dd HH24:MI:SS')
) t
pivot(
count(user_type)
FOR user_type IN (1,2,3,5)
) pivot_table;
Which gives:
status
1
2
3
5
2
3
0
0
0
4
13
0
0
0
5
1
0
0
0
3
5
0
0
1
0
4
0
0
8
Wanted result:
status
1
2
3
5
total
2
3
0
0
0
3
4
13
0
0
0
13
5
1
0
0
0
1
3
5
0
0
1
6
0
4
0
0
8
12
sum of statuses 2,4,5
17
0
0
0
17
sum of all statuses
26
0
0
0
35
I have tried adding:
Select STATUS,USER_TYPE,
count(user_type) as records,
sum(user_type) over (partition by status) as total
and in the end:
pivot ( sum (records) for user_type in (1,2,3,5)) pivot_table
but logically I am still not there.
You want to use GROUP BY CUBE with conditional aggregation and filters on the grouping sets:
SELECT CASE GROUPING_ID(status, CASE WHEN status IN (2, 4, 5) THEN 1 ELSE 0 END)
WHEN 0
THEN TO_CHAR(status)
WHEN 2
THEN 'SUB-TOTAL'
ELSE 'TOTAL'
END AS status,
COUNT(CASE user_type WHEN 1 THEN 1 END) AS "1",
COUNT(CASE user_type WHEN 2 THEN 1 END) AS "2",
COUNT(CASE user_type WHEN 3 THEN 1 END) AS "3",
COUNT(CASE user_type WHEN 5 THEN 1 END) AS "5",
COUNT(*) AS total
FROM table_name
GROUP BY CUBE(status, CASE WHEN status IN (2, 4, 5) THEN 1 ELSE 0 END)
HAVING GROUPING_ID(status, CASE WHEN status IN (2, 4, 5) THEN 1 ELSE 0 END) IN (0, 3)
OR ( GROUPING_ID(status, CASE WHEN status IN (2, 4, 5) THEN 1 ELSE 0 END) = 2
AND CASE WHEN status IN (2, 4, 5) THEN 1 ELSE 0 END = 1 )
Which, for the sample data (representing the output of your many joined tables):
CREATE TABLE table_name (status, user_type) AS
SELECT 2, 1 FROM DUAL CONNECT BY LEVEL <= 3 UNION ALL
SELECT 4, 1 FROM DUAL CONNECT BY LEVEL <= 13 UNION ALL
SELECT 5, 1 FROM DUAL CONNECT BY LEVEL <= 1 UNION ALL
SELECT 3, 1 FROM DUAL CONNECT BY LEVEL <= 5 UNION ALL
SELECT 3, 5 FROM DUAL CONNECT BY LEVEL <= 1 UNION ALL
SELECT 0, 1 FROM DUAL CONNECT BY LEVEL <= 4 UNION ALL
SELECT 0, 5 FROM DUAL CONNECT BY LEVEL <= 8;
Outputs:
STATUS
1
2
3
5
TOTAL
0
4
0
0
8
12
3
5
0
0
1
6
2
3
0
0
0
3
4
13
0
0
0
13
5
1
0
0
0
1
SUB-TOTAL
17
0
0
0
17
TOTAL
26
0
0
9
35
You can change the string literals to match the row titles for your desired output.
fiddle
One of the options to do it is to use MODEL clause like this:
Select
*
From
(
Select Cast(p.STATUS as VARCHAR2(30)) "STATUS", p."1", p."2", p."3", p."5"
From pivot_table p Union All
Select 'Sum of Statuses 2, 4, 5', Null, Null, Null, Null From Dual Union All
Select 'Sum of All Statuses', Null, Null, Null, Null From Dual
)
MODEL
Dimension By (STATUS)
Measures ("1", "2", "3", "5", 0 "TOTAL")
Rules
(
"1"['Sum of All Statuses'] = Sum("1")[ANY], -- Grand Total of Column 1 = Sum(1) for ANY Status (Dimension)
"2"['Sum of All Statuses'] = Sum("2")[ANY],
"3"['Sum of All Statuses'] = Sum("3")[ANY],
"5"['Sum of All Statuses'] = Sum("5")[ANY],
--
"1"['Sum of Statuses 2, 4, 5'] = Sum("1")[STATUS IN('2', '4', '5')], -- SubTotal of Column 1 = Sum(1) for Status IN 2, 4, 5
"2"['Sum of Statuses 2, 4, 5'] = Sum("2")[STATUS IN('2', '4', '5')],
"3"['Sum of Statuses 2, 4, 5'] = Sum("3")[STATUS IN('2', '4', '5')],
"5"['Sum of Statuses 2, 4, 5'] = Sum("5")[STATUS IN('2', '4', '5')],
--
TOTAL[ANY] = "1"[CV()] + "2"[CV()] + "3"[CV()] + "5"[CV()] -- for each row do the addition of columns with the same Current Value 'CV()' of Status
)
With the data from your question:
WITH
pivot_table AS
(
Select 2 "STATUS", 3 "1", 0 "2", 0 "3", 0 "5" From Dual Union All
Select 4 "STATUS", 13 "1", 0 "2", 0 "3", 0 "5" From Dual Union All
Select 5 "STATUS", 1 "1", 0 "2", 0 "3", 0 "5" From Dual Union All
Select 3 "STATUS", 5 "1", 0 "2", 0 "3", 1 "5" From Dual Union All
Select 0 "STATUS", 4 "1", 0 "2", 0 "3", 8 "5" From Dual
)
The result should be:
STATUS
1
2
3
5
TOTAL
2
3
0
0
0
3
4
13
0
0
0
13
5
1
0
0
0
1
3
5
0
0
1
6
0
4
0
0
8
12
Sum of Statuses 2, 4, 5
17
0
0
0
17
Sum of All Statuses
26
0
0
9
35
Regards...
Related
I need help to do a count based on a date condition.
I have a DB similar to the following:
ManDB
ID
report_date
traffic_v
traffic_ul
traffic_dl
a
1/12/2021
0
0
100
a
2/12/2021
0
0
100
a
3/12/2021
100
0
100
a
4/12/2021
100
0
100
b
1/12/2021
0
100
100
b
2/12/2021
0
0
0
b
3/12/2021
0
100
0
b
4/12/2021
100
100
0
I need you to count the data to zero, for which I have the query:
SELECT
ID AS SECTOR,
SUM(TRAFFIC) TRAFICO_VOZ,
SUM(TRAFFIC_DL_G) + SUM(TRAFFIC_DL_E) TRAFFIC_DL,
SUM(TRAFFIC_UL_G) + SUM(TRAFFIC_UL_E) TRAFFIC_UL
FROM
MainDB
GROUP BY ID
HAVING SUM(TRAFFIC) = 0
OR (SUM(TRAFFIC_DL_G) + SUM(TRAFFIC_DL_E)) = 0
OR (SUM(TRAFFIC_UL_G) + SUM(TRAFFIC_UL_E)) = 0
But I need you to count me from the current date backwards, how many days has it been zero
You should only count me from the last record in zero.
So you should get the following result:
Expected result
ID
traffic_v
count_v
traffic_ul
count_ul
traffic_dl
count_dl
a
200
0
0
4
400
0
b
100
0
200
0
0
3
I do not know how to set the condition so that it detects the date on which I began to have zero records and perform the count of days until the current date.
In cases where the register is different from zero, the count must be restarted.
The db is updated daily.
the counts are displayed correctly with the query, as I only care about zero data.
try to use SUM / CASE, but it counts me from the minimum date that it finds at zero, regardless of having a different record
You can use a MODEL clause:
SELECT id,
count_traffic_v,
sum_traffic_v,
count_traffic_ul,
sum_traffic_ul,
count_traffic_dl,
sum_traffic_dl
FROM (
SELECT *
FROM (
SELECT m.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY report_date DESC) AS rn
FROM mainDB m
)
MODEL
PARTITION BY (id)
DIMENSION BY (report_date)
MEASURES (
rn,
traffic_v,
0 AS count_traffic_v,
0 AS sum_traffic_v,
traffic_ul,
0 AS count_traffic_ul,
0 AS sum_traffic_ul,
traffic_dl,
0 AS count_traffic_dl,
0 AS sum_traffic_dl
)
RULES AUTOMATIC ORDER (
count_traffic_v[report_date] = CASE traffic_v[cv()]
WHEN 0
THEN COALESCE(count_traffic_v[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_v[report_date] = CASE traffic_v[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_v[cv() - 1], 0) + traffic_v[cv()]
END,
count_traffic_ul[report_date] = CASE traffic_ul[cv()]
WHEN 0
THEN COALESCE(count_traffic_ul[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_ul[report_date] = CASE traffic_ul[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_ul[cv() - 1], 0) + traffic_ul[cv()]
END,
count_traffic_dl[report_date] = CASE traffic_dl[cv()]
WHEN 0
THEN COALESCE(count_traffic_dl[cv() - 1] + 1, 1)
ELSE 0
END,
sum_traffic_dl[report_date] = CASE traffic_dl[cv()]
WHEN 0
THEN 0
ELSE COALESCE(sum_traffic_dl[cv() - 1], 0) + traffic_dl[cv()]
END
)
)
WHERE rn = 1;
Which, for the sample data:
CREATE TABLE maindb (ID, report_date, traffic_v, traffic_ul, traffic_dl) AS
SELECT 'a', DATE '2021-12-01', 0, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-02', 0, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-03', 100, 0, 100 FROM DUAL UNION ALL
SELECT 'a', DATE '2021-12-04', 100, 0, 100 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-01', 0, 100, 100 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-02', 0, 0, 0 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-03', 0, 100, 0 FROM DUAL UNION ALL
SELECT 'b', DATE '2021-12-04', 100, 100, 0 FROM DUAL;
Outputs:
ID
COUNT_TRAFFIC_V
SUM_TRAFFIC_V
COUNT_TRAFFIC_UL
SUM_TRAFFIC_UL
COUNT_TRAFFIC_DL
SUM_TRAFFIC_DL
a
0
200
4
0
0
400
b
0
100
0
200
3
0
db<>fiddle here
I have a table that looks like below:
ID|Date |X| Flag |
1 |1/1/16|2| 0
2 |1/1/16|0| 0
3 |1/1/16|0| 0
1 |2/1/16|0| 0
2 |2/1/16|1| 0
3 |2/1/16|2| 0
1 |3/1/16|2| 0
2 |3/1/16|1| 0
3 |3/1/16|2| 0
I'm trying to make it so that flag is populated if X=2 in the PREVIOUS month. As such, it should look like this:
ID|Date |X| Flag |
1 |1/1/16|2| 0
2 |1/1/16|0| 0
3 |1/1/16|0| 0
1 |2/1/16|2| 1
2 |2/1/16|1| 0
3 |2/1/16|2| 0
1 |3/1/16|2| 1
2 |3/1/16|1| 0
3 |3/1/16|2| 1
I use this in SQL:
`select ID, date, X, flag into Work_Table from t
(
Select ID, date, X, flag,
Lag(X) Over (Partition By ID Order By date Asc) As Prev into Flag_table
From Work_Table
)
Update [dbo].[Flag_table]
Set flag = 1
where prev = '2'
UPDATE t
Set t.flag = [dbo].[Flag_table].flag FROM T
JOIN [dbo].[Flag_table]
ON t.ID= [dbo].[Flag_table].ID where T.date = [dbo].[Flag_table].date`
However I cannot do this in Bigquery. Any ideas?
Below is for BigQuery Standard SQL
#standardSQL
SELECT id, dt, x,
IF(LAG(x = 2) OVER(PARTITION BY id ORDER BY dt), 1, 0) flag
FROM `project.dataset.work_table`
You can test / play with it using dummy data from your question as
#standardSQL
WITH `project.dataset.work_table` AS (
SELECT 1 id, '1/1/16' dt, 2 x, 0 flag UNION ALL
SELECT 2, '1/1/16', 0, 0 UNION ALL
SELECT 3, '1/1/16', 0, 0 UNION ALL
SELECT 1, '2/1/16', 0, 0 UNION ALL
SELECT 2, '2/1/16', 1, 0 UNION ALL
SELECT 3, '2/1/16', 2, 0 UNION ALL
SELECT 1, '3/1/16', 2, 0 UNION ALL
SELECT 2, '3/1/16', 1, 0 UNION ALL
SELECT 3, '3/1/16', 2, 0
)
SELECT id, dt, x,
IF(LAG(x = 2) OVER(PARTITION BY id ORDER BY dt), 1, 0) flag
FROM `project.dataset.work_table`
ORDER BY dt, id
with result as
Row id dt x flag
1 1 1/1/16 2 0
2 2 1/1/16 0 0
3 3 1/1/16 0 0
4 1 2/1/16 0 1
5 2 2/1/16 1 0
6 3 2/1/16 2 0
7 1 3/1/16 2 0
8 2 3/1/16 1 0
9 3 3/1/16 2 1
I have a query which returns many columns which are either 1 or 0 depending on a users interaction with many points of a website, my data looks like this:
UserID Variable_1 Variable_2 Variable_3 Variable_4 Variable_5
User 1 1 0 1 0 0
User 2 0 0 1 0 0
User 3 0 0 0 0 1
User 4 0 1 1 1 1
User 5 1 0 0 0 1
Each variable is defined with it's own line of code like:
MAX(IF(LOWER(hits_product.productbrand) LIKE "Variable_1",1,0)) AS Variable_1,
I'd like to have one column that sums up all the rows per user. which looks like this:
UserID Total Variable_1 Variable_2 Variable_3 Variable_4 Variable_5
User 1 2 1 0 1 0 0
User 2 3 1 1 1 0 0
User 3 0 0 0 0 0 0
User 4 5 1 1 1 1 1
User 5 3 1 0 1 0 1
What is the most elegant way to achieve this?
Even though it happen that for OP's particular case simple COUNT(DISTINCT) will suffice - I still wanted to answer original question of how to sum up all numerical columns into one Total without having dependency on number and names of those columns
Below is for BigQuery Standard SQL
#standardSQL
SELECT
UserID,
( SELECT SUM(CAST(value AS INT64))
FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r':(\d+),?')) value
) Total,
* EXCEPT(UserID)
FROM t
This can be tested / played with using dummy data from question
#standardSQL
WITH t AS (
SELECT 'User 1' UserID, 1 Variable_1, 0 Variable_2, 1 Variable_3, 0 Variable_4, 0 Variable_5 UNION ALL
SELECT 'User 2', 1, 1, 1, 0, 0 UNION ALL
SELECT 'User 3', 0, 0, 0, 0, 0 UNION ALL
SELECT 'User 4', 1, 1, 1, 1, 1 UNION ALL
SELECT 'User 5', 1, 0, 1, 0, 1
)
SELECT
UserID,
( SELECT SUM(CAST(value AS INT64))
FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r':(\d+),?')) value
) Total,
* EXCEPT(UserID)
FROM t
ORDER BY UserID
result is
Row UserID Total Variable_1 Variable_2 Variable_3 Variable_4 Variable_5
1 User 1 2 1 0 1 0 0
2 User 2 3 1 1 1 0 0
3 User 3 0 0 0 0 0 0
4 User 4 5 1 1 1 1 1
5 User 5 3 1 0 1 0 1
A simple method uses a subquery or CTE:
select t.*, (v1 + v2 + v3 . . . ) as total
from (<your query here>
) t;
Not knowing what the data looks like, it is quite possible that count(distinct hits_product.productbrand) would also do the trick.
How about defining multiple variable columns into one repeated 'variables' column, of KeyValue messages, where a key would be your variable name and value a number, it can greatly simplify your calculation.
I have three tables as shown below.
TABLE1 : tb_subject
subject_id subject_name
1 English
2 Maths
3 Science
Table2 : tb_student
subject_id student_id
1 AA
1 BB
2 CC
3 DD
3 EE
Table3 : tb_student_score
student_id score conducted_month_number
AA 20 2
BB 30 3
CC 50 4
AA 80 4
DD 50 6
BB 10 2
EE 40 3
Result should be
conducted_month_number SUM(subject_id1) SUM(subject_id2) SUM(subject_id3)
1 0 0 0
2 30 0 0
3 30 0 40
4 80 50 0
5 0 0 0
6 0 0 60
7 0 0 0
8 0 0 0
9 0 0 0
10 0 0 0
11 0 0 0
12 0 0 0
How to write a select query for this? Can add all month number that is not stored in table as like in the resulted output?
You should be able to use case when to sum for each subject individually:
SELECT conducted_month_number,
SUM(CASE b.subject_id WHEN 1 THEN a.score ELSE 0 END) AS English,
SUM(CASE b.subject_id WHEN 2 THEN a.score ELSE 0 END) AS Maths,
SUM(CASE b.subject_id WHEN 3 THEN a.score ELSE 0 END) AS Science
FROM tb_student_score AS a
JOIN tb_student AS b ON b.student_id = a.student_id
GROUP BY conducted_month_number
ORDER BY conducted_month_number;
However, this alone will not ensure you have results for values of conducted_month_number that don't exist - if this is an issue, you could simply create a dummy student with a score of 0 for each month.
Edit: I noticed some comments posted around the same time I submitted my answer - if you want the number of summation columns to be variable based on the values of rows in the tb_subject table, you will not find the relational model of SQL to be well suited for that task. However, you can easily go back and update your query to include any new subjects you may add later on.
Have added dummy values of 1 to 12 months using union statement and later on did group by on them to calculate total scores.
Try this:-
Select conducted_month_number ,
sum(case when subject_id=1 then score else 0 end) as sum_subject_id1,
sum(case when subject_id=2 then score else 0 end) as sum_subject_id2,
sum(case when subject_id=3 then score else 0 end) as sum_subject_id3
from
(
Select a.conducted_month_number ,subject_id,score
from
tb_student_score a
inner join
tb_student b
on a.student_id=b.student_id
union
select 1,' ',0 from tb_student_score
union
select 2,' ',0 from tb_student_score
union
select 3,' ',0 from tb_student_score
union
select 4,' ',0 from tb_student_score
union
select 5,' ',0 from tb_student_score
union
select 6,' ',0 from tb_student_score
union
select 7,' ',0 from tb_student_score
union
select 8,' ',0 from tb_student_score
union
select 9,' ',0 from tb_student_score
union
select 10,' ',0 from tb_student_score
union
select 11,' ',0 from tb_student_score
union
select 12,' ',0 from tb_student_score
)a
group by conducted_month_number
My Output
conducted_month_number sum_subject_id1 sum_subject_id2 sum_subject_id3
1 0 0 0
2 30 0 0
3 30 0 40
4 80 50 0
5 0 0 0
6 0 0 50
7 0 0 0
8 0 0 0
9 0 0 0
10 0 0 0
11 0 0 0
12 0 0 0
I have data in the below format
g_name amt flag
g1 0 0
g1 0 0
g1 10 1
g1 0 0
g1 15 2
g1 0 0
and I would require in the below format
n1 will have data starting from row where amt hits 1 and it keeps retaining it till the end, similarly n2 will have data starting from row where amt hits 2 and it keeps retaining it till the end, please help me with any window functions with out needing joins. please.
g_name amt flag n1 n2
g1 0 0 0 0
g1 0 0 0 0
g1 10 1 10 0
g1 0 0 10 0
g1 15 2 10 15
g1 0 0 10 15
I added a column for ordering - change as needed. I also added a few more rows with a different g_name, presumably this must be done "by g_name".
This is a good test case for the first_value() analytic function. It has the ability to ignore nulls - so we make the amt NULL when flag is not 1 (or 2, etc.) and then apply first_value() with the proper PARTITION BY and ORDER BY clauses.
with
test_data ( id, g_name, amt, flag ) as (
select 1, 'g1', 0, 0 from dual union all
select 2, 'g1', 0, 0 from dual union all
select 3, 'g1', 10, 1 from dual union all
select 4, 'g1', 0, 0 from dual union all
select 5, 'g1', 15, 2 from dual union all
select 6, 'g1', 0, 0 from dual union all
select 1, 'g2', 0, 0 from dual union all
select 2, 'g2', 4, 1 from dual union all
select 3, 'g2', 3, 2 from dual union all
select 4, 'g2', 0, 0 from dual
)
-- end of test data; solution (SQL query) begins below this line
select id, g_name, amt, flag,
coalesce (first_value(case when flag = 1 then amt end ignore nulls)
over (partition by g_name order by id), 0) as n1,
coalesce (first_value(case when flag = 2 then amt end ignore nulls)
over (partition by g_name order by id), 0) as n2
from test_data
order by g_name, id
;
ID G_NAME AMT FLAG N1 N2
--- ------ ---------- ---------- ---------- ----------
1 g1 0 0 0 0
2 g1 0 0 0 0
3 g1 10 1 10 0
4 g1 0 0 10 0
5 g1 15 2 10 15
6 g1 0 0 10 15
1 g2 0 0 0 0
2 g2 4 1 4 0
3 g2 3 2 4 3
4 g2 0 0 4 3
SQL tables represent unordered sets. There is no ordering, unless a column specifies that ordering. Let me assume that such a column exists.
If so, you can do this with analytic functions:
select t.*,
max(case when flag = 1 then amt else 0 end) over (order by ??) as n1,
max(case when flag = 2 then amt else 0 end) over (order by ??) as n2
from t;
The ?? specifies the ordering.