inserting into table closing rows while keeping the date column - sql

Following this question.
My table
id sum type date
1 3 -1 2017-02-02
1 6 -1 2017-02-04
1 -6 2 2017-02-01
1 -3 1 2017-02-09
1 3 -1 2017-02-17
1 6 -1 2017-02-05
This query finds people who pass the conditions and returns an occurrences number of rows of those users with some columns modified.
with t as(
select id
, -abs (sum) as sum
, sum (case when type = -1 then 1 else -1 end) as occurrences
--, collect_list(date) as time_col
from table
group by id, abs(sum)
having sum (case when type = -1 then 1 else -1 end) > 15
)
select t.id
, t.sum
, 2 as type
from t
lateral view explode (split (space (cast (occurrences as int) - 1),' ')) e
-- lateral view explode(time_col) time_table as time_key;
The problem is, I need every row to hold one date column from the list. I tried adding , collect_list(date) as time_col and then
lateral view explode(time_col) time_table as time_key;
but this just returned all possible combinations. I could probably use a join(would that work?), but I wondered if that's really necessary.
In the end these rows
1 3 -1 2017-02-17
1 6 -1 2017-02-05
would transform into
1 -3 2 2017-02-17
1 -6 2 2017-02-05

select val_id
,-val_sum as val_sum
,2 as val_type
,val_date
from (select val_id
,val_sum
,val_type
,val_date
,sum (case when val_type = -1 then 1 else -1 end) over
(
partition by val_id,-abs (val_sum)
) as occurrences
,row_number () over
(
partition by val_id,val_sum
order by val_date desc
) as rn
from mytable
) t
where val_type = -1
and rn <= occurrences
and occurrences > 15
;
Execution results (without and occurrences > 15)
+--------+---------+----------+------------+
| val_id | val_sum | val_type | val_date |
+--------+---------+----------+------------+
| 1 | -3 | 2 | 2017-02-17 |
+--------+---------+----------+------------+
| 1 | -6 | 2 | 2017-02-05 |
+--------+---------+----------+------------+

Related

SQL COUNT of zero values in multiple columns

I am calculating how many zeros appear in a series of columns based on a ID.
Example Table:
ID hour1 hour2 hour3
1 2 10 0
2 0 0 0
3 0 24 0
I think it would look something like this, but obviously it doesn't work
SELECT ID, COUNT(CASE WHEN(
FROM (VALUES (hour1) , (hour2) , (hour3))
AS VALUE (v)) AS ZERO_HOURS
Desired output:
ID ZERO_HOURS
1 1
2 3
3 2
One method is:
select t.id, h.num_zeros
from t cross apply
(select count(*) as num_zeros
from (values (hour1), (hour2), (hour3)) v(h)
where h = 0
) h;
Of course a case expression is not so hard either:
select t.id,
(case when hour1 = 0 then 1 else 0 end +
case when hour2 = 0 then 1 else 0 end
case when hour3 = 0 then 1 else 0 end
) as num_zeros
Or, if there are no negative or NULL values:
select t.id,
(1 - sign(hour1)) + (1 - sign(hour2)) + (1 - sign(hour3)) as num_zeros
Please try the following solution.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, hour1 INT, hour2 INT, hour3 INT);
INSERT INTO #tbl (hour1, hour2, hour3) VALUES
(2, 10, 0),
(0, 0 , 0),
(0, 24, 0);
-- DDL and sample data population, end
SELECT ID
, c.value('count(/root/*[./text()="0"])','INT') AS ZERO_HOURS
FROM #tbl
CROSS APPLY (
SELECT hour1, hour2, hour3
FOR XML PATH(''), TYPE, ROOT('root')) AS t(c);
Output
+----+------------+
| ID | ZERO_HOURS |
+----+------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 2 |
+----+------------+

Count Visits by Source for 30 Day Prior Period for each Purchase

I have a table that logs website activity with the following Columns and Data
ID Date Source Revenue
1 2013-10-01 A 0
2 2013-10-01 A 0
3 2013-10-01 B 10
1 2013-10-02 A 40
4 2013-10-03 B 0
3 2013-10-03 B 0
4 2013-10-04 A 10
I am trying to create a table that takes each transaction (Revenue > 0) and counts all of the visits by source in individual columns for the last 30 days. It should look something like this.
ID Date Source Revenue Count_A Count_B
3 2013-10-01 B 10 0 1
1 2013-10-02 A 40 2 0
4 2013-10-04 A 10 1 1
I have tried using a subquery for each of these columns, but the Counts are way off and I am not sure why.
Select ID,
Date,
Source,
Revenue,
(SELECT Count(*)
FROM table t2
WHERE t2.Date between t.Date-30 and t.Date and Source = 'A') AS Count_A
(SELECT Count(*)
FROM table t3
WHERE t3.Date between t.Date-30 and t.Date and Source = 'B') AS Count_B
FROM table t
Where Revenue > 0
Order By WMEID
I am using Microsoft SQL Server.
Use a lateral join:
Select l.*, l2.*
from logs l outer apply
(select sum(case when l2.source = 'A' then 1 else 0 end) as count_a,
sum(case when l2.source = 'B' then 1 else 0 end) as count_b
from logs l2
where l2.id = l.id and
l2.date >= dateadd(day, -30, l.date) and
l2.date <= l.date
) l2
where l.Revenue > 0
order By l.WMEID;
I think the issue with your approach is that you are not matching the ids.
Your counts are off because your sub-selects aren't correlated to the outer query, so the totals are coming up independent of the other data in the row. Also, there's no GROUP BY in the sub-selects, so you're getting a total table count. And I'm not so sure about that date logic.
You could correct all this by adding the correlation to each sub-select (WHERE...t2.ID = t.ID AND t2.Date = t.Date, etc) and including an appropriate GROUP BY clause for each of those queries. But that's rather a lot of typing, hard to maintain, and hard to read. It will also probably generate multiple table scans, so it could become a performance issue.
Instead, I'd opt for conditional aggregation:
Select t.ID,
t.Date,
t.Source,
SUM(t.Revenue) AS Revenue,
SUM(CASE WHEN t.Source = 'A' THEN 1 ELSE 0 END) AS Count_A,
SUM(CASE WHEN t.Source = 'B' THEN 1 ELSE 0 END) AS Count_B
FROM mytable t
Where Revenue > 0
AND t.Date >= DATEADD(DAY, -30, CAST(GETDATE() AS date))
AND t.Date < CAST(GETDATE() AS date)
GROUP BY
t.ID,
t.Date,
t.Source
Order By t.Date
Results (Based on the structure in the question, not the data):
+----+------------+--------+---------+---------+---------+
| ID | Date | Source | Revenue | Count_A | Count_B |
+----+------------+--------+---------+---------+---------+
| 3 | 2020-05-01 | B | 60 | 0 | 2 |
| 1 | 2020-05-02 | A | 40 | 1 | 0 |
| 4 | 2020-05-04 | A | 10 | 1 | 0 |
+----+------------+--------+---------+---------+---------+
Here's a SQL Fiddle.

Distinct Conditional Counting to Avoid Overlap

Consider this table:
[Table1]
------------------------
| Person_ID | Yes | No |
|-----------|-----|----|
| 1 | 1 | 0 |
|-----------|-----|----|
| 1 | 1 | 0 |
|-----------|-----|----|
| 2 | 0 | 1 |
|-----------|-----|----|
| 2 | 0 | 1 |
|-----------|-----|----|
| 3 | 1 | 0 |
|-----------|-----|----|
| 3 | 1 | 0 |
|-----------|-----|----|
| 3 | 0 | 1 |
|-----------|-----|----|
| 3 | 1 | 0 |
------------------------
I need a distinct count on Person_ID to get the number of people that are marked Yes and No. However, if someone has a single instance of No, they should be counted as a No and not be included in the Yes count no matter how many Yes they have.
My first thought was to try something similar to:
select count(distinct (case when Yes = 1 then Person_ID else null end)) Yes_People
, count(distinct (case when No = 1 then Person_ID else null end)) No_People
from Table1
but this will result in 3 being counted in both the Yes and No counts.
My desired output would be:
--------------------------
| Yes_People | No_People |
|------------|-----------|
| 1 | 2 |
--------------------------
I'm hoping to avoid the performance hit from having to evaluate a subquery against each row but if it has to be the way to go I will accept that.
Aggregate first at the person level and then overall:
select sum(yes_only) as yes_only,
sum(1 - yes_only) as no
from (select person_id,
(case when max(yes) = min(yes) and max(yes) = 1
then 1
end) as yes_only
from t
group by person_id
) t
You can first group them by the person.
Then the CASE for the Yes people can have a not No condition.
SELECT
COUNT(CASE WHEN No = 0 AND Yes = 1 THEN Person_ID END) AS Yes_People,
COUNT(CASE WHEN No = 1 THEN Person_ID END) AS No_People
FROM
(
select Person_ID
, MAX(Yes) as Yes
, MAX(No) as No
FROM Table1
GROUP BY Person_ID
) q
You could use a window function to rank the rows for a single person_id to prioritize a 'No' over a 'Yes', but that will require a subquery
select count(case when yes=1 then 1 end) as yes_count,
count(case when no=1 then no_count) as no_count
from (
select person_id, yes, no, row_number() over (order by no desc, yes desc) as rn
from table1
)
where rn = 1
The inner subquery plus the where filter will get you a single row per person_id, giving priority to the 'no' records.
This of course assumes yes/no are mutually exclusive, and if that's true, you should probably change the model to a single field.
Think you need to precheck every person with a window function
with t as (select 1 p_id, 1 yes, 0 no from dual
union all select 1 p_id, 1 yes, 0 no from dual
union all select 2 p_id, 0 yes, 1 no from dual
union all select 2 p_id, 0 yes, 1 no from dual
union all select 3 p_id, 1 yes, 0 no from dual
union all select 3 p_id, 0 yes, 1 no from dual
union all select 3 p_id, 1 yes, 0 no from dual)
, chk as (
select max(no) over (partition by p_id) n
, max(yes) over (partition by p_id) y
, p_id
from t)
-- select * from chk;
select count(distinct decode(y-n,1,p_id,null )) yes_people
, count(distinct decode(n,1,p_id,null )) no_people
from chk
group by 1;
You can use Conditional aggregation as following:
SQL> with table1 as (select 1 PERSON_ID, 1 yes, 0 no from dual
2 union all select 1 PERSON_ID, 1 yes, 0 no from dual
3 union all select 2 PERSON_ID, 0 yes, 1 no from dual
4 union all select 2 PERSON_ID, 0 yes, 1 no from dual
5 union all select 3 PERSON_ID, 1 yes, 0 no from dual
6 union all select 3 PERSON_ID, 0 yes, 1 no from dual
7 union all select 3 PERSON_ID, 1 yes, 0 no from dual)
8 SELECT
9 SUM(CASE WHEN NOS = 0 AND YES > 0 THEN 1 END) YES_PEOPLE,
10 SUM(CASE WHEN NOS > 0 THEN 1 END) NO_PEOPLE
11 FROM
12 (
13 SELECT
14 SUM(NO) NOS,
15 PERSON_ID,
16 SUM(YES) YES
17 FROM TABLE1
18 GROUP BY PERSON_ID
19 );
YES_PEOPLE NO_PEOPLE
---------- ----------
1 2
SQL>
Cheers!!

PIVOT values from two columns to multiple columns

Table: Sample
ID Day Status MS
----------------------------
1 1 0 10
1 2 0 20
1 3 1 15
2 3 1 3
2 30 0 5
2 31 0 6
Expected Result:
ID Day1 Day2 Day3....Day30 Day31 Status1 Status2 Status3...Status30 Status31
---------------------------------------------------------------------------------------
1 10 20 15 NULL NULL 0 0 1 NULL NULL
2 NULL NULL 3 5 6 NULL NULL 1 0 0
I want to get the MS and Status value for each day from 1 to 31 for each ID.
I have used PIVOT to get the below result.
Result:
ID Day1 Day2 Day3....Day30 Day31
-------------------------------------
1 10 20 15 NULL NULL
2 NULL NULL 3 5 6
Query:
SELECT
ID
,[1] AS Day1
,[2] AS Day2
,[3] AS Day3
.
.
.
,[30] AS Day30
,[31] AS Day31
FROM
(
SELECT
ID
,[Day]
,MS
FROM
Sample
) AS A
PIVOT
(
MIN(MS)
FOR [Day] IN([1],[2],[3],...[30],[31])
) AS pvtTable
How can I merge the Status column with the result?.
Try this. Use Another Pivot to transpose Status column. Then use aggregate (Max or Min) in select column list with group by Id to get the Result.
CREATE TABLE #est
(ID INT,[Day] INT,[Status] INT,MS INT)
INSERT #est
VALUES (1,1,0,10),(1,2,0,20),(1,3,1,15 ),
(2,3,1,3),(2,30,0,5),(2,31,0,6)
SELECT ID,
Max([Day1]) [Day1],
Max([Day2]) [Day2],
Max([Day3]) [Day3],
Max([Day30]) [Day30],
Max([Day31]) [Day31],
Max([status1]) [status1],
Max([status2]) [status2],
Max([status3]) [status3],
Max([status30])[status30],
Max([status31])[status31]
FROM (SELECT Id,
'status' + CONVERT(VARCHAR(30), Day) col_stat,
'Day' + CONVERT(VARCHAR(30), Day) Col_Day,
[status],
ms
FROM #est) a
PIVOT (Min([ms])
FOR Col_Day IN([Day1],[Day2],[Day3],[Day30],[Day31])) piv
PIVOT (Min([status]) FOR col_stat IN ([status1],[status2],[status3],[status30],[status31])) piv1
GROUP BY id

How to group sums by weekday in MySQL?

I have a table like this:
id | timestamp | type
-----------------------
1 2010-11-20 A
2 2010-11-20 A
3 2010-11-20 B
4 2010-11-21 A
5 2010-11-21 C
6 2010-11-27 B
and I need to count the rows for each type, grouped by weekday; like this:
weekday | A | B | C
--------------------------
5 2 2 0 -- the B column equals 2 because nov 20 and nov 27 are saturday
6 1 0 1
What would be the simplest solution for this?
I don't mind using views, variables, subqueries, etc.
Use:
SELECT WEEKDAY(t.timestamp) AS weekday,
SUM(CASE WHEN t.type = 'A' THEN 1 ELSE 0 END) AS a,
SUM(CASE WHEN t.type = 'B' THEN 1 ELSE 0 END) AS b,
SUM(CASE WHEN t.type = 'C' THEN 1 ELSE 0 END) AS c
FROM YOUR_TABLE t
GROUP BY WEEKDAY(t.timestamp)