SQLite query - Limit occurrence of value

SQLite query - Limit occurrence of value - sql

I have a query that return this result. How can i limit the occurrence of a value from the 4th column.
19 1 _BOURC01 1
20 1 _BOURC01 3 2019-11-18
20 1 _BOURC01 3 2017-01-02
21 1 _BOURC01 6
22 1 _BOURC01 10
23 1 _BOURC01 13 2016-06-06
24 1 _BOURC01 21 2016-09-19
My Query:
SELECT "_44_SpeakerSpeech"."id" AS "id", "_44_SpeakerSpeech"."active" AS "active", "_44_SpeakerSpeech"."id_speaker" AS "id_speaker", "_44_SpeakerSpeech"."Speech" AS "Speech", "34 Program Weekend"."date" AS "date"
FROM "_44_SpeakerSpeech"
LEFT JOIN "_34_programWeekend" "34 Program Weekend" ON "_44_SpeakerSpeech"."Speech" = "34 Program Weekend"."theme_id"
WHERE "id_speaker" = "_BOURC01"
ORDER BY id_speaker, Speech, date DESC
Thanks

I think this is what you want here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY s.id, s.active, s.id_speaker, s.Speech
ORDER BY p.date DESC) rn
FROM "_44_SpeakerSpeech" s
LEFT JOIN "_34_programWeekend" p ON s.Speech = p.theme_id
WHERE s.id_speaker = '_BOURC01'
)
SELECT id, active, id_speaker, Speech, date
FROM cte
WHERE rn = 1;
This logic assumes that when two or more records all have the same columns values (excluding the date), you want to retain only the latest record.

Related

Group items from the first time + certain time period

I want to group orders from the same customer if they happen within 10 minutes of the first order, then find the next first order and group them and so on.
Ex:
Customer group orders
6 1 3
2 4,5
3 8
7 1 9,10
2 11,12
3 13
id customer time
3 6 2021-05-12 12:14:22.000000
4 6 2021-05-12 12:24:24.000000
5 6 2021-05-12 12:29:16.000000
8 6 2021-05-12 13:01:40.000000
9 7 2021-05-14 12:13:11.000000
10 7 2021-05-14 12:20:01.000000
11 7 2021-05-14 12:45:00.000000
12 7 2021-05-14 12:48:41.000000
13 7 2021-05-14 12:58:16.000000
18 9 2021-05-18 12:22:13.000000
25 15 2021-05-18 13:44:02.000000
26 16 2021-05-17 09:39:02.000000
27 16 2021-05-18 19:38:43.000000
28 17 2021-05-18 15:40:02.000000
29 18 2021-05-19 15:32:53.000000
30 18 2021-05-19 15:45:56.000000
31 18 2021-05-19 16:29:09.000000
34 15 2021-05-24 15:45:14.000000
35 15 2021-05-24 15:45:14.000000
36 19 2021-05-24 17:14:53.000000
Here is what I have currently, I think that it is currently not grouping by customer when case when d.StartTime > dateadd(minute, 10, c.first_time) so it compares StartTime of all orders for all customers.
with
data as (select Customer,StartTime,Id, row_number() over(partition by Customer order by StartTime) rn from orders t),
cte as (
select d.*, StartTime as first_time
from data d
where rn = 1
union all
select d.*,
case when d.StartTime > dateadd(minute, 10, c.first_time)
then d.StartTime
else c.first_time
end
from cte c
inner join data d on d.rn = c.rn + 1
)
select c.*, dense_rank() over(partition by Customer order by first_time) grp
from cte c;'
I have two databases (MySQL & SQL Server) having similar schema so either would work for me.

Try the following on SQL Server:
SELECT customer,
ROW_NUMBER() OVER (PARTITION BY customer ORDER BY grp) AS group_no,
STRING_AGG(id, ',') AS orders
FROM
(
SELECT id,customer, [time],
(DATEDIFF(SECOND, MIN([time]) OVER (PARTITION BY CUSTOMER), [time])/60)/10 grp
FROM orders
) T
GROUP BY customer, grp
ORDER BY customer
See a demo.
According to your posted requirement, you are trying to divide the period between the first order date and the last order date into groups (or let's say time frames) each one is 10 minutes long.
What I did in this query: for each customer order, find the difference between the order date and the minimum date (first customer order date) in seconds and then divide it by 10 to get it's time frame number. i.e. for a difference = 599s the frame number = 599/60 =9m /10 = 0. for a difference = 620s the frame number = 620/60 =10m /10 = 1.
After defining the correct groups/time frames for each order you can simply use the STRING_AGG function to get the desired output. Noting that the STRING_AGG function applies to SQL Server 2017 (14.x) and later.

Snowflake SQL - Count Distinct Users within descending time interval

I want to count the distinct amount of users over the last 60 days, and then, count the distinct amount of users over the last 59 days, and so on and so forth.
Ideally, the output would look like this (TARGET OUTPUT)
Day Distinct Users
60 200
59 200
58 188
57 185
56 180
[...] [...]
where 60 days is the max total possible distinct users, and then 59 would have a little less and so on and so forth.
my query looks like this.
select
count(distinct (case when datediff(day,DATE,current_date) <= 60 then USER_ID end)) as day_60,
count(distinct (case when datediff(day,DATE,current_date) <= 59 then USER_ID end)) as day_59,
count(distinct (case when datediff(day,DATE,current_date) <= 58 then USER_ID end)) as day_58
FROM Table
The issue with my query is that This outputs the data by column instead of by rows (like shown below) AND, most importantly, I have to write out this logic 60x for each of the 60 days.
Current Output:
Day_60 Day_59 Day_58
209 207 207
Is it possible to write the SQL in a way that creates the target as shown initially above?

Using below data in CTE format -
with data_cte(dates,userid) as
(select * from values
('2022-05-01'::date,'UID1'),
('2022-05-01'::date,'UID2'),
('2022-05-02'::date,'UID1'),
('2022-05-02'::date,'UID2'),
('2022-05-03'::date,'UID1'),
('2022-05-03'::date,'UID2'),
('2022-05-03'::date,'UID3'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID2'),
('2022-05-04'::date,'UID3'),
('2022-05-04'::date,'UID4'),
('2022-05-05'::date,'UID1'),
('2022-05-06'::date,'UID1'),
('2022-05-07'::date,'UID1'),
('2022-05-07'::date,'UID2'),
('2022-05-08'::date,'UID1')
)
Query to get all dates and count and distinct counts -
select dates,count(userid) cnt, count(distinct userid) cnt_d
from data_cte
group by dates;
DATES
CNT
CNT_D
2022-05-01
2
2
2022-05-02
2
2
2022-05-03
3
3
2022-05-04
5
4
2022-05-05
1
1
2022-05-06
1
1
2022-05-08
1
1
2022-05-07
2
2
Query to get difference of date from current date
select dates,datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates;
DATES
DDIFF
CNT
CNT_D
2022-05-01
45
2
2
2022-05-02
44
2
2
2022-05-03
43
3
3
2022-05-04
42
5
4
2022-05-05
41
1
1
2022-05-06
40
1
1
2022-05-08
38
1
1
2022-05-07
39
2
2
Get records with date difference beyond a certain range only -
include clause having
select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43;
DDIFF
CNT
CNT_D
43
3
3
42
5
4
41
1
1
39
2
2
38
1
1
40
1
1
If you need to prefix 'day' to each date diff count, you can
add and outer query to previously fetched data-set and add the needed prefix to the date diff column as following -
I am using CTE syntax, but you may use sub-query given you will select from table -
,cte_1 as (
select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43)
select 'day_'||to_char(ddiff) days,
cnt,
cnt_d
from cte_1;
DAYS
CNT
CNT_D
day_43
3
3
day_42
5
4
day_41
1
1
day_39
2
2
day_38
1
1
day_40
1
1
Updated the answer to get distinct user count for number of days range.
A clause can be included in the final query to limit to number of days needed.
with data_cte(dates,userid) as
(select * from values
('2022-05-01'::date,'UID1'),
('2022-05-01'::date,'UID2'),
('2022-05-02'::date,'UID1'),
('2022-05-02'::date,'UID2'),
('2022-05-03'::date,'UID5'),
('2022-05-03'::date,'UID2'),
('2022-05-03'::date,'UID3'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID6'),
('2022-05-04'::date,'UID2'),
('2022-05-04'::date,'UID3'),
('2022-05-04'::date,'UID4'),
('2022-05-05'::date,'UID7'),
('2022-05-06'::date,'UID1'),
('2022-05-07'::date,'UID8'),
('2022-05-07'::date,'UID2'),
('2022-05-08'::date,'UID9')
),cte_1 as
(select datediff(day,dates,current_date()) ddiff,userid
from data_cte), cte_2 as
(select distinct ddiff from cte_1 )
select cte_2.ddiff,
(select count(distinct userid)
from cte_1 where cte_1.ddiff <= cte_2.ddiff) cnt
from cte_2
order by cte_2.ddiff desc
DDIFF
CNT
47
9
46
9
45
9
44
8
43
5
42
4
41
3
40
1

You can do unpivot after getting your current output.
sample one.
select
*
from (
select
209 Day_60,
207 Day_59,
207 Day_58
)unpivot ( cnt for days in (Day_60,Day_59,Day_58));

View and complex query count distinct locations employee stayed in SQL

I have a view which looks like this view_1:
id Office Begin_dt Last_dt Days
1 Office1 2019-09-02 2019-09-08 6
1 Office2 2019-09-09 2019-09-30 21
1 Office1 2019-10-01 2019-10-31 30
5 Office3 2017-10-01 2017-10-16 15
5 Office2 2017-10-17 2017-10-30 13
5 Office2 2017-11-01 2017-11-31 30
I want to find the office where employee stayed for max time and also the number of Distinct Office locations he stayed in.
Expected output
id Max_time_in_Office Days Distinct_office_locations
1 Office1 36 2
5 Office2 43 2
So id 1 spends 6 and 30, overall 36 days in office 1. Max time is spent in office 1 by him. Distinct locations are 2.
id 5 spends 13 and 30 , 43 days in office. Max time is spent in office 2. Distinct locations are 2.
Code tried
select v.*
from (select v.id, v.office, sum(days) as Max_time_in_Office, count(Office) as Distinct_office_locations,
rank() over (partition by id order by sum(days) desc) as seqnum
from view_1 v
group by id, office
) v
where seqnum = 1;
Output obtained
id Max_time_in_Office Days Distinct_office_locations
1 Office1 36 1
5 Office2 43 1
So I am getting wrong output. Can someone pls help

Close. You want a window function:
select v.*
from (select v.id, v.office, sum(days) as Max_time_in_Office,
count(*) over (partition by id) as Distinct_office_locations,
rank() over (partition by id order by sum(days) desc) as seqnum
from view_1 v
group by id, office
) v
where seqnum = 1;
Basically the window function is counting the number of rows returned after the aggregation -- and there is one row per office.

You could use the apply operator to achieve that:
select V.Id,
T.Max_Time_Office,
T.Days,
Distinct_office_locations = count(distinct V.Office)
from view_1 V
Cross apply
(
Select top 1 Id,
Max_Time_Office = Office,
Days = sum(Days)
From view_1 VG
where V.Id = VG.Id
group by VG.Id, VG.Office
order by sum(Days) desc
) T
group by V.Id, T.Max_Time_Office, T.Days
Basically, you are getting the Office with most days in the order by sum(Days) desc inside the Cross apply, and using that in the outer expression. I then just did a count(distinct V.Office) to get the distinct offices.

Current record with group by function

Trying to get userid recent aggregate value for session_id.
(session_id 3 has two records, recent agg value is 80.00
session_id 4 has four records, recent agg value is 95.00
session_id 6 has three records, recent agg value is 72.00
Table:session_agg
id session_id userid agg date
-- ---------- ------ ----- -------
1 3 11 60.00 1573561586
4 3 11 80.00 1573561586
6 4 11 35.00 1573561749
7 4 11 50.00 1573561751
8 4 11 70.00 1573561912
10 4 11 95.00 1573561921
11 6 14 40.00 1573561945
12 6 14 67.00 1573561967
13 6 14 72.00 1573561978
select id, session_id, userid, agg, date from session_agg
WHERE date IN (select MAX(date) from session_agg GROUP BY session_id) AND
userid = 11

If you want to stick with your current approach, then you need to correlate the session_id in the subquery which checks for the max date for each session:
SELECT id, session_id, userid, add, date
FROM session_agg sa1
WHERE
date = (SELECT MAX(date) FROM session_agg sa2 WHERE sa2.session_id = sa1.session_id) AND
userid = 11;
But, if your version of SQL supports analytic functions, ROW_NUMBER is an easier way to do this:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY date DESC) rn
FROM session_agg
)
SELECT id, session_id, userid, add, date
FROM cte
WHERE rn = 1;

select latest record for each battery using SQL with count

BatteryId TimeStamp Temprature
1 2017-02-13 12:16:14.000 23
1 2016-02-13 12:13:14.000 21
1 2015-01-13 12:16:14.000 19
2 2017-02-11 12:16:14.000 22
2 2016-02-13 12:16:14.000 16
3 2017-02-13 11:16:14.000 12
3 2016-02-13 12:15:14.000 25
I have table with multiple records for each battery as above
following sql query is returning latest record for each battery
SELECT * FROM (SELECT BatteryId, Timestamp, Temperature
ROW_NUMBER() OVER(PARTITION BY BatteryId ORDER BY timestamp DESC)
AS N FROM tblBattery) AS TT WHERE N = 1
as
BatteryId TimeStamp Temprature
1 2017-02-13 12:16:14.000 23
2 2017-02-11 12:16:14.000 22
3 2017-02-13 11:16:14.000 12
How I can add Count for each BatteryId, Here is what I need
BatteryId TimeStamp Temprature Count
1 2017-02-13 12:16:14.000 23 3
2 2017-02-11 12:16:14.000 22 2
3 2017-02-13 11:16:14.000 12 2

Use the count window function.
SELECT * FROM
(SELECT BatteryId, Timestamp, Temperature,
ROW_NUMBER() OVER(PARTITION BY BatteryId ORDER BY timestamp DESC) AS N,
COUNT(*) OVER(PARTITION BY BatteryId) as Cnt
FROM tblBattery) TT
WHERE N = 1

Hoping, i understood your problem correctly.
Please check if below query can help you.
SELECT *
FROM
(SELECT BatteryId,
TIMESTAMP,
Temperature , ROW_NUMBER() OVER(PARTITION BY BatteryId ORDER BY TIMESTAMP DESC) AS N ,
COUNT(0) OVER(PARTITION BY BatteryId ) CNT
FROM tblBattery
) AS TT
WHERE N = 1;

Add a sub query before you perform the PARTITION BY
SELECT *
FROM (SELECT
BatteryId
,Timestamp
,Temperature
,Count
,ROW_NUMBER() OVER(PARTITION BY BatteryId ORDER BY timestamp DESC) AS N
FROM (SELECT *, COUNT(BatteryId) As Count FROM tblBattery GROUP BY BatteryId)) AS TT WHERE N = 1
This should solve your issue.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQLite query - Limit occurrence of value - sql

Related

Group items from the first time + certain time period

Snowflake SQL - Count Distinct Users within descending time interval

View and complex query count distinct locations employee stayed in SQL

Current record with group by function

select latest record for each battery using SQL with count

Categories

Resources