find users with consecutive logins sql query - sql

i have table contains 2 columns; customer_id, login_date.
For each day, if a customer has logged in, there will be 1 entry in the table for that customer.
Customer_id login_date
--------------------
1 31-Dec-2018
2 31-Dec-2018
3 31-Dec-2018
1 1-Jan-2019
2 1-Jan-2019
3 2-Jan-2019
2 2-Jan-2019
3 3-Jan-2019
3 4-Jan-2019
I need to get the ids of customers who have logged in for at least 3 consecutive days.
Expected output is like below.
Customer_id
------
2
3
So far i have achieved this using below query .
select customer_id from (
select *, case when (lag(logindate,1) over (partition by customer_id order by logindate)) = dateadd(day, -1,logindate) then 1 else 0 end second_day ,
case when (lag(logindate,2) over (partition by customer_id order by logindate)) = dateadd(day, -2,logindate) then 1 else 0 end third_day
from login_history
) a
where a.second_day =1 and a.third_day =1;
But if i have to get customers with 5 consecutive logins i have to keep on adding lag columns.
Is there any better way to get this done?

You can use lag() or lead():
select distinct customer_id
from (select t.*,
lag(login_date, 2) over (partition by customer_id order by login_date) as prev2_login_date
from t
) t
where prev2_login_date = dateadd(day, -2, login_date);
This looks at the login date two rows behind. If it is two days before the current day -- then voila! Three days in a row. This uses the fact that you do not have duplicate dates for a given customer.
Here is a db<>fiddle.

Related

Oracle SQL LAG() function results in duplicate rows

I have a very simple query that results in two rows:
SELECT DISTINCT
id,
trunc(start_date) start_date
FROM example.table
WHERE ID = 1
This results in the following rows:
id start_date
1 7/1/2012
1 9/1/2016
I want to add a column that simply shows the previous date for each row. So I'm using the following:
SELECT DISTINCT id,
Trunc(start_date) start_date,
Lag(start_date, 1)
over (
ORDER BY start_date) pdate
FROM example.table
WHERE id = 1
However, when I do this, I get four rows instead of two:
id start_date pdate
1 7/1/2012 NULL
1 7/1/2012 7/1/2012
1 9/1/2016 7/1/2012
1 9/1/2016 9/1/2012
If I change the offset to 2 or 3 the results remain the same. If I change the offset to 0, I get two rows again but of course now the start_date == pdate.
I can't figure out what's going on
Use an explicit GROUP BY instead:
SELECT id, trunc(start_date) as start_date,
LAG(trunc(start_date)) OVER (PARTITION BY id ORDER BY trunc(start_date))
FROM example.table
WHERE ID = 1
GROUP BY id, trunc(start_date)
The reason for this is: the order of execution of an SQL statements, is that LAG runs before the DISTINCT.
You actually want to run the LAG after the DISTINCT, so the right query should be:
WITH t1 AS (
SELECT DISTINCT id, trunc(start_date) start_date
FROM example.table
WHERE ID = 1
)
SELECT *, LAG(start_date, 1) OVER (ORDER BY start_date) pdate
FROM t1

Oracle SQL - create select statement which will retrieve every end date of the month but compare the values the day or two after the end of month

Let's say I have a customer table.
customer_name || date || amount
-----------------------------------
A 31-OCT-20 100
A 01-NOV-20 100
A 02-NOV-20 200
B 31-OCT-20 300
B 01-NOV-20 325
B 02-NOV-20 350
I need to create a select statement which will retrieve every end date of the month and compare the values for the amounts respective to the day or two after. If the amount for the day or two is different from the end date of that month, display the recent changed amount.
Example 1 - Retrieve customer A for 31-OCT-20, compare to 01-NOV-20 and 02-NOV-20, output 200 for the amount.
Example 2 - Retrieve customer B for 31-OCT-20, compare to 01-NOV-20 and 02-NOV-20, output 350 for the amount.
Hmmm . . .
select t.*,
(case when next_amount <> amount or next2_amount <> amount
then greatest(next_amount, next2_amount)
else next_amount
end) as imputed_next_2_days
from (select t.*,
lead(amount) over (partition by customer_name order by date) as next_amount,
lead(amount, 2) over (partition by customer_name order by date) as next2_amount
from t
) t
where date = last_day(date);
You can use the following query:
Select * from
(Select t.*,
row_number() over (partition by customer_name order by dt desc) as rn
From your_table t
Where extract(day from t.dt + 2) between 2 and 4)
Where rn = 1
Tip of the day: don't use oracle reserved keywords as the column name. (Date)

How to find the share of clients who "outflow" every month? (SQLite or Oracle)

The CLIENTS table contains a monthly snapshot of the bank's clients,
who have made any transactions in the given month. Attributes: report_month
and client_id. We assume that the client "outflow" from the bank in month N, if in month N
it is active (present in the CLIENTS table) and inactive in months N + 1, N + 2, N + 3.
How to find the share of clients who "outflow" every month?
Table looks like:
report_month client_id
2020-01-01 0023
2020-03-01 0125
...
You can do this with window functions and a window frame. In standard SQL, this would look like:
select report_month, sum(case when cnt = 0 then 1 else 0 end) as outflow
from (
select t.*,
count(*) over(
partition by client_id
order by report_month
range between interval '1' month following and interval '3' month following
) cnt
from mytable t
) t
group by report_month
This assumes that report_month is of a date-like datatype, and that each customer has 0 or 1 record per report_month. If a customer may appear more than once in a month, you would change the outer conditional sum() to:
count(distinct case when cnt = 0 then client_id end) as outflow
In SQLite, that has poor date arithmetics support, it is a bit more complicated. If you can live with an approximation of month periods, you could do something like this:
select report_month, sum(case when cnt = 0 then 1 else 0 end) as outflow
from (
select t.*,
count(*) over(
partition by client_id
order by julianday(report_month)
range between 28 following and 92 following
) cnt
from mytable t
) t
group by report_month

How to get next to last MAX date with join

I'm working on a query in MS SQL to show items that have been in our "service center" more than once within the last 30 days. The data needed is from the service order prior to the most recent service order, based on serial number. So if an item has been received in the last 30 days, check to see if it was received at a previous time within the last 30 days.
ServiceOrders table: CustID, ItemID, DateReceived
ItemMaster table: CustID, ItemID, SerialNumber
I can get DateReceived items by using
ServiceOrders.DateReceived >= DATEADD(month,-1,GETDATE())
I could load service orders from the past month into a temp table, and then query against that to get the prior service order, but that doesn't sound like the best plan. Any ideas on an efficient way to get the previous service orders?
Example data
ServiceOrders table:
CustID ItemID DateReceived
1 2 9/26/2016
1 2 9/05/2016
1 2 1/15/2015
5 6 9/20/2016
7 6 9/02/2016
ItemMaster table:
CustID ItemID SerialNumber
1 2 8675309
5 6 101
7 6 101
So in the above example, SerialNumber 8675309 and 101 have been received more than once in the last 30 days. I need the data from ServiceOrders and ItemMaster for the DateReceived 9/05/2016 and 09/02/2016 records (the second most recent within 30 days). There are other fields in both tables, but they're simplified here. CustID won't necessarily stay the same from date to date, as the item can be transferred. SerialNumber is the key.
Filter the last month orders into Common Table Expression into cte and number them descending. Then select those items with more than 1 occurrences into cte2, join both cte's selecting the second row.
;With cte as(
Select row_number() over(PARTITION by ItemID order by DateReceived desc) as RowNum, *
from ServiceOrders
where DateReceived >= DateAdd(Month, -1, Getdate())
), cte2 as(
Select
ItemID From cte
Group by ItemID
Having count(*)>1
)
select b.*, c.SerialNumber from cte2 as a
left join cte as b on a.ItemID= b.ItemID and b.RowNum=2
left join ItemMaster as c on b.ItemID=c.ItemID and b.CustID=c.CustID
SELECT *, COUNT(ServiceOrders.DateReceived) as DateCount FROM TABLE
Paste to excel
Filter last column to show results = or > than 2
Might be a quick solution for you.
Get the items which have been received more than once in the last 30 days. Then use row_number to number the rows based on the descending order of date_received. Finally, get the rows whose row_number is 2 (the date before the latest date in the last 30 days).
If serialnumber is needed in the output, just join the itemmaster table to the final resultset.
WITH morethanone
AS (SELECT
so.itemid
FROM serviceorders so
JOIN itemmaster i
ON i.itemid = so.itemid
GROUP BY so.itemid
HAVING COUNT(CASE WHEN DATEDIFF(dd, so.datereceived, GETDATE()) <= 30 THEN 1 END)>1
)
SELECT
custid,
itemid,
datereceived
FROM (SELECT
*,
ROW_NUMBER() OVER (PARTITION BY itemid ORDER BY datereceived DESC) rn
FROM serviceorders
WHERE itemid IN (SELECT itemid FROM morethanone)
AND DATEDIFF(dd, datereceived, GETDATE()) <= 30
) x
WHERE rn = 2

Get the first occurence of the result in each specified group

I have this query in sql server 2012
select sum(user_number),
sum(media_number),
month_name from (
select TOP 100
count(distinct a.answer_group_guid) as 'user_number',
count(distinct a.media_guid) as 'media_number',
datename(mm,answer_datetime) as 'month_name' ,year(answer_datetime) as 'year'
from
tb_answers as a
left outer join
tb_media as m ON m.user_guid = 'userguid' and m.media_guid=a.media_guid
where
m.user_guid = 'userguid'
group by concat(year(answer_datetime),'',month(answer_datetime)),datename(mm,answer_datetime),year(answer_datetime)
order by year(answer_datetime) desc) as aa
group by month_name,year
order by month_name desc,year desc;
it get this result
Out
user_number media_number month_name
5 1 September
2 1 October
1 1 October
1 1 August
But I need only the first occurence of octuber month
as
user_number media_number month_name
5 1 September
2 1 October
1 1 August
You simply need to use a ranking function like ROW_NUMBER(). Use it to number the records partitioning by month_name, and select only the records which are number 1 in each partition, i.e.
Add this to the select list of your query:
ROW_NUMBER() OVER(PARTITION BY month_name ORDER By XXX) as RowNumber
This will number the rows which have the same month_name with consecutive numbers, starting by 1, and in the order specified by XXX.
NOTE: specify the order in XXX to decide which of the month rows is number one and will be returned by the query
And then, do a select from the resulting query, filtering by RowNumber = 1
SELECT Q.user_number, Q.media_number, Q.month_name
FROM(
-- your query + RowNumber) Q
WHERE Q.RowNumber = 1
NOTE: if you need some ordering in your result, you'll have to move the ORDER BY out of the subselect, and write it beside the WHERE Q.RowNumber=1