Select entries where Date Difference not higher than 5 days - sql

I am looking for a SQL Statement which gives me all Entries whoms Date are not more than 5 days apart from another entry in this Table.
Example:
ID | Date
1 | 16.10.14 00:00:00
2 | 14.10.14 00:00:00
3 | 09.09.14 00:00:00
4 | 13.10.14 00:00:00
5 | 06.07.14 00:00:00
6 | 09.01.14 00:00:00
7 | 10.01.14 00:00:00
8 | 14.05.14 00:00:00
Expected Output:
ID | Date
1 | 16.10.14 00:00:00
2 | 14.10.14 00:00:00
4 | 13.10.14 00:00:00
6 | 09.01.14 00:00:00
7 | 10.01.14 00:00:00
8 | 14.01.14 00:00:00
EDIT:
In fact all I need is a way to do a diff over the datatype Date. That's why I cant even show my attempts cause I'm missing the keyword.
Nevermind I will still try
It should be something like this:
select * from example m where m.Date not more apart than 5 days from another entry in the Table

The - operator, when applied on two dates, will return their difference in days. So, you can use the exists operator to construct your query:
SELECT *
FROM my_table o
WHERE EXISTS (SELECT *
FROM my_table i
WHERE ABS (o.my_date - i.my_date) <= 5)

Related

Create a count of consecutive days when condition is met psql

I need to create a query to get the consecutive days by the data dates.
Using this table as sample:
id_used | ref_date
---------+---------------------
1 | 2021-02-01 00:00:00
1 | 2021-09-01 00:00:00
1 | 2021-09-02 00:00:00
1 | 2021-09-03 00:00:00
My return should be 3 (The 3 last rows).

Insert a row for each month in the range [duplicate]

This question already has answers here:
Generate series of months for every row in Oracle
(1 answer)
Create all months list from a date column in ORACLE SQL
(3 answers)
Closed 1 year ago.
I want to make my table here in Oracle
+----+------------+------------+
| N | Start | End |
+----+------------+------------+
| 1 | 2018-01-01 | 2018-05-31 |
| 1 | 2018-01-01 | 2018-06-31 |
+----+------------+------------+
Into, as silly as it looks I need to insert one row for each month in the range for each in the first table
+----+------------+
| N | month| |
+----+------------+
| 1 | 2018-01-01 |
| 1 | 2018-01-01 |
| 1 | 2018-02-01 |
| 1 | 2018-02-01 |
| 1 | 2018-03-01 |
| 1 | 2018-03-01 |
| 1 | 2018-04-01 |
| 1 | 2018-04-01 |
| 1 | 2018-05-01 |
| 1 | 2018-05-01 |
| 1 | 2018-06-01 |
+----+------------+
I been trying to follow SQL: Generate Record Per Month In Date Range but I haven't had any luck figuring out the result I want.
Thanks for helping
My best guess is that you want to show all begining of months that are in the interval start to end in your table.
create table t1 as
select date'2018-01-01' start_d, date'2018-05-31' end_d from dual union all
select date'2018-01-01' start_d, date'2018-06-30' end_d from dual;
with cal as
(select add_months(date'2018-01-01', rownum-1) month_d
from dual connect by level <= 12)
select cal.month_d from cal
join t1 on cal.month_d between t1.start_d and t1.end_d
order by 1;
MONTH_D
-------------------
01.01.2018 00:00:00
01.01.2018 00:00:00
01.02.2018 00:00:00
01.02.2018 00:00:00
01.03.2018 00:00:00
01.03.2018 00:00:00
01.04.2018 00:00:00
01.04.2018 00:00:00
01.05.2018 00:00:00
01.05.2018 00:00:00
01.06.2018 00:00:00
So probaly there is a cut & paste error in your expectation for January.
Some other points
do not use reserved word as start for column names
Use DATE format to store dates to aviod invalid entries such as 2018-06-31
You can use a recursive CTE. For example:
with
n (s, e, cur) as (
select s, e, s from t
union all
select s, e, add_months(cur, 1)
from n
where add_months(cur, 1) < e
)
select cur from n;
Result:
CUR
---------
01-JAN-18
01-JAN-18
01-FEB-18
01-FEB-18
01-MAR-18
01-MAR-18
01-APR-18
01-APR-18
01-MAY-18
01-MAY-18
01-JUN-18
See running example at db<>fiddle.

Optimizing results for query with WHERE EXISTS clause

I have this table in postgres:
id | id_datetime | longitude | latitude
--------+---------------------+---------------------+--------------------
639438 | 2018-02-20 18:00:00 | -122.3880011217841 | 37.75538988423265
639439 | 2018-02-20 20:30:00 | -122.38756878451498 | 37.760550220844614
639440 | 2018-02-20 20:05:00 | -122.39640513677658 | 37.76130039041195
639441 | 2018-02-24 10:00:00 | -122.45819139221014 | 37.724317534370066
639442 | 2018-02-10 09:00:00 | -122.44693382058489 | 37.77000760474354
I want an output with all the differents ID's which has at least another ID between the last 15 minutes and between 1000 meters (geographic distance).
My table has more than 100K rows. So, I'm currently trying with the following query which works but takes too long. Are there any steps I can take to optimize this?
SELECT DISTINCT
x.id
FROM table x
WHERE EXISTS(
SELECT
1
FROM table t
WHERE t.id <> x.id
AND (t.id_datetime between x.id_datetime - interval '15 minutes' AND x.id_datetime)
AND (ST_Distance((geography(ST_MakePoint(x.longitude, x.latitude))),
geography(ST_MakePoint(t.longitude, t.latitude)) ) <= 1000)
)

BIGQUERY: How to query for a rolling monthly user active/churn

So I have a website with news articles and I'm trying to calculate 4 user types for each month. The user types are:
1. New User: A user who registers (their first article view) in the current month and viewed an article in the current month.
2. Retained User: A New User from the previous month OR a user who viewed an article in the previous month and in the current month.
3. Churned User: A New User or Retained User from the previous month who has not viewed an article in the current month OR a Churned User from the previous month.
4. Resurrected User: A Churned User from the previous month who has viewed an article in the current month.
**User Table A - Unique User Article Views**
- Current month = 2019-04-01 00:00:00 UTC
| user_id | viewed_at |
------------------------------------------
| 4 | 2019-04-01 00:00:00 UTC |
| 3 | 2019-04-01 00:00:00 UTC |
| 2 | 2019-04-01 00:00:00 UTC |
| 1 | 2019-03-01 00:00:00 UTC |
| 3 | 2019-03-01 00:00:00 UTC |
| 2 | 2019-02-01 00:00:00 UTC |
| 1 | 2019-02-01 00:00:00 UTC |
| 1 | 2019-01-01 00:00:00 UTC |
The table above outlines the following user types:
2019-01-01
* User 1: New
2019-02-01
* User 1: Retained
* User 2: New
2019-03-01
* User 1: Retained
* User 2: Churned
* User 3: New
2019-04-01
* User 1: Churned
* User 2: Resurrected
* User 3: Retained
* User 4: New
My desired table COUNTS the distinct user_id for each user type in each month.
| month_viewed_at | ut_new | ut_retained | ut_churned | ut_resurrected
------------------------------------------------------------------------------------
| 2019-04-01 00:00:00 UTC | 1 | 1 | 1 | 1
| 2019-03-01 00:00:00 UTC | 1 | 1 | 1 | 0
| 2019-02-01 00:00:00 UTC | 1 | 1 | 0 | 0
| 2019-01-01 00:00:00 UTC | 1 | 0 | 0 | 0
I simply am not sure where to start
Hope you read all my comments and actually tried something by yourself, but as I don't see any update I suppose you still stuck here - so here we go ...
Below is for BigQuery Standard SQL and should give you direction
#standardSQL
WITH temp1 AS (
SELECT user_id,
FORMAT_DATE('%Y-%m', DATE(viewed_at)) month_viewed_at,
DATE_DIFF(DATE(viewed_at), '2000-01-01', MONTH) pos,
DATE_DIFF(DATE(MIN(viewed_at) OVER(PARTITION BY user_id)), '2000-01-01', MONTH) first_pos
FROM `project.dataset.table`
), temp2 AS (
SELECT *, pos = first_pos AS new_user
FROM temp1
), temp3 AS (
SELECT *, LAST_VALUE(new_user) OVER(win) OR pos - 1 = LAST_VALUE(pos) OVER(win) AS retained_user
FROM temp2
WINDOW win AS (PARTITION BY user_id ORDER BY pos RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING)
)
SELECT month_viewed_at,
COUNTIF(new_user) AS new_users,
COUNTIF(retained_user) AS retained_users
FROM temp3
GROUP BY month_viewed_at
-- ORDER BY month_viewed_at DESC
If to apply to your sample data - result is
Row month_viewed_at new_users retained_users
1 2019-04 1 1
2 2019-03 1 1
3 2019-02 1 1
4 2019-01 1 0
In temp1 we preparing data by formatting viewed_at to needed format to present in output ad also we are transforming it to present consecutive number of month since some abstract data (2000-02-02) so we can use analytics function with RANGE as opposed to ROWS
In temp2 we just simply identifying new users and in temp3 - retained users
I think, this can be good start, so I am leaving the rest for you

how to get second max date in postgres sql

I have following situation where i need to get several values between two invoices date.
So query is giving data based on invoices now what i need to do is for some values fetch data between this invoice date and last invoice date
already tried ways
1) sub query will easily solve this but as i have to do this for 4-5 column and its a 15 gb database so that's not possible.
2) if i go like this
left join (select inv.date ,inv,actno from invoice inv) as invo on invo.actno=act.id and invo.date < inv.date
then it will give all the data less then that date but i need only one data that will be less than main invoice date.
3) we can not get second max value in subquery of from clause because outer invoice is not grouped so it might be max or midlle or least .
4) we can not send values of other table in subquery of join table.
ex
create table inv (id serial ,date timestamp without time zone);
insert into inv (date) values('2017-01-31 00:00:00'),('2017-01-30 00:00:00'),('2017-01-29 00:00:00'),('2017-01-28 00:00:00'),('2017-01-27 00:00:00');
select date as d1 from inv;
id | date
----+---------------------
1 | 2017-01-31 00:00:00
2 | 2017-01-30 00:00:00
3 | 2017-01-29 00:00:00
4 | 2017-01-28 00:00:00
5 | 2017-01-27 00:00:00
(5 rows)
I need this
id |date |date | id
1 | 2017-01-31 00:00:00 | 2017-01-30 00:00:00 | 2
2 | 2017-01-30 00:00:00 | 2017-01-29 00:00:00 | 3
3 | 2017-01-29 00:00:00 | 2017-01-28 00:00:00 | 4
4 | 2017-01-28 00:00:00 | 2017-01-27 00:00:00 | 5
5 | 2017-01-27 00:00:00 |
I can't do subquery in select as database is big and need to do this for 4-5 column
UPDATE 1
I need this from same table but using it twice in FROM clause as my requirement is that I need several data joined from invoice table and then there is 4-5 column in which I need things like sum of amount paid between last and this invoice.
So I can take both invoice date in subquery and get the data between them
UPDATE 2
lag will not solve this
select i.id,i.date, lag(date) over (order by date) from inv i order by id ;
id | date | lag
----+---------------------+---------------------
1 | 2017-01-31 00:00:00 | 2017-01-30 00:00:00
2 | 2017-01-30 00:00:00 | 2017-01-29 00:00:00
3 | 2017-01-29 00:00:00 | 2017-01-28 00:00:00
4 | 2017-01-28 00:00:00 | 2017-01-27 00:00:00
5 | 2017-01-27 00:00:00 |
(5 rows)
Time: 0.480 ms
test=# select i.id,i.date, lag(date) over (order by date) from inv i where id=2 order by id ;
id | date | lag
----+---------------------+-----
2 | 2017-01-30 00:00:00 |
(1 row)
Time: 0.525 ms
test=# select i.id,i.date, lag(date) over (order by date) from inv i where id in (2,3) order by id ;
id | date | lag
----+---------------------+---------------------
2 | 2017-01-30 00:00:00 | 2017-01-29 00:00:00
3 | 2017-01-29 00:00:00 |
it will calculate on the data it will get from the table in that query it is bounded in that query see here 3 has a lag but could not get it cause query is not allowing it to have it ....something in left join needs to be done so the lag date can be taken from same table but calling it again in from clause Thanks Again buddy
Like here?:
t=# select date as d1,
lag(date) over (order by date)
from inv
order by 1 desc;
d1 | lag
---------------------+---------------------
2017-01-31 00:00:00 | 2017-01-30 00:00:00
2017-01-30 00:00:00 | 2017-01-29 00:00:00
2017-01-29 00:00:00 | 2017-01-28 00:00:00
2017-01-28 00:00:00 | 2017-01-27 00:00:00
2017-01-27 00:00:00 |
(5 rows)
Time: 1.416 ms