Using LEAD Function and do deduct calculation in one SQL query - sql

Having Lead function and doing calculation(deduct) in ONE select statement instead of using nested select statement to achieve the results
Below shows my current not working sql statement. CurrentDate - Nxcurrentdate and having a column with showing the difference:
SELECT
ID,
CURRENTDATE,
LEAD(CURRENTDATE,1) OVER ( PARTITION BY ID ORDER BY ID, CURRENTDATE ) NX_DATE,
NX_DATE - CURRENTDATE AS DATE_DIFF
FROM TABLEA
Expected results should be:
ID CurrentDate NxDate DateDiff

you can use datediff(interval, date1, date2) and getdate()
Select *, DATEDIFF(day, t1.CURRENTDATE, t1.NX_DATE) AS DATE_DIFF) from (
SELECT
ID, getdate() as CURRENTDATE,
LEAD(getdate(),1) OVER ( PARTITION BY ID ORDER BY ID, getdate()) NX_DATE
FROM TABLEA) t1

You can try below:
SELECT
ID, CURRENTDATE,
LEAD(CURRENTDATE,1) OVER ( PARTITION BY ID ORDER BY ID, CURRENTDATE ) NX_DATE,
datediff(day,CURRENTDATE,LEAD(CURRENTDATE,1) OVER ( PARTITION BY ID ORDER BY ID, CURRENTDATE)) AS DATE_DIFF
from tablename

Related

How to use max(Date) and Distinct in Oracle DB. I would like to get data inserted very last within a day

I am looking for a way to fetch Data.
by
latest date In the same day
by UserId
UserId,Value1,Date
1, 2030,2020–09-07 10:58:58
1, 2020,2020–09-07 05:58:28
1, 2050,2020–09-08 19:58:28
2, 3000,2020–09-07 10:58:18
2, 2001,2020–09-06 10:58:55
3, 2400,2020–09-08 10:28:53
4, 2400,2020–09-07 13:28:53
e.g
where Date >= trunc(TO_DATE(’20200907’,’YYYYMMDD’)) and Date < trunc(TO_DATE(’20200908’,’YYYYMMDD’))
Ideal Result
UserId,Value
1,2050
2,3000
4,2400
select UserId, value
What should I use ?
max(Date) ? Distinct userId ? Group by userId?
If value is the only column you want, then you could use keep:
select userid, max(value1) keep(dense_rank last order by dt) value1
from mytable
where dt >= date '2020-09-07' and dt < date '2020-09-08'
group by userid
order by userid
Notes:
this uses the standard date syntax rather than to_date() to build literal dates
dateis a reserved word in Oracle, hence not a good choice for a column name; I renamed it to dt in the query.
If you want more columns in the resultset, then filtering with window functions is more appropriate:
select t.*
from (
select t.*, row_number() over(partition by userid order by dt desc) rn
from mytable t
where dt >= date '2020-09-07' and dt < date '2020-09-08'
) t
where rn = 1

Select latest 30 dates for each unique ID

This is a sample data file
Data Contains unique IDs with different latitudes and longitudes on multiple timestamps.I would like to select the rows of latest 30 days of coordinates for each unique ID.Please help me on how to run the query .This date is in Hive table
Regards,
Akshay
According to your example above (where no current year dates for id=2,3), you can numbering date for each id (order by date descending) using window function ROW_NUMBER(). Then just get latest 30 values:
--get all values for each id where num<=30 (get last 30 days for each day)
select * from
(
--numbering each date for each id order by descending
select *, row_number()over(partition by ID order by DATE desc)num from Table
)X
where num<=30
If you need to get only unique dates (without consider time) for each id, then can try this query:
select * from
(
--numbering date for each id
select *, row_number()over(partition by ID order by new_date desc)num
from
(
-- move duplicate using distinct
select distinct ID,cast(DATE as date)new_date from Table
)X
)Y
where num<=30
In Oracle this will be:
SELECT * FROM TEST_DATE1
WHERE DATEUPDT > SYSDATE - 30;
select * from MyTable
where
[Date]>=dateadd(d, -30, getdate());
To group by ID and perform aggregation, something like this
select ID,
count(*) row_count,
max(Latitude) max_lat,
max(Longitude) max_long
from MyTable
where
[Date]>=dateadd(d, -30, getdate())
group by ID;

Segregate the row based on the date time column per each month

I have the following table in sql server database environment.
the format of start date MM/DD/YYYY.
I need the result to be like the following table.
based on start date column the record should segregated to each month in the period between start date and end date
You can use a recursive CTE:
with cte as (
select id, startdate as dte, enddate
from t
union all
select id,
dateadd(day, 1, eomonth(dte)),
enddate
from t
where eomonth(dte) < enddate
)
select id, dte,
lead(dte, 1, enddate) over (partition by id order by dte)
from cte;
Thank you Gordon Linoff
Using CTE I have got the following result
My code
WITH cte
AS (SELECT 1 AS id,
Cast('2010-01-20' AS DATE) AS trg,
Cast('2010-01-20' AS DATE) AS strt_dte,
Cast('2010-03-15' AS DATE) AS end_dte
UNION ALL
SELECT id,
Dateadd(day, 1, Eomonth (trg)),
strt_dte,
end_dte
FROM cte
WHERE Eomonth(trg) < end_dte)
SELECT id,
trg,
strt_dte,
end_dte,
Lead (trg, 1, end_dte)
OVER (
partition BY id
ORDER BY trg) AS lead_result
FROM cte

how to calculate difference between dates in BigQuery

I have a table named Employees with Columns: PersonID, Name, StartDate. I want to calculate 1) difference in days between the newest and oldest employee and 2) the longest period of time (in days) without any new hires. I have tried to use DATEDIFF, however the dates are in a single column and I'm not sure what other method I should use. Any help would be greatly appreciated
Below is for BigQuery Standard SQL
#standardSQL
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
You can test, play with above using dummy data as in the example below
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT DATE '2019-01-01' StartDate UNION ALL
SELECT '2019-01-03' StartDate UNION ALL
SELECT '2019-01-13' StartDate
)
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
with result
Row days_between_newest_and_oldest_employee longest_period_without_new_hire
1 12 9
Note use of -1 in calculating longest_period_without_new_hire - it is really up to you to use this adjustment or not depends on your preferences of counting gaps
1) difference in days between the newest and oldest record
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT DATE_DIFF(MAX(date), MIN(date), DAY) max_minus_min
FROM table
2) the longest period of time (in days) without any new records
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT MAX(diff) max_diff
FROM (
SELECT DATE_DIFF(date, LAG(date) OVER(ORDER BY date), DAY) diff
FROM table
)

SQL Server: Attempting to output a count with a date

I am trying to write a statement and just a bit puzzled what is the best way to put it together. So I am doing a UNION on a number of tables and then from there I want to produce as the output a count for the UserID within that day.
So I will have numerous tables union such as:
Order ID, USERID, DATE, Task Completed.
UNION
Order ID, USERID, DATE, Task Completed
etc
Above is layout of the table which will have 4 tables union together with same names.
Then statement output I want is for a count of USERID that occurred within the last 24 hours.
So output should be:
USERID--- COUNT OUTPUT-- DATE
I was attempting a WHERE statement but think the output is not what I am after exactly, just thinking if anyone can point me in the right direction and if there is alternative way compared to the union? Maybe a joint could be a better alternative, any help be appreciated.
I will eventually then put this into a SSRS report, so it gets updated daily.
You can try this:
select USERID, count(*) as [COUNT], cast(DATE as date) as [DATE]
from
(select USERID, DATE From SomeTable1
union all
select USERID, DATE From SomeTable2
....
) t
where DATE <= GETDATE() AND DATE >= DATEADD(hh, -24, GETDATE())
group by USERID, cast(DATE as date)
First, you should use union all rather than union. Second, you need to aggregate and use count distinct to get what you want:
So, the query you want is something like:
select count(distinct userid)
from ((select date, userid
from table1
where date >= '2015-05-26'
) union all
(select date, userid
from table2
where date >= '2015-05-26'
) union all
(select date, userid
from table3
where date >= '2015-05-26'
)
) du
Note that this hardcodes the date. In SQL Server, you would do something like:
date >= cast(getdate() - 1 as date)
And in MySQL
date >= date_sub(curdate(), interval 1 day)
EDIT:
I read the question as wanting a single day. It is easy enough to extend to all days:
select cast(date as date) as dte, count(distinct userid)
from ((select date, userid
from table1
) union all
(select date, userid
from table2
) union all
(select date, userid
from table3
)
) du
group by cast(date as date)
order by dte;
For even more readability, you could use a CTE:
;WITH cte_CTEName AS(
SELECT UserID, Date, [Task Completed] FROM Table1
UNION
SELECT UserID, Date, [Task Completed] FROM Table2
etc
)
SELECT COUNT(UserID) AS [Count] FROM cte_CTEName
WHERE Date <= GETDATE() AND Date >= DATEADD(hh, -24, GETDATE())
I think this is what you are trying to achieve...
Select
UserID,
Date,
Count(1)
from
(Select *
from table1
Union All
Select *
from table2
Union All
Select *
from table3
Union All
Select *
from table4
) a
Group by
Userid,
Date