How to use max(Date) and Distinct in Oracle DB. I would like to get data inserted very last within a day - sql

I am looking for a way to fetch Data.
by
latest date In the same day
by UserId
UserId,Value1,Date
1, 2030,2020–09-07 10:58:58
1, 2020,2020–09-07 05:58:28
1, 2050,2020–09-08 19:58:28
2, 3000,2020–09-07 10:58:18
2, 2001,2020–09-06 10:58:55
3, 2400,2020–09-08 10:28:53
4, 2400,2020–09-07 13:28:53
e.g
where Date >= trunc(TO_DATE(’20200907’,’YYYYMMDD’)) and Date < trunc(TO_DATE(’20200908’,’YYYYMMDD’))
Ideal Result
UserId,Value
1,2050
2,3000
4,2400
select UserId, value
What should I use ?
max(Date) ? Distinct userId ? Group by userId?

If value is the only column you want, then you could use keep:
select userid, max(value1) keep(dense_rank last order by dt) value1
from mytable
where dt >= date '2020-09-07' and dt < date '2020-09-08'
group by userid
order by userid
Notes:
this uses the standard date syntax rather than to_date() to build literal dates
dateis a reserved word in Oracle, hence not a good choice for a column name; I renamed it to dt in the query.
If you want more columns in the resultset, then filtering with window functions is more appropriate:
select t.*
from (
select t.*, row_number() over(partition by userid order by dt desc) rn
from mytable t
where dt >= date '2020-09-07' and dt < date '2020-09-08'
) t
where rn = 1

Related

Select latest 30 dates for each unique ID

This is a sample data file
Data Contains unique IDs with different latitudes and longitudes on multiple timestamps.I would like to select the rows of latest 30 days of coordinates for each unique ID.Please help me on how to run the query .This date is in Hive table
Regards,
Akshay
According to your example above (where no current year dates for id=2,3), you can numbering date for each id (order by date descending) using window function ROW_NUMBER(). Then just get latest 30 values:
--get all values for each id where num<=30 (get last 30 days for each day)
select * from
(
--numbering each date for each id order by descending
select *, row_number()over(partition by ID order by DATE desc)num from Table
)X
where num<=30
If you need to get only unique dates (without consider time) for each id, then can try this query:
select * from
(
--numbering date for each id
select *, row_number()over(partition by ID order by new_date desc)num
from
(
-- move duplicate using distinct
select distinct ID,cast(DATE as date)new_date from Table
)X
)Y
where num<=30
In Oracle this will be:
SELECT * FROM TEST_DATE1
WHERE DATEUPDT > SYSDATE - 30;
select * from MyTable
where
[Date]>=dateadd(d, -30, getdate());
To group by ID and perform aggregation, something like this
select ID,
count(*) row_count,
max(Latitude) max_lat,
max(Longitude) max_long
from MyTable
where
[Date]>=dateadd(d, -30, getdate())
group by ID;

Postgres DB query to get the count, and first and last ids by date in a single query

I have the following db structure.
table
-----
id (uuids)
date (TIMESTAMP)
I want to write a query in postgres (actually cockroachdb which uses the postgres engine, so postgres query should be fine).
The query should return a count of records between 2 dates , id of the record with latest date and id of the record with latest earliest date within that range.
So the query should return the following:
count, id(of the earliest record in the range), id (of the latest record in the range)
thanks.
You can use row_number() twice, then conditional aggregation:
select
no_records,
min(id) filter(where rn_asc = 1) first_id
max(id) filter(where rn_desc = 1) last_id
from (
select
id,
count(*) over() no_records
row_number() over(order by date asc) rn_asc,
row_number() over(order by date desc) rn_desc
from mytable
where date >= ? and date < ?
) t
where 1 in (rn_asc, rn_desc)
The question marks represents the (inclusive) start and (exclusive) end of the date interval.
Of course, if ids are always increasing, simple aggregation is sufficient:
select count(*), min(id) first_id, max(id) last_id
from mytable
where date >= ? and date < ?
Unfortunately, Postgres doesn't support first_value() as an aggregation function. One method is to use arrays:
select count(*),
(array_agg(id order by date asc))[1] as first_id,
(array_agg(id order by date desc))[1] as last_id
from t
where date >= ? and date <= ?

Segregate the row based on the date time column per each month

I have the following table in sql server database environment.
the format of start date MM/DD/YYYY.
I need the result to be like the following table.
based on start date column the record should segregated to each month in the period between start date and end date
You can use a recursive CTE:
with cte as (
select id, startdate as dte, enddate
from t
union all
select id,
dateadd(day, 1, eomonth(dte)),
enddate
from t
where eomonth(dte) < enddate
)
select id, dte,
lead(dte, 1, enddate) over (partition by id order by dte)
from cte;
Thank you Gordon Linoff
Using CTE I have got the following result
My code
WITH cte
AS (SELECT 1 AS id,
Cast('2010-01-20' AS DATE) AS trg,
Cast('2010-01-20' AS DATE) AS strt_dte,
Cast('2010-03-15' AS DATE) AS end_dte
UNION ALL
SELECT id,
Dateadd(day, 1, Eomonth (trg)),
strt_dte,
end_dte
FROM cte
WHERE Eomonth(trg) < end_dte)
SELECT id,
trg,
strt_dte,
end_dte,
Lead (trg, 1, end_dte)
OVER (
partition BY id
ORDER BY trg) AS lead_result
FROM cte

How to replace the loop in MsSQL?

For example
If I want to check in every day last week
select count(ID) from DB where date < "2019/07/01"
select count(ID) from DB where date < "2019/07/02"
select count(ID) from DB where date < "2019/07/03"
...
select count(ID) from DB where date < "2019/07/08"
like
0701 10
0702 15
0703 23
...
0707 45
How to do this without loop and one query?
You can generate the dates using a recursive CTE (or other method) and then run the query:
with dates as (
select convert(date, '2019-07-01') as dte union all
select dateadd(day, 1, dte)
from dates
where dte < '2019-07-08'
)
select d.dte,
(select count(*) from DB where DB.date < d.dte)
from dates d;
More efficient, though, is a cumulative sum:
select db.*
from (select date, count(*) as cnt, sum(count(*)) over (order by date) as running_cnt
from db
group by date
) d
where d.date >= '2019-07-01' and d.date < '2019-07-09';
Are you just counting the number by day?
Something like
SELECT MONTH(date), DAY(date), COUNT(ID)
FROM DB
GROUP BY MONTH(date), DAY(date);
(assuming date is a DATE or DATETIME)
Do it with window Count. range between current row and current row selects exactly this day rows.
select distinct date, count(1) over (order by Date) - count(1) over (order by Date range between current row and current row)
from DB
where date between '2019-07-01' and '2019-07-08';
I assume date column is exactly DATE.

how to calculate difference between dates in BigQuery

I have a table named Employees with Columns: PersonID, Name, StartDate. I want to calculate 1) difference in days between the newest and oldest employee and 2) the longest period of time (in days) without any new hires. I have tried to use DATEDIFF, however the dates are in a single column and I'm not sure what other method I should use. Any help would be greatly appreciated
Below is for BigQuery Standard SQL
#standardSQL
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
You can test, play with above using dummy data as in the example below
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT DATE '2019-01-01' StartDate UNION ALL
SELECT '2019-01-03' StartDate UNION ALL
SELECT '2019-01-13' StartDate
)
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
with result
Row days_between_newest_and_oldest_employee longest_period_without_new_hire
1 12 9
Note use of -1 in calculating longest_period_without_new_hire - it is really up to you to use this adjustment or not depends on your preferences of counting gaps
1) difference in days between the newest and oldest record
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT DATE_DIFF(MAX(date), MIN(date), DAY) max_minus_min
FROM table
2) the longest period of time (in days) without any new records
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT MAX(diff) max_diff
FROM (
SELECT DATE_DIFF(date, LAG(date) OVER(ORDER BY date), DAY) diff
FROM table
)