I have a table in Db2 called myTable.
It has several columns:
a | b | date1 | date2
---------------------------------------------
1 abc <null> 2014-09-02
2 aax 2015-12-30 2016-09-02
2 bax 2015-10-20 <null>
2 ayx 2014-12-10 2016-02-12
As seen from values above, date1 and date2 can have null values as well.
How can I get the max of both date1 and date2 together ?
i.e. the output of the query should be 2016-09-02 as that is the max date of all the dates present in date1 and date2.
I am using Db2-9.
Thanks for reading!
How about using a UNION query:
SELECT MAX(t.newDate)
FROM
(
SELECT date1 AS newDate
FROM myTable
UNION
SELECT date2 AS newDate
FROM myTable
) t
Another option:
SELECT CASE WHEN t.date1 > t.date2 THEN t.date1 ELSE t.date2 END
FROM
(
SELECT (SELECT MAX(date1) FROM myTable) AS date1,
(SELECT MAX(date2) FROM myTable) AS date2
FROM SYSIBM.SYSDUMMY1
) t
MAX() is an interesting beast...
It's available as both a scalar function and an aggregate one.
So all you really need is
select max(max(coalesce(date1,'0001-01-01')
,coalesce(date2,'0001-01-01')
)
)
from mytable
The outer MAX() is the aggregate version, the inner is the scalar one.
Related
I have the following data set:
I want to create a new column that sums the last 7 days of sales. So the query result should look be the following:
Pls help
Thanks!
In standard SQL, you would use a window function -- assuming you have data for each day:
select t.*,
sum(sales) over (partition by itemid order by date rows between 6 preceding and current row) as sales_7
from t;
use sum() aggregate function and group by
select country,itemid,year,monthnumber,week sum(sales) as sales_last_7days from your_table
where date>=DATEADD(day, -7, getdate()) and date< getdate()
group by country,itemid,year,monthnumber,week
with window:
select (list other columns here), sum(sum(sales)) over
(partition by week
order by day
rows between 6 preceding and current row)
from table
group by date, week;
note that week doesen't change group by beacause a date is reffered to one week only, but it is needed in window.
Seems you are working with SQL Server if so, then you can use apply :
select t.*, t1.[last7day]
from table t outer apply
(select sum(t1.sales) as [last7day]
from table t1
where t.itemid = t1.itemid and
t1.date <= dateadd(day, -6, t.dt)
) t1;
If you don't have exactly one day for each row, for example if you have a list of transactions...
The below example completely confused me the first time I saw it, so I've tried to comment as much as I can to explain what's happening.
Suppose we have a table tbl with date column dt and amount column amt, and for each date in tbl we want to return a rolling sum of the amount from the current day and the past 6 days.
select distinct -- see note after code on what this distinct is doing.
dt
, ( -- Has to be in brackets to denote we're returning 1 value per row.
-- for each row of T1:
select sum(b.amt) -- the sum of amounts in T2. The where clause will restrict which rows in T2 will be summed.
from tbl T2
where T2.dt between T1.dt - 6 and T1.dt -- for each row in T1, give me all rows in T2 where the date is between 6 days before this T1 row's date and T1 row's date, giving us our rolling sum
-- WARNING: CHECK YOUR VERSION OF SQL FOR HOW TO SUBTRACT DAYS FROM A DATE, I'VE MADE IT (T1.dt - 6) FOR SIMPLICITY
-- we don't need a group by, because we're returning one value for each row in T1
)
from tbl T1
We have a main version of tbl, aliased T1. We then have a secondary table, aliased T2. For each row in T1, we're going to ask for a set of rows in T2 that we're going to sum before giving it to our main query.
To understand what's happening, run the code without the distinct. You'll notice that we have the same number of rows as in tbl, because the T2 statement is happening for every row in T1.
Notes:
If you have any days for which no rows exist in your table you will not get a calculation for this day. To be certain this doesn't happen, join your table to a table containing a distinct list of consecutive dates, and use this as your date column.
If you have nulls in your amount column the calculation will still work, but if the rolling average contains only nulls you will have null instead of 0 as your result. If that troubles you convert all your nulls to zero's before (or after) you use the query.
The beginning of the period will have a 'ramp up'. But this would be the same whatever method you use to do a rolling sum. If it bothers you, don't return the first 6 days.
Finally a worked example if you're playing along at home using SQL Server:
with tbl as (
-- a list of transactions from 1.10.2019 to 14.10.2019
select cast('2019-10-01' as date) dt, 1 amt
union select cast('2019-10-02' as date), 4
union select cast('2019-10-01' as date), 10
union select cast('2019-10-03' as date), 3
union select cast('2019-10-04' as date), 20
union select cast('2019-10-04' as date), 2
union select cast('2019-10-04' as date), 12
union select cast('2019-10-04' as date), 17
union select cast('2019-10-05' as date), null -- a whole week of null values because we all had the week off... I hope this data wasn't important
union select cast('2019-10-06' as date), null
union select cast('2019-10-07' as date), null
union select cast('2019-10-08' as date), null
union select cast('2019-10-09' as date), null
union select cast('2019-10-10' as date), null
union select cast('2019-10-10' as date), null
union select cast('2019-10-10' as date), null
union select cast('2019-10-11' as date), null
union select cast('2019-10-12' as date), 1
union select cast('2019-10-12' as date), 1
union select cast('2019-10-12' as date), 1
union select cast('2019-10-12' as date), 1
union select cast('2019-10-12' as date), 1
union select cast('2019-10-12' as date), 1
union select cast('2019-10-13' as date), 2
union select cast('2019-10-14' as date), 1000
)
select distinct
a.dt
, (
select sum(b.amt)
from tbl b
where b.dt between dateadd(dd, -6, a.dt) and a.dt
) past_7_days_amt
from tbl a
Returns:
+------------+-----------------+
| dt | past_7_days_amt |
+------------+-----------------+
| 2019-10-01 | 11 |
| 2019-10-02 | 15 |
| 2019-10-03 | 18 |
| 2019-10-04 | 69 |
| 2019-10-05 | 69 |
| 2019-10-06 | 69 |
| 2019-10-07 | 69 |
| 2019-10-08 | 58 |
| 2019-10-09 | 54 |
| 2019-10-10 | 51 |
| 2019-10-11 | NULL |
| 2019-10-12 | 1 |
| 2019-10-13 | 3 |
| 2019-10-14 | 1003 |
+------------+-----------------+
I've a table containing a date column.
ID | Date
----|-----------
1 | 2000-01-01
2 | 2000-02-01
3 | 2000-02-01
4 | 2000-03-01
I need a select that returns for each row, the ID, the Date and the smallest date (of all dates in the table) that is larger than the current date.
ID | Date | Next date
----+------------+------------
1 | 2000-01-01 | 2000-02-01
2 | 2000-02-01 | 2000-03-01
3 | 2000-02-01 | 2000-03-01
4 | 2000-03-01 | (NULL)
My first approach was
SELECT id, date, LEAD (date, 1) OVER (ORDER BY date NULLS LAST) AS next_date
FROM t
But this only works, if the values in column DATE are unique.
Any ideas?
You could use an analytic function with a windowing clause. lead() doesn't support a windowing clause, so you need use one that does like min() or first_value():
FIRST_VALUE ("Date")
OVER (ORDER BY "Date" RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
The default windowing clause is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which would give all your rows the same value of 2000-01-01 and using a ROWS window would run into the same problem you're having with lead() with duplicate dates (ID 2 would still get 2000-02-01; and ID 4 would get 2000-03-01 instead of null if you you used ROWS BETWEEN CURRENT ROW... rather than 1 FOLLOWING).
Demo using this range:
with t (ID, "Date") as (
select 1, date '2000-01-01' from dual
union all select 2, date '2000-02-01' from dual
union all select 3, date '2000-02-01' from dual
union all select 4, date '2000-03-01' from dual
)
select id, "Date", FIRST_VALUE ("Date") OVER (ORDER BY "Date"
RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS next_date
FROM t;
ID Date NEXT_DATE
---------- ---------- ----------
1 2000-01-01 2000-02-01
2 2000-02-01 2000-03-01
3 2000-02-01 2000-03-01
4 2000-03-01
Only rows where the date value is higher than the current row are considered. And this still only has to hit the table once.
(I've put "Date" in double-quotes because date is a reserved word; from your sample data it looks like a quoted identifier, but it isn't quoted in your query, so it's probably just got a more sensible name really...)
select * , (select min(t2.date) from table t2 where t2.date > t1.date)
from Table t1
Above code is in sql server
To answer my own question. ;-) (Just to show another option to people stumbling across this post)
Another solution would be using a subselect:
SELECT t.id,
t.date,
(SELECT MIN (t.date)
FROM t t2
WHERE t2.date > t.date)
AS next_date
FROM t;
One approach would be to create a CTE containing the distinct dates and their immediate lead values. Then, join this CTE to your original table on the date to get the final result.
WITH cte AS (
SELECT t.date,
LEAD(t.date, 1) OVER (ORDER BY t.date NULLS LAST) AS next_date
FROM (SELECT DISTINCT date FROM yourTable) t
)
SELECT
t1.ID,
t1.date,
t2.next_date
FROM yourTable t1
INNER JOIN cte t2
ON t1.date = t2.date
Here is another approach, without "distinct":
select
ted.id,
ted.date_col,
(select
min(ted2.date_col)
from
test_date_v ted2
where
ted2.id != ted.id and
ted2.date_col > ted.date_col) next_date_col
from
test_date_v ted;
Im using SQL Server 2005. From the tbl_temp table below, I would like to add an EndDate column based on the next row's StartDate minus 1 day until there's a change in AID and UID combination. This calculated EndDate will go to the row above it as the EndDate. The last row of the group of AID and UID will get the system date as its EndDate. The table has to be ordered by AID, UID, StartDate sequence. Thanks for the help.
-- tbl_temp
AID UID StartDate
1 1 2013-02-20
2 1 2013-02-06
1 1 2013-02-21
1 1 2013-02-27
1 2 2013-02-02
1 2 2013-02-04
-- Result needed
AID UID StartDate EndDate
1 1 2013-02-20 2013-02-20
1 1 2013-02-21 2013-02-26
1 1 2013-02-27 sysdate
1 2 2013-02-02 2013-02-03
1 2 2013-02-04 sysdate
2 1 2013-02-06 sysdate
The easiest way to do this is with a correlated subquery:
select t.*,
(select top 1 dateadd(day, -1, startDate )
from tbl_temp t2
where t2.aid = t.aid and
t2.uid = t.uid and
t2.startdate > t.startdate
) as endDate
from tbl_temp t
To get the current date, use isnull():
select t.*,
isnull((select top 1 dateadd(day, -1, startDate )
from tbl_temp t2
where t2.aid = t.aid and
t2.uid = t.uid and
t2.startdate > t.startdate
), getdate()
) as endDate
from tbl_temp t
Normally, I would recommend coalesce() over isnull(). However, there is a bug in some versions of SQL Server where it evaluates the first argument twice. Normally, this doesn't make a difference, but with a subquery it does.
And finally, the use of sysdate makes me think of Oracle. The same approach will work there too.
;WITH x AS
(
SELECT AID, UID, StartDate,
ROW_NUMBER() OVER(PARTITION BY AID, UID ORDER BY StartDate) AS rn
FROM tbl_temp
)
SELECT x1.AID, x1.UID, x1.StartDate,
COALESCE(DATEADD(day,-1,x2.StartDate), CAST(getdate() AS date)) AS EndDate
FROM x x1
LEFT OUTER JOIN x x2 ON x2.AID = x1.AID AND x2.UID = x1.UID
AND x2.rn = x1.rn + 1
ORDER BY x1.AID, x1.UID, x1.StartDate
SQL Fiddle example
This is MyTable structure
MyTable : ID, Date
Select ID, Date from MyTable
ID Date
50 2013-01-01 00:00:00.000
51 2013-01-02 00:00:00.000
52 2013-01-02 00:00:00.000
I need to get the result like this.
ID Date
50 2013-01-01 00:00:00.000
50 2013-01-02 00:00:00.000
50 2013-01-03 00:00:00.000
51 2013-01-02 00:00:00.000
51 2013-01-03 00:00:00.000
52 2013-01-03 00:00:00.000
How do i get the results ?
Seems like you want to get a list of dates. Of so, then you can use a recursive CTE to get the list of dates:
;with data(id, date) as
(
select id, date
from mytable
union all
select id, dateadd(day, 1, date)
from data
where dateadd(day, 1, date) <= '1/3/2013'
)
select *
from data
order by id
See SQL Fiddle with Demo. This will generate the list of dates for each ID between the current date in your table and the end date that you provide. In my example it is 1/3/2013
I think this will work a bit faster ONLY if there is a CLUSTERED INDEX on the ID column, else bluefeet's solution is fantastic!
SELECT T1.ID
,Date = DATEADD(DAY, -ROW_NUMBER() OVER (PARTITION BY T1.ID ORDER BY (SELECT NULL))+1, '2013-01-03')
FROM MyTable T1
JOIN MyTable T2 ON T1.ID <= T2.ID
I have a table with many IDs and many dates associated with each ID, and even a few IDs with no date. For each ID and date combination, I want to select the ID, date, and the next largest date also associated with that same ID, or null as next date if none exists.
Sample Table:
ID Date
1 5/1/10
1 6/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10
4 4/15/10
4
Desired Output:
ID Date Next_Date
1 5/1/10 6/1/10
1 6/1/10 7/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10 4/15/10
4 4/15/10
SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
AND mytable.id = mytablemin.id
) AS NextDate
FROM mytable
This has been tested on SQL Server 2008 R2 (but it should work on other DBMSs) and produces the following output:
id date NextDate
----------- ----------------------- -----------------------
1 2010-05-01 00:00:00.000 2010-06-01 00:00:00.000
1 2010-06-01 00:00:00.000 2010-06-15 00:00:00.000
1 2010-07-01 00:00:00.000 2010-08-15 00:00:00.000
2 2010-06-15 00:00:00.000 2010-07-01 00:00:00.000
3 2010-08-15 00:00:00.000 NULL
3 2010-08-15 00:00:00.000 NULL
4 2010-04-01 00:00:00.000 2010-04-15 00:00:00.000
4 2010-04-15 00:00:00.000 2010-05-01 00:00:00.000
4 NULL NULL
Update 1:
For those that are interested, I've compared the performance of the two variants in SQL Server 2008 R2 (one uses MIN aggregate and the other uses TOP 1 with an ORDER BY):
Without an index on the date column, the MIN version had a cost of 0.0187916 and the TOP/ORDER BY version had a cost of 0.115073 so the MIN version was "better".
With an index on the date column, they performed identically.
Note that this was testing with just these 9 records so the results could be (very) spurious...
Update 2:
The results hold for 10,000 uniformly distributed random records. The TOP/ORDER BY query takes so long to run at 100,000 records I had to cancel it and give up.
If your db is oracle, you can use lead() and lag() functions.
SELECT id, date,
LEAD(date, 1, 0) OVER (PARTITION BY ID ORDER BY Date DESC NULLS LAST) NEXT_DATE,
FROM Your_table
ORDER BY ID;
SELECT
id,
date,
( SELECT date
FROM table t1
WHERE t1.date > t2.date
ORDER BY t1.date LIMIT 1 )
FROM table t2
I think self JOIN would be faster than subselect.
WITH dates AS (
SELECT 1 AS ID, '2010-05-01' AS Date
UNION ALL SELECT 1, '2010-06-01'
UNION ALL SELECT 1, '2010-07-01'
UNION ALL SELECT 2, '2010-06-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 4, '2010-04-01'
UNION ALL SELECT 4, '2010-04-15'
UNION ALL SELECT 4, ''
)
SELECT
dates.ID,
dates.Date,
nextDates.Date AS Next_Date
FROM
dates
LEFT JOIN
dates nextDates
ON nextDates.ID = dates.ID
AND nextDates.Date > dates.Date
LEFT JOIN
dates noLower
ON noLower.ID = nextDates.ID
AND noLower.Date < nextDates.Date
AND noLower.Date > dates.Date
WHERE
dates.Date > 0
AND noLower.ID IS NULL
https://www.db-fiddle.com/f/4sWRLt2hxjik5HqiJ21ez8/1