I have two date columns Date A and Date B.
I need to select the greater (most recent) of Date A + 42 Days and Date B.
What is the best way to approach this?
You can use a simple CASE statement:
SELECT A, B, CASE WHEN DATEADD(DAY, 42, A) > B THEN DATEADD(DAY, 42, A) ELSE B END AS A42ORB
There are other ways depending on the SQL Server version for example:
SELECT A, B, CA.C
FROM t
CROSS APPLY (
SELECT MAX(V) AS C
FROM (VALUES
(DATEADD(DAY, 42, A)),
(B)
) AS VA(V)
) AS CA
Or:
SELECT A, B, CASE WHEN C > B THEN C ELSE B END
FROM t
CROSS APPLY (SELECT DATEADD(DAY, 42, A)) AS CA(C)
For latest date use MAX() to find A+42 days use DATEADD() . If you can give us table structure and expected result we can help you better.
Here is example:
SELECT MAX([YourDateColumn])
FROM YourTable
WHERE [YourDateColumn] BETWEEN B AND DATEADD(DAY,42,A)
I'd give this a go, hope it helps!
SELECT
MAX([a])
FROM
(SELECT DATEADD(DD,42,SoCreateDate) [a]
FROM UNIQUESOID
UNION ALL
SELECT SOSUBMISSIONDATE
FROM UNIQUESOID
) [x]
Just to explain what this script does;
Create a combined set of data (using union) so that the two dates are in the same column then we use SELECT (MAX) in order to pick the highest value from that dataset we just created.
I'm not sure that your question was well understood by the others, so let me propose you this solution :
SELECT
IIF(DATEADD(DAY,42,t0.A) > t0.B, DATEADD(DAY,42,t0.A), t0.B) AS MaxDate
FROM
YourTable AS t0
WARNING : If you have possibility of NULL values for your dates, you have to handle the case in your IIF
Related
Does a shorthand exist that allows you to compare multiple columns against the same condition in the WHERE clause?
SELECT *
FROM [Table]
WHERE [Date1] BETWEEN x AND y
OR [Date2] BETWEEN x AND y
OR [Date3] BETWEEN x and y
OR [Date4] BETWEEN x and y
It's not the end of the world to copy and paste this condition and replace [Date x] with each column, but it sure isn't fun.
You can also write the query like this (in SQL Server 2008 or later):
SELECT * FROM [Table]
WHERE EXISTS (
SELECT *
FROM (VALUES (Date1),(Date2),(Date3),(Date4)) v (TheDate)
WHERE TheDate BETWEEN x AND y
)
However, I don't see any benefits of doing so (in terms of peformance or readability).
Of course, things would be different if you need to write Date1=x OR Date2=x OR Date3=x OR Date4=x, because in this case you can simply write x IN (Date1, Date2, Date3, Date4).
You could use cross apply and values, but the result is even more cumbersome than the code you have right now:
SELECT *
FROM [Table]
CROSS APPLY
(
SELECT MIN([Date]) As MinDate,
MAX([Date]) As MaxDate
FROM (VALUES ([Date1]), ([Date2]), ([Date3]), ([Date4])) VALS([Date])
)
WHERE MinDate <= y
AND MaxDate >= x
AND x <= y
With that being said, I agree with Sean Lange's comment - Seems like the table structure is ill-designed and all these dates values should be in a different table, referenced by this table with a one-to-many relationship.
I'm building a quick csv from a mysql table with a query like:
select DATE(date),count(date) from table group by DATE(date) order by date asc;
and just dumping them to a file in perl over a:
while(my($date,$sum) = $sth->fetchrow) {
print CSV "$date,$sum\n"
}
There are date gaps in the data, though:
| 2008-08-05 | 4 |
| 2008-08-07 | 23 |
I would like to pad the data to fill in the missing days with zero-count entries to end up with:
| 2008-08-05 | 4 |
| 2008-08-06 | 0 |
| 2008-08-07 | 23 |
I slapped together a really awkward (and almost certainly buggy) workaround with an array of days-per-month and some math, but there has to be something more straightforward either on the mysql or perl side.
Any genius ideas/slaps in the face for why me am being so dumb?
I ended up going with a stored procedure which generated a temp table for the date range in question for a couple of reasons:
I know the date range I'll be looking for every time
The server in question unfortunately was not one that I can install perl modules on atm, and the state of it was decrepit enough that it didn't have anything remotely Date::-y installed
The perl Date/DateTime-iterating answers were also very good, I wish I could select multiple answers!
When you need something like that on server side, you usually create a table which contains all possible dates between two points in time, and then left join this table with query results. Something like this:
create procedure sp1(d1 date, d2 date)
declare d datetime;
create temporary table foo (d date not null);
set d = d1
while d <= d2 do
insert into foo (d) values (d)
set d = date_add(d, interval 1 day)
end while
select foo.d, count(date)
from foo left join table on foo.d = table.date
group by foo.d order by foo.d asc;
drop temporary table foo;
end procedure
In this particular case it would be better to put a little check on the client side, if current date is not previos+1, put some addition strings.
When I had to deal with this problem, to fill in missing dates I actually created a reference table that just contained all dates I'm interested in and joined the data table on the date field. It's crude, but it works.
SELECT DATE(r.date),count(d.date)
FROM dates AS r
LEFT JOIN table AS d ON d.date = r.date
GROUP BY DATE(r.date)
ORDER BY r.date ASC;
As for output, I'd just use SELECT INTO OUTFILE instead of generating the CSV by hand. Leaves us free from worrying about escaping special characters as well.
not dumb, this isn't something that MySQL does, inserting the empty date values. I do this in perl with a two-step process. First, load all of the data from the query into a hash organised by date. Then, I create a Date::EzDate object and increment it by day, so...
my $current_date = Date::EzDate->new();
$current_date->{'default'} = '{YEAR}-{MONTH NUMBER BASE 1}-{DAY OF MONTH}';
while ($current_date <= $final_date)
{
print "$current_date\t|\t%hash_o_data{$current_date}"; # EzDate provides for automatic stringification in the format specfied in 'default'
$current_date++;
}
where final date is another EzDate object or a string containing the end of your date range.
EzDate isn't on CPAN right now, but you can probably find another perl mod that will do date compares and provide a date incrementor.
You could use a DateTime object:
use DateTime;
my $dt;
while ( my ($date, $sum) = $sth->fetchrow ) {
if (defined $dt) {
print CSV $dt->ymd . ",0\n" while $dt->add(days => 1)->ymd lt $date;
}
else {
my ($y, $m, $d) = split /-/, $date;
$dt = DateTime->new(year => $y, month => $m, day => $d);
}
print CSV, "$date,$sum\n";
}
What the above code does is it keeps the last printed date stored in a
DateTime object $dt, and when the current date is more than one day
in the future, it increments $dt by one day (and prints it a line to
CSV) until it is the same as the current date.
This way you don't need extra tables, and don't need to fetch all your
rows in advance.
I hope you will figure out the rest.
select * from (
select date_add('2003-01-01 00:00:00.000', INTERVAL n5.num*10000+n4.num*1000+n3.num*100+n2.num*10+n1.num DAY ) as date from
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n1,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n2,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n3,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n4,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n5
) a
where date >'2011-01-02 00:00:00.000' and date < NOW()
order by date
With
select n3.num*100+n2.num*10+n1.num as date
you will get a column with numbers from 0 to max(n3)*100+max(n2)*10+max(n1)
Since here we have max n3 as 3, SELECT will return 399, plus 0 -> 400 records (dates in calendar).
You can tune your dynamic calendar by limiting it, for example, from min(date) you have to now().
Since you don't know where the gaps are, and yet you want all the values (presumably) from the first date in your list to the last one, do something like:
use DateTime;
use DateTime::Format::Strptime;
my #row = $sth->fetchrow;
my $countdate = strptime("%Y-%m-%d", $firstrow[0]);
my $thisdate = strptime("%Y-%m-%d", $firstrow[0]);
while ($countdate) {
# keep looping countdate until it hits the next db row date
if(DateTime->compare($countdate, $thisdate) == -1) {
# counter not reached next date yet
print CSV $countdate->ymd . ",0\n";
$countdate = $countdate->add( days => 1 );
$next;
}
# countdate is equal to next row's date, so print that instead
print CSV $thisdate->ymd . ",$row[1]\n";
# increase both
#row = $sth->fetchrow;
$thisdate = strptime("%Y-%m-%d", $firstrow[0]);
$countdate = $countdate->add( days => 1 );
}
Hmm, that turned out to be more complicated than I thought it would be.. I hope it makes sense!
I think the simplest general solution to the problem would be to create an Ordinal table with the highest number of rows that you need (in your case 31*3 = 93).
CREATE TABLE IF NOT EXISTS `Ordinal` (
`n` int(10) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (`n`)
);
INSERT INTO `Ordinal` (`n`)
VALUES (NULL), (NULL), (NULL); #etc
Next, do a LEFT JOIN from Ordinal onto your data. Here's a simple case, getting every day in the last week:
SELECT CURDATE() - INTERVAL `n` DAY AS `day`
FROM `Ordinal` WHERE `n` <= 7
ORDER BY `n` ASC
The two things you would need to change about this are the starting point and the interval. I have used SET #var = 'value' syntax for clarity.
SET #end = CURDATE() - INTERVAL DAY(CURDATE()) DAY;
SET #begin = #end - INTERVAL 3 MONTH;
SET #period = DATEDIFF(#end, #begin);
SELECT #begin + INTERVAL (`n` + 1) DAY AS `date`
FROM `Ordinal` WHERE `n` < #period
ORDER BY `n` ASC;
So the final code would look something like this, if you were joining to get the number of messages per day over the last three months:
SELECT COUNT(`msg`.`id`) AS `message_count`, `ord`.`date` FROM (
SELECT ((CURDATE() - INTERVAL DAY(CURDATE()) DAY) - INTERVAL 3 MONTH) + INTERVAL (`n` + 1) DAY AS `date`
FROM `Ordinal`
WHERE `n` < (DATEDIFF((CURDATE() - INTERVAL DAY(CURDATE()) DAY), ((CURDATE() - INTERVAL DAY(CURDATE()) DAY) - INTERVAL 3 MONTH)))
ORDER BY `n` ASC
) AS `ord`
LEFT JOIN `Message` AS `msg`
ON `ord`.`date` = `msg`.`date`
GROUP BY `ord`.`date`
Tips and Comments:
Probably the hardest part of your query was determining the number of days to use when limiting Ordinal. By comparison, transforming that integer sequence into dates was easy.
You can use Ordinal for all of your uninterrupted-sequence needs. Just make sure it contains more rows than your longest sequence.
You can use multiple queries on Ordinal for multiple sequences, for example listing every weekday (1-5) for the past seven (1-7) weeks.
You could make it faster by storing dates in your Ordinal table, but it would be less flexible. This way you only need one Ordinal table, no matter how many times you use it. Still, if the speed is worth it, try the INSERT INTO ... SELECT syntax.
Use some Perl module to do date calculations, like recommended DateTime or Time::Piece (core from 5.10). Just increment date and print date and 0 until date will match current.
I don't know if this would work, but how about if you created a new table which contained all the possible dates (that might be the problem with this idea, if the range of dates is going to change unpredictably...) and then do a left join on the two tables? I guess it's a crazy solution if there are a vast number of possible dates, or no way to predict the first and last date, but if the range of dates is either fixed or easy to work out, then this might work.
Supposed you have a table T(A) with only positive integers allowed, like:
1,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18
In the above example, the result is 10. We always can use ORDER BY and DISTINCT to sort and remove duplicates. However, to find the lowest integer not in the list, I came up with the following SQL query:
select list.x + 1
from (select x from (select distinct a as x from T order by a)) as list, T
where list.x + 1 not in T limit 1;
My idea is start a counter and 1, check if that counter is in list: if it is, return it, otherwise increment and look again. However, I have to start that counter as 1, and then increment. That query works most of the cases, by there are some corner cases like in 1. How can I accomplish that in SQL or should I go about a completely different direction to solve this problem?
Because SQL works on sets, the intermediate SELECT DISTINCT a AS x FROM t ORDER BY a is redundant.
The basic technique of looking for a gap in a column of integers is to find where the current entry plus 1 does not exist. This requires a self-join of some sort.
Your query is not far off, but I think it can be simplified to:
SELECT MIN(a) + 1
FROM t
WHERE a + 1 NOT IN (SELECT a FROM t)
The NOT IN acts as a sort of self-join. This won't produce anything from an empty table, but should be OK otherwise.
SQL Fiddle
select min(y.a) as a
from
t x
right join
(
select a + 1 as a from t
union
select 1
) y on y.a = x.a
where x.a is null
It will work even in an empty table
SELECT min(t.a) - 1
FROM t
LEFT JOIN t t1 ON t1.a = t.a - 1
WHERE t1.a IS NULL
AND t.a > 1; -- exclude 0
This finds the smallest number greater than 1, where the next-smaller number is not in the same table. That missing number is returned.
This works even for a missing 1. There are multiple answers checking in the opposite direction. All of them would fail with a missing 1.
SQL Fiddle.
You can do the following, although you may also want to define a range - in which case you might need a couple of UNIONs
SELECT x.id+1
FROM my_table x
LEFT
JOIN my_table y
ON x.id+1 = y.id
WHERE y.id IS NULL
ORDER
BY x.id LIMIT 1;
You can always create a table with all of the numbers from 1 to X and then join that table with the table you are comparing. Then just find the TOP value in your SELECT statement that isn't present in the table you are comparing
SELECT TOP 1 table_with_all_numbers.number, table_with_missing_numbers.number
FROM table_with_all_numbers
LEFT JOIN table_with_missing_numbers
ON table_with_missing_numbers.number = table_with_all_numbers.number
WHERE table_with_missing_numbers.number IS NULL
ORDER BY table_with_all_numbers.number ASC;
In SQLite 3.8.3 or later, you can use a recursive common table expression to create a counter.
Here, we stop counting when we find a value not in the table:
WITH RECURSIVE counter(c) AS (
SELECT 1
UNION ALL
SELECT c + 1 FROM counter WHERE c IN t)
SELECT max(c) FROM counter;
(This works for an empty table or a missing 1.)
This query ranks (starting from rank 1) each distinct number in ascending order and selects the lowest rank that's less than its number. If no rank is lower than its number (i.e. there are no gaps in the table) the query returns the max number + 1.
select coalesce(min(number),1) from (
select min(cnt) number
from (
select
number,
(select count(*) from (select distinct number from numbers) b where b.number <= a.number) as cnt
from (select distinct number from numbers) a
) t1 where number > cnt
union
select max(number) + 1 number from numbers
) t1
http://sqlfiddle.com/#!7/720cc/3
Just another method, using EXCEPT this time:
SELECT a + 1 AS missing FROM T
EXCEPT
SELECT a FROM T
ORDER BY missing
LIMIT 1;
Can I write something like below. But this is not giving proper output in WinSQL/Teradata
with
a (x) as ( select 1 ),
b (y) as ( select * from a )
select * from b
Do you really need to use CTEs for this particular solution when derived tables would work as well:
SELECT B.*
FROM (SELECT A.*
FROM (SELECT 1 AS Col1) A
) B;
That being said, I believe multiple CTEs are available in Teradata 14.10 or 15. I believe support for a single CTE and the WITH clause were introduced in Teradata 12 or 13.
You call the dependent 1st and then the parent
like this and it will work. Why is it like that ? Teradata likes people to play with it longer and spend more time with it, making it feel important
with
"b" (y) as ( select * from "a" ),
"a" (x) as ( select '1' )
select * from b
Given a table in SQL-Server like:
Id INTEGER
A VARCHAR(50)
B VARCHAR(50)
-- Some other columns
with no index on A or B, I wish to find rows where a unique combination of A and B occurs more than once.
I'm using the query
SELECT A+B, Count(A+B) FROM MyTable
GROUP BY A+B
HAVING COUNT(A+B) > 1
First Question
Is there a more time-efficient way to do this? (I cannot add indices to the database)
Second Question
When I attempt to gain some formatting of the output by including a , in the concatenation:
SELECT A+','+B, Count(A+','+B) FROM MyTable
GROUP BY A+','+B
HAVING COUNT(A+','+B) > 1
The query fails with the error
Column 'MyDB.dbo.MyTable.A' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
with a similar error for Column B.
How can I format the output to separate the two columns?
It would seem more natural to me to write:
SELECT A, B, Count(*) FROM MyTable
GROUP BY A, B
HAVING COUNT(*) > 1
And it's the most efficient way of doing it (and so is the query in the question).
Similarly to the above query, you can rewrite your second query:
SELECT A + ',' + B, Count(*) FROM MyTable
GROUP BY A, B
HAVING COUNT(*) > 1