SQL: earliest date from set of date fields - sql

I have a series of dates associated with a unique identifier in a table. For example:
1 | 1999-04-01 | 0000-00-00 | 0000-00-00 | 0000-00-00 | 2008-12-01 |
2 | 1999-04-06 | 2000-04-01 | 0000-00-00 | 0000-00-00 | 2010-04-03 |
3 | 1999-01-09 | 0000-00-00 | 0000-00-00 | 0000-00-00 | 2007-09-03 |
4 | 1999-01-01 | 0000-00-00 | 1997-01-01 | 0000-00-00 | 2002-01-04 |
Is there a way, to select the earliest date from the predefined list of DATE fields using a straightforward SQL command?
So the expected output would be:
1 | 1999-04-01
2 | 1999-04-06
3 | 1998-01-09
4 | 1997-01-01
I am guessing this is not possible but I wanted to ask and make sure. My current solution in mind involves putting all the dates in a temporary table and then using that to get the MIN()
thanks
Edit: The problem with using LEAST() as stated is that the new behaviour is to return NULL if any of the columns in NULL. In a series of dates like the dataset in question, any date might be NULL. I would like to obtain the earliest actual date from the set of dates.
SOLUTION: Used a combination of LEAST() and IF() in order to filter out NULL dates.
SELECT LEAST( IF(date1=0,NOW(),date1), IF(date2=0,NOW(),date2), [...] );
Lessons learnt a) COALESCE does not treat '0000-00-00' as a NULL date, b) LEAST will return '0000-00-00' as the smallest value - I would guess this is due to internal integer comparison(?)

select id, least(date_col_a, date_col_b, date_col_c) from table
upd
select id, least (
case when date_col_a = '0000-00-00' then now() + interval 100 year else date_col_a end,
case when date_col_b = '0000-00-00' then now() + interval 100 year else date_col_b end) from table

Actually you can do it like bellow or using a large case structure... or with least(date1, date2, dateN) but with that null could be the minimum value...
select rowid, min(date)
from
( select rowid, date1 from table
union all
select rowid, date2 from table
union all
select rowid, date3 from table
/* and so on */
)
group by rowid;
HTH

select
id,
least(coalesce(date1, '9999-12-31'), ....)
from
table

Related

sum time with specific delimiter

Right now I have a problem with sum time based on specific condition. For example, I have something like this.
Due to some reason, I have to add the work time based on their activity date if only approval status on the activity date is approve.
So for the restriction example I have something like this
-----------------------------------------------
| Activity Date | ApprovalStatus | WorkTime |
-----------------------------------------------
| 2017-01-06 | Rejected | 01:00:00 |
-----------------------------------------------
| 2017-01-06 | Approve | 03:00:00 |
-----------------------------------------------
| 2017-01-06 | Waiting | 02:00:00 |
-----------------------------------------------
| 2017-01-06 | Approve | 01:00:00 |
-----------------------------------------------
From those example, the accepted worktime that only will be summed from this circumstances, So the expected result is become like below. The expected result is become 04:00:00 since only the approve counted for final result.
-----------------------------------------------
| Activity Date | ApprovalStatus | WorkTime |
-----------------------------------------------
| 2017-01-06 | Approved | 04:00:00 |
-----------------------------------------------
Is there any enlightenment to solve this problem?
PS: I am using SQL Server 2014. Hope you can help me, thank you!!
Try like below
Schema:
SELECT * INTO #TAB FROM(
SELECT '2017-01-06' AS Activity_Date
, 'Rejected' AS ApprovalStatus
, '01:00:00' AS WorkTime
UNION ALL
SELECT '2017-01-06' , 'Approve' , '03:00:00'
UNION ALL
SELECT '2017-01-06' , 'Waiting' , '02:00:00'
UNION ALL
SELECT '2017-01-06' , 'Approve' , '01:00:00'
)A
Now Sum the Hours column by grouping the Date
SELECT [Activity_Date]
,CAST(DATEADD(HH,SUM( DATEDIFF(HH,'00:00:00',WorkTime)),'00:00:00') AS TIME(0))
FROM #TAB
WHERE ApprovalStatus='Approve'
GROUP BY [Activity_Date]
Result:
+---------------+------------------+
| Activity_Date | (No column name) |
+---------------+------------------+
| 2017-01-06 | 04:00:00 |
+---------------+------------------+
UPDATE :
The SUM function will only take exact numeric or approximate numeric data type . It won't accept date or Time datatype for summation.
It is documented in SUM (Transact-SQL) on microsoft website.
SUM ( [ ALL | DISTINCT ] expression )
expression
Is a constant, column, or function, and any combination of
arithmetic, bitwise, and string operators. expression is an expression
of the exact numeric or approximate numeric data type category, except
for the bit data type. Aggregate functions and subqueries are not
permitted.
So you can only have a chance to write your own logic to get the sum of Time. This below will calculate the SUM of time upto milliseconds.
SELECT [Activity_Date]
,CAST(DATEADD(ms, SUM(DATEDIFF(ms, '00:00:00.000', WorkTime)), '00:00:00.000') as time(0))
FROM #TAB2
WHERE ApprovalStatus='Approve'
GROUP BY [Activity_Date]
You can filter the records by ApprovalStatus and do a summation on worktime by grouping it by activity date.
Use this, if you want to add only the hour part.
SELECT SUM(DATEDIFF(HH,'00:00:00',WorkTime)) AS [TotalWorktime]
FROM [YourTable]
WHERE ApprovalStatus = 'Approve'
GROUP BY [Activity Date]
OR
Use this if you want to add even the minutes part.
SELECT SUM(DATEDIFF(MINUTE,'0:00:00',CONVERT(TIME,WorkTime)))/60 + (SUM(DATEDIFF(MINUTE,'0:00:00',CONVERT(TIME,WorkTime)))%60)/100.0 AS [TotalWorktime]
FROM [YourTable]
WHERE ApprovalStatus = 'Approve'
GROUP BY [Activity Date]

Oracle: How query where datetime column not null and null values?

I want query datetime column not null and null values.
But i query have not null values now. i want query both values.
Query:
select l.com_code,
l.p_code,
to_char(l.effdate,'dd/mm/yyyy') effdate,to_char(l.expdate,'dd/mm/yyyy') expdate
from RATE_BILL l
where ( to_date('02/06/2016','dd/mm/yyyy') <= to_date(l.effdate,'dd/mm/yyyy')
or to_date('02/06/2016','dd/mm/yyyy') <= to_date(l.expdate,'dd/mm/yyyy') )
Data Sample
com_code | p_code | effdate | expdate
A | TEST01 | 01/01/2016 | 31/05/2016
A | Test01 | 01/06/2016 |
Query Result:
com_code | p_code | effdate | expdate
A | TEST01 | 01/01/2016 | 31/05/2016
A | Test01 | 01/06/2016 |
Column expdate If null = '31/12/9998' but show in DB is null
when query datetime = '02/06/2016' is between should result this
com_code | p_code | effdate | expdate
A | Test01 | 01/06/2016 |
But where query is
where ( to_date('31/05/2016','dd/mm/yyyy') <= to_date(l.effdate,'dd/mm/yyyy') or to_date('31/05/2016','dd/mm/yyyy') <= to_date(l.expdate,'dd/mm/yyyy') )
Result Should
A | TEST01 | 01/01/2016 | 31/05/2016
A | Test01 | 01/06/2016 |
Values Datetime is "Now Datetime"
First of all I must admit that I am not sure to understand the meaning of your text from your wording [no offence intended]. Feel free to comment if this answer does not address your needs.
The where condition of a query is built on the columns of the table/view and their sql data types. There is no need to convert datetime columns to the datetime data type.
Moreover, it is potentially harmful here since it implies an implicit conversion:
date column
-> char /* implicit, default format */
-> date /* express format;
in general will differ from the format the argument
string follows
*/
So change the where condition to:
where to_date('02/06/2016','dd/mm/yyyy') <= l.effdate
or to_date('02/06/2016','dd/mm/yyyy') <= l.expdate
To cater for null values, complement the where condition with 'sufficiently large' datetime to compare against in case of null values in the db columns:
where to_date('02/06/2016','dd/mm/yyyy') <= nvl(l.effdate, to_date('12/31/9998','dd/mm/yyyy'))
or to_date('02/06/2016','dd/mm/yyyy') <= nvl(l.expdate, to_date('12/31/9998','dd/mm/yyyy'))
You are free to use different cutoff dates. For example you might wish to use expdate from rate_bill when it is not null and the current datetime otherwise:
where to_date('02/06/2016','dd/mm/yyyy') <= nvl(l.effdate, to_date('12/31/9998','dd/mm/yyyy'))
or to_date('02/06/2016','dd/mm/yyyy') <= nvl(l.expdate, sysdate)
I don't understand the details of your problem, but I think you have got problems with comparison of null values.
Null values are ignored by comparison. To select these columns, you should explicitly check l.effdate is null
e.g.
-- select with expdate < today or with no expdate
select *
from RATE_BILL l
where l.expdate is null or
l.expdate <= trunc(sysdate)

SQL : Getting data as well as count from a single table for a month

I am working on a SQL query where I have a rather huge data-set. I have the table data as mentioned below.
Existing table :
+---------+----------+----------------------+
| id(!PK) | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 1 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
+-------------------------------------------+
What I am looking for is an insert query into an empty table. The condition is like this :
Insert in an empty table where id is common, count of names common to an id for march.
Output for above table would be like
+---------+----------+------------------------+
| some_id | count | Date |
+---------+----------+----------------------+
| 1 | 2 | 21.03.2015 |
| 3 | 1 | 23.03.2015 |
+-------------------------------------------+
All I have is :
insert into empty_table values (some_id,count,date)
select id,count(*),date from existing_table where id=1;
Unfortunately above basic query doesn't suit this complex requirement.
Any suggestions or ideas? Thank you.
Udpated query
insert into empty_table
select id,count(*),min(date)
from existing_table where
date >= '2015-03-01' and
date < '2015-04-01'
group by id;
Seems you want the number of unique names per id:
insert into empty_table
select id
,count(distinct name)
,min(date)
from existing_table
where date >= DATE '2015-03-01'
and date < DATE '2015-04-01'
group by id;
If I understand correctly, you just need a date condition:
insert into empty_table(some_id, count, date)
select id, count(*), min(date)
from existing_table
where id = 1 and
date >= date '2015-03-01' and
date < date '2015-04-01'
group by id;
Note: the list after the table name contains the columns being inserted. There is no values keyword when using insert . . . select.
insert into empty_table
select id, count(*) as mycnt, min(date) as mydate
from existing_table
group by id, year_month(date);
Please use function provided by your RDBMS obtaining date part containing only year and month as far as you did not provide the RDBMS version and the date processing functionality varies wildly between them.

SQL Query Compare values in per 15 minutes and display the result per hour

I have a table with 2 columns. UTCTime and Values.
The UTCTime is in 15 mins increment. I want a query that would compare the value to the previous value in one hour span and display a value between 0 and 4 depends on if the values are constant. In other words there is an entry for every 15 minute increment and the value can be constant so I just need to check each value to the previous one per hour.
For example
+---------|-------+
| UTCTime | Value |
------------------|
| 12:00 | 18.2 |
| 12:15 | 87.3 |
| 12:30 | 55.91 |
| 12:45 | 55.91 |
| 1:00 | 37.3 |
| 1:15 | 47.3 |
| 1:30 | 47.3 |
| 1:45 | 47.3 |
| 2:00 | 37.3 |
+---------|-------+
In this case, I just want a Query that would compare the 12:45 value to the 12:30 and 12:30 to 12:15 and so on. Since we are comparing in only one hour span then the constant values must be between 0 and 4 (O there is no constant values, 1 there is one like in the example above)
The query should display:
+----------+----------------+
| UTCTime | ConstantValues |
----------------------------|
| 12:00 | 1 |
| 1:00 | 2 |
+----------|----------------+
I just wanted to mention that I am new to SQL programming.
Thank you.
See SQL fiddle here
Below is the query you need and a working solution Note: I changed the timeframe to 24 hrs
;with SourceData(HourTime, Value, RowNum)
as
(
select
datepart(hh, UTCTime) HourTime,
Value,
row_number() over (partition by datepart(hh, UTCTime) order by UTCTime) RowNum
from foo
union
select
datepart(hh, UTCTime) - 1 HourTime,
Value,
5
from foo
where datepart(mi, UTCTime) = 0
)
select cast(A.HourTime as varchar) + ':00' UTCTime, sum(case when A.Value = B.Value then 1 else 0 end) ConstantValues
from SourceData A
inner join SourceData B on A.HourTime = B.HourTime and
(B.RowNum = (A.RowNum - 1))
group by cast(A.HourTime as varchar) + ':00'
select SUBSTRING_INDEX(UTCTime,':',1) as time,value, count(*)-1 as total
from foo group by value,time having total >= 1;
fiddle
Mine isn't much different from Vasanth's, same idea different approach.
The idea is that you need recursion to carry it out simply. You could also use the LEAD() function to look at rows ahead of your current row, but in this case that would require a big case statement to cover every outcome.
;WITH T
AS (
SELECT a.UTCTime,b.VALUE,ROW_NUMBER() OVER(PARTITION BY a.UTCTime ORDER BY b.UTCTime DESC)'RowRank'
FROM (SELECT *
FROM #Table1
WHERE DATEPART(MINUTE,UTCTime) = 0
)a
JOIN #Table1 b
ON b.UTCTIME BETWEEN a.UTCTIME AND DATEADD(hour,1,a.UTCTIME)
)
SELECT T.UTCTime, SUM(CASE WHEN T.Value = T2.Value THEN 1 ELSE 0 END)
FROM T
JOIN T T2
ON T.UTCTime = T2.UTCTime
AND T.RowRank = T2.RowRank -1
GROUP BY T.UTCTime
If you run the portion inside the ;WITH T AS ( ) you'll see that gets us the hour we're looking at and the values in order by time. That is used in the recursive portion below by joining to itself and evaluating each row compared to the next row (hence the RowRank - 1) on the JOIN.

Is it possible to temporarily duplicate and modify rows on the fly in an SQL SELECT query?

I've just received a new data source for my application which inserts data into a Derby database only when it changes. Normally, missing data is fine - I'm drawing a line chart with the data (value over time), and I'd just draw a line between the two points, extrapolating the expected value at any given point. The problem is that as missing data in this case means "draw a straight line," the graph would be incorrect if I did this.
There are two ways I could fix this: I could create a new class that handles missing data differently (which could be difficult due to the way prefuse, the drawing library I'm using, handles drawing), or I could duplicate the rows, leaving the y value the same while changing the x value in each row. I could do this in the Java that bridges the database and the renderer, or I could modify the SQL.
My question is, given a result set like the one below:
+-------+---------------------+
| value | received |
+-------+---------------------+
| 7 | 2000-01-01 08:00:00 |
| 10 | 2000-01-01 08:00:05 |
| 11 | 2000-01-01 08:00:07 |
| 2 | 2000-01-01 08:00:13 |
| 4 | 2000-01-01 08:00:16 |
+-------+---------------------+
Assuming I query it at 8:00:20, how can I make it look like the following using SQL? Basically, I'm duplicating the row for every second until it's already taken. received is, for all intents and purposes, unique (it's not, but it will be due to the WHERE clause in the query).
+-------+---------------------+
| value | received |
+-------+---------------------+
| 7 | 2000-01-01 08:00:00 |
| 7 | 2000-01-01 08:00:01 |
| 7 | 2000-01-01 08:00:02 |
| 7 | 2000-01-01 08:00:03 |
| 7 | 2000-01-01 08:00:04 |
| 10 | 2000-01-01 08:00:05 |
| 10 | 2000-01-01 08:00:06 |
| 11 | 2000-01-01 08:00:07 |
| 11 | 2000-01-01 08:00:08 |
| 11 | 2000-01-01 08:00:09 |
| 11 | 2000-01-01 08:00:10 |
| 11 | 2000-01-01 08:00:11 |
| 11 | 2000-01-01 08:00:12 |
| 2 | 2000-01-01 08:00:13 |
| 2 | 2000-01-01 08:00:14 |
| 2 | 2000-01-01 08:00:15 |
| 4 | 2000-01-01 08:00:16 |
| 4 | 2000-01-01 08:00:17 |
| 4 | 2000-01-01 08:00:18 |
| 4 | 2000-01-01 08:00:19 |
| 4 | 2000-01-01 08:00:20 |
+-------+---------------------+
Thanks for your help.
Due to the set based nature of SQL, there's no simple way to do this. I have used two solution strategies:
a) use a cycle to go from the initial to end date time and for each step get the value, and insert that into a temp table
b) generate a table (normal or temporary) with the 1 minute increments, adding the base date time to this table you can generate the steps.
Example of approach b) (SQL Server version)
Let's assume we will never query more than 24 hours of data. We create a table intervals that has a dttm field with the minute count for each step. That table must be populated previously.
select dateadd(minute,stepMinutes,'2000-01-01 08:00') received,
(select top 1 value from table where received <=
dateadd(minute,dttm,'2000-01-01 08:00')
order by received desc) value
from intervals
It seems like in this case you really don't need to generate all of these datapoints. Would it be correct to generate the following instead? If it's drawing a straight line, you don't need go generate a data point for each second, just two for each datapoint...one at the current time, one right before the next time. This example subtracts 5 ms from the next time, but you could make it a full second if you need it.
+-------+---------------------+
| value | received |
+-------+---------------------+
| 7 | 2000-01-01 08:00:00 |
| 7 | 2000-01-01 08:00:04 |
| 10 | 2000-01-01 08:00:05 |
| 10 | 2000-01-01 08:00:06 |
| 11 | 2000-01-01 08:00:07 |
| 11 | 2000-01-01 08:00:12 |
| 2 | 2000-01-01 08:00:13 |
| 2 | 2000-01-01 08:00:15 |
| 4 | 2000-01-01 08:00:16 |
| 4 | 2000-01-01 08:00:20D |
+-------+---------------------+
If that's the case, then you can do the following:
SELECT * FROM
(SELECT * from TimeTable as t1
UNION
SELECT t2.value, dateadd(ms, -5, t2.received)
from ( Select t3.value, (select top 1 t4.received
from TimeTable t4
where t4.received > t3.received
order by t4.received asc) as received
from TimeTable t3) as t2
UNION
SELECT top 1 t6.value, GETDATE()
from TimeTable t6
order by t6.received desc
) as t5
where received IS NOT NULL
order by t5.received
The big advantage of this is that it is a set based solution and will be much faster than any iterative approach.
You could just walk a cursor, keep vars for the last value & time returned, and if the current one is more than a second ahead, loop one second at a time using the previous value and the new time until you get the the current row's time.
Trying to do this in SQL would be painful, and if you went and created the missing data, you would possible have to add a column to track real / interpolated data points.
Better would be to have a table for each axial value you want to have on the graph, and then either join to it or even just put the data field there and update that record when/if values arrive.
The "missing values" problem is quite extensive, so I suggest you have a solid policy.
One thing that will happen is that you will have multiple adjacent slots with missing values.
This would be much easier if you could transform it into OLAP data.
Create a simple table that has all the minutes (warning, will run for a while):
Create Table Minutes(Value DateTime Not Null)
Go
Declare #D DateTime
Set #D = '1/1/2000'
While (Year(#D) < 2002)
Begin
Insert Into Minutes(Value) Values(#D)
Set #D = DateAdd(Minute, 1, #D)
End
Go
Create Clustered Index IX_Minutes On Minutes(Value)
Go
You can then use it somewhat like this:
Select
Received = Minutes.Value,
Value = (Select Top 1 Data.Value
From Data
Where Data.Received <= Minutes.Received
Order By Data.Received Desc)
From
Minutes
Where
Minutes.Value Between #Start And #End
I would recommend against solving this in SQL/the database due to the set based nature of it.
Also you are dealing with seconds here so I guess you could end up with a lot of rows, with the same repeated data, that would have to be transfered from the database to you application.
One way to handle this is to left join your data against a table that contains all of the received values. Then, when there is no value for that row, you calculate what the projected value should be based on the previous and next actual values you have.
You didn't say what database platform you are using. In SQL Server, I would create a User Defined Function that accepts a start datetime and end datetime value. It would return a table value with all of the received values you need.
I have simulated it below, which runs in SQL Server. The subselect aliased r is what would actually get returned by the user defined function.
select r.received,
isnull(d.value,(select top 1 data.value from data where data.received < r.received order by data.received desc)) as x
from (
select cast('2000-01-01 08:00:00' as datetime) received
union all
select cast('2000-01-01 08:00:01' as datetime)
union all
select cast('2000-01-01 08:00:02' as datetime)
union all
select cast('2000-01-01 08:00:03' as datetime)
union all
select cast('2000-01-01 08:00:04' as datetime)
union all
select cast('2000-01-01 08:00:05' as datetime)
union all
select cast('2000-01-01 08:00:06' as datetime)
union all
select cast('2000-01-01 08:00:07' as datetime)
union all
select cast('2000-01-01 08:00:08' as datetime)
union all
select cast('2000-01-01 08:00:09' as datetime)
union all
select cast('2000-01-01 08:00:10' as datetime)
union all
select cast('2000-01-01 08:00:11' as datetime)
union all
select cast('2000-01-01 08:00:12' as datetime)
union all
select cast('2000-01-01 08:00:13' as datetime)
union all
select cast('2000-01-01 08:00:14' as datetime)
union all
select cast('2000-01-01 08:00:15' as datetime)
union all
select cast('2000-01-01 08:00:16' as datetime)
union all
select cast('2000-01-01 08:00:17' as datetime)
union all
select cast('2000-01-01 08:00:18' as datetime)
union all
select cast('2000-01-01 08:00:19' as datetime)
union all
select cast('2000-01-01 08:00:20' as datetime)
) r
left outer join Data d on r.received = d.received
If you were in SQL Server, then this would be a good start. I am not sure how close Apache's Derby is to sql.
Usage: EXEC ElaboratedData '2000-01-01 08:00:00','2000-01-01 08:00:20'
CREATE PROCEDURE [dbo].[ElaboratedData]
#StartDate DATETIME,
#EndDate DATETIME
AS
--if not a valid interval, just quit
IF #EndDate<=#StartDate BEGIN
SELECT 0;
RETURN;
END;
/*
Store the value of 1 second locally, for readability
--*/
DECLARE #OneSecond FLOAT;
SET #OneSecond = (1.00000000/86400.00000000);
/*
create a temp table w/the same structure as the real table.
--*/
CREATE TABLE #SecondIntervals(TSTAMP DATETIME, DATAPT INT);
/*
For each second in the interval, check to see if we have a known value.
If we do, then use that. If not, make one up.
--*/
DECLARE #CurrentSecond DATETIME;
SET #CurrentSecond = #StartDate;
WHILE #CurrentSecond <= #EndDate BEGIN
DECLARE #KnownValue INT;
SELECT #KnownValue=DATAPT
FROM TESTME
WHERE TSTAMP = #CurrentSecond;
IF (0 = ISNULL(#KnownValue,0)) BEGIN
--ok, we have to make up a fake value
DECLARE #MadeUpValue INT;
/*
*******Put whatever logic you want to make up a fake value here
--*/
SET #MadeUpValue = 99;
INSERT INTO #SecondIntervals(
TSTAMP
,DATAPT
)
VALUES(
#CurrentSecond
,#MadeUpValue
);
END; --if we had to make up a value
SET #CurrentSecond = #CurrentSecond + #OneSecond;
END; --while looking thru our values
--finally, return our generated values + real values
SELECT TSTAMP, DATAPT FROM #SecondIntervals
UNION ALL
SELECT TSTAMP, DATAPT FROM TESTME
ORDER BY TSTAMP;
GO
As just an idea, you might want to check out Anthony Mollinaro's SQL Cookbook, chapter 9. He has a recipe, "Filling in Missing Dates" (check out pages 278-281), that discusses primarily what you are trying to do. It requires some sort of sequential handling, either via a helper table or doing the query recursively. While he doesn't have examples for Derby directly, I suspect you could probably adapt them to your problem (particularly the PostgreSQL or MySQL one, it seems somewhat platform agnostic).