'Summing' a date field in SQL - any ideas? - sql

I'm creating an application that is essentially an integrity check between two databases - one is MSSQL and one is an old provider Btrieve. As part of the requirements all columns for every table need to be compared to ensure the data matches. Currently we loop through each table, get the basic count of the table in both DBs, and then delve into the columns. For numeric fields we do a simple SUM, and for text fields we sum up the length of the column for every row. If these match in both DBs, it's a good indicator the data has migrated across correctly.
This all works fine, but I need to develop something similar for datetime fields. Obviously we can't really SUM these fields, so I'm wondering if anyone has ideas on the best way to approach this. I was thinking maybe the seconds since a certain date but the number will be huge.
Any other ideas? Thanks!

The most straightforward answer to me would be to convert the date or datetime fields to integers with the same format. YYYYMMDD or YYYYMMDDHHmmss work just fine as long as your formats use leading zeroes. In SQL Server, you can do something like:
SELECT SUM(CAST(REPLACE(REPLACE(REPLACE(CONVERT(VARCHAR(20),DateTimeColumn,120),' ',''),':',''),'-','') AS BIGINT)) .....
Alternately, you can convert them to either the number of days from a given date ('1970-01-01'), or the number of seconds from a given date ('1970-01-01 00:00:00') if you use time.
SELECT SUM(DATEDIFF(DAY,'19700101',DateColumn)) ....
I'm not familiar enough with Btrieve to know what kinds of functions are available for formatting dates, however.

Using "Except" in SQL on the lines of Numeric fields you can compare the date counts in both the tables. For the Old source you may generate the select statement using excel or in the native database and bring to the SQL Server. For demonstration purpose I have used two tables and showing Except example below.
IF EXISTS (SELECT * FROM sys.objects
WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[DateCompareOld]') AND
TYPE IN (N'U'))
DROP TABLE [dbo].[DateCompareOld]
GO
CREATE TABLE dbo.DateCompareOld
(
AsOf DATETIME
)
INSERT INTO DateCompareOld
SELECT '01/01/2016' UNION ALL
SELECT '01/01/2016' UNION ALL
SELECT '01/01/2016' UNION ALL
SELECT '01/02/2016' UNION ALL
SELECT '01/02/2016' UNION ALL
SELECT '01/02/2016'
IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[DateCompareNew]') AND TYPE IN (N'U'))
DROP TABLE [dbo].[DateCompareNew]
GO
CREATE TABLE dbo.DateCompareNew
(
AsOf DATETIME
)
INSERT INTO DateCompareNew
SELECT '01/01/2016' UNION ALL
SELECT '01/01/2016' UNION ALL
SELECT '01/01/2016' UNION ALL
SELECT '01/02/2016' UNION ALL
SELECT '01/02/2016' UNION ALL
SELECT '01/02/2016'
SELECT AsOf,COUNT(*) AsOfCount
FROM DateCompareOld
GROUP BY AsOf
Except
SELECT AsOf,COUNT(*) AsOfCount
FROM DateCompareNew
GROUP BY AsOf

Unless the date range used by rows in the database is extreme (like dates of astronomical stars being born and dying), it should be just as valid to convert the dates to an integer. This can be done any of several ways and is slightly database-specific, but converting 2016-01-04 to 20,160,104 is going to work fine.
Even SQL Server allows ORD(date_field) like expressions to obtain the internal representation. But this can also be done in a portable, system-agnostic means like
datediff(day, 'January 1, 1901', date_field)
if keeping track of days is sufficient, or
datediff(second, 'January 1, 1901', date_field)
if keeping track of seconds is needed.

Maybe it is not much help, maybe is something:
declare #d1 datetime; set #d1 = '2016-01-05 12:09'
declare #d2 datetime; set #d2 = '1970-04-05 07:09'
declare #d3 datetime; set #d3 = '1999-12-12 23:05'
declare #d4 datetime; set #d4 = '1999-12-12 23:06'
declare #i1 bigint
declare #i2 bigint
declare #i3 bigint
declare #i4 bigint
select #i1 = convert( bigint, convert( timestamp, #d1 ) )
select #i2 = convert( bigint, convert( timestamp, #d2 ) )
select #i3 = convert( bigint, convert( timestamp, #d3 ) )
select #i4 = convert( bigint, convert( timestamp, #d4 ) )
select #i1
select #i2
select #i3
select #i4
select #i1 ^ #i2 ^ #i3 ^ #i4

I think you could do something like this on the SQL Server side to find the center ("average") value of the column. Then use that value on the Btrieve side to avoid overflow issues where I'm guessing you're more constrained.
-- January 1, 2000 value pulled out of the air as a stab in the dark
select
dataadd(
second,
avg(cast(datediff(datediff(second, '20000101', <data>) as bigint)),
'20000101'
) /* find the center */
I wouldn't be surprised if you had to resort to a floating point type with Btrieve or partition your scans into smaller ranges to avoid intermediate sums that get too big. And you might want to use a cursor and randomize the ordering of the rows so you don't hit them in a sorted order that causes an overflow. At this point I'm just speculating since I haven't seen any of the data and my knowledge of Btrieve is so ancient and minimal to begin with.
I also have a feeling that some of this effort is all about satisfying some uneasiness on the part of non-technical stakeholders. I'm sure you could come up with checksums and hashes that would work better but this summing concept is the one they can grasp and will set their minds at ease on top of being somewhat easier to implement quickly.

Related

UPDATE SET FORMAT not working in SQL Server 2016?

FORMAT instruction works in a SELECT but has no effect in an UPDATE:
SELECT ##VERSION
DROP TABLE IF EXISTS #t;
CREATE TABLE #t(DateMin datetime);
INSERT INTO #t VALUES ('2019-13-01 00:00:00')
SELECT * FROM #t
UPDATE #t SET DateMin = FORMAT(DateMin, 'dd/MM/yyyy');
SELECT * FROM #t;
SELECT #DateMin AS a, FORMAT(#DateMin, 'dd/MM/yyyy') AS b
A type like DATETIME isn't stored with a format.
So if one updates a DATETIME with a string in a certain format, it doesn't matter for the stored value in the DATETIME field.
The formatted string is implicitly converted to a datetime. At least if it's in a format that's valid.
The function FORMAT, which returns a NVARCHAR is rather used for representation of the datetime field in a query.
Or if one wants to INSERT/UPDATE a string field with a datetime in a certain format. But that should be avoided, because it's much easier to work with a datetime than a string.
If you want to change that format for the user use this:
set dateformat dmy;
By running this statement:
DBCC USEROPTIONS;
you will see your dateformat is ydm so you can alway back it up to that if this is not what you wanted :)
You cannot set the output format of a datetime in the datetime itselfs.
If you need to output the datetime as formatted char/varchar, you need to use the convert-function when you select the data:
SELECT CONVERT(char(10), CURRENT_TIMESTAMP, 101) -- format: MM/dd/yyyy
SELECT CONVERT(char(10), CURRENT_TIMESTAMP, 103) -- format: dd/MM/yyyy
In your case:
SELECT #DateMin AS a, CONVERT(char(10), #DateMin, 103) AS b
That works as expected.
If you want to have a mutable data-type, you need to declare it as sql_variant:
DROP TABLE IF EXISTS #t;
CREATE TABLE #t(DateMin sql_variant);
INSERT INTO #t VALUES ('2019-01-13T00:00:00')
UPDATE #t SET DateMin = FORMAT(CAST(DateMin AS datetime), 'dd''/''MM''/''yyyy');
SELECT * FROM #t;
Also, your format-expression needs to explicitly put the / into quotation marks, aka 'dd''/''MM''/''yyyy', otherwise sql-server replaces it with the date-separator specific to the current culture, which would be . in my case.
Just use convert with option 103 instead, it works on all versions of sql-server and it's probably faster.
Also, your insert-statement fails on some versions of sql-server, because iso-date-format is 2019-01-13T00:00:00 and not 2019-13-01 00:00:00
Correct is:
INSERT INTO #t VALUES ('2019-01-13T00:00:00')
Also
DROP TABLE IF EXISTS #t;
is sql-server 2016+ only, otherwise you need
IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t
And post sql-server 2005, you should use datetime2 instead of datetime.
You shouldn't use datetime anymore, because datetime uses float, and as such is imprecise - if you insert an iso datetime value, it can do funny things because of the float-point-machine-epsilon, e.g. set it to the next day if you have 23:59:59.999, just as a scary example.
I advise you to never use the sql_variant type. If you have a temp-table with defined columns, just create another column where you will write the char/varchar value to.

Converting SQL dates to different formats

I have a task which should be easy, just converting dates in to a specfic format
2015-11-16T20:34:19+08:00
(yyyy-mm-ddThh:mm:ss+[timezone offset]) which I later export to an Excel template that requires this type for format.
Looking at the database table where all the data is stored I noticed the column where the dates are stored under is of Varchar(20) datatype. As far as I know, it's a bad thing to save dates like that.
So basically what I need is to convert the following:
SELECT TIMESTAMP AS LASTCHANGEDATE FROM TABLE1
To a yyyy-mm-ddThh:mm:ss+[timezone offset] format, but TIMESTAMP has the datatype of varchar(20)
Anyone can help with this?
EDIT
The dates are stored atm like this 23.12.2015 17:08:18
In SQL Server it is something like this:
EDIT: Try it like this:
DECLARE #dtString VARCHAR(100) = '23.12.2015 17:08:18';
DECLARE #dt DATETIME = CONVERT(DATETIME, #dtString, 104);
SELECT CONVERT(VARCHAR(100),#dt,126)+'+08:00';
The reason why I tried the direct cast was your "The dates are stored atm like this". I thought, if the occur in different formats it might be better not to specify it...
Old Code:
DECLARE #dtString VARCHAR(100) = '23.12.2015 17:08:18';
DECLARE #dt DATETIME = CAST(#dtString AS DATETIME);
SELECT CONVERT(VARCHAR(100),#dt,126)+'+08:00';
EDIT: the third parameter of CONVERT is 126. This will create a ISO8601 compliant date equivalent
The result:
2015-12-23T17:08:18+08:00
EDIT: According to your comment you might implement this like here.
DECLARE #tbl TABLE(TimeStamp VARCHAR(100),item INT);
INSERT INTO #tbl VALUES
('23.12.2015 17:08:18',1123)
,('23.12.2015 19:08:18',1123)
,('24.12.2015 17:08:18',1123)
,('22.12.2015 19:08:18',3233)
SELECT item, CONVERT(VARCHAR(100),CONVERT(DATETIME,TimeStamp,104),126)+'08:00' AS ConvertedDate
FROM #tbl
WHERE item IN (1123,3233,2342);

How to compare smalldatetime in stored procedure

I'm writing stored procedure to compare dates but it's not working properly. How can I make it so it compares only the dates but not the time? What I'm trying to do is compare the times and if the Id is null than insert a new entry with the same name but new time. I'm keeping multiple entries with same name but different test time.
ALTER PROCEDURE [dbo].[UL_TestData]
(
#Name varchar(30),
#Test_Time smalldatetime,
#ID INT output
)
AS
Declare #UpdateTime smalldatetime
SELECT #ID=ID FROM Info_User WHERE Name=#Name AND UpdateTime= #Test_Time
IF(#ID IS NULL)
BEGIN
INSERT INTO Info_User (Name, UpdateTime) VALUES (#Name, #UpdateTime)
END
there are a lot of solutions to this depending on what type of DBMS, however here is one:
SELECT #ID=ID FROM Info_User WHERE Name=#Name AND floor(cast(#UpdateTime as float))= floor(cast(#Test_Time as float))
this works because smalldatetime's date is stored a whole numbers, where the time is stored as decimals.
I would cast the dates to a plain date which makes this solution independent of implementation details
select #ID=ID
from info_user
where Name = #Name
and cast (UpdateTime as Date) = Cast(#TestTime as Date)
However, I would either add the date part of the UpdateTime as an additional (calculated) column or split the information into a date and a time part. This makes it much easier to query entries by the plain date.
As a rule of thumb: The type of columns (in general: the table layout) greatly depends on the type of query you usually run against your data.
Edit: As attila pointed out, the date datatype only exists in version 2008 and up

Using SQL 2005 trying to cast 16 digit Varchar as Bigint error converting

First, thanks for all your help! You really make a difference, and I GREATLY appreciate it.
So I have a Varchar column and it holds a 16 digit number, example: 1000550152872026
select *
FROM Orders
where isnumeric([ord_no]) = 0
returns: 0 rows
select cast([ord_no] as bigint)
FROM Progression_PreCall_Orders o
order by [ord_no]
returns: Error converting data type varchar to bigint.
How do I get this 16 digit number into a math datatype so I can add and subtract another column from it?
UPDATE: Found scientific notation stored as varchar ex: 1.00054E+15
How do I convert that back into a number then?
DECIMAL datatype seems to work fine:
DECLARE #myVarchar AS VARCHAR(32)
SET #myVarchar = '1000550152872026'
DECLARE #myDecimal AS DECIMAL(38,0)
SET #myDecimal = CAST(#myVarchar AS DECIMAL(38,0))
SELECT #myDecimal + 1
Also, here's a quick example where IsNumeric returns 1 but converting to DECIMAL fails:
DECLARE #myVarchar AS VARCHAR(32)
SET #myVarchar = '1000550152872026E10'
SELECT ISNUMERIC(#myVarchar)
DECLARE #myDecimal AS DECIMAL(38,0)
SET #myDecimal = CAST(#myVarchar AS DECIMAL(38,0)) --This statement will fail
EDIT
You could try to CONVERT to float if you're dealing with values written in scientific notation:
DECLARE #Orders AS TABLE(OrderNum NVARCHAR(64), [Date] DATETIME)
INSERT INTO #Orders VALUES('100055015287202', GETDATE())
INSERT INTO #Orders VALUES('100055015287203', GETDATE())
INSERT INTO #Orders VALUES('1.00055015287E+15', GETDATE()) --sci notation
SELECT
CONVERT(FLOAT, OrderNum, 2) +
CAST(REPLACE(CONVERT(VARCHAR(10), GETDATE(), 120), '-', '') AS FLOAT)
FROM #Orders
WITH validOrds AS
(
SELECT ord_no
FROM Orders
WHERE ord_no NOT LIKE '%[^0-9]%'
)
SELECT cast(validOrds.ord_no as bigint) as ord_no
FROM validOrds
LEFT JOIN Orders ords
ON ords.ord_no = validOrds.ord_no
WHERE ords.ord_no is null
Take a look at this link for an explanation of why isnumeric isn't functioning the way you are assuming it would: http://www.sqlservercentral.com/articles/IsNumeric/71512/
Take a look at this link for an SO post where a user has a similar problem as you:
Error converting data type varchar
hence, you should always use the correct datatype for each column unless you have a very specific reason to do so otherwise... Even then, you'll need to be extra careful when saving values to the column to ensure that they are indeed valid values

How to enter a Date into a table in TSQL? (Error converting data type varchar to datetime)

I want to enter 30/10/1988 as the date to a DOB column in a table using a procedure
alter procedure addCustomer
#userName varchar(50),
#userNIC varchar(50),
#userPassword varchar(100),
#userDOB datetime,
#userTypeID int,
#userEmail varchar(50),
#userTelephone int,
#userAddress char(100),
#userCityID int,
#status int output
as
declare #userID int
declare #eid int
declare #tid int
declare #aid int
execute getLastRaw 'userID','tblUserParent', #userID output
insert into tblUserParent values (#userID, #userName, #userNIC, #userPassword, #userDOB, #userTypeID)
execute getLastRaw 'addressID','tblAddress', #aid output
insert into tblAddress values (#aid, #userAddress, #userID, #userCityID)
execute getLastRaw 'emailID','tblEmail', #eid output
insert into tblEmail values (#eid, #userEmail, #userID)
execute getLastRaw 'telephoneID','tblTelephoneNO', #tid output
insert into tblTelephoneNO values (#tid, #userTelephone , #userID)
insert into tblUserCustomer values (#userID, #eid , #tid, #aid)
...but it gives an error when i enter like this '30/10/1988'
Msg 8114, Level 16, State 5, Procedure addCustomer, Line 0 Error converting data type varchar to datetime.
...but when I enter like only the 30/10/1988
Incorrect syntax near '/'
How do I fix this?
If you would truly like to avoid the possibility of ambiguous dates based, then you should always enter it in one of the two unambiguous date formats Answer has already been selected and it's valid but I'm a believer in spreading the knowledge ;)
As noticed by #cloud and my post representing a younger, and less wise me with a link only answer, I'll pop the contents of the archive of Jamie Thompson's answer for unambiguous date formats in TSQL
tl;dr;
yyyy-MM-ddTHH24:mi:ss
yyyyMMdd HH24:mi:ss
One of the most commonly used data types in SQL Server is [datetime]
which unfortunately has some vagaries around how values get casted. A
typical method for defining a [datetime] literal is to write it as a
character string and then cast it appropriately. The cast syntax looks
something like this: DECLARE #dt NVARCHAR(19) = '2009-12-08 18:00:00';
SELECT CAST(#dt AS datetime);
Unfortunately in SQL Server 2005 the result of the cast operation may
be dependent on your current language setting. You can discover your
current language setting by executing: SELECT ##LANGUAGE To
demonstrate how your language setting can influence the results of a
cast take a look at the following code: ALTER DATABASE tempdb
SET COMPATIBILITY_LEVEL = 90 ; --Behave like SQL Server 2005
USE tempdb
GO
DECLARE #t TABLE (
dateString NVARCHAR(19)
);
INSERT #t (dateString)
VALUES ('2009-12-08 18:00:00') --'yyyy-MM-dd hh24:mi:ss'
, ('2009-12-08T18:00:00') --'yyyy-MM-ddThh24:mi:ss'
, ('20091208 18:00:00') --'yyyyMMdd hh24:mi:ss'
SET LANGUAGE french;
SELECT 'french' AS lang
, DATENAME(MONTH,q.[dt]) AS mnth
, q.[dt]
FROM (
SELECT CAST(dateString AS DATETIME) AS dt
FROM #t
)q;
SET LANGUAGE us_english;
SELECT 'us_english' AS lang
, DATENAME(MONTH,q.[dt]) AS mnth
, q.[dt]
FROM (
SELECT CAST(dateString AS DATETIME) AS dt
FROM #t
)q; We are taking the value which can be described in words as “6pm on 8th December 2009”, defining it in three different ways, then
seeing how the ##LANGUAGE setting can affect the results. Here are
those results: french language datetime Notice how the interpretation
of the month can change depending on ##LANGUAGE. If
##LANGUAGE=’french’ then the string '2009-12-08 18:00:00' is
interpreted as 12th August 2009 (‘août’ is French for August for those
that don’t know) whereas if ##LANGUAGE=’us_english’ it is interpreted
as 8th December 2009. Clearly this is a problem because the results of
our queries have a dependency on a server-level or connection-level
setting and that is NOT a good thing. Hence I recommend that you only
define [datetime] literals in one of the two unambiguous date formats:
yyyy-MM-ddTHH24:mi:ss yyyyMMdd HH24:mi:ss That was going to be the end
of this blog post but then I found out that this behaviour changed
slightly in SQL Server 2008. Take the following code (see if you can
figure out what the results will be before I tell you): ALTER
DATABASE tempdb
SET COMPATIBILITY_LEVEL = 100 ; --Behave like SQL Server 2008
GO
USE tempdb
GO
SET LANGUAGE french;
DECLARE #dt NCHAR(10) = '2009-12-08 18:00:00'; --Ambiguous date
format
SELECT CAST(#dt AS datetime) AS [ExplicitCast]
, DATENAME(MONTH,#dt) AS [MonthFromImplicitCast]
, DATENAME(MONTH,CAST(#dt AS datetime)) AS
[MonthFromExplicitCast]; Here we are doing three different things with
our nchar literal: explicitly cast it as a [datetime] extract the
month name from the char literal using the DATENAME function (which
results in an under-the-covers implicit cast) extract the month name
from the char literal using the DATENAME function after it has been
explicitly casted as a [datetime] Note that the compatibility level is
set to SQL Server 2008 and ##LANGUAGE=’french’. Here are the results:
image (Were you correct?) Let’s take a look at what is happening here.
The behaviour when we are explicitly casting as [datetime] hasn’t
changed, our nchar literal is still getting interpreted as 12th August
rather than 8th December when ##LANGUAGE=’french’. The
[MonthFromExplicitCast] field is interesting though, it seems as
though the implicit cast has resulted in the desired value of 8th
December. Why is that? To get the answer we can turn to BOL’s
description of the DATENAME function syntax: image The implicit cast
is not casting to [datetime] at all, it is actually casting to [date]
which is a new datatype in SQL Server 2008. The new date-related
datatypes in SQL Server 2008 (i.e. [date], [datetime2], [time],
[datetimeoffset]) disregard ##LANGUAGE and hence we get behaviour that
is more predictable and, frankly, better. These new behaviours for SQL
Server 2008 were unknown to me when I began this blog post so I have
learnt something in the course of authoring it, I hope it has helped
you too. No doubt someone somewhere is going to get nastily burnt by
this at some point, make sure that it isn’t you by always using
unambiguous date formats: yyyy-MM-ddTHH24:mi:ss yyyyMMdd HH24:mi:ss
regardless of which version you are on!
The following works in both SQL Server and MySql without ambiguity: yyyy-mm-dd, like so:
INSERT INTO TableName(DateColumn) VALUES ('1988-10-30');
...as an added benefit there's no question of whether it's a US or European style date on days like the fourth of March...
See if there is a culture setting that you can change to allow you to use dd/mm/yyyy. I believe it is expecting mm/dd/yyyy.
A potentially easy way around the problem is to use a date format with no ambiguity between mm/dd/yyyy and dd/mm/yyyy such as dd-mmm-yyyy, eg: 30-OCT-1988