CAST and CONVERT both failing when attempting to convert string to date - sql

I'm dealing with a table containing records from questionnaires administered to people after completing an activity. There are several questions on the questionnaire, so each person has multiple records with the same collection date, like so.
PersonID Question Result CollectedDate
-------------------------------------------------------------
1001 First activity? Yes 10/23/2022
1001 Activity date 10/20/2022 10/23/2022
1001 Activity type Painting 10/23/2022
1002 First activity? No 10/24/2022
1002 Activity date 10/23/2022 10/24/2022
1002 Activity type Writing 10/24/2022
Since my end goal is to compare the activity date with the questionnaire collection date and see how much time elapsed between them, I've altered my query a bit so I'm focusing only on each person's question regarding the activity date. It's a super simple query:
SELECT
PersonID,
Question,
Result,
CollectedDate
FROM Questionnaire
WHERE Question LIKE '%date%'
PersonID Question Result CollectedDate
-------------------------------------------------------------
1001 Activity date 10/20/2022 10/23/2022
1002 Activity date 10/23/2022 10/24/2022
My main issue is that the Result field is varchar(50) in order to accommodate text answers, so any dates seen there are actually from free text fields in the front-end interface. I've tried using both CAST() and CONVERT() to turn it into an actual date format so the difference between the dates can be calculated. I've seen both of the following errors depending on which function I'm using or which date/time style I'm attempting to apply:
Conversion failed when converting date and/or time from character string
The conversion of a varchar data type to a datetime data type resulted in an out-of-range value
I've tried:
SELECT
PersonID,
Question,
CAST(Result as date),
CollectedDate
FROM Questionnaire
WHERE Question LIKE '%date%'
and...
SELECT
PersonID,
Question,
CONVERT(DATETIME,Result,101) as Result,
CollectedDate
FROM Questionnaire
WHERE Question LIKE '%date%'
...and have tried several different styles. Does anyone have any further suggestions? Is the date itself likely the problem, or is if the fact that the Result field contains a bunch of other stuff too, even though it's currently omitted from the query results?
UPDATE: There are some kind of wonky date formats in this Result field even when I have the other question types filtered out (I hate free text). For example, there are some formatted like 05/01/2022 and others like 5/1/2022. Some others have something like 5/19/2022 - 5/20/2022, like maybe the person couldn't remember the exact date of their activity. What's the best way to deal with all of this?

You should be able to get past the error by making sure you reject any value that can't be converted to a date. Largely, that is this:
Result = CASE
WHEN ISDATE(Result) = 1 THEN CONVERT(date, Result, 101) END
You'd think it would be enough to say WHERE Question = 'Activity Date' AND ISDATE(Result) = 1, but:
Someone still might have entered bad data on that row.
SQL Server might try to perform the CONVERT() operation before the filter.
You can identify the ones that have bad data using:
WHERE Question = 'Activity Date' AND ISDATE(Result) = 0
But until you've fixed the structure and stored dates in an independent column, fixing that data just means it's a matter of time before it happens again.
You might consider, in the meantime, just displaying what the user entered as a string, instead of trying to force it to be converted to a date. Especially since 101 might be a bad guess - what if the user is from the UK or Canada? They may have entered 05/12 and meant December 5th, not May 12th.

Related

Include records occurring within a date period

50 records in databank. I prepared a query to select contracts with ending dates between 1/1/2019 and 12/31/2020. Some of the records have dates outside the 12/31/2020; 12/31/2021. I want those records included as they were active during the queried period.
The between query only returns records with the ending date of 12/31/2020. I changed the criteria to end period between 1/1/2019 and 12/31/2021 and not 12/31/2022. That returns records before end end date of 12/31/20 and outside the start of the end period of 1/1/2019.
I've tried about 10 other things (can't remember all of them) regardless am not getting the results I need.
I'm not VBA/SQL friendly, I'm a query kind of user. Sorry if that makes my question a little more difficult.
Thank you soooo much for any direction you can give me!!
select *, DATE_FORMAT(*datetime_column*,'%m/%d/%Y') from *table_name* where *datetime_column* between '1/1/2019' and '12/31/2020'
I think the format of date leads to 'query don't satisfy correct result' problem. You could convert the date to this format and check the result

There's duplicated query result in Microsoft Access while checking for time overlapping

I got a table with a huge list of equipment booking details. I wrote a SQL Query to display the desired result that I wanted: A type of the equipment with time overlapping of booking.
So I check for the time overlapping by duplicating my table in order for it to check against each other.
The result I gotten are kind of repetitive?
For instance,
May CLASHES Claire
May CLASHES Sherene
Claire CLASHES May
Claire CLASHES Sherene
Sherene CLASHES May
Those in bold are repetitive.
How can I modify my SQL query in order to resolve the issue?
Please kindly advise. Thank you!
SELECT DISTINCT *
FROM 2015, 2015 AS 2015_1
WHERE ([2015].Equipment Like '*Video cam*' Or [2015].Equipment Like '*video recorder*' Or [2015].Equipment Like '*camcorder*')
AND ([2015_1].Equipment Like '*Video cam*' Or [2015_1].Equipment Like '*video recorder*' Or [2015_1].Equipment Like '*camcorder*')
AND ([2015].[Loaned By]<>[2015_1].[Loaned By])
AND ([2015_1].[Start Time]<=[2015].[End Time])
AND ([2015_1].[End Time] Is Null Or [2015_1].[End Time]>=[2015].[Start Time]);
EDIT
My table is called 2015.
The variables are (Field Name - Data Type):
ID - Number
Loaned By - Text
Equipment - Text
Start Date - Date/Time
Start Time - Date/Time
End Date - Date/Time
End Time - Date/Time
Durations (hours) - Number
You can add the following condition:
[2015].EquipmentType < [2015_1].EquipmentType
This will order them alphabetically.
Your question doesn't have enough information to clearly specify the column.

applying knowledge of SQL for everyday workplace activities

My question is how to properly write a SQL query for the below highlighted/bold question.
There is a table in HMO database which stores doctor's working
hours.Table has following fields
"FirstName","LastName","Date","HoursWorked". write a sql statement
which retrieves average working hours for period January-March for a
doctor with name Joe Doe.
so far i have
SELECT HoursWorked
FROM Table
WHERE DATE = (January - March) AND
SELECT AVG(HoursWorked) FROM Table WHERE FirstName="Joe",LastName="Doe"*
A few pointers as this sounds like a homework question (which we don't answer for you here, but we can try to give you some guidance).
You want to put all the things you want to return from your select first and you want to have all your search conditions at the end.
So the general format would be :
SELECT Column1,
Column2,
Column3,
FROM YourTable
WHERE Column4 = Restriction1
AND Column5 = Restriction2
The next thing you need to think about is how the dates are formatted in your database table. Hopefully they're kept in a column of type datetime or date (options will depend on the database engine you're using, eg, Microsoft SQL Server, Oracle or MySql). In reality some older databases people use can store dates in all sorts of formats which makes this much harder, but since I'm assuming it's a homework type question, lets assume it's a datetime format.
You specify restrictions by comparing columns to a value, so if you wanted all rows where the date was after midnight on the 2nd of March 2012, you would have the WHERE clause :
WHERE MyDateColumn >= '2012-03-02 00:00:00'
Note that to avoid confusion, we usually try to format dates as "Year-Month-Day Hour:Minute:Second". This is because in different countries, dates are often written in different formats and this is considered a Universal format which is understood (by computers at least) everywhere.
So you would want to combine a couple of these comparisons in your WHERE, one for dates AFTER a certain date in time AND one for dates before another point in time.
If you give this a go and see where you get to, update your question with your progress and someone will be able to help get it finished if you have problems.
If you don't have access to an actual database and need to experiment with syntax, try this site : http://sqlfiddle.com/
you already have the answer written
SELECT AVG(HoursWorked) FROM Table WHERE FirstName="Joe",LastName="Doe"*
you only need to fix the query
SELECT AVG(HoursWorked) as AVGWORKED FROM Table WHERE FirstName='Joe' AND LastName='Doe'
That query will give you the average hours worked for Joe Doe, however you only need to get between some time you add the next "AND", if you are using SQL server you can use the built in function DateFromParts(year,month,day) to create a new Date, or if you are using another Database Engine you can convert a string to a DateColumn Convert(Date,'MM/dd/yyyy')
Example
SELECT AVG(HoursWorked) as AVGWORKED FROM Table WHERE FirstName='Joe' AND LastName='Doe' AND DateColumn between DateFromParts(year,month,day) and Convert(Date,'MM/dd/yyyy')
In the example i showed both approaches (datefromparts for the initial date, and convert(date) for the ending date).

SQL Like function Broken? or Limited?

I am trying to use the LIKE function to get data with similar names. Everything looks fine but the data I get in return is missing some values when I get back more than ~20 rows of data.
I have a very basic query. I just want data that starts with Lab, ideally for the whole day, or at least 12 hours. The code below misses some data and I cannot discern a pattern for what it picks to skip.
SELECT History.TagName, DateTime, Value FROM History
WHERE History.TagName like ('Lab%')
AND Quality = 0
AND wwRetrievalMode = 'Full'
AND DateTime >= '20150811 6:00'
AND DateTime <= '20150811 18:00'
To give you an idea of the data I am pulling, I have Lab.Raw.NTU, Lab.Raw.Alk, Lab.Sett.NTU, etc. Most of the data should have values at 6am/pm, 10am/pm, and 2am/pm. Some have more, few have less, not important. When I change the query to be more specific (i.e. only 1 hour window or LIKE "Lab.Raw.NTU") I get all of my data. Currently, this will spit out data for all tags and I get both 6am data and 6pm data, but certain values will be missing such as Lab.Raw.NTU at 6pm. There seem to be other data that is missing if I change the window for the previous day or the night shift, so I don't think it has to be with the data itself. Something weird is going on with the LIKE function but I have no idea what.
Is there another way to get the tagnames that I want besides like? Such as Tagname > Lab and Tagname <= Labz? (that gives me an error, so I am thinking not)
Please help.
It appears that you are using the Like operator correctly; that could be a red herring. Check the data type of the DateTime field. If it is character based such as varchar you are doing string comparisons instead of date comparisons, which could cause unexpected results. Try doing an explicit cast to ensure they are compared as dates:
DateTime >= convert(datetime, '20150811 6:00')

How to design SQL tables when column data arrives in multiple types/margins of error?

I've been given a stack of data where a particular value has been collected sometimes as a date (YYYY-MM-DD) and sometimes as just a year.
Depending on how you look at it, this is either a variance in type or margin of error.
This is a subprime situation, but I can't afford to recover or discard any data.
What's the optimal (eg. least worst :) ) SQL table design that will accept either form while avoiding monstrous queries and allowing maximum use of database features like constraints and keys*?
*i.e. Entity-Attribute-Value is out.
You could store the year, month and day components in separate columns. That way, you only need to populate the columns for which you have data.
if it comes in as just a year make it default to 01 for month and date, YYYY-01-01
This way you can still use a date/datetime datatype and don't have to worry about invalid dates
Either bring it in as a string unmolested, and modify it so it's consistent in another step, or modify the year-only values during the import like SQLMenace recommends.
I'd store the value in a DATETIME type and another value (just an integer will do, or some kind of enumerated type) that signifies its precision.
It would be easier to give more information if you mentioned what kind of queries you will be doing on the data.
Either fix it, then store it (OK, not an option)
Or store it broken with a fixed computed columns
Something like this
CREATE TABLE ...
...
Broken varchar(20),
Fixed AS CAST(CASE WHEN Broken LIKE '[12][0-9][0-9][0-9]' THEN Broken + '0101' ELSE Broken END AS datetime)
This also allows you to detect good from bad source data
If you don't always have a full date, what sort of keys and constraints would you need? Perhaps store two columns of data; a full date, and a year. For data that has only year, the year is stored and date is null. For items with full info, both are populated.
I'd put three columns in the table:
The provided value (YYYY-MM-DD or YYYY)
A date column, Date or DateTime data type, which is nullable
A year column, as an integer or char(4) depending upon your needs.
I'd always populate the year column, populate the date column only when the provided value is a date.
And, because you've kept the provided value, you can always re-process down the road if needs change.
An alternative solution would be to that of a date mask (like in IP). Store the date in a regular datetime field, and insert an additional field of type smallint or something, where you could indicate which is present (could go even binary here):
If you have YYYY-MM-DD, you would have 3 bits of data, which will have the values 1 if data is present and 0 if not.
Example:
Date Mask
2009-12-05 7 (111)
2009-12-01 6 (110, only year and month are know, and day is set to default 1)
2009-01-20 5 (101, for some strange reason, only the year and the date is known. January has 31 days, so it will never generate an error)
Which solution is better depends on what you will do with it.
This is better when you want to select those with full dates, which are between a certain period (less to write). Also this way it's easier to compare any dates which have masks like 7,6,4. It may also take up less memory (date + smallint may be smaller than int+int+int, and only if datetime uses 64 bit, and smallint uses up as much as int, it will be the same).
I was going to suggest the same solution as #ninesided did above. Additionally, you could have a date field and a field that quantitatively represents your uncertainty. This offers the advantage of being able to represent things like "on or about Sept 23, 2010". The problem is that to represent the case where you only know the year, you'd have to set your date to be the middle of the year, with 182.5 days' uncertainty (assuming non-leap year), which seems ugly.
You could use a similar but distinct approach with a mask that represents what date parts you're confident about - that's what SQLMenace offered in his answer above.
+1 each to recommendations from ninesided, Nikki9696 and Jeff Siver - I support all those answers though none was exactly what I decided upon.
My solution:
a date column used only for complete dates
an int column used for years
a constraint to ensure integrity between the two
a trigger to populate the year if only date is supplied
Advantages:
can run simple (one-column) queries on the date column with missing data ignored (by using NULL for what it was designed for)
can run simple (one-column) queries on the year column for any row with a date (because year is automatically populated)
insert either year or date or both (provided they agree)
no fear of disagreement between columns
self explanatory, intuitive
I would argue that methods using YYYY-01-01 to signify missing data (when flagged as such with a second explanatory column) fail seriously on points 1 and 5.
Example code for Sqlite 3:
create table events
(
rowid integer primary key,
event_year integer,
event_date date,
check (event_year = cast(strftime("%Y", event_date) as integer))
);
create trigger year_trigger after insert on events
begin
update events set event_year = cast(strftime("%Y", event_date) as integer)
where rowid = new.rowid and event_date is not null;
end;
-- various methods to insert
insert into events (event_year, event_date) values (2008, "2008-02-23");
insert into events (event_year) values (2009);
insert into events (event_date) values ("2010-01-19");
-- select events in January without expressions on supplementary columns
select rowid, event_date from events where strftime("%m", event_date) = "01";