I want to start by saying my SQL knowledge is limited (the sololearn SQL basics course is it), and I have fallen into a position where I am regularly asked to pull data from the SQL database for our ERP software. I have been pretty successful so far, but my current problem is stumping me.
I need to filter my results by having the date match from 2 separate tables.
My issue is that one of the tables outputs DATETIME with full time data. e.g. "2022-08-18 11:13:09.000"
While the other table zeros the time data. e.g. "2022-08-18 00:00:00.000"
Is there a way I can on the fly convert these to just a DATE e.g. "2022-08-18" so I can set them equal and get the results I need?
A simple CAST statement should work if I understand correctly.
CAST( dateToConvert AS DATE)
Related
I have a two tables in a database in AWS Athena that I want to join.
I want to join them by several columns, one of them being date.
However in one data set the date string is encoded for single value months is encoded as
"08/31/2018"
While the other would have it encoded as
"8/31/2018"
Is there a way to make them the same format?
I am unsure if it is easier to add the extra 0 to strings which have lack the extra 0 or to concatenate strings which have the extra 0.
Based on what I have researched I think I will have to use the CASE and CONCAT functions.
Both of the tables were loaded into the database from a CSV file, and the variables are in the string format.
I have tried changing the values manually in the CSV file, tried running an R script on one of the tables to format the date in the same way, and have also tried re-loading the tables into the database as the same date format.
However no matter what I do whenever it is loaded into the database, even when they have the same date type, it always loads them with different formats.
One with the the extra 0 and the other without it.
The last avenue I haven't tried is through a SQL query.
However I am not well versed in Athena and am having a hard time formatting this query.
I know this is rather vague, so please ask me for more information if you need.
If someone could help me start this query I would be grateful.
Thank you for the help.
Here is the query for changing dates in Athena.
date_parse(table.date_variable,'%m/%d/%Y')
Though Athena tables are immutable once created.
You can convert the value to date using date_parse(). So, this should work:
date_parse(t1.datecol, '%m/%d/%Y') = str_to_date(t2.datecol, '%m/%d/%Y')
Having said that, you should fix the data model. Store dates as dates not as strings! Then you can use an equality join and that is just better all around.
IN Netezza , I am trying to check if date value is valid or not ; something like ISDATE function in SQL server.
I am getting dates like 11/31/2013 which is not valid, how in Netezza I can check if this date is valid so I exclude them from my process.
Thanks
I don't believe there is a built-in Netezza function to check if a date is valid. You may be able to write a LUA function to do this, or you could try joining to a "Date" lookup table, like so:
Create a table with two columns:
DATE_VALUE date
DATE_STRING varchar(10)
Load data into this table for valid dates (generate a file in your favorite tool, excel, unix, whatever). There can even be more than one row per DATE_VALUE (different "valid" formats) if all you use this for is this check. If you fill in from, say, 1900 to 2100, as long as your data is within that range, you'll be fine. And it's a small table, too, for ~200 years only ~7300 rows. Add more if needed. Heck, since the NZ date datatype goes from AD1 to AD 9999, you could fill it completely with only 3.4 million rows (small for NZ).
Then, to isolate rows that have invalid dates, just use a JOIN or an EXISTS / NOT EXISTS to this table, on DATE_STRING. Since the table is so small, netezza will likely broadcast it to all SPUs, making the performance impact trivial.
Netezza Analytics Package 3.0 (free download) comes with a couple LUA functions that verify date values: isdate() and todate(). Very simple to install / compile.
I was wondering if there was a way to store a date (example: 01/01/2013) as datetime without SQL Server CE adding the time (example: 12:00:00 AM).
I could always store it as the string "01/01/2013" but I really want to be able to compare the dates on querying the database.
I realize that as long as I only stored the date part, all of the times in the datetime field would have equal values (i.e. 12:00:00 AM), so comparing them wouldn't be a problem and I could just always ignore the time part, however, it seems ridiculous to have this unnecessary data appended to every entry in the table.
Is there a way to store only the date part of the datetime as datetime so that the dates can still be compared in the SQL query or do I just need to live with this overhead and move on?
Side Note:
I just spent the last 30 minutes searching Google and SO for an answer I was sure was already out there, but to my surprise, I couldn't find anything on this issue.
Update:
The conclusion I have come to is that I will just accept the time in the datetime format and let it always default to 12:00:00 AM by only adding the date part during the INSERT statement (e.g. 01/01/2013). As long as the time part always remains the same throughout, the dates will still be easily comparable and I can just trim it up when I convert it to string for screen display. I believe this will be the easiest way to handle this scenario. After all, I decided to use SQL for the power of its queries, otherwise, I might have just used XML instead of a database, in the first place.
No you really can't get rid of the time component. It is part of the data type defined by sql server. I was very annoyed by it until I found that I could still display the dates without the time using JQuery to reformat them with the date formatter plugi:
https://github.com/phstc/jquery-dateFormat
Good Luck!
select CONVERT(date, GETDATE())
I have a project that I am working on that requires me to delete records from the database if they are atleast 3 years old.
I have something like this in DB2 SQL to get the date:
SELECT * FROM tableA
WHERE ADD_DATE < CHAR(CURRENT DATE-3 YEARS)
ADD_DATE is stored as Characters in my system, this is why I am converting
I know it is also possible to get the date and format it in VB.net which is the language I am using to call the SQL statements.
My question is whether it would be faster/better to get the date and perform the conversion inside the SELECT in SQL or would it be better to get the current date and convert it in VB.net and then use that date in the SQL statement. I'm thinking VB.net would be better because there are thousands of records that must be compared. I should be able to set it up in VB so that it only retrieves the date and converts it once but I am not sure what kind of performance hit each takes from these statements.
Thanks in advance.
If all you are doing with a call to the database would be getting the date, then it would be faster to get it client-side and avoid the round-trip to the database.
If you do it server side and you're comparing your date in a single set-based operation then the time difference for that is negligible. If you do the check in something loop-based (a cursor or something) then you'll be wasting time.
It doesn't sound like this is applicable to you, but for future reference be sure to take into consideration the possibility of the client and the database server being in different timezones. It could be safer to do it one way or the other based on the time zone your data is generated for.
Doing a "Now" in VB.Net will definitely be faster than hitting the database.
Is it best practice to split a dateTime in two datetime SQL columns?
For example, 2010-12-17 01:55:00.000 is put in two colums,
one column containing a datetime for
the date portion: 2010-12-17 00:00:00.000
one column containing a datetime
for the time portion: 1900-01-01 01:55:00.000
I'm being told this is best practice because at some point SQL 2000 didn't allow to put time in a date? and that there are even data storage standards that enforce this and that some companies have ensure that all their data is stored in that manner to comply to some data storage standards?
If this is the case, I'm sure someone heard about it here, any of this sounds familiar?
In sql server 2008 you have date and time data types so this becomes a non issue. datetime always allowed for time even back in sql server 6 and 7
the reason people split it up is because with everything in 1 column a query that returns all orders placed between 3 and 4 PM for any day requires a scan, with a time column this can be accomplished with a seek (much, much faster)
Starting in SQL 2005 I would do only one column.
If you wanted this information to be Sargable I would use computed columns instead. This way you can query on date or time or both and your application code is only responsible for maintaining the one column.
I know this is old, but another reason you might want to keep separate is for user input (and GenEric said in a comment that this is for time management). If you allow users to enter date/time as separate fields, and you want to be able to save the data with either field being empty, it is nice to have 2 separate null-able fields in your database. Otherwise I guess you either have to resort to kludges where certain date values equal "empty" or add extra bit fields as "no time / no date" flags.