SQL Server 2005 Import from Excel - sql

I'd like to know what my best option would be to import data from an excel file on a weekly or monthly basis. At first, I thought I would use SSIS, but after much struggle with seemingly simple tasks, I'm starting to rethink my plan. Would it be better/easier to just write the SQL by hand or use the services of an SSIS package? The basic process will be as follows:
A separate process will download an .xls file to a local fileshare.
The xls file will have a filename like: 'myfilename MON YY'.
I will need to read the month and year from the the filename, reformat it to a sql date and then query a DimDate table to find the corresponding date key.
For each row (after the first 2 header rows), insert the data with the date key, unless the row is a total row, then ignore.
Here are some of the issues I've been encountering with SSIS:
I can parse the date string from a flat file datasource, but can't seem to do it with an excel data source. Also, once parsed, i cannot seem to convert the string to a date in order to perform the lookup for the date key. For example, I want to do something like this:
select DateKey from DimDate
where ActualDate = convert(datetime, '01-' + 'JAN-10', 120)
but i don't think it is possible to use the 'convert' or 'datetime' keywords in an expression builder. I have been also unable to find where I can edit the SQL to ignore the first 2 rows of data.
I'm very skeptical of using SSIS because it seems like a Kludgy way of doing something that can probably be accomplished more efficiently writing the SQL yourself, but I may be forced to use SSIS. Thoughts?

SSIS is definitely the direction to go.
To hit on your problems: (DT_DBTIMESTAMP) is the conversion you want. The syntax is a bit different. For instance to convert your example date I would use:
(DT_DBTIMESTAMP)"01/01/2010"
If you use that expression in a derived column to replace your string date (or create a new column), you could then do a lookup against datetime columns in a DB.
If you need to exclude the first two rows, you will either need to write an SQL statement to query the file (as opposed to an excel file reader source) or use a conditional split to throw them away based on any condition that can be repeated with every import.
Flat files easier to work with, and do allow you to throw away x number of initial rows.

Related

Is it possible in SSMS to pull a DATETIME value as DATE?

I want to start by saying my SQL knowledge is limited (the sololearn SQL basics course is it), and I have fallen into a position where I am regularly asked to pull data from the SQL database for our ERP software. I have been pretty successful so far, but my current problem is stumping me.
I need to filter my results by having the date match from 2 separate tables.
My issue is that one of the tables outputs DATETIME with full time data. e.g. "2022-08-18 11:13:09.000"
While the other table zeros the time data. e.g. "2022-08-18 00:00:00.000"
Is there a way I can on the fly convert these to just a DATE e.g. "2022-08-18" so I can set them equal and get the results I need?
A simple CAST statement should work if I understand correctly.
CAST( dateToConvert AS DATE)

Converting a STRING to DATE in Big Query [duplicate]

Been struggling with some datasets I want to use which have a problem with the date format.
Bigquery could not load the files and returned the following error:
Could not parse '4/12/2016 2:47:30 AM' as TIMESTAMP for field date (position 1) starting at location 21 with message 'Invalid time zone:
AM'
I have been able to upload the file manually but as strings, and now would like to set the fields back to the proper format, However, I just could not find a way to change the format of the date column from string to proper DateTime format.
Would love to know if this is possible as the file is just too long to be formatted in excel or sheets (as I have done with the smaller files from this dataset).
now would like to set the fields back to the proper format ... from string to proper DateTime format
Use parse_datetime('%m/%d/%Y %r', string_col) to parse datetime out of string
If applied to sample string in your question - you got
As #Mikhail Berlyant rightly said, using the parse_datetime('%m/%d/%Y %r', string_col)
function will convert your badly formatted dates to a standard format as per ISO 8601 accepted by Google Bigquery . the best option will then be to save these query results to a new table on the database in your Bigquery Project.
I had a similar issue.
Below is an image of my table which i uploaded with all columns in String format .
Next up was that i applied the following settings to the query below
The Settings below stored the query output to a new table called heartrateSeconds_clean on the same dataset
The Write if empty option is a good option to avoid overwriting the existing raw data or just arbitrarily writing output to a temporary table, except if you are sure you want to do so. Save the settings and proceed to Run your Query.
As seen below, the output schema of the new table is automatically updated
Below is the new preview of the resulting table
NB: I did not apply an ORDER BY clause to the Results hence the data is not ordered by any specific column in both versions of the same table.
This dataset has over 2M rows.

AWS Athena - How to change format of date string

I have a two tables in a database in AWS Athena that I want to join.
I want to join them by several columns, one of them being date.
However in one data set the date string is encoded for single value months is encoded as
"08/31/2018"
While the other would have it encoded as
"8/31/2018"
Is there a way to make them the same format?
I am unsure if it is easier to add the extra 0 to strings which have lack the extra 0 or to concatenate strings which have the extra 0.
Based on what I have researched I think I will have to use the CASE and CONCAT functions.
Both of the tables were loaded into the database from a CSV file, and the variables are in the string format.
I have tried changing the values manually in the CSV file, tried running an R script on one of the tables to format the date in the same way, and have also tried re-loading the tables into the database as the same date format.
However no matter what I do whenever it is loaded into the database, even when they have the same date type, it always loads them with different formats.
One with the the extra 0 and the other without it.
The last avenue I haven't tried is through a SQL query.
However I am not well versed in Athena and am having a hard time formatting this query.
I know this is rather vague, so please ask me for more information if you need.
If someone could help me start this query I would be grateful.
Thank you for the help.
Here is the query for changing dates in Athena.
date_parse(table.date_variable,'%m/%d/%Y')
Though Athena tables are immutable once created.
You can convert the value to date using date_parse(). So, this should work:
date_parse(t1.datecol, '%m/%d/%Y') = str_to_date(t2.datecol, '%m/%d/%Y')
Having said that, you should fix the data model. Store dates as dates not as strings! Then you can use an equality join and that is just better all around.

Check date value in Netezza

IN Netezza , I am trying to check if date value is valid or not ; something like ISDATE function in SQL server.
I am getting dates like 11/31/2013 which is not valid, how in Netezza I can check if this date is valid so I exclude them from my process.
Thanks
I don't believe there is a built-in Netezza function to check if a date is valid. You may be able to write a LUA function to do this, or you could try joining to a "Date" lookup table, like so:
Create a table with two columns:
DATE_VALUE date
DATE_STRING varchar(10)
Load data into this table for valid dates (generate a file in your favorite tool, excel, unix, whatever). There can even be more than one row per DATE_VALUE (different "valid" formats) if all you use this for is this check. If you fill in from, say, 1900 to 2100, as long as your data is within that range, you'll be fine. And it's a small table, too, for ~200 years only ~7300 rows. Add more if needed. Heck, since the NZ date datatype goes from AD1 to AD 9999, you could fill it completely with only 3.4 million rows (small for NZ).
Then, to isolate rows that have invalid dates, just use a JOIN or an EXISTS / NOT EXISTS to this table, on DATE_STRING. Since the table is so small, netezza will likely broadcast it to all SPUs, making the performance impact trivial.
Netezza Analytics Package 3.0 (free download) comes with a couple LUA functions that verify date values: isdate() and todate(). Very simple to install / compile.

How to parse a string and create rows in SQL (postgres)

I have a single database field that contains a start date, end date, and exclusions in the form
available DD/MONTH/YYYY [to DD/MONTH/YYYY]?[, exclude WORD [, WORD]*]?
Meaning it always starts with "available DD/MONTH/YYYY", optionally has a single "to DD/MONTH/YYYY", and optionally has an exclude clause that is a comma separated list of strings. Think regular expression meanings for + , *, and ?
I have been tasked with extracting the data out so we will now have a "startdate" column, "enddate" column, and a new table that will contain the exclusions. It will need to fill the startdate and enddate columns with the values parsed from the availability string. It will also need to create multiple records in the new exclusion table, one for each of the comma separate values after the 'exclude' key word in the availability string.
Is this a migration I can do in SQL only (postgres 8.4)?
This is against postgres 8.4.
Update: With the help of a co-worker we now have a sql script that has as it's results sql to perform the insert statements based on the parsing of the exclusions. It uses a bunch of case statements and string manipulation within the sql to generate the results. I then send the output to a file and execute that file to perform the inserts. I am doing the same for the start and end date columns.
It's not 100% sql, but a simple .bat or .sh file that runs the first .sql file, then the generated one is all that is needed to get it to go.
Thanks for the input.
You can probably do that with a combination of the regexp functions ( and the to_date() or to_timestamp() functions.
But it may be easier to just mangle the text in a function in say pl/perl. That'll get you access to the full manipulation functions in perl, while keeping the work inside the database as seems to be your specification.
why single SQL?
Write simple script in Ruby/Python/Basic to read data from the source, parse it, and put into destination database.
Or it is so big?