Are there SQL datatypes that don't work with R? - sql

I am trying run an sqlQuery in Rstudio which seems to crash the program. I want to use the RODBC package to import a name called package name and elapsed time from a Oracle database. When I try to do an sqlQuery such as the following
dataframe <- sqlQuery(channel,
"select package_name, elapsed_time from fooSchema.barTable")
When I run this with just the package_name or other fields in the table, it works fine. If I try to run this with the elapsed_time, RStudio crashes. The datatype of elapsed_time is INTERVAL DAY (3) TO SECOND (6) so one record for example looks like this, "+000 00:00:00.22723"
Are there certain data types, such as Interval Day to Second, from Oracle that don't work in RStudio or R in general?

The problem isn't R, Rstudio, or even RODBC. The problem is that Oracle doesn't support interval data types for ODBC connections.
It is under section E.1
https://docs.oracle.com/cd/B28359_01/server.111/b32009/app_odbc.htm#CIHBFHCG
To get back to your question in a more general sense. Base R supports Date, POSIXct, and POSIXlt objects.
Dates and POSIXct objects are stored as the number of days/seconds respectively since 1/1/1970 whereas POSIXlt is a list of elements.
Whatever SQL connector you're using will need to coerce the SQL version of a date and time into one of the above. Sometimes it'll just convert to a character string. For instance with RPostgreSQL it'll take columns stored as Postgre's Date type as a character but Postgres timestamp columns will be coerced into POSIXct directly.

Related

Pulling a SQL Server date to be used in Oracle query within SSIS environment is ignored in the Oracle query

I am having trouble in my Oracle query that uses a variable stored in SSIS which has a date that is pulled from sql server.
I am using an execute sql task that simply gets a max date from a sql server table and stores it in a variable. E.g.
SELECT MAX(t.Date) FROM table t;
I then want to use that variable in my Oracle query which is an ADO.NET source connection. I noticed you can't parameterize in those connections and found the work around where you use the sql expression with your user variable in it. So now my Oracle source query looks something like this:
"SELECT DISTINCT t.* FROM table t WHERE TO_CHAR(t.LastUpdateDate, 'YYYY-MM-DD') > " + "'#[User::LastUpdateDate]'"
The query syntax itself is fine, but when I run it, it is pulling all rows and seems to be completely ignoring the where clause of the date.
I've tried removing the TO_CHAR from LastUpdateDate.
I've tried adding a TO_CHAR to my user variable #[User::LastUpdateDate].
I've tried using the CONVERSION() function from sql server on #[User::LastUpdateDate].
Nothing seems to work and the query just runs and pulls in all data as if I don't have the WHERE clause on the query.
Does anyone know how to rectify this issue or point out what I might be doing wrong?
Thank you for any and all help!
**EDIT:
My date being pulled from SQL Server is in this format: 2022-09-01 20:17:58.0000000
This is not an answer, just troubleshooting advice
You do not say what data type #[User::LastUpdateDate] is, I'll assume it's a datetime
Ideally all datetime data should be kept in datetime data types, then format becomes completely irrelevant. However since it's difficult to parameterise Oracle queries in SSIS, you have to concoct a string to be submitted. Now date format does become important.
On to something a little different, it is a very good habit performancewise, to not put functions around columns that you are searching on. This is called sargability - look it up.
Given these things, I suggest that you concoct your required SQL query bit by bit and troubleshoot.
First, format your date parameter as an Oracle date literal. Remember this is normally a bad and unecessary thing. We are only doing it because we have to concoct a SQL string.
So create another SSIS variable called strLastUpdateDate and put this hideous expression in it:
RIGHT("0" + (DT_STR,2,1252)DATEPART( "dd" , #[User::LastUpdateDate] ), 2) + '-' +
(DT_STR,3,1252)DATEPART( "mmm" , #[User::LastUpdateDate] ) + '-' +
(DT_STR,4,1252)DATEPART("yyyy" , #[User::LastUpdateDate] )
Yes this is ludicrously long code but it will turn your date variable into a Oracle string literal. You could simplify this by putting it into your original max query but lets not go there. Use whatever debugging technique you have to confirm that it works as expected.
Now you should be able to use this:
"SELECT t.*, '"+#[User::LastUpdateDate]+"' As MyStrDate FROM table t WHERE
t.LastUpdateDate > '" #[User::strLastUpdateDate] + "'"
You can try running that and see if it makes any difference. Make sure you use this https://dba.stackexchange.com/questions/8828/how-do-you-show-sql-executing-on-an-oracle-database to monitor what is actually being submitted to Oracle.
This is all from memory and googling - I haven't done SSIS for many years now
I suspect after all this you may still have the same problem because I recall from many years having the same mysterious issue.

Odd error with casting to timestamp in standard SQL/Tableau

The latest version of Tableau has started using standard SQL when it connects to Google's BigQuery.
I recently tried to update a large table but found that there appeared to be errors when trying to parse datetimes. The table originates as a CSV which is loaded into BigQuery where further manipulations happen. The datetime column in the original CSV contain strings in ISO standard date time format (basically yyyy-mm-dd hh:mm). This saves a lot of annoying manipulation later.
But on trying to convert the datetime strings in Tableau into dates or datetimes I got a bunch of errors. On investigation they seemed to come from BigQuery and looked like this:
Error: Invalid timestamp: '2015-06-28 02:01'
I thought at first this might be a Tableau issue so I loaded a chunk of the original CSV into Tableau directly where the conversion of the string to a data worked perfectly well.
I then tried simpler versions of the conversion (to a year rather than a full datetime) and they still failed. The generated SQL for the simplest conversion looks like this:
SELECT
EXTRACT(YEAR
FROM
CAST(`Arrival_Date` AS TIMESTAMP)) AS `yr_Arrival_Date_ok`
FROM
`some_dataset`.`some_table` `some_table`
GROUP BY
1
The invalid timestamp in the error message always looks to me like a perfectly valid timestamp. And further analysis suggests it doesn't happen for all the rows in the source table, just occasional ones.
This error did not appear in older versions of Tableau/BigQuery where legacy SQL was the default for Tableau. So i'm presuming it is a consequence of standard SQL.
So is there an intermittent problem with casting to timestamps in BigQuery? Or is this a Tableau problem which causes the SQL to be incorrectly formatted? And what can I do about it?
The seconds part in the canonical timestamp representation required if the hour and minute are also present. Try this instead with PARSE_TIMESTAMP and see if it works:
SELECT
EXTRACT(YEAR
FROM
PARSE_TIMESTAMP('%F %R', `Arrival_Date`)) AS `yr_Arrival_Date_ok`
FROM
`some_dataset`.`some_table`.`some_table`
GROUP BY
1

Set timezone as IST (+5:30) in DB Browser for SQLite

I have been searching for a setting in DB BRowser for SQLite on how to change the timezone to IST (Indian Standard Time +5:30) Is there a way to set it directly without running any queries? I also found some SQL queries that can convert the db time to IST but almost all are SELECT statements. I am looking for a setting to change the timezone permanently and if that is not possible then may be an update query which can read all records in the database and change/convert/replace all times to IST. Can someone shed some light on it?
My field name is "expire_time" set as DATETIME NOT NULL in CREATE TABLE
What I searched for was
INSERT INTO MyTable(MyColumn) VALUES(datetime(CURRENT_TIMESTAMP, 'localtime'))
but I am not looking for insert statement
SELECT datetime(1092941466, 'unixepoch', 'localtime');
but I am not looking for select statement
Please help me either with a setting (if available in DB Browser for SQLite) or an update query that can change all times from GMT TO IST.
Thanks.
EDIT
SQLite has no DATETIME type. And it treats datatypes very different from other DBMS. For example
CREATE TABLE T (
Field MYTYPE
);
will run OK. Sqlite is applying so called datatype affinity https://www.sqlite.org/datatype3.html#affinity to figure out one of the implemented datatypes it will use instead of stuff specified it CREATE TABLE. DATETIME (as well as MYTYPE) affinity is NUMERIC - a special affinity which means column can store any type you want, TEXT for example.
This boils down the only way to work with DATETIME in Sqlite is datetime functions. And those functions use default timezone UTC. Any other timezone must be provided explicitly as a part of the datetime string. No PRAGMA or something to change this default.
EDIT
If expire_time is currently a string expression of UTC time you can get specific timezone text value, for example
select datetime(expire_time, '+05 hours','+30 minutes') || ' IST' as t
Note datetime(d,'utc') will most probably return NULL if string d contains explicit timezone. So i advice you standardize on storing datetime as UTC in DB and convert it to different timezone needed only when generating an output. This way you have all Sqlite toolbelt at your disposal.

Spark time datatype equivalent to MYSQL TIME

I am importing data to spark from MYSQL through JDBC and one of the column has time type (SQL type TIME and JDBC type java.sql.Time) with large hour value (Eg: 168:03:01). Spark convert them to timestamp format and causing error while reading three digit hour.How to deal with Time type in Spark
Probably your best shot at this moment is to cast data before it is actually read by Spark and parse it directly in your application. JDBC data source allows you to pass a valid subquery as a dbtable option or table argument. It means you can do for example something similar to this:
sqlContext.read.format("jdbc").options(Map(
"url" -> "xxxx",
"dbtable" -> "(SELECT some_field, CAST(time_field AS TEXT) FROM table) tmp",
))
and use some combination of built-in functions to convert it in Spark to a type that is applicable for your application.

How to convert a SQL field stored as a tick into a date

I am exporting info from a database that has field with a birthdate stored as a tick. I need to convert this to a regular date. I can either do it in the SQL statement (if there is a way?) or convert it into excel since I will import my data there?
EDIT: Sorry, not very familiar with ticks, so wasn't sure what info to include. The database is postgreSQL. Just putting it in a format mm/dd/yyyy is fine (excel should understand that).
You need to convert the .NET ticks to UNIX Epoch and then to a postgreSQL timestamp.
to_timestamp((("date" - 621355968000000000) / 10000000))
This code will work with version 9 and above