Athena Date Partition Without Extra Bits - amazon-s3

I have a table with a projected date partition (p_date) that I'm trying to insert values into. When i insert values into this table and specify a string value for the p_date it complains that I am attempting to insert a varchar into a timestamp column (fair). But when I convert the value to a timestamp and do the same insert it adds an unwanted millis value to the end of the timestamp.
-- ERROR varchar cannot be inserted into timestamp
INSERT INTO blah
(p_date)
VALUES
('2021-01-01 00:00:00');
-- Not error. But adds unwanted `.0` to s3 key
INSERT INTO blah
(p_date)
VALUES
(timestamp '2021-01-01 00:00:00');
Here is what that looks like in S3:
How can I insert rows into this table at the correct p_date partition without changing that field to a string or getting extra bits on the end?

Does your partition key have the type TIMESTAMP? In that case Athena will format the values as it formats timestamps. If you want it to format them as dates you can use the DATE type instead.

Related

Dates inserting incorrectly - SQL

Dates are not inserting correctly to table, any explanation / solution?
create table test
(
ID bigint,
MarketOpen datetime
);
insert into test (ID, MarketOpen)
values (1, 2019-01-19-11-40-00);
select * from test;
Fiddle
Thats totally the wrong way to enter a date. SQL Server is treating your current syntax as a calculation e.g. 2019-01-19-11-40-00=1948 and then converting the number 1948 to a datetime. You need to use a formatted string e.g.
insert into #test (ID, EventId, MarketId, RaceDate, MarketOpen, HorseID)
values
(1, 123, 153722767, '2019-01-19 11:40:00', '2019-01-18 11:40:00', 34434);
Note: As mentioned by seanb its best practice to use a non-ambiguous format when specifying dates and the ISO format (yyyymmdd) is probably the best of these.

Bulk Inserting data to table which have default current timestamp column

I have a table on redshift with following structure
CREATE TABLE schemaName.tableName (
some_id INTEGER,
current_time TIMESTAMP DEFAULT GETDATE()
);
If I bulk insert data from other table for example
INSERT INTO schemaName.tableName (some_id) SELECT id FROM otherSchema.otherTable;
Will the value for current_time column be same for all bulk inserted rows? Or it will depend on insertion time for each record. As the column data-type is TIMESTAMP
I am considering this for Amazon Redshift only.
So far I have tested with changing the default value of current_time column to SYSDATE and bulk inserting 10 rows to target table. current_time column values per row yields results like 2016-11-16 06:38:52.339208 and are same for each row, where GETDATE() yields result like 2016-11-16 06:43:56. I haven't found any documentation regarding this and need confirmation regarding this.
To be precise, all rows get same timestamp values after executing following statement
INSERT INTO schemaName.tableName (some_id) SELECT id FROM otherSchema.otherTable;
But if I change the table structure to following
CREATE TABLE schemaName.tableName (
some_id INTEGER,
current_time DOUBLE PRECISION DEFAULT RANDOM()
);
rows get different random values for current_time
Yes. Redshift will have same default value in the case of bulk insert. The RedshiftDocumentation has the below content:
the evaluated DEFAULT expression for a given column is the same for
all loaded rows, a DEFAULT expression that uses a RANDOM() function
will assign to same value to all the rows.

Postgres Data type conversion

I have this dataset that's in a SQL format. However the DATE type needs to be converted into a different format because I get the following error
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
ERROR: date/time field value out of range: "28-10-96"
LINE 58: ...040','2','10','','P13-00206','','','','','1-3-95','28-10-96'...
^
HINT: Perhaps you need a different "datestyle" setting.
I've definitely read the documentation on date format
http://www.postgresql.org/docs/current/static/datatype-datetime.html
But my question is how do I convert all of the dates in a proper format without going through all the 500 or so data rows and making sure each one is correct before inserting into a DB. Backend is handle by rails, but I figured going through SQL to cleaning it up will be best here.
I have a CREATE TABLE statement above this dataset, and mind you the data set was given to be via a DBF converter/external source
Here's part of my dataset
INSERT INTO winery_attributes
(ID,NAME,STATUS,BLDSZ_ORIG,BLDSZ_CURR,HAS_CAVE,CAVESIZE,PROD_ORIG,PROD_CURR,TOUR_TASTG,VISIT_DAY,VISIT_WEEK,VISIT_YR,VISIT_MKTG,VISIT_NMEV,VISIT_ALL,EMPLYEENUM,PARKINGNUM,WDO,LAST_UP,IN_CITYBDY,IN_AIASP,NOTES,SMLWNRYEXM,APPRV_DATE,ESTAB_DATE,TOTAL_SIZE,SUBJ_TO_75,GPY_AT_75,AVA,SUP_DIST)
VALUES
(1,'ACACIA WINERY','PROD','8000','34436','','0','50000','250000','APPT','75','525','27375','3612','63','30987','22','97','x','001_02169-MOD_AcaciaWinery','','','','','1-11-79','1-9-82','34436','x','125000','Los Carneros','1');
INSERT INTO winery_attributes
(ID,NAME,STATUS,BLDSZ_ORIG,BLDSZ_CURR,HAS_CAVE,CAVESIZE,PROD_ORIG,PROD_CURR,TOUR_TASTG,VISIT_DAY,VISIT_WEEK,VISIT_YR,VISIT_MKTG,VISIT_NMEV,VISIT_ALL,EMPLYEENUM,PARKINGNUM,WDO,LAST_UP,IN_CITYBDY,IN_AIASP,NOTES,SMLWNRYEXM,APPRV_DATE,ESTAB_DATE,TOTAL_SIZE,SUBJ_TO_75,GPY_AT_75,AVA,SUP_DIST)
VALUES
('2','AETNA SPRING CELLARS','PROD','2500','2500','','0','2000','20000','TST APPT','0','3','156','0','0','156','1','10','x','','','','','x','1-4-86','1-6-86','2500','','0','Napa Valley','3');
INSERT INTO winery_attributes
(ID,NAME,STATUS,BLDSZ_ORIG,BLDSZ_CURR,HAS_CAVE,CAVESIZE,PROD_ORIG,PROD_CURR,TOUR_TASTG,VISIT_DAY,VISIT_WEEK,VISIT_YR,VISIT_MKTG,VISIT_NMEV,VISIT_ALL,EMPLYEENUM,PARKINGNUM,WDO,LAST_UP,IN_CITYBDY,IN_AIASP,NOTES,SMLWNRYEXM,APPRV_DATE,ESTAB_DATE,TOTAL_SIZE,SUBJ_TO_75,GPY_AT_75,AVA,SUP_DIST)
VALUES
('3','ALTA VINEYARD CELLAR','PROD','480','480','','0','5000','5000','NO','0','4','208','0','0','208','4','6','x','003_U-387879','','','','','2-5-79','1-9-80','480','','0','Diamond Mountain District','3');
INSERT INTO winery_attributes
(ID,NAME,STATUS,BLDSZ_ORIG,BLDSZ_CURR,HAS_CAVE,CAVESIZE,PROD_ORIG,PROD_CURR,TOUR_TASTG,VISIT_DAY,VISIT_WEEK,VISIT_YR,VISIT_MKTG,VISIT_NMEV,VISIT_ALL,EMPLYEENUM,PARKINGNUM,WDO,LAST_UP,IN_CITYBDY,IN_AIASP,NOTES,SMLWNRYEXM,APPRV_DATE,ESTAB_DATE,TOTAL_SIZE,SUBJ_TO_75,GPY_AT_75,AVA,SUP_DIST)
VALUES
('4','BLACK STALLION','PROD','43600','43600','','0','100000','100000','PUB','50','350','18200','0','0','18200','2','45','x','P13-00391','','','','','1-5-80','1-9-85','43600','','0','Oak Knoll District of Napa Valley','3');
INSERT INTO winery_attributes
(ID,NAME,STATUS,BLDSZ_ORIG,BLDSZ_CURR,HAS_CAVE,CAVESIZE,PROD_ORIG,PROD_CURR,TOUR_TASTG,VISIT_DAY,VISIT_WEEK,VISIT_YR,VISIT_MKTG,VISIT_NMEV,VISIT_ALL,EMPLYEENUM,PARKINGNUM,WDO,LAST_UP,IN_CITYBDY,IN_AIASP,NOTES,SMLWNRYEXM,APPRV_DATE,ESTAB_DATE,TOTAL_SIZE,SUBJ_TO_75,GPY_AT_75,AVA,SUP_DIST)
VALUES
('5','ALTAMURA WINERY','PROD','11800','11800','x','3115','50000','50000','APPT','0','20','1040','0','0','1040','2','10','','P13-00206','','','','','1-3-95','28-10-96','14915','x','50000','Napa Valley','4');
The dates in your data set are in the form of a string. Since they are not in the default datestyle (which is YYYY-MM-DD) you should explicitly convert them to a date as follows:
to_date('1-5-80', 'DD-MM-YY')
If you store the data in a timestamp instead, use
to_timestamp('1-5-80', 'DD-MM-YY')
If you are given the data set in the form of the INSERT statements that you show, then first load all the data as simple strings into varchar columns, then add date columns and do an UPDATE (and similarly for integer and boolean columns):
UPDATE my_table
SET estab = to_date(ESTAB_DATE, 'DD-MM-YY'), -- column estab of type date
apprv = to_date(APPRV_DATE, 'DD-MM-YY'), -- etc
...
When the update is done you can ALTER TABLE to drop the text columns with dates (integers, booleans).

Covert a column of string dates

I have 2 columns in access that are saved as string dates yyyymmdd. I am linking the table to a oracle database and need to coveret the columns on insert to look like yyyy/mm/dd.
I am trying:
INSERT INTO TEST
(DATE) Values (20110818, To_DATE("YYYY/MM/DD"))
FROM TEST_DATE
I want to convert the entire column on insert from access into oracle
Try like this
INSERT INTO TEST
(DATE)
SELECT TO_DATE('20110818','YYYYMMDD')
FROM TEST_DATE

is it possible to explicity insert values in timestamp column of DB2

is it possible to explicity insert values in timestamp column of DB2? For example i have a date time value '2\11\2005 4:59:36 PM'. How to convert it to timestamp value in DB2?
Thanks in advance
Another way to specify insert into timestamp field:
insert into mytable (timestamp_field, ...) values ('2018-07-25-14.56.11.000000', ...)
INSERT INTO TimeStampTable(TIMESTAMPFIELD, ...)
VALUES ((TIMESTAMP(CAST('04.02.2005' AS VARCHAR(10)),'13:14:53')),...)