how to update the previous rows with last_modified date column having null values? - sql

I have a loader table in which the feed updates and inserts records for every three hours. A few set of records show Null values for the last_modified date even though I have a merge which checks for last_modified date column to sysdate. For future purpose, I set the last_modified to sysdate and enabled Not NULL constraint.Is there any way where we can rectify for these set of records alone to have the last_modified date with the correct timestamp (the records should have the last_modified date with the date when the insert/update is done).
Thanks

No, the last modification time is not stored in a row by default. You have to do that yourself like you are doing now, or enable some form of journaling. There is no way to correct any old records where you have not done so.

If your rows were modified "recently enough", you might still map their ora_rowscn to their approximate modification TIMESTAMP using SCN_TO_TIMESTAMP :
UPDATE MY_TABLE
SET Last_Modified = SCN_TO_TIMESTAMP(ora_rowscn)
WHERE Last_Modified IS NULL;
This is not a magic bullet though. To quote the documentation:
The usual precision of the result value is 3 seconds.
The association between an SCN and a timestamp when the SCN is generated is remembered by the database for a limited period of time. This period is the maximum of the auto-tuned undo retention period, if the database runs in the Automatic Undo Management mode, and the retention times of all flashback archives in the database, but no less than 120 hours. The time for the association to become obsolete elapses only when the database is open. An error is returned if the SCN specified for the argument to SCN_TO_TIMESTAMP is too old.
If you try to map ora_rowscn of rows outside the allowed window, you will get the error ORA-08181 "specified number is not a valid system change number".

Related

Postgresql Performance: What Is the Best Way to Use pg_timezone_names?

We use only timestamps without time zone for a global application. However, some things have to be in local times for user convenience. In order for that to work, we have to deal with the conversion from local to UTC, including handling daylight savings. We don't need precision below that of minute.
pg_timezone_names contains everything we need, including the unambiguous long string for time zone name (e.g., 'US/Eastern'), the interval utc_offset, and the boolean is_dst. (I am assuming the latter two values change as dst boundaries are crossed.)
I am trying to figure out the best performance model, assuming we ultimately have millions of users. Here are the options being considered:
TZ name string ('US/Eastern') in the table for the location. Every time a time transformation (from local to UTC or back) is needed, we directly call pg_timezone_names for the utc_offset of that time zone. (This is assuming that view is well-indexed.) Index on the string in the location table, of course.
Local table time_zones replicating pg_timezone_names, but adding id and boolean in_use columns (and dropping the abbreviation.) Include tz_id in the location table as a foreign key instead of the string.
In the case of a local table, use a procedure that fires around the clock at one minute after every hour over the 26 hours or so that time zones can change, that checks the list of time zones in_use that have just passed two AM Sunday (based on the locally-stored offset,) and calls pg_timezone_names for the updated offset and is_dst values. Trigger updates on the local table check whenever a zone goes into use and makes sure it has the correct values.
The question is whether it is faster to evaluate the indexed string in the location table and then pull the offset from pg_timezone_names every time it is needed, or use a local time_zones table to pull the offset with the FK. I'm thinking the second will be much faster, because it avoids the initial string handling, but it really depends on the speed of the view pg_timezone_names.
After researching this more and discussing with a colleague, I've realized a flaw in the second option above. That option would indeed be quite a bit faster, but it only works if one wishes to pull the current utc_offset for a time zone. If one needs to do it for a timestamp that is not current or a range of timestamps, the built-in postgres view needs to be called, so each timestamp can be called at timezone, which will make the appropriate Daylight Savings conversion for that particular timestamp.
It's slower, but I don't think it can be improved, unless one is only interested in the current timestamp conversion, which is extremely unlikely.
So I am back to the first option, and indexing the time zone string in the local table is no longer necessary, as it would never be searched or sorted on.

extracting dates from SCN_TO_TIMESTAMP(ORA_ROWSCN)

I have a problem where I am supposed to extract row creation date for each row and be part of a large report.With SCN_TO_TIMESTAMP(ORA_ROWSCN) i can view record creation dates but i can not convert,extract that data and user it somewhere else. I'm getting an error message which says "ORA-08181: specified number is not a valid system change number
ORA-06512: at "SYS.SCN_TO_TIMESTAMP", line 1"
The query i wrote was as follows:
*insert into MEMBER_CREATION_DATE(NATIONAL_ID,CHECKNO,CREATION_DATE)
select NATIONAL_ID,CHECKNO,trunc(scn_to_timestamp(ora_rowscn)) from MEMBER*
Your clue is ORA-08181: specified number is not a valid system change number
What it means is that the SCN_TO_TIMESTAMP is not able to get the ORA_ROWSCN because the record is no longer part of the UNDO data. The SCN_TO_TIMESTAMP which is the timestamp associated to that System Change Number is too old, therefore you get the error.
You can check the oldest available SCN number in database by this query:
select min(SCN) min_scn from sys.smon_scn_time;
As Oracle states:
The association between an SCN and a timestamp when the SCN is generated is remembered by the database for a limited period of time. This period is the maximum of the auto-tuned undo retention period, if the database runs in the Automatic Undo Management mode, and the retention times of all flashback archives in the database, but no less than 120 hours. The time for the association to become obsolete elapses only when the database is open. An error is returned if the SCN specified for the argument to SCN_TO_TIMESTAMP is too old.

Oracle : Date time of load

I need to extract some data from an Oracle table that was loaded on a particular day. Is there a way to do that? The rows do not have any datetimestamp entry
Found it - ORA_ROWSCN. Have to figure out how to convert it to a date (SCN_TO_TIMESTAMP is not working)
In general, no. You'd need a date column in the table.
If the load was recent, you could try
select scn_to_timestamp( ora_rowscn ), t.*
from table t
However, there are several problems with this
Oracle only knows how to convert recent SCN's to timestamps (on the order of a few days). You probably would need to create a new function that called scn_to_timestamp and handled the exception if the SCN can't be converted to a timestamp.
The conversion of an SCN to a timestamp is approximate (should be within a minute)
Unless the table was built with rowdependencies (which is not the default), the SCN is stored at the block level not at the row level. So if your load changed one row in the block, all the rows in the block would have the same updated SCN. If you can tolerate picking up some rows that were loaded earlier and/or you know that your load only writes to new blocks, this may be less of an issue.
Beyond that, you'd be looking at things like whether flashback logs were enabled or some other mechanism was in place to track data versioning.

How to find updated date for hive tables?

How to find the last DML or DQL update timestamp for Hive table. I can find TransientDDLid by using "formatted describe ". But it is helping in getting Modified Date. How can I figure out the latest UPDATED DATE for a Hive Table(Managed/External)?
Do show table extended like 'table_name';
It will give number of milliseconds elapsed since epoch.
Copy that number, remove last 3 digits and do select from_unixtime(no. of milliseconds elapsed since epoch)
e.g. select from_unixtime(1532442615733);
This will give you timestamp of that moment in current system's time zone.
I guess this is what you're looking for...

SQL query date according to time zone

We are using a Vertica database with table columns of type timestamptz, all data is inserted according to the UTC timezone.
We are using spring-jdbc's NamedParameterJdbcTemplate
All queries are based on full calendar days, e.g. start date 2013/08/01 and end date 2013/08/31, which brings everything between '2013/08/01 00:00:00.0000' and '2013/08/31 23:59:59.9999'
We are trying to modify our queries to consider timezones, i.e. I can for my local timezone I can ask for '2013/08/01 00:00:00.0000 Asia/Jerusalem' till '2013/08/31 23:59:59.9999 Asia/Jerusalem', which is obviously different then '2013/08/01 00:00:00.0000 UTC' till '2013/08/31 23:59:59.9999 UTC'.
So far, I cannot find a way to do so, I tried setting the timezone in the session:
set timezone to 'Asia/Jerusalem';
This doesn't even work in my database client.
Calculating the difference in our Java code will not work for us as we also have queries returning date groupings (this will get completely messed up).
Any ideas or recommendations?
I am not familiar with Veritca, but some general advice:
It is usually best to use half-open intervals for date range queries. The start date should be inclusive, while the end date should be exclusive. In other words:
start <= date < end
or
start <= date && end > date
Your end date wouldn't be '2013/08/31 23:59:59.9999', it would instead be the start of the next day, or '2013/09/01 00:00:00.0000'. This avoids problems relating to precision of decimals.
That example is for finding a single date. Since you are querying a range of dates, then you have two inputs. So it would be:
startFieldInDatabase >= yourStartParameter
AND
endFieldInDatabase < yourEndParameter
Again, you would first increment the end parameter value to the start of the next day.
It sounds like perhaps Vertica is TZ aware, given that you talked about timestamptz types in your answer. Assuming they are similar to Oracle's TIMESTAMPTZ type, then it sounds like your solution will work just fine.
But usually, if you are storing times in UTC in your database, then you would simply convert the query input time(s) in advance. So rather than querying between '2013/08/01 00:00:00.0000' and '2013/09/01 00:00:00.0000', you would convert that ahead of time and query between '2013/07/31 21:00:00.0000' and '2013/08/31 21:00:00.0000'. There are numerous posts already on how to do that conversion in Java either natively or with Joda Time, so I won't repeat that here.
As a side note, you should make sure that whatever TZDB implementation you are using (Vertica's, Java's, or JodaTime's) has the latest 2013d update, since that includes the change for Israel's daylight saving time rule that goes into effect this year.
Okay, so apparently:
set time zone to 'Asia/Jerusalem';
worked and I just didn't realize it, but for the sake of helping others I'm going to add something else that works:
select fiels at time zone 'Asia/Jerusalem' from my_table;
will work for timestamptz fields