Audit data migration into Oracle - sql

I am having a task to migrate data from another database to Oracle database.
And data from previous database has audit information, i.e. tracking of create/update of records with update_time and update_user. For simplicity, let's assume the previous database I am talking about is an excel file of the following format:
Key | Value | Update_Time | Update_User |
----|-------|-------------|-------------|
a | 1 | 23/04/2020 | user1 |
b | 2 | 21/04/2020 | user2 |
a | 3 | 20/04/2020 | user1 |
a | 4 | 19/04/2020 | user5 |
a | 5 | 18/04/2020 | user2 |
What is the best practice to move data into Oracle such that user can still query those audit info along with the new audit given that the data is now being saved to a new table in Oracle below? Does Oracle provide any native solution for this? I try Oracle Flashback, but not sure how to include those previous audit, because as I understand, we can only query Flashback for data change from now on. Ideally, I want to store only the latest data table in Oracle like this, as they are the actual active data:
Key | Value | Last_Update_Time | Last_Update_User |
----|-------|------------------|------------------|
a | 1 | 23/04/2020 | user1 |
b | 2 | 21/04/2020 | user2 |
Let's say user continue edit row with key b on 24/04/2020, then I want to fetch those result for UI display (currently I am using python sqlalchemy to access the db, but a solution with a sql query should be fine for the start)
Key | Value | Update_Time | Update_User |
----|-------|-------------|-------------|
b | 7 | 24/04/2020 | user2 | ---> this is an update on the new oracle table above
a | 1 | 23/04/2020 | user1 | ---> those rows below I want to somehow load into the oracle without explicitly create a new table for it
b | 2 | 21/04/2020 | user2 |
a | 3 | 20/04/2020 | user1 |
a | 4 | 19/04/2020 | user5 |
a | 5 | 18/04/2020 | user2 |
After the change, the main data table in Oracle should look below
Key | Value | Last_Update_Time | Last_Update_User |
----|-------|------------------|------------------|
a | 1 | 23/04/2020 | user1 |
b | 7 | 24/04/2020 | user2 |

YOu can use the below select query
SELECT AD.* FROM Audit_table AD,
(SELECT Key,Max(Update_time) Updated_Time,Last_updated_USer
From Audit_table
group by Key,Last_updated_USer)rec
where AD.Key=rec.Key
AND AD.Updated_Time=rec.Updated_Time
AND AD.Last_updated_USer=rec.Last_updated_USer;

Related

SQLITE: How to select a column value based on different columns in another table

I apologise in advance because I have no idea how to structure this question.
I have the following tables:
Sessions:
+----------+---------+
| login | host |
+----------+---------+
| breilly | node001 |
+----------+---------+
| pparker | node003 |
+----------+---------+
| jjameson | node004 |
+----------+---------+
| jjameson | node012 |
+----------+---------+
Userlist:
+----------+----------------+------------------+
| login | primary_server | secondary_server |
+----------+----------------+------------------+
| breilly | node001 | node010 |
+----------+----------------+------------------+
| pparker | node002 | node003 |
+----------+----------------+------------------+
| jjameson | node003 | node004 |
+----------+----------------+------------------+
What kind of SQL query should I perform so I can get a table like this?:
+----------+---------+------------+
| login | Host | Server |
+----------+---------+------------+
| jjameson | node004 | Secondary |
+----------+---------+------------+
| jjameson | node012 | Wrong Node |
+----------+---------+------------+
| pparker | node003 | Secondary |
+----------+---------+------------+
| breilly | node001 | Primary |
+----------+---------+------------+
Currently I'm just using Go with a bunch of structs / hashmaps to generate this.
I am planning to migrate the users / sessions to an in memory sqlite Database, but I can't seem to wrap my head around a query to get this sort of table.
The Server column is based on whether the user is logged on his primary / secondary or wrong machine.
I've put this in SQL Fiddle as well
Use case logic:
select s.*,
(case when s.host = ul.primary_server then 'primary'
when s.host = ul.secondary_server then 'secondary'
else 'wrong node'
end) as server
from sessions s left join
userlist ul
on s.login = ul.login;

How to add data or change schema to production database

I am new to working with databases and I want to make sure I understand the best way to add or remove data from a database without making a mess of any related data.
Here is a scenario I am working with:
I have a Tags table, with an Identity ID column. The Tags can be selected via the web application to categorize stories that are submitted by a user. When the database was first seeded; like tags were seeded in order together. As you can see all the Campuses (cities) were 1-4, the Colleges (subjects) are 5-7, and Populations are 8-11.
If this database is live in production and the client wants to add a new Campus (City) tag, what is the best way to do this?
All the other city tags are sort of organized at the top, it seems like the only option is to insert any new tags at to bottom of the table, where they will end up taking whatever the next ID available is. I suppose this is fine because the Display category column will allow us to know which categories these new tags actually belong to.
Is this typical? Is there better ways to set up the database or handle this situation such that everything remains more organized?
Thank you
+----+------------------+---------------+-----------------+--------------+--------+----------+
| ID | DisplayName | DisplayDetail | DisplayCategory | DisplayOrder | Active | ParentID |
+----+------------------+---------------+-----------------+--------------+--------+----------+
| 1 | Albany | NULL | 1 | 0 | 1 | NULL |
| 2 | Buffalo | NULL | 1 | 1 | 1 | NULL |
| 3 | New York City | NULL | 1 | 2 | 1 | NULL |
| 4 | Syracuse | NULL | 1 | 3 | 1 | NULL |
| 5 | Business | NULL | 2 | 0 | 1 | NULL |
| 6 | Dentistry | NULL | 2 | 1 | 1 | NULL |
| 7 | Law | NULL | 2 | 2 | 1 | NULL |
| 8 | Student-Athletes | NULL | 3 | 0 | 1 | NULL |
| 9 | Alumni | NULL | 3 | 1 | 1 | NULL |
| 10 | Faculty | NULL | 3 | 2 | 1 | NULL |
| 11 | Staff | NULL | 3 | 3 | 1 | NULL |
+----+------------------+---------------+-----------------+--------------+--------+----------+
The terms "top" and "bottom" which you use aren't really applicable. "Albany" isn't at the "Top" of the table - it's merely at the top of the specific view you see when you query the table without specifying a meaningful sort order. It defaults to a sort order based on the Id or an internal ROWID parameter, which isn't the logical way to show this data.
Data in the table isn't inherently ordered. If you want to view your tags organized by their category, simply order your query by DisplayCategory (and probably by DisplayOrder afterwards), and you'll see your data properly organized. You can even create a persistent View that sorts it that way for your convenience.

SQLite3 select last event by user

I have the following table 'events'.
| id | event_type | by_user | asset | time |
| 1 | owner | a | 10 | 1111111111 |
| 2 | updated | b | 20 | 1111111112 |
| 3 | owner | a | 30 | 1111111113 |
| 4 | owner | c | 20 | 1111111114 |
| 5 | updated | a | 10 | 1111111115 |
| 6 | owner | a | 20 | 1111111118 |
I would like to select the assets where user 'a' was the last user
with an 'owner' event_type. So in this example the id's 1, 3 and 6 (the
assets 10, 20 and 30 are owned by user 'a').
Basically, based on the events, I want to find the assests owned by user 'a'.
This is what correlated subqueries are for:
SELECT * FROM events e
WHERE event_type='owner'
AND time=(SELECT MAX(e_inner.time) FROM events e_inner
WHERE e_inner.asset=e.asset AND e_inner.event_type='owner')
Will give you the event that is "for each asset, show the last ownership event". If you want it for specific assets or specific owners, just add an appropriate WHERE clause
Your question is ripe for breakage if you aren't guaranteeing uniqueness of {time, event_type, asset}. This will return all n rows if you have n users being assigned ownership at the exact same time.

Last accessed timestamp of a Netezza table?

Does anyone know of a query that gives me details on the last time a Netezza table was accessed for any of the operations (select, insert or update) ?
Depending on your setup you may want to try the following query:
select *
from _v_qryhist
where lower(qh_sql) like '%tablename %'
There are a collection of history views in Netezza that should provide the information you require.
Netezza does not track this information in the catalog, so you will typically have to mine that from the query history database, if one is configured.
Modern Netezza query history information is typically stored in a dedicated database. Depending on permissions, you may be able to see if history collection is enabled, and which database it is using with the following command. Apologies in advance for the screen-breaking wrap to come.
SYSTEM.ADMIN(ADMIN)=> show history configuration;
CONFIG_NAME | CONFIG_DBNAME | CONFIG_DBTYPE | CONFIG_TARGETTYPE | CONFIG_LEVEL | CONFIG_HOSTNAME | CONFIG_USER | CONFIG_PASSWORD | CONFIG_LOADINTERVAL | CONFIG_LOADMINTHRESHOLD | CONFIG_LOADMAXTHRESHOLD | CONFIG_DISKFULLTHRESHOLD | CONFIG_STORAGELIMIT | CONFIG_LOADRETRY | CONFIG_ENABLEHIST | CONFIG_ENABLESYSTEM | CONFIG_NEXT | CONFIG_CURRENT | CONFIG_VERSION | CONFIG_COLLECTFILTER | CONFIG_KEYSTORE_ID | CONFIG_KEY_ID | KEYSTORE_NAME | KEY_ALIAS | CONFIG_SCHEMANAME | CONFIG_NAME_DELIMITED | CONFIG_DBNAME_DELIMITED | CONFIG_USER_DELIMITED | CONFIG_SCHEMANAME_DELIMITED
-------------+---------------+---------------+-------------------+--------------+-----------------+-------------+---------------------------------------+---------------------+-------------------------+-------------------------+--------------------------+---------------------+------------------+-------------------+---------------------+-------------+----------------+----------------+----------------------+--------------------+---------------+---------------+-----------+-------------------+-----------------------+-------------------------+-----------------------+-----------------------------
ALL_HIST_V3 | NEWHISTDB | 1 | 1 | 20 | localhost | HISTUSER | aFkqABhjApzE$flT/vZ7hU0vAflmU2MmPNQ== | 5 | 4 | 20 | 0 | 250 | 1 | f | f | f | t | 3 | 1 | 0 | 0 | | | HISTUSER | f | f | f | f
(1 row)
Also make note of the CONFIG_VERSION, as it will come into play when crafting the following query example. In my case, I happen to be using the version 3 format of the query history database.
Assuming history collection is configured, and that you have access to the history database, you can get the information you're looking for from the tables and views in that database. These are documented here. The following is an example, which reports when the given table was the target of a successful insert, update, or delete by referencing the "usage" column. Here I use one of the history table helper functions to unpack that column.
SELECT FORMAT_TABLE_ACCESS(usage),
hq.submittime
FROM "$v_hist_queries" hq
INNER JOIN "$hist_table_access_3" hta
USING (NPSID, NPSINSTANCEID, OPID, SESSIONID)
WHERE hq.dbname = 'PROD'
AND hta.schemaname = 'ADMIN'
AND hta.tablename = 'TEST_1'
AND hq.SUBMITTIME > '01-01-2015'
AND hq.SUBMITTIME <= '08-06-2015'
AND
(
instr(FORMAT_TABLE_ACCESS(usage),'ins') > 0
OR instr(FORMAT_TABLE_ACCESS(usage),'upd') > 0
OR instr(FORMAT_TABLE_ACCESS(usage),'del') > 0
)
AND status=0;
FORMAT_TABLE_ACCESS | SUBMITTIME
---------------------+----------------------------
ins | 2015-06-16 18:32:25.728042
ins | 2015-06-16 17:46:14.337105
ins | 2015-06-16 17:47:14.430995
(3 rows)
You will need to change the digit at the end of the $v_hist_table_access_3 view to match your query history version.

Updating a field in a table with a number aggregated from other table

I have a log table with web log entries which have a session ID. I also have a session table summarizing sessions from the previous table. So I have to run some update SQL statement but I don't get how to construct a SQL statement for a field named "session_length". In this field I hope to assign the number of events in that particular session.
Let's say I have the following log table:
| Session ID | Timestamp | Action | ...
| 1 | 00:00:00 | get | ...
| 2 | 00:00:00 | get | ...
| 1 | 00:00:01 | get | ...
| 1 | 00:00:02 | get | ...
| 2 | 00:00:02 | get | ...
In the session table, I would like to have the following values for session_length field:
| Session ID | session_length | ...
| 1 | 3 | ...
| 2 | 2 | ...
I am not sure whether this can be done by a single query but I would like to see if this can be done by a single SQL query using update. In particular, I am using PostgresSQL in AWS RedShift.
You can do this with a correlated subquery in the update statement:
update sessions
set session_length = (select count(*)
from log
where log.sessionid = sessions.sessionid
)