How do we identify a session in Oracle GV$SESSION view? - sql

I am trying to understand what is the minimal set of parameters to uniquely identify a session in GV$SESSION. I have seen a few online examples there AUDSID, SID, and INST_ID are used, I am trying to understand why?

For a given instance, a session is uniquely identified by its SID and its SERIAL#, as explained in the documentation:
SID: Session identifier
SERIAL#: Session serial number. Used to uniquely identify a session's objects. Guarantees that session-level commands are applied to the correct session objects if the session ends and another session begins with the same session ID.
You can add INST_ID to that if you are running a RAC environment.

It's a pair of values SID and SERIAL#
Description on both of them in docs explains why:
SID
Session identifier
SERIAL#
Session serial number. Used to uniquely identify a session's objects. Guarantees that session-level commands are applied to the correct session objects if the session ends and another session begins with the same session ID.

SID and SERIAL# are enough in V$SESSION for single instance database.
SID, SERIAL#, INST_ID are enough for RAC cluster database in GV$SESSION.
NB: no need to use GV$SESSION if not RAC.

GV$session vs V$session is used on standalone database, an gv$session (g=global) is used mostly on RAC environments.
AUDSID is a unique identifier for the session and is used in sys.aud$ , as the SESSIONID column. It is the leading column of the only index on sys.aud$
INST_ID column displays the instance number from which the associated V$ view information was obtained
The best way to understand both of them to refer to Oracle documentation and understand what each column do,
https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2088.htm#REFRN30223

Related

How can I get the last issued sequence ID in vertica?

Background: I am migrating from postgreSQL to Vertica and found, that there are some issues in IDENTITY or AUTO_INCREMENT columns. One of these issues is, that vertica cannot assign values to IDENTITY columns or alter a column, that already has data into an IDENTITY column. Therefore I created a sequence and set the default value of the column to be unique doing:
SELECT MAX(id_column) FROM MY_SCHEMA.my_table;
which is 12345
CREATE SEQUENCE MY_SCHEMA.seq_id_column MINVALUE 12346 CACHE 1;
ALTER TABLE MY_SCHEMA.my_table
ALTER COLUMN id_column SET DEFAULT(MY_SCHEMA.seq_id_column.nextval);
ALTER TABLE MY_SCHEMA.log ADD UNIQUE(id_column);
Which works as expected. In this case, I have the cache deactivated, as I am on a single node installation and I want my ID column to be contiguous. However, this is not an option on a cluster installation as the needed lock leads to a bottleneck.
Question: In a vertica cluster with several nodes, how can I access the ID of the last insert in a session (without an additional select)?
E.g. in postgreSQL I could do something like
INSERT INTO MY_SCHEMA.my_table RETURNING id_column;
which does not work in Vertica. Furthermore, the LAST_INSERT_ID() function of Vertica does not work for named sequences. I also feel, that querying the current_value of MY_SCHEMA.seq_id_column could be giving wrong results due to caching, but I am unsure about this.
Why no additional SELECT?
To my knowledge, the select will only give correct values after a commit. I cannot do a commit after every single insert due to performance.
The comments from LukStorms pointed me in the right direction.
The NEXTVAL() function (as far as I have tested) gives contiguous values in the case, where one single session queries them. Furthermore, on concurrent access, if issued after an insert, CURRVAL retrieves the cached value, which is guaranteed to be unique but not necessarily contiguous. As I never call NEXTVAL anywhere else as in my default clause, this solves the problem for me, although there might be cases, where an additional call to NEXTVAL between inserts increments the sequence counter.
One case I can think of (and that I will test in the future) is what happens if AUTO COMMIT is set to OFF, which is ON by default for the vertica client drivers.
UPDATE:
This even seems to work with AUTOCOMMIT being OFF (shown using the vertica-python client driver, where C is the connection and cur the cursor):
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SET SESSION AUTOCOMMIT TO OFF")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4
However, this seems to be unchanged during a rollback of the connection. So the following happens:
C.rollback()
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4

Postgres Overlap in System and Local DB Table Name

I ran into a strange issue where I have a user table in my local DB, which appears to overlap with the postgres system table user when I try to run SQL queries using Postico. While I am running queries against my db, I am returned a current_user column that maps to the username accessing the db. What should I change to my SQL to point to the local DB user table?
SQL commands I'm trying to run:
SELECT *
FROM user
;
Returned current_user: johndoe
UPDATE user
SET user.password = 'cryptopw', user.authentication_token = NULL
WHERE user.user_id = 210;
USER is one of the reserved key words for PostgreSQL. It is also reserved for SQL 2011, 2008 and 92.
From the documentation:
SQL distinguishes between reserved and non-reserved key words. According to the standard, reserved key words are the only real key words; they are never allowed as identifiers.
emphasis mine
That is, if you want to avoid trouble, don't use user as an identifier (that is: it shouldn't be the name of a table, or a column, or an index, or a constraint, or a function, ...).
If you still want to try:
As a general rule, if you get spurious parser errors for commands that contain any of the listed key words as an identifier you should try to quote the identifier to see if the problem goes away.
That is, try:
SELECT
*
FROM
"user" ;
and
UPDATE
"user"
SET
password = 'cryptopw', -- not "user".password
authentication_token = NULL -- not "user". authentication_token
WHERE
"user".user_id = 210; -- you can, but don't need "user"
I would actually rename the table, call it my_user (or application_user or something else), follow the SQL standard, and forget about having to quote identifiers.

Spring application database row level security best approach

What is the best approach to implement row level security in spring web application and its database? I have many tables which contain application users data. User can select, update and delete only his own rows. User is defined in table in database and logs to application with spring-security. I am using one database account to connect from application to database.
My idea is to create column with username in every table (do i need relationships here?). Now I can just add 'where username = <username>' in backend queries. Is it good idea? What is most common approach in cases like this?
I manage data access with JPA and Hibernate.
I think you need to define an owner for every object you need to validate. You can validate the user in the database with a where clause.
But you can also add another layer in the method with Spring #PostFilter. It lets you filter the returning object.
#PostFilter ("filterObject.owner == authentication.name")
public List getMyObjects();
You can see more here.
http://www.concretepage.com/spring/spring-security/prefilter-postfilter-in-spring-security
Some databases already have row level security feature. Oracle offer Virtual Private Database, PostreSQL have alter table ... enable row level security;. SQL server has this feature too. In other databases you have to do all work manually (via auxiliary column in table or/and special check in wrapper view): question about mysql on SE, MairaDB.

SQL dot notation

Can someone please explain to me how SQL Server uses dot notation to identify
the location of a table? I always thought that the location is Database.dbo.Table
But I see code that has something else in place of dbo, something like:
DBName.something.Table
Can someone please explain this?
This is a database schema. Full three-part name of a table is:
databasename.schemaname.tablename
For a default schema of the user, you can also omit the schema name:
databasename..tablename
You can also specify a linked server name:
servername.databasename.schemaname.tablename
You can read more about using identifiers as table names on MSDN:
The server, database, and owner names are known as the qualifiers of the object name. When you refer to an object, you do not have to specify the server, database, and owner. The qualifiers can be omitted by marking their positions with a period. The valid forms of object names include the following:
server_name.database_name.schema_name.object_name
server_name.database_name..object_name
server_name..schema_name.object_name
server_name...object_name
database_name.schema_name.object_name
database_name..object_name
schema_name.object_name
object_name
An object name that specifies all four parts is known as a fully qualified name. Each object that is created in Microsoft SQL Server must have a unique, fully qualified name. For example, there can be two tables named xyz in the same database if they have different owners.
Most object references use three-part names. The default server_name is the local server. The default database_name is the current database of the connection. The default schema_name is the default schema of the user submitting the statement. Unless otherwise configured, the default schema of new users is the dbo schema.
What #Szymon said. You should also make a point of always schema-qualifying object references (whether table, view, stored procedure, etc.) Unqualified object references are resolved in the following manner:
Probe the namespace of the current database for an object of the specified name belonging to the default schema of the credentials under which the current connection is running.
If not found, probe the namespace of the current database for an object of the specified name belonging to the dbo schema.
And if the object reference is to a stored procedure whose name begins with sp_, it's worse, as two more steps are added to the resolution process (unless the references is database-qualified): the above two steps are repeated, but this time, looking in the database master instead of the current database.
So a query like
select *
from foo
requires two probes of the namespace to resolve foo (assuming that the table/view is actually dbo.foo): first under your default schema (john_doe.foo) and then, not being found, under dbo (dbo.foo'), whereas
select *
from dbo.foo
is immediately resolved with a single probe of the namespace.
This has 3 implications:
The redundant lookups are expensive.
It inhibits query plan caching, as every execution has to be re-evaluated, meaning the query has to be recompiled for every execution (and that takes out compile-time locks).
You will, at one point or another, shoot yourself in the foot, and inadvertently create something under your default schema that is supposed to exist (and perhaps already does) under the dbo schema. Now you've got two versions floating around.
At some point, you, or someone else (usually it happens in production) will run a query or execute a stored procedure and get...unexpected results. It will take you quite some time to figure out that there are two [differing] versions of the same object, and which one gets executed depends on their user credentials and whether or not the reference was schema-qualified.
Always schema-qualify unless you have a real reason not to.
That being said, it can sometimes be useful, for development purposes to be able to maintain the "new" version of something under your personal schema and the "current" version under the 'dbo' schema. It makes it easy to do side-by-side testing. However, it's not without risk (which see above).
When SQL sees the syntax it will first look at the current users schema to see if the table exists, and will use that one if it does.
If it doesn't then it looks at the dbo schema and uses the table from there

What's the best way to audit log DELETEs?

The user id on your connection string is not a variable and is different from the user id (can be GUID for example) of your program. How do you audit log deletes if your connection string's user id is static?
The best place to log insert/update/delete is through triggers. But with static connection string, it's hard to log who delete something. What's the alternative?
With SQL Server, you could use CONTEXT_INFO to pass info to the trigger.
I use this in code (called by web apps) where I have to use triggers (eg multiple write paths on the table). This is where can't put my logic into the stored procedures.
We have a similar situation. Our web application always runs as the same database user, but with different logical users that out application tracks and controls.
We generally pass in the logical user ID as a parameter into each stored procedure. To track the deletes, we generally don't delete the row, just mark the status as deleted, set the LastChgID and LastChgDate fields accordingly. For important tables, where we keep an audit log (a copy of every change state), we use the above method and a trigger copies the row to a audit table, the LastChgID is already set properly and the trigger doesn't need to worry about getting the ID.