Spring application database row level security best approach - sql

What is the best approach to implement row level security in spring web application and its database? I have many tables which contain application users data. User can select, update and delete only his own rows. User is defined in table in database and logs to application with spring-security. I am using one database account to connect from application to database.
My idea is to create column with username in every table (do i need relationships here?). Now I can just add 'where username = <username>' in backend queries. Is it good idea? What is most common approach in cases like this?
I manage data access with JPA and Hibernate.

I think you need to define an owner for every object you need to validate. You can validate the user in the database with a where clause.
But you can also add another layer in the method with Spring #PostFilter. It lets you filter the returning object.
#PostFilter ("filterObject.owner == authentication.name")
public List getMyObjects();
You can see more here.
http://www.concretepage.com/spring/spring-security/prefilter-postfilter-in-spring-security

Some databases already have row level security feature. Oracle offer Virtual Private Database, PostreSQL have alter table ... enable row level security;. SQL server has this feature too. In other databases you have to do all work manually (via auxiliary column in table or/and special check in wrapper view): question about mysql on SE, MairaDB.

Related

Creating view with nested 'no-lock' on SQL Server

Here is the scenario: I have some database model with about 500K new records everyday. The database is almost never updated (only insert statement and delete).
Many users would like to perform queries against database with tools such as PowerBI or so, but I haven't given any access to anybody to prevent deadlocking (I only allow specific IT managed resource to access the data).
I would like to open up data access, but I must prevent any one from blocking the new records insertions.
Could I create a view with nested no-lock inside it assuming no dirty read are created since no update are performed?
Would that be an acceptable design? I know it's is not a perfect solution and it's not mean for that.
It's a compromise to allow user with no SQL skills to perform ad-hoc queries and lookup.
Anything I might be missing?
I think that you can use 'WITH (NoLock) ' in front of table name in query, such as :
SELECT * FROM [table Name] WITH (NoLock)

Use schema name in a JOIN in Redshift

Our database is set up so that each of our clients is hosted in a separate schema (the organizational level above a table in Postgres/Redshift, not the database structure definition). We have a table in the public schema that has metadata about our clients. I want to use some of this metadata in a view I am creating.
Say I have 2 tables:
public.clients
name_of_schema_for_client
metadata_of_client
client_name.usage_info
whatever columns this isn't that important
I basically want to get the metadata for the client I'm running my query on and use it later:
SELECT *
FROM client_name.usage_info
INNER JOIN public.clients
ON CURRENT_SCHEMA() = public.clients.name_of_schema_for_client
This is not possible because CURRENT_SCHEMA() is a leader-node function. This function returns an error if it references a user-created table, an STL or STV system table, or an SVV or SVL system view. (see https://docs.aws.amazon.com/redshift/latest/dg/r_CURRENT_SCHEMA.html)
Is there another way to do this? Or am I just barking up the wrong tree?
Your best bet is probably to just manually set the search path within the transaction from whatever source you call this from. See this:
https://docs.aws.amazon.com/redshift/latest/dg/r_search_path.html
let's say you only want to use the table matching your best client:
set search_path to your_best_clients_schema, whatever_other_schemas_you_need_for_this;
Then you can just do:
select * from clients;
Which will try to match to the first clients table available, which by coincidence you just set to your client's schema!
You can manually revert afterwards if need be or just reset the connection to return to default, up to you

Is it better to do database operations from an sql script or from application code?

Consider the following abstract situation (just as an example):
I have two tables TableA and TableB. They have unique IDs and possibly other columns (which are irrelevant) The relatioship between them is many to many so I have a third table AssociationTable that is used to store the relationships between them. Basically, AssociationTable will have two columns (ID_A and ID_B - foreign keys).
If I delete a row in AssociationTable and the ID_A that was deleted was the last one, I would also like to delete the entry from TableA that corresponds to that ID.
I could do this:
a) From the application that uses the database
b) by using an SQL trigger
My question, basically, is the following:
Is there any good practice that says "if you can do something from both the application and from SQL, always prefer sql." ?
Or does it depend on the case? If so, what should I take into account?
Performance: The query plan for stored procedures is compiled onn DB Server and subsequent requests can run faster.
A stored procedure can execute multiple steps and the intermediate results need not go back to application layer, reducing traffic between an application and the DB server.
Security: Stored procedures are well defined database objects that can be locked down with security measures. Use of typed parameters can help prevent SQL injection attacks.
Code re-use: SQL queries can be written once and re-used across multiple clients without writing the same SQL commands over and over again.
Abstraction: By putting all the SQL code into a stored procedure, the application is completely abstracted from the field names, tables names, etc. So when a SQL query needs to be changed, there is almost zero or NO impact in the application code.
There are more benefits of doing it in the database.
Other client application code need not worry about data integrity.
The data logic should remain as close to data as possible
It could be faster if managed by DB (trigger invocation).

Postgres: Restructuring to Schemas

I have a Rails 3.2 multi-tenant subdomain based app which I'm trying to migrate over to PostgreSQL's schemas (each account getting its own schema -- right now all of the accounts use the same tables).
So, I'm thinking I need to:
Create a new DB
Create a schema for each Account (its id) and the tables under them
Grab all the data that belongs to each account and insert it into the new DB under the schema of said account
Does that sound correct? If so, what's a good way of doing that? Should I write a ruby script that uses ActiveRecord, plucks the data, then inserts it (pretty inefficient, but should get the job done) into the new DB? Or does Postgres provide good tools for doing such a thing?
EDIT:
As Craig recommended, I created schemas in the existing DB. I then looped through all of the Accounts in a Rake task, copying the data over with something like:
Account.all.each do |account|
PgTools.set_search_path account.id, false
sql = %{INSERT INTO tags SELECT DISTINCT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags"."id" = "taggings"."tag_id" WHERE "taggings"."tagger_id" = #{admin.id} AND "taggings"."tagger_type" = 'User'}
ActiveRecord::Base.connection.execute sql
#more such commands
end
I'd do the conversion with SQL personally.
Create the new schemas in the same database as the current one for easy migration, because you can't easily query across databases with PostgreSQL.
Migrate the data using appropriate INSERT INTO ... SELECT queries. To do it without having to disable any foreign keys, you should build a dependency graph of your data. Copy the data into tables that depend on nothing first, then tables that depend on them, and so on.
You'll need to repeat this for each customer schema, so consider creating a PL/PgSQL function that uses EXECUTE ... dynamic SQL to:
Create the schema for a customer
Create the tables within the schema
Copy data in the correct order by looping over a hard-coded array of table names, doing:
EXECUTE `'INSERT INTO '||quote_ident(newschema)||'.'||quote_ident(tablename)||' SELECT * FROM oldschema.'||quote_ident(tablename)||' WHERE customer_id = '||quote_literal(customer_id)'||;'
where newschema, tablename and customer_id are PL/PgSQL variables.
You can then invoke that function from SQL. While you could do just select convert_customer(c.id) FROM customer GROUP BY c.id, I'd probably do it from an external control script just so each customer's work got done and committed individually, avoiding the need to start again from scratch if the second-to-last customer conversion fails.
For bonus crazy points it's even possible to define triggers on the main customer schema's tables that replicate changes to already-migrated customers over to the copy of their data in the new schema, so they can keep using the system during the migration. I'd avoid that unless the migration was just too big to do without downtime, as it'd be a nightmare to test and you'd still need the triggers to throw an error on access by customer id x while the migration of x's data was actually in-progress, so it wouldn't be fully transparent.
If you're using different login users for different customers (strongly recommended) your function can also:
REVOKE rights on the schema from public
GRANT limited rights on the schema to the user(s) or role(s) who'll be using it
REVOKE rights on public from each table created
GRANT the desired limited rights on each table to the user(s) and role(s)
GRANT on any sequences used by those tables. This is required even if the sequence is created by a SERIAL pseudo-column.
That way all your permissions are consistent and you don't need to go and change them later. Remember that your webapp should never log in as a superuser.

What's the best way to audit log DELETEs?

The user id on your connection string is not a variable and is different from the user id (can be GUID for example) of your program. How do you audit log deletes if your connection string's user id is static?
The best place to log insert/update/delete is through triggers. But with static connection string, it's hard to log who delete something. What's the alternative?
With SQL Server, you could use CONTEXT_INFO to pass info to the trigger.
I use this in code (called by web apps) where I have to use triggers (eg multiple write paths on the table). This is where can't put my logic into the stored procedures.
We have a similar situation. Our web application always runs as the same database user, but with different logical users that out application tracks and controls.
We generally pass in the logical user ID as a parameter into each stored procedure. To track the deletes, we generally don't delete the row, just mark the status as deleted, set the LastChgID and LastChgDate fields accordingly. For important tables, where we keep an audit log (a copy of every change state), we use the above method and a trigger copies the row to a audit table, the LastChgID is already set properly and the trigger doesn't need to worry about getting the ID.