SQL Limit number of references to another table without locking - sql

Is there a technique to avoid locking a row but still be able to limit the number of rows in another table that reference it?
For example:
create table accounts (
id integer,
name varchar,
max_users integer
);
create table users (
id integer,
account_id integer,
email varchar
);
If I want to limit the number of users that are part of an account using the max_users value in accounts. Is there another way to ensure that concurrent calls won't create more users than permitted without locking the group row?
Something like this doesn't work, since this happening in two concurrent transactions can have select count(*)... be true even if the count is just at the limit:
begin;
insert into users(id, account_id, email)
select 1, 1, 'john#abc.com' where (select count(*) from users where account_id = 1) < (select max_users from accounts where id = 1);
commit;
And the following works, but I'm having performance issues that are mostly based transactions waiting for locks:
begin;
select id from accounts where id = 1 for update;
insert into users(id, account_id, email)
select 1, 1, 'john#abc.com' where (select count(*) from users where account_id = 1) < (select max_users from accounts where id = 1);
commit;
EDIT: Bonus question: what if the value is not stored in the database, but is something you can set dynamically?

Related

How to check if values exists in a table - postgres PLPGSQL

ERD
users
id
name
groups
id
name
users_in_groups
user_id
group_id
Problem summary
I'm writing a stored procedure in postgres that recieves a group name and users array and adds users to the group, and I want to assert first that the users exists in users - because I want to raise a custom error so I can catch it my server (if I rely on the default errors like - FK violation, I cannot classify it specifically enough in my server).
The stored procedure
CREATE FUNCTION add_users_to_group(group_name text, users text[])
RETURNS VOID AS $$
DECLARE
does_all_users_exists boolean;
BEGIN
SELECT exist FROM (
WITH to_check (user_to_check) as (select unnest(users))
SELECT bool_and(EXISTS (
SELECT * FROM users where id = to_check.user_to_check
)) as exist from to_check) as existance INTO does_all_users_exists;
IF NOT does_all_users_exists THEN
RAISE EXCEPTION '%', does_all_users_exists USING ERRCODE = 'XXXXX';
-- TODO: loop through each user and insert into users_in_groups
END;
$$ LANGUAGE PLPGSQL VOLATILE STRICT SECURITY INVOKER;
The problem
When I execute the function with users that exists in the users table, I get the error I throw and the message is: f (so my variable was false), but when I run only the query that gives me the existance of the all the users:
WITH to_check (user_to_check) as (select unnest(users))
SELECT bool_and(EXISTS (
SELECT * FROM users where id = to_check.user_to_check
)) as exist from to_check
I get true. but I get it inside a table like so:
#
exist (boolean)
1
true
so I guess I need to extract the true somehow.
anyway I know there is a better solution for validating the existance before insert, you are welcome to suggest.
Your logic seems unnecessarily complex. You can just check if any user doesn't exist using NOT EXISTS:
SELECT 1
FROM UNNEST(users) user_to_check
WHERE NOT EXISTS (SELECT 1 FROM users u WHERE u.id = user_to_check)
When you want to avoid issues with unique and foreign key constraints, you can SELECT and INSERT the records that you need for the next step. And you can do this for both tables (users and groups) in a single query, including the INSERT in users_in_groups:
CREATE FUNCTION add_users_to_group(group_name text, users text[])
RETURNS VOID AS $$
WITH id_users AS (
-- get id's for existing users:
SELECT id, name
FROM users
WHERE name =any($2)
), dml_users AS (
-- create id's for the new users:
INSERT INTO users (name)
SELECT s.name
FROM unnest($2) s(name)
WHERE NOT EXISTS(SELECT 1 FROM id_users i WHERE i.name = s.name)
-- Just to be sure, not sure you want this:
ON conflict do NOTHING
-- Result:
RETURNING id
), id_groups AS (
-- get id for an existing group:
SELECT id, name
FROM users
WHERE name = $1
), dml_group AS (
-- create id's for the new users:
INSERT INTO groups (name)
SELECT s.name
FROM (VALUES($1)) s(name)
WHERE NOT EXISTS(SELECT 1 FROM id_groups i WHERE i.name = s.name)
-- Just to be sure, not sure you want this:
ON conflict do NOTHING
-- Result:
RETURNING id
)
INSERT INTO users_in_groups(user_id, group_id)
SELECT user_id, group_id
FROM (
-- get all user-id's
SELECT id FROM dml_users
UNION
SELECT id FROM id_users
) s1(user_id)
-- get all group-id's
, (
SELECT id FROM dml_group
UNION
SELECT id FROM id_groups
) s2(group_id);
$$ LANGUAGE sql VOLATILE STRICT SECURITY INVOKER;
And you don't need PLpgSQL either, SQL will do.

The query returns values before inserting data

I have a rating table.
CREATE TABLE merchants_rating(
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users ON DELETE CASCADE,
merchant_id INTEGER NOT NULL REFERENCES users ON DELETE CASCADE,
rating INTEGER NOT NULL
);
I want to insert data into it and get the sum of the seller’s rating and the number of users who rated it.
I made a request.
WITH INSERT_ROW AS (
INSERT INTO MERCHANTS_RATING (USER_ID, MERCHANT_ID, RATING)
VALUES(147, 92, 2)
)
SELECT SUM(R.RATING) AS SUMMA_RATING, COUNT(R.USER_ID) AS USER_COUNT
FROM MERCHANTS_RATING AS R
WHERE R.MERCHANT_ID = 92
The data is added successfully, but there are problems in the output. When the table is empty and the first time I add data to it, I get such values.
SUMMA_RATING | USER_COUNT |
----------------------------
NULL | 0 |
Although I expect to receive.
SUMMA_RATING | USER_COUNT |
----------------------------
2 | 1 |
Since one user has rated the seller.
What have I done wrong?
Quote from the manual
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot “see” one another's effects on the target tables
(emphasis mine)
Luckily the manual also explains how to work around that:
This [...] means that RETURNING data is the only way to communicate changes between different WITH sub-statements and the main query
You need to use a UNION between the existing rows and the inserted rows:
WITH insert_row AS (
INSERT INTO merchants_rating (user_id, merchant_id, rating)
VALUES (147, 92, 2)
returning * --<< return the inserted row to the outer query
)
SELECT sum(r.rating) AS summa_rating, count(r.user_id) AS user_count
FROM (
SELECT rating, user_id
FROM merchants_rating
WHERE merchant_id = (SELECT merchant_id FROM insert_row)
UNION ALL
SELECT rating, user_id
FROM insert_row
) r;
If you intend to insert more than row in the first step, you need to change the r.merchant_id = to r.merchant_id IN
Online example: https://rextester.com/BSCI15298
I think this is what your are trying to do.
insert some values to merchants_rating from insert_row cte.
select sum and count from table merchants_rating
insert into merchants_rating (user_id, merchant_id, rating)
with insert_row as (
select 147, 92, 2
) select * from insert_row;
select sum(rating) AS summa_rating, count(user_id) AS user_count
from merchants_rating where merchant_id = 92;
See SQLFIDDLE

Operation on each row from select query

How execute additional query (UPDATE) on each row from SELECT?
I have to get amount from each row from select and send it to user's balance table.
Example:
status 0 - open
status 1 - processed
status 2 - closed
My select statement:
select id, user_id, sell_amount, sell_currency_id
from (select id, user_id, sell_amount, sell_currency_id,
sum(sell_amount)
over (order by buy_amount/sell_amount ASC, date_add ASC) as cumsell
from market t
where (status = 0 or status = 1) and type = 0
) t
where 0 <= cumsell and 7 > cumsell - sell_amount;
Select result from market table
id;user_id;amount;status
4;1;1.00000000;0
6;2;2.60000000;0
5;3;2.00000000;0
7;4;4.00000000;0
We get 7 amount and send it to user balance table.
id;user_id;amount;status
4;1;0.00000000;2 -- took 1, sum 1, status changed to 2
6;2;0.00000000;2 -- took 2.6, sum=3.6, status changed to 2
5;3;0.00000000;2 -- took 2, sum 5.6, status changed to 2
7;4;2.60000000;1 -- took 1.4, sum 7.0, status changed to 1 (because there left 2.6 to close)
User's balance table
user_id;balance
5;7 -- added 7 from previous operation
Postgres version 9.3
The general principle is to use UPDATE ... FROM over a subquery. Your example is too hard to turn into useful CREATE TABLE and SELECT statements, so I've made up a quick dummy dataset:
CREATE TABLE balances (user_id integer, balance numeric);
INSERT INTO balances (user_id, balance) VALUES (1,0), (2, 2.1), (3, 99);
CREATE TABLE transactions (user_id integer, amount numeric, applied boolean default 'f');
INSERT INTO transactions (user_id, amount) VALUES (1, 22), (1, 10), (2, -10), (4, 1000000);
If you wanted to apply the transactions to the balances you would do something like:
BEGIN;
LOCK TABLE balances IN EXCLUSIVE MODE;
LOCK TABLE transactions IN EXCLUSIVE MODE;
UPDATE balances SET balance = balance + t.amount
FROM (
SELECT t2.user_id, sum(t2.amount) AS amount
FROM transactions t2
GROUP BY t2.user_id
) t
WHERE balances.user_id = t.user_id;
UPDATE transactions
SET applied = true
FROM balances b
WHERE transactions.user_id = b.user_id;
The LOCK statements are important for correctness in the presence of concurrent inserts/updates.
The second UPDATE marks the transactions as applied; you might not need something like that in your design.

How would I copy entire rows within a table and change one value?

insert into user
(user_id, account_id, user_type_cd, name, e_mail_addr, login_failure_cnt, admin_user, primary_user)
select *
from pnet_user
where account_id='1';
But now I want to change 1 to 2 on the inserted entries. But now I want to change 1 to 3 on the inserted entries. But now I want to change 1 to .... on the inserted entries. But now I want to change 1 to 1000 on the inserted entries.
It will copy and write down 1000 times (only changing id ).
Do you understand ? Sorry for my poor English ! Thank you very much !
It depends exactly what you mean;
If you want the first 1000 users why not write;
WHERE account_id <= 1000
If you want all users;
Have no WHERE clause.
If you want the user inserted via a parameter (user input);
DECLARE #ID int
SET #ID = 1;
WHERE account_id = #ID
insert into user
(user_id, account_id, user_type_cd, name, e_mail_addr, login_failure_cnt, admin_user, primary_user)
select *
from pnet_user
where account_id BETWEEN 1 AND 1000;
The SELECT-part may return multiple rows at once. (I presume account_id is not really a string but a number).
Besides, I would strongly recommend to type out the column names in the select statement.
insert into user (user_id,account_id,user_type_cd,name,e_mail_addr,login_failure_cnt,admin_user,primary_user) select * from pnet_user
and you can specifu column name also instead of *
insert into user (user_id,account_id,user_type_cd,name,e_mail_addr,login_failure_cnt,admin_user,primary_user) select id,ac_id .... from pnet_user
If account_id is numeric, Just change the where clause
Insert user (user_id, account_id, user_type_cd,
name, e_mail_addr, login_failure_cnt,
admin_user, primary_user)
Select user_id, account_id, user_type_cd,
name, e_mail_addr, login_failure_cnt,
admin_user, primary_user
From pnet_user where account_id
Between 1 and 1000
If account_id really is a string, then change it to a number
Insert user (user_id, account_id, user_type_cd,
name, e_mail_addr, login_failure_cnt,
admin_user, primary_user)
Select user_id, account_id, user_type_cd,
name, e_mail_addr, login_failure_cnt,
admin_user, primary_user
From pnet_user where Cast(account_id as integer)
Between 1 and 1000

SQL - how to efficiently select distinct records

I've got a very performance sensitive SQL Server DB. I need to make an efficient select on the following problem:
I've got a simple table with 4 fields:
ID [int, PK]
UserID [int, FK]
Active [bit]
GroupID [int, FK]
Each UserID can appear several times with a GroupID (and in several groupIDs) with Active='false' but only once with Active='true'.
Such as:
(id,userid,active,groupid)
1,2,false,10
2,2,false,10
3,2,false,10
4,2,true,10
I need to select all the distinct users from the table in a certain group, where it should hold the last active state of the user. If the user has an active state - it shouldn't return an inactive state of the user, if it has been such at some point in time.
The naive solution would be a double select - one to select all the active users and then one to select all the inactive users which don't appear in the first select statement (because each user could have had an inactive state at some point in time). But this would run the first select (with the active users) twice - which is very unwanted.
Is there any smart way to make only one select to get the needed query? Ideas?
Many thanks in advance!
What about a view such as this :
createview ACTIVE as select * from USERS where Active = TRUE
Then just one select from that view will be sufficient :
select user from ACTIVE where ID ....
Try this:
Select
ug.GroupId,
ug.UserId,
max(ug.Active) LastState
from
UserGroup ug
group by
ug.GroupId,
ug.UserId
If the active field is set to 1 for a user / group combination you will get the 1, if not you will get a 0 for the last state.
I'm not a big fan of the use of an "isActive" column the way you're doing it. This requires two UPDATEs to change an active status and has the effect of storing the information about the active status several times in the different records.
Instead, I would remove the active field and do one of the following two things:
If you already have a table somewhere in which (userid, groupid) is (or could be) a PRIMARY KEY or UNIQUE INDEX then add the active column to that table. When a user becomes active or inactive with respect to a particular group, update only that single record with true or false.
If such a table does not already exist then create one with '(userid, groupid)as thePRIMARY KEYand the fieldactive` and then treat the table as above.
In either case, you only need to query this table (without aggregation) to determine the users' status with respect to the particular group. Equally importantly, you only store the true or false value one time and only need to UPDATE a single value to change the status. Finally, this tables acts as the place in which you can store other information specific to that user's membership in that group that applies only once per membership, not once per change-in-status.
Try this:
SELECT t.* FROM tbl t
INNER JOIN (
SELECT MAX(id) id
FROM tbl
GROUP BY userid
) m
ON t.id = m.id
Not sure that I understand what you want your query to return but anyway. This query will give you the users in a group that is active in the last entry. It uses row_number() so you need at least SQL Server 2005.
Table definition:
create table YourTable
(
ID int identity primary key,
UserID int,
Active bit,
GroupID int
)
Index to support the query:
create index IX_YourTable_GroupID on YourTable(GroupID) include(UserID, Active)
Sample data:
insert into YourTable values
(1, 0, 10),
(1, 0, 10),
(1, 0, 10),
(1, 1, 10),
(2, 0, 10),
(2, 1, 10),
(2, 0, 10),
(3, 1, 10)
Query:
declare #GroupID int = 10
;with C as
(
select UserID,
Active,
row_number() over(partition by UserID order by ID desc) as rn
from YourTable as T
where T.GroupID = #GroupID
)
select UserID
from C
where rn = 1 and
Active = 1
Result:
UserID
-----------
1
3