Postgres update column, on conflict ignore this row - sql

I have a table with email and secondary_email. email column has a unique constraint, while secondary_email can be repeated across rows.
I have to write a query to copy secondary_email to email. If there is a conflict, then ignore this row.
This query
UPDATE users SET email = secondary_email
WHERE NOT EXISTS
(SELECT 1 FROM users WHERE email=secondary_email)
still throws the error ERROR: duplicate key value violates unique constraint "users_email_key"
Users Before
+----+-------+-----------------+
| id | email | secondary_email |
+----+-------+-----------------+
| 1 | NULL | NULL |
| 2 | NULL | NULL |
| 3 | NULL | |
| 4 | NULL | e1#example.com |
| 5 | NULL | e1#example.com |
| 6 | NULL | e2#example.com |
+----+-------+-----------------+
Users After
+----+----------------+-----------------+
| id | email | secondary_email |
+----+----------------+-----------------+
| 1 | NULL | NULL |
| 2 | NULL | NULL |
| 3 | NULL | |
| 4 | e1#example.com | e1#example.com |
| 5 | NULL | e1#example.com |
| 6 | e2#example.com | e2#example.com |
+----+----------------+-----------------+

You need table aliases to fix your query:
UPDATE users u
SET email = u.secondary_email
WHERE NOT EXISTS (SELECT 1 FROM users u2 WHERE u2.email = u.secondary_email);
For your overall problem, check for no duplicates within the column as well:
UPDATE users u
SET email = u.secondary_email
FROM (SELECT secondary_email, COUNT(*) as cnt
FROM users u
GROUP BY secondary_email
HAVING COUNT(*) = 1
) s
WHERE s.secondary_email = u.secondary_email AND
NOT EXISTS (SELECT 1 FROM users u2 WHERE u2.email = u.secondary_email);
Or choose the first one:
UPDATE users u
SET email = u.secondary_email
FROM (SELECT u.*,
ROW_NUMBER() OVER (PARTITION BY secondary_email ORDER BY user_id) as seqnum
FROM users u
) s
WHERE s.user_id = u.user_id AND
s.seqnum = 1 AND
NOT EXISTS (SELECT 1 FROM users u2 WHERE u2.email = u.secondary_email);
Note: This will also filter out NULL values which seems like a good idea.
Here is a db<>fiddle.

Related

Mark rows from one table where value exists in join table?

For my query, I have two tables, as defined below:
permissions table:
| permission_id | permission_description |
|---------------|------------------------|
| 1 | Create User |
| 2 | Edit User |
| 3 | Delete User |
users_permissions table:
| permission_id | user_id |
|---------------|---------|
| 1 | 1 |
| 1 | 2 |
| 3 | 2 |
| 3 | 5 |
| 1 | 3 |
| 3 | 1 |
Now, I need to retrieve a list of all permissions in the permissions table, with a column to indicate if the user with user_id of 1 exists for each permission in the users_permissions table.
So, my desired output for the query would be:
| permission_id | permission_description | has_permission |
|---------------|------------------------|----------------|
| 1 | Create User | TRUE |
| 2 | Edit User | FALSE |
| 3 | Delete User | TRUE |
So far, I have tried the following query, but it returns entries for all permissions and all user_id values:
SELECT permissions.permission_id,
permission_description,
CASE WHEN user_id = 1 THEN 'TRUE' ELSE 'FALSE' END AS has_permission
FROM permissions
INNER JOIN users_permissions ON permission.permission_id = users_permissions.permissions_id;
How do I limit that to just one entry per permission?
For clarity, the end goal is to get a list of available permissions and mark the ones the user already has.
If you only want to know the answer for one user then an exists subquery will do the job - no need for a join.
SELECT P.permission_id
, P.permission_description
, CASE WHEN exists (select 1 from users_permissions UP where UP.permission_id = P.permission_id and UP.user_id = 1) THEN 'TRUE' ELSE 'FALSE' END AS has_permission
FROM [permissions] P
PS - I wouldn't recommend having a table called permissions as its a reserved word in SQL Server.
Use LEFT JOIN instead of INNER JOIN and check if it is null
select permissions.permission_id,
permission_description,
case when user_id is null then 'FALSE' else 'TRUE' END as has_permission
FROM permissions
LEFT JOIN users_permissions
ON permission.permission_id = users_permissions.permissions_id and user_id = 1

SQL count referrals for each user

My query:
SELECT COUNT(referrer) as refs, SUM(amount) as total, contracts.id, userid, fine
FROM contracts
JOIN users ON contracts.userid = users.id
WHERE active = 1
GROUP BY userid
my users table :
id | username | referrer (int)
1 | test | 2
2 | drekorig |
3 | maximili | 2
my contracts table:
id ! userid | amount | fine | active
1 | 1 | 50 | 23/10/2018 | 1
2 ! 2 | 120 | 24/10/2018 | 1
3 | 2 | 150 | 24/10/2018 | 1
How do I get the count of referrals for each User? My query actually gets the number of contracts instead...
Expected result:
refs | total | id | userid | fine
0 | 0 | 1 | 1 | 23/10/2018
2 | 270 | 2 | 2 | 24/10/2018
http://sqlfiddle.com/#!9/0a464d/5
SELECT r.count as refs,
SUM(amount) as total,
MAX(c.id),
u.id,
MAX(fine)
FROM users u
LEFT JOIN
(SELECT referrer, COUNT(*) `count`
FROM users
GROUP BY referrer
) r
ON u.id = r.referrer
JOIN contracts c
ON c.userid = u.id
WHERE active = 1
GROUP BY u.id

JOIN the table if records exist

is it possible if i want to do INNER JOIN only if the record exist on the 2nd table if not then dont join?
this is my table
User table
+--------+--------------+
| id | name |
+--------+--------------+
| 1 | John |
+--------+--------------+
| 2 | Josh |
+--------+--------------+
House table
+--------+-------------+--------------+
| id | owner_id | house_no |
+--------+-------------+--------------+
| 1 | 1 | 991 |
+--------+-------------+--------------+
this is my INNER JOIN query
SELECT h.owner_id, u.name, h.house_no FROM user u
INNER JOIN house h on u.id = h.owner_id
WHERE u.id = :id
it will return this result if id = 1
+--------+--------------+--------------+
| id | name | house_no |
+--------+--------------+--------------+
| 1 | John | 991 |
+--------+--------------+--------------+
but if i run with id = 2 no result returned.
what i want to do right now is it still return the result even when no data exist for id = 2 in table house
Use a left outer join instead.
SELECT u.id, u.name, h.house_no FROM user u
LEFT OUTER JOIN house h on u.id = h.owner_id
WHERE u.id = :id
The resulting record will be:
+--------+--------------+--------------+
| id | name | house_no |
+--------+--------------+--------------+
| 2 | Josh | null |
+--------+--------------+--------------+

Find and update specific duplicates in MS SQL

given below table:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | NULL |
| 2 | Ashly | Baboon | 09898788909 | NULL |
| 3 | James | Vangohg | 04333989878 | NULL |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+
I want to use MS SQL to find and update duplicate (based on name, last name and number) but only the earlier one(s). So desired result for above table is:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | DUPE |
| 2 | Ashly | Baboon | 09898788909 | DUPE |
| 3 | James | Vangohg | 04333989878 | DUPE |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+
This query uses a CTE to apply a row number, where any number > 1 is a dupe of the row with the highest ID.
;WITH x AS
(
SELECT ID,NAME,[LAST NAME],PHONE,STATE,
ROW_NUMBER() OVER (PARTITION BY NAME,[LAST NAME],PHONE ORDER BY ID DESC)
FROM dbo.YourTable
)
UPDATE x SET STATE = CASE rn WHEN 1 THEN NULL ELSE 'DUPE' END;
Of course, I see no reason to actually update the table with this information; every time the table is touched, this data is stale and the query must be re-applied. Since you can derive this information at run-time, this should be part of a query, not constantly updated in the table. IMHO.
Try this statement.
LAST UPDATE:
update t1
set
t1.STATE = 'DUPE'
from
TableName t1
join
(
select name, last_name, phone, max(id) as id, count(id) as cnt
from
TableName
group by name, last_name, phone
having count(id) > 1
) t2 on ( t1.name = t2.name and t1.last_name = t2.last_name and t1.phone = t2.phone and t1.id < t2.id)
If my understanding of your requirements is correct, you want to update all of the STATE values to DUPE when there exists another row with a higher ID value that has the same NAME and LAST NAME. If so, use this:
update t set STATE = (case when sorted.RowNbr = 1 then null else 'DUPE' end)
from yourtable t
join (select
ID,
row_number() over
(partition by name, [last name], phone order by id desc) as RowNbr from yourtable)
sorted on sorted.ID = t.ID

MySQL Join Sub-Select Optimization

UPDATE users u
JOIN (select count(*) as job_count, user_id from job_responses where date_created > subdate(now(), 30) group by user_id) j
ON j.user_id = u.user_id
JOIN users_profile p
ON p.user_id = u.user_id
JOIN users_roles_xref x
ON x.user_id = u.user_id
SET num_job_responses = least(j.job_count, 5)
WHERE u.status = 1 AND p.visible = "Y" AND x.role_id = 2000
And explain tells me this:
+----+-------------+---------------+--------+---------------------------------+---------------+---------+----------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+---------------------------------+---------------+---------+----------------------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 23008 | |
| 1 | PRIMARY | u | eq_ref | PRIMARY,user_id,status,status_2 | PRIMARY | 4 | j.user_id | 1 | Using where |
| 1 | PRIMARY | p | ref | user_id,visible | user_id | 4 | scoop_jazz.u.user_id | 2 | Using where |
| 1 | PRIMARY | x | ref | index_role_id,index_user_id | index_user_id | 4 | scoop_jazz.u.user_id | 3 | Using where |
| 2 | DERIVED | job_responses | range | date_created | date_created | 4 | NULL | 135417 | Using where; Using temporary; Using filesort |
+----+-------------+---------------+--------+---------------------------------+---------------+---------+----------------------+--------+----------------------------------------------+
I'm having trouble optimizing this query with explain. Any way to do it?
You will want to add an index on job_responses(date_created, user_id).
Then you can drop the current single-column index on date_created.
The most expensive part of the query is the subquery
(select count(*) as job_count, user_id
from job_responses
where date_created > subdate(now(), 30)
group by user_id)
The only two fields of note are user_id and date_created. There is an index on date_created that has been chosen to satisfy date_created in last 30 days. However, it will have to go back to the data pages to retrieve user_id, then group by it.
If you had a composite index, the user_id is available directly from the index. It also covers the single-column index date_created, so you can drop that one.
It ended up being easier and way faster to generate a temporary table, populate it, and then use a join on that table. I was "chunking" the original query, which ends up being very expensive when it has to create and destroy tables created by sub-selects.