Query Parent and Children from single table - sql

I currently have a single table that hosts all of my users. Now some users have team_leaders which reference the user id of the team leader which is also stored in the database.
Now, what I wanted to do do (and can't figure out) is how to query the database where it retrieves a list of the ids of all the team members and the leader in one result set.
For Example
name | id | team_leader
--------------------------------------------------
Jack | 1 | null
--------------------------------------------------
Susan| 2 | 1
--------------------------------------------------
Bob | 3 | 1
--------------------------------------------------
Eric | 4 | null
--------------------------------------------------
SELECT name FROM users where team_leader = '<some user's id>'
returns [ 'Susan', Bob']
But I would like it to return the team leader included, such as
['Jack', 'Susan', 'Bob']
Does anyone have any idea how to include the team leader in the query results?
EDIT:
Okay, so it seems like I have not explained myself 100%, my apologies. so the goal of this query is to do as follows.
I have another table called leads and there is a field there that is called user_id which correlates to the user that has access to the lead. Now, I want to introduce the ability for team leaders to update the leads that are associated with their accounts, so if the current user is a team leader they should have the ability to update the user_id from their id to anyone on their team, from one of their children to another, and from one of the children to themselves, but not to anyone not on their team. So the way I thought of it was to have a WHERE EXISTS or a WHERE IN (this would mean adding a field to the lead table called leader_id) and it checks if the new user_id is in a list of that team leader's members, including themselves.
Based off the example above.
UPDATE lead SET user_id = xxx
WHERE lead.id = yyy
AND ...
-- here is where I would check that the user_id xxx is part of the current
-- user's team which must be a team leader, for example user.id = 1
So my thought process was to get the previous query to then check against.
Hope this clears things up.

If I'm understanding correctly, you can just use or:
select name
from users
where team_leader = 1 or id = 1

WITH CTE AS(
SELECT name,id,team_leader FROM [users]
WHERE team_leader=1
UNION ALL
SELECT u.name,u.id,u.team_leader from [users] u
JOIN CTE ON CTE.empno=u.team_leader`enter code here`
and u.team_leader=1
)
SELECT * FROM CTE

Related

Is there a way to insert a record in SQL server if it does not match the latest version of the record based on three of the columns?

Consider the following table named UserAttributes:
+----+--------+----------+-----------+
| Id | UserId | AttrName | AttrValue |
+----+--------+----------+-----------+
| 4 | 1 | FavFood | Apples |
| 3 | 2 | FavFood | Burgers |
| 2 | 1 | FavShape | Circle |
| 1 | 1 | FavFood | Chicken |
+----+--------+----------+-----------+
I would like to insert a new record in this table if the latest version of a particular attribute for a user has a value that does not match the latest.
What I mean by the latest is, for example, if I was to do:
SELECT TOP(1) * FROM [UserAttributes] WHERE [UserId] = 1 AND [AttrName] = 'FavFood' ORDER BY [Id] DESC
I will be able to see that user ID 1's current favorite food is "Apples".
Is there a query safe for concurrency that will only insert a new favorite food if it doesn't match the current favorite food for this user?
I tried using the MERGE query with a HOLDLOCK, but the problem is that WHEN MATCHED/WHEN NOT MATCHED, and that works if I never want to insert a new record after a user has previously set their favorite food (in this example) to the new value. However, it does not consider that a user might switch to a new favorite food, then subsequently change back to their old favorite food. I would like to maintain all the changes as a historical record.
In the data set above, I would like to insert a new record if the user ID 1's new favorite food is "Burgers", but I do not want to insert a record if their new favorite food is "Apples" (since that is their current favorite food). I would also like to make this operation safe for concurrency.
Thank you for your help!
EDIT: I should probably also mention that when I split this operation into two queries (ie: first select their current favorite food, then do an insert query only if there is a new food detected) it works under normal conditions. However, we are observing race conditions (and therefore duplicates) since (as you may have guessed) the data set above is simply an example and there are many threads operating on this table at the same time.
A bit ugly, but to do it in one command, you could insert the user's (new) favorite food but filter with an EXCEPT of their current values.
e.g., (assuming the user's new data is in #UserID, #FavFood
; WITH LatestFavFood AS
(SELECT TOP(1) UserID, AttrName, AttrValue
FROM [UserAttributes]
WHERE [UserId] = #UserID AND [AttrName] = 'FavFood'
ORDER BY [Id] DESC
)
INSERT INTO UserAttributes (UserID, AttrName, AttrValue)
SELECT #UserID, 'FavFood', #FavFood
EXCEPT
SELECT UserID, AttrName, AttrValue
FROM LatestFavFood
Here's a DB_Fiddle with three runs.
EDIT: I have changed the above to assume varchar types for AttrName rather than nvarchar. The fiddle has a mixture. Would be good to ensure you get them correct (especially food as it may have special characters).

Search single entity which is presented by multiple rows SQL

I have User table which contains same user represented by different entities all around. For example
User Table
==========================
id name
1 John Doe
2 Doe, John
3 Nicholas Cage
4 BlackRiderXXX
5 Nicholas cage
where users John Doe, Doe, John, BlackRiderXXX are the same people. Also, Nicholas Cage and Nicholas cage are the same people. Other tables refer to user.id randomly based on which user object did the action.
For Action table it'll look like
Action Table
==========================
id user_id some_other_stuff
1 1 ...
2 2 ...
3 1 ...
4 4 ...
5 3 ...
Where the actions 1,2,3,4 are all done by John Doe.
I'll have these users merged by the user manually meaning we'd know who is whom. They'd also select which User is the one they'd like to be as their main user account so we need to know this information as well.
I'm simplifiying a bit but I have a dozen tables which are like the Action table I provided above. We have mainly two use cases on how we will need to query:
1) Find actions which are done by user X (which should check all the users entities belonging to user X)
2) Find actions and group unique users
Main point is we will be using it everywhere around the codebase on 100+ queries so we want to design it well. How can I construct a system where the query will be simple enough also powerful enough to handle different querying ways?
Thanks
PS: We are using PostgreSQL
Why not include the "main" user in the first table?
User Table
id name main_user_id
1 John Doe 1
2 Doe, John 1
3 Nicholas Cage 2
4 BlackRiderXXX 1
5 Nicholas cage 2
Then you would join on:
select . . .
from actions a join
users u
on a.user_id = u.id
where u.main_user_id = 1;
If you want this selectable per end user, then use a different table:
create table end_user_users (
end_user_users_id serial primary key,
end_user_id int references end_users (end_user_id),
end_user_user_id int references users (id),
end_user_main_user_id int references users (id)
);
Then the query would look like:
select . . .
from actions a join
end_users_users euu
on euu.end_user_user_id = a.user_id and
euu.end_user_id = $my_id
where euu.end_user_main_user_id = 1;
You can use regexp_replace(),initcap() and trim() functions to refine and extract the common name strings to be grouped, and then generate values for newly created action_id column depending on them :
with new_action0 as
(
select u.id as id,
case when strpos(u.name,',') > 0 then
initcap(trim(regexp_replace(trim(u.name),'(.*),(.*)','\2 \1')))
else
case when lower(trim(u.name))='blackriderxxx' then
'John Doe'
else
trim(initcap(u.name))
end
end as name
from action u
)
select n.id, dense_rank() over (order by n.name) as user_id
from new_action0 n;
Demo
A new decent user table can be created by using this query with create table .. as statement

select users with same social security number different badge numbers

Hello as the title suggest I need help writing a query that does this. I need to find all the users who have had a badge number change. So in the database there are often two records for the same person but both have a different badge number. Im assuming it's the same person if the social matches.
Table:
Badge_no | SSN
123123 | 387-47-1234 2
34837 | 387-47-1234
837532 | 543-45-6392
584391 | 543-45-6392
In this case I would want it to output:
837532 | 543-45-6392
584391 | 543-45-6392
Thank you!
I believe the following should do the trick here:
SELECT *
FROM yourtable
WHERE SSN IN (SELECT SSN FROM yourtable GROUP BY SSN HAVING Count(*) >=2);
That subquery will return SSN's that have more than one record. We use those SSN's to select, again, from the table to get all of the fields associated to them.

Group by non-scalar value

Given a one-to-many relationship between Person and Item
Person Item
------- ------
Id <-----. Id
Name `---- PersonId
Label
Where there are may people and Item.Label takes few distinct values, it might make sense to adopt an equivalent schema:
Person List Item
-------- ------ ------
Id .--> Id <--. Id
ListId --` `-- ListId
Name Label
That way many people can share the same list.
The migration from second schema to the first is trivial. My question is, how to migrate from the first schema to the second?
The challenge is to pick exactly one representative Person for each possible outcome of
SELECT Label FROM Item WHERE PersonId = ?
I was able to solve the problem by using FOR XML present in MS SQL server. That is,
SELECT P.Id, (SELECT Label FROM Item WHERE PersonId = P.Id FOR XML) list
FROM Person P
and then simply SELECT MIN(P.Id) FROM ... GROUP BY list to collect representatives. I'm unsatisfied with this workaround though and wish to find a more pure solution.
edit:
SELECT p.Id, q.Id FROM Person p, Person q
WHERE NOT EXISTS ( --symmetric difference between
(SELECT Label FROM Item WHERE PersonId = p.Id) --and
(SELECT Label FROM Item WHERE PersonId = q.id))
Should be the equivalence relation of Persons, for which representatives need to be found. I still wouldn't know how to finish, and this does seem rather inefficient.
It depends! I suggest you to stick your model to your business logic.
If people own pre-mades sets of items it makes senses to create a table to hold that logic.
Consider people can own just "home edition", "pro edition" or "std edition".
It makes sense to create a relational table between Edition_Items that way that edition can contain items (A,B),(A,B,C,D) and (A,C) for example.
And you can make a relational table between People and Edition it owns. At your scenario if that editions are "customized" editions, even if you got two to contain the same set of items you can consider they are different sets (just because they are owned by different people).
So that "Assembled Set" table can be used as a relational table between people and items.
Edit:
OP comment enforces my last statement.
So your "List" table can be a relational table between People and items.
|People | |List| |List_Item| |Item|
|-------| |----| |---------| |----|
|P1, L1 | | L1 | | L1, I1 | |I1 |
|P2, L2 | | L2 | | L1, I2 | |I2 |
| L3 | | L2, I1 | |I3 |
| L4 | | L2, I1 |
Seeing it you can ask, why keep a List table? That's use full if that List got some properties like: isDeleted, Description, CreateTime, etc
And the final question is? We put a reference of list on people or a reference of people in the list (or create another relational table?)
It depenses on:
1) People List is a 1-1 relation?
2) Who comes first? (egg and chicken problem?)
That's usually better questioning: Who can exist without the other.

MySQL duplicates -- how to specify when two records actually AREN'T duplicates?

I have an interesting problem, and my logic isn't up to the task.
We have a table with that sometimes develops duplicate records (for process reasons, and this is unavoidable). Take the following example:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-1234 jdoe#gmail.com
2 Jane Smith 123-555-1111 jsmith#foo.com
3 John Doe 123-555-4321 jdoe#yahoo.com
4 Bob Jones 123-555-5555 bob#bar.com
5 John Doe 123-555-0000 jdoe#hotmail.com
6 Mike Roberts 123-555-9999 roberts#baz.com
7 John Doe 123-555-1717 wally#domain.com
We find the duplicates this way:
SELECT c1.*
FROM `clients` c1
INNER JOIN (
SELECT `FirstName`, `LastName`, COUNT(*)
FROM `clients`
GROUP BY `FirstName`, `LastName`
HAVING COUNT(*) > 1
) AS c2
ON c1.`FirstName` = c2.`FirstName`
AND c1.`LastName` = c2.`LastName`
This generates the following list of duplicates:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-1234 jdoe#gmail.com
3 John Doe 123-555-4321 jdoe#yahoo.com
5 John Doe 123-555-0000 jdoe#hotmail.com
7 John Doe 123-555-1717 wally#domain.com
As you can see, based on FirstName and LastName, all of the records are duplicates.
At this point, we actually make a phone call to the client to clear up potential duplicates.
After doing so, we learn (for example) that records 1 and 3 are real duplicates, but records 5 and 7 are actually two different people altogether.
So we merge any extraneously linked data from records 1 and 3 into record 1, remove record 3, and leave records 5 and 7 alone.
Now here's were the problem comes in:
The next time we re-run the "duplicates" query, it will contain the following rows:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-4321 jdoe#gmail.com
5 John Doe 123-555-0000 jdoe#hotmail.com
7 John Doe 123-555-1717 wally#domain.com
They all appear to be duplicates, even though we've previously recognized that they aren't.
How would you go about identifying that these records aren't duplicates?
My first though it to build a lookup table identifying which records aren't duplicates of each other (for example, {1,5},{1,7},{5,7}), but I have no idea how to build a query that would be able to use this data.
Further, if another duplicate record shows up, it may be a duplicate of 1, 5, or 7, so we would need them all to show back up in the duplicates list so the customer service person can call the person in the new record to find out which record he may be a duplicate of.
I'm stretched to the limit trying to understand this. Any brilliant geniuses out there that would care to take a crack at this?
Interesting problem. Here's my crack at it.
How about if we approach the problem from a slightly different perspective.
Consider that the system is clean for a start i.e all records currently in the system are either with Unique First + Last name combinations OR the same first + last name ones have already been manually confirmed to be different people.
At the point of entering a NEW user in the system, we have an additional check. Can be implemented as an INSERT Trigger or just another procedure called after the insert is successfully done.
This Trigger / Procedure matches the
FIRST + LAST name combination of
"Inserted"record with all existing
records in the table.
For all the matching First + Last names, it will create an entry in a matching table (new table) with NewUserID, ExistingMatchingRecordsUserID
From an SQL perspective,
TABLE MatchingTable
COLUMNS 1. NewUserID 2. ExistingUserID
Constraint : Logical PK = NewUserID + ExistingMatchingRecordsUserID
INSERT INTO MATCHINGTABLE VALUES ('NewUserId', userId)
SELECT userId FROM User u where u.firstName = 'John' and u.LastName = 'Doe'
All entries in MatchingTable need resolution.
When say an Admin logs into the system, the admin sees the list of all entries in MatchingTable
eg: New User John Doe - (ID 345) - 3 Potential matches John Doe - ID 123 ID 231 / ID 256
The admin will check up data for 345 against data in 123 / 231 and 256 and manually confirm if duplicate of ANY / None
If Duplicate, 345 is deleted from User Table (soft / hard delete - whatever suits you)
If NOT, the entries for ID 354 are just removed from MatchingTable (i would go with hard deletes here as this is like a transactional temp table but again anything is fine).
Additionally, when entries for ID 354 are removed from MatchingTable, all other entries in MatchingTable where ExistingMatchingRecordsUserID = 354 are automatically removed to ensure that unnecessary manual verification for already verified data is not needed.
Again, this could be a potential DELETE trigger / Just logic executed additionally on DELETE of MatchingTable. The implementation is subject to preference.
At the expense of adding a single byte per row to your table, you could add a manually_verified BOOL column, with a default of FALSE. Set it to TRUE if you have manually verified the data. Then you can simply query where manually_verified = FALSE.
It's simple, effective, and matches what is actually happening in the business processes: you manually verify the data.
If you want to go a step further, you might want to store when the row was verified and who verified it. Since this might be annoying to store in the main table, you could certainly store it in a separate table, and LEFT JOIN in the verification data. You could even create a view to recreate the appearance of a single master table.
To solve the problem of a new duplicate being added: you would check non-verified data against the entire data set. So that means your main table, c1, would have the condition manually_verified = FALSE, but your INNER JOINed table, c2, does not. This way, the unverified data will still find all potential duplicate matches:
SELECT * FROM table t1
INNER JOIN table t2 ON t1.name = t2.name AND t1.id <> t2.id
WHERE t1.manually_verified = FALSE
The possible matches for the duplicates will be in the joined table.