Constraint across multiple columns based on row value - sql

I need help creating constraints to prevent the following conditions in a SQL Server Database based on the table below.
1) If the owns flag is set for a given ID, a new row can not be added for that ID with the owns flag set.
2) An id can not be own by different ownerNames. (This is independent of the first case, say we allow the own flag to be set more then once for a ID). John cannot own ID 123 because David owns it, but we can have two records saying David owns ID 123.
Owns Id OwnerName
==============================
1 123 David
1 123 John
0 123 Alexis
0 254 Brandon
1 956 Rod

Related

How to structure DBT tables with cyclical dependencies

I have one table containing my members.
customer_id
name
age
1
John
74
2
Sarah
87
Everyday, I get a new table containing the current members.
If a new member has joined, I want to add them.
If a member has left, I want to nullify their name/id
If a current member is still a member then I want to keep them as is.
Imagine that I get a new upload with the following rows
customer_id
name
age
2
Sarah
87
3
Melvin
23
I then want to generate the table
customer_id
name
age
Null
Null
74
2
Sarah
87
3
Melvin
23
I don't want to nullify anything by mistake and therefore I want to run a few tests on this table before I replace my old one. The way I've done this is by creating a temporary table (let's call it customer_temp). However, I've now created a cyclical dependency since I:
Need to read the live table customer in order to create the customer_temp
Need to replace the live table customer with customer_temp after I've run my tests
Is there anyway I can do this using dbt?
Destroying data is tricky. I would avoid that unless it's necessary (e.g., DSAR compliance).
Assuming the new data is loaded into the same table in your database each day, I think this is a perfect candidate for snapshots, with the option for invalidating hard-deleted records. See the docs. This allows you to capture the change history of a table without any data loss.
If you turned on snapshots in the initial state, your snapshot table would look like (assuming the existing records had a timestamp of 1/1):
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
2
Sarah
87
1/1/2022
Then, after the source table was updated, re-running dbt snapshot (today) would create this table:
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
5/12/2022
2
Sarah
87
1/1/2022
3
Melvin
23
5/12/2022
You can create the format you'd like with a simple query:
select
case when valid_to is null then customer_id else null end as customer_id,
case when valid_to is null then name else null end as name,
age
from {{ ref('my_snapshot') }}

Database schema for Sales Commissions

I'm trying to create a database with table titles which contains different titles, code(short code for the name) and commission of that title on other titles for instance.
I have a table named Title
Id Name Code CommissionOnA CommissionOnEng
1 Admin A 0 15
2 Engineer Eng 1 0
Now Is it good to have table schema like this, as the titles will change and can be inserted, updated or deleted dynamically. So with my current approach I have to alter table and add another column to it, in order to add commission for new title.
Is there any better way to do it, considering in mind that this also support multilevel sale heirarchy. Schema for any database is fine, but for MySql is preferred.
The Scenerio is, that the form where user creates a new title, dynamically renders all the titles that exist in the table with the textbox, so that when user creates a new title, he should be able to add commissions corresponding to other titles for the new title.
for instance if user creates a new Title name "Consultant" with code "c", he should see textboxes for Admin, Engineer, so that when user saves it, a row in the table gets created which has following data
Id Name Code CommissionOnA CommissionOnEng CommissionOnC
1 Admin A 0 15 0
2 Engineer Eng 1 0 0
3 Consultant C 12 5 0
Now I have another table called Employees
Id Name Title ManagerId
1 Rob 1 Null
2 Kate 2 1
3 Eli 3 2
4 Al 2 3
Now when Ido recursion, each time a junior get sale, a commission should be transfered to his manager as well as manager of his manager based on the commission specified in the title table.
So, when Al sells something, than Eli should get commission of 5 as, title of Eli is Consultant and Eli is boss of Al, so Employee with title Consultant(3) get commission of 5, if Employee with title Engineer(2) sells something.
It's better to normalise your table schemas so you don't need to add new columns instead put those related columns into their own table and then join these records via a foreign key.
For example, create a new table named commissions, then have a column for its unique ID, the ID that relates to the titles table and the commission amount:
commissions
----------------------------
id (INT, NOT NULL, Primary Key)
titles_id (INT, NOT NULL)
amount (INT, NOT NULL, DEFAULT=0)
and the data would look like:
id titles_id amount
1 1 15
2 2 1

How do I make a query for if value exists in row add a value to another field?

I have a database on access and I want to add a value to a column at the end of each row based on which hospital they are in. This is a separate value. For example - the hospital called "St. James Hospital" has the id of "3" in a separate field. How do I do this using a query rather than manually going through a whole database?
example here
Not the best solution, but you can do something like this:
create table new_table as
select id, case when hospital="St. James Hospital" then 3 else null
from old_table
Or, the better option would be to create a table with the columns hospital_name and hospital_id. You can then create a foreign key relationship that will create the mapping for you, and enforce data integrity. A join across the two tables will produce what you want.
Read about this here:
http://net.tutsplus.com/tutorials/databases/sql-for-beginners-part-3-database-relationships/
The answer to your question is a JOIN+UPDATE. I am fairly sure if you looked up you would find the below link.
Access DB update one table with value from another
You could do this:
update yourTable
set yourFinalColumnWhateverItsNameIs = {your desired value}
where someColumn = 3
Every row in the table that has a 3 in the someColumn column will then have that final column set to your desired value.
If this isn't what you want, please make your question clearer. Are you trying to put the name of the hospital into this table? If so, that is not a good idea and there are better ways to accomplish that.
Furthermore, if every row with a certain value (3) gets this value, you could simply add it to the other (i.e. Hospitals) table. No need to repeat it everywhere in the table that points back to the Hospitals table.
P.S. Here's an example of what I meant:
Let's say you have two tables
HOSPITALS
id
name
city
state
BIRTHS
id
hospitalid
babysname
gender
mothersname
fathername
You could get a baby's city of birth without having to include the City column in the Births table, simply by joining the tables on hospitals.id = births.hospitalid.
After examining your ACCDB file, I suggest you consider setting up the tables differently.
Table Health_Professionals:
ID First Name Second Name Position hospital_id
1 John Doe PI 2
2 Joe Smith Co-PI 1
3 Sarah Johnson Nurse 3
Table Hospitals:
hospital_id Hospital
1 Beaumont
2 St James
3 Letterkenny Hosptial
A key point is to avoid storing both the hospital ID and name in the Health_Professionals table. Store only the ID. When you need to see the name, use the hospital ID to join with the Hospitals table and get the name from there.
A useful side effect of this design is that if anyone ever misspells a hospital name, eg "Hosptial", you need correct that error in only one place. Same holds true whenever a hospital is intentionally renamed.
Based on those tables, the query below returns this result set.
ID Second Name First Name Position hospital_id Hospital
1 Doe John PI 2 St James
3 Johnson Sarah Nurse 3 Letterkenny Hosptial
2 Smith Joe Co-PI 1 Beaumont
SELECT
hp.ID,
hp.[Second Name],
hp.[First Name],
hp.Position,
hp.hospital_id,
h.Hospital
FROM
Health_Professionals AS hp
INNER JOIN Hospitals AS h
ON hp.hospital_id = h.hospital_id
ORDER BY
hp.[Second Name],
hp.[First Name];

How to change values of foreign keys in postgresql?

Let's say I have two tables: Customer and City. There are many Customers that live in the same City. The cities have an uid that is primary key. The customers have a foreign key reference to their respective city via Customer.city_uid.
I have to swap two City.uids with one another for external reasons. But the customers should stay attached to their cities. Therefore it is necessary to swap the Customer.city_uids as well. So I thought I first swap the City.uids and then change the Customer.city_uids accordingliy via an UPDATE-statement. Unfortunately, I can not do that since these uids are referenced from the Customer-table and PostgreSQL prevents me from doing that.
Is there an easy way of swapping the two City.uids with one another as well as the Customer.city_uids?
One solution could be:
BEGIN;
1. Drop foreign key
2. Make update
3. Create foreign key
COMMIT;
Or:
BEGIN;
1. Insert "new" correct information
2. Remove outdated information
COMMIT;
My instinct is to recommend not trying to change the city table's id field. But there is lot of information missing here. So it really is a feeling rather than a definitive point of view.
Instead, I would swap the values in the other fields of the city table. For example, change the name of city1 to city2's name, and vice-versa.
For example:
OLD TABLE NEW TABLE
id | name | population id | name | population
------------------------- -------------------------
1 | ABerg | 123456 1 | BBerg | 654321
2 | BBerg | 654321 2 | ABerg | 123456
3 | CBerg | 333333 3 | CBerg | 333333
(The ID was not touched, but the other values were swapped. Functionally the same as swapping the IDs, but with 'softer touch' queries that don't need to make any changes to table constraints, etc.)
Then, in your associated tables, you can do...
UPDATE
Customer
SET
city_uid = CASE WHEN city_uid = 1 THEN 2 ELSE 1 END
WHERE
city_uid IN (1,2)
But then, do you have other tables that reference city_uid? And if so, is it feasible for you to repeat that update on all those tables?
You could create two temporary cities.
You would have:
City 1
City 2
City Temp 1
City Temp 2
Then, you could do the follow:
Update all Customer UIDs from City 1 to City Temp 1.
Update all Customer UIDs from City 2 to City Temp 2.
Swap City 1 and 2 UIDs
Move all Customers back from City Temp 1 to City 1.
Move all Customers back from City Temp 2 to City 2.
Delete the temporally cities.
You can also add an ON UPDATE CASCADE clause to the parent table's CREATE TABLE statement, as described here:
How to do a cascading update?

MySQL duplicates -- how to specify when two records actually AREN'T duplicates?

I have an interesting problem, and my logic isn't up to the task.
We have a table with that sometimes develops duplicate records (for process reasons, and this is unavoidable). Take the following example:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-1234 jdoe#gmail.com
2 Jane Smith 123-555-1111 jsmith#foo.com
3 John Doe 123-555-4321 jdoe#yahoo.com
4 Bob Jones 123-555-5555 bob#bar.com
5 John Doe 123-555-0000 jdoe#hotmail.com
6 Mike Roberts 123-555-9999 roberts#baz.com
7 John Doe 123-555-1717 wally#domain.com
We find the duplicates this way:
SELECT c1.*
FROM `clients` c1
INNER JOIN (
SELECT `FirstName`, `LastName`, COUNT(*)
FROM `clients`
GROUP BY `FirstName`, `LastName`
HAVING COUNT(*) > 1
) AS c2
ON c1.`FirstName` = c2.`FirstName`
AND c1.`LastName` = c2.`LastName`
This generates the following list of duplicates:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-1234 jdoe#gmail.com
3 John Doe 123-555-4321 jdoe#yahoo.com
5 John Doe 123-555-0000 jdoe#hotmail.com
7 John Doe 123-555-1717 wally#domain.com
As you can see, based on FirstName and LastName, all of the records are duplicates.
At this point, we actually make a phone call to the client to clear up potential duplicates.
After doing so, we learn (for example) that records 1 and 3 are real duplicates, but records 5 and 7 are actually two different people altogether.
So we merge any extraneously linked data from records 1 and 3 into record 1, remove record 3, and leave records 5 and 7 alone.
Now here's were the problem comes in:
The next time we re-run the "duplicates" query, it will contain the following rows:
id FirstName LastName PhoneNumber email
-- --------- -------- ------------ --------------
1 John Doe 123-555-4321 jdoe#gmail.com
5 John Doe 123-555-0000 jdoe#hotmail.com
7 John Doe 123-555-1717 wally#domain.com
They all appear to be duplicates, even though we've previously recognized that they aren't.
How would you go about identifying that these records aren't duplicates?
My first though it to build a lookup table identifying which records aren't duplicates of each other (for example, {1,5},{1,7},{5,7}), but I have no idea how to build a query that would be able to use this data.
Further, if another duplicate record shows up, it may be a duplicate of 1, 5, or 7, so we would need them all to show back up in the duplicates list so the customer service person can call the person in the new record to find out which record he may be a duplicate of.
I'm stretched to the limit trying to understand this. Any brilliant geniuses out there that would care to take a crack at this?
Interesting problem. Here's my crack at it.
How about if we approach the problem from a slightly different perspective.
Consider that the system is clean for a start i.e all records currently in the system are either with Unique First + Last name combinations OR the same first + last name ones have already been manually confirmed to be different people.
At the point of entering a NEW user in the system, we have an additional check. Can be implemented as an INSERT Trigger or just another procedure called after the insert is successfully done.
This Trigger / Procedure matches the
FIRST + LAST name combination of
"Inserted"record with all existing
records in the table.
For all the matching First + Last names, it will create an entry in a matching table (new table) with NewUserID, ExistingMatchingRecordsUserID
From an SQL perspective,
TABLE MatchingTable
COLUMNS 1. NewUserID 2. ExistingUserID
Constraint : Logical PK = NewUserID + ExistingMatchingRecordsUserID
INSERT INTO MATCHINGTABLE VALUES ('NewUserId', userId)
SELECT userId FROM User u where u.firstName = 'John' and u.LastName = 'Doe'
All entries in MatchingTable need resolution.
When say an Admin logs into the system, the admin sees the list of all entries in MatchingTable
eg: New User John Doe - (ID 345) - 3 Potential matches John Doe - ID 123 ID 231 / ID 256
The admin will check up data for 345 against data in 123 / 231 and 256 and manually confirm if duplicate of ANY / None
If Duplicate, 345 is deleted from User Table (soft / hard delete - whatever suits you)
If NOT, the entries for ID 354 are just removed from MatchingTable (i would go with hard deletes here as this is like a transactional temp table but again anything is fine).
Additionally, when entries for ID 354 are removed from MatchingTable, all other entries in MatchingTable where ExistingMatchingRecordsUserID = 354 are automatically removed to ensure that unnecessary manual verification for already verified data is not needed.
Again, this could be a potential DELETE trigger / Just logic executed additionally on DELETE of MatchingTable. The implementation is subject to preference.
At the expense of adding a single byte per row to your table, you could add a manually_verified BOOL column, with a default of FALSE. Set it to TRUE if you have manually verified the data. Then you can simply query where manually_verified = FALSE.
It's simple, effective, and matches what is actually happening in the business processes: you manually verify the data.
If you want to go a step further, you might want to store when the row was verified and who verified it. Since this might be annoying to store in the main table, you could certainly store it in a separate table, and LEFT JOIN in the verification data. You could even create a view to recreate the appearance of a single master table.
To solve the problem of a new duplicate being added: you would check non-verified data against the entire data set. So that means your main table, c1, would have the condition manually_verified = FALSE, but your INNER JOINed table, c2, does not. This way, the unverified data will still find all potential duplicate matches:
SELECT * FROM table t1
INNER JOIN table t2 ON t1.name = t2.name AND t1.id <> t2.id
WHERE t1.manually_verified = FALSE
The possible matches for the duplicates will be in the joined table.