How to structure DBT tables with cyclical dependencies - sql

I have one table containing my members.
customer_id
name
age
1
John
74
2
Sarah
87
Everyday, I get a new table containing the current members.
If a new member has joined, I want to add them.
If a member has left, I want to nullify their name/id
If a current member is still a member then I want to keep them as is.
Imagine that I get a new upload with the following rows
customer_id
name
age
2
Sarah
87
3
Melvin
23
I then want to generate the table
customer_id
name
age
Null
Null
74
2
Sarah
87
3
Melvin
23
I don't want to nullify anything by mistake and therefore I want to run a few tests on this table before I replace my old one. The way I've done this is by creating a temporary table (let's call it customer_temp). However, I've now created a cyclical dependency since I:
Need to read the live table customer in order to create the customer_temp
Need to replace the live table customer with customer_temp after I've run my tests
Is there anyway I can do this using dbt?

Destroying data is tricky. I would avoid that unless it's necessary (e.g., DSAR compliance).
Assuming the new data is loaded into the same table in your database each day, I think this is a perfect candidate for snapshots, with the option for invalidating hard-deleted records. See the docs. This allows you to capture the change history of a table without any data loss.
If you turned on snapshots in the initial state, your snapshot table would look like (assuming the existing records had a timestamp of 1/1):
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
2
Sarah
87
1/1/2022
Then, after the source table was updated, re-running dbt snapshot (today) would create this table:
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
5/12/2022
2
Sarah
87
1/1/2022
3
Melvin
23
5/12/2022
You can create the format you'd like with a simple query:
select
case when valid_to is null then customer_id else null end as customer_id,
case when valid_to is null then name else null end as name,
age
from {{ ref('my_snapshot') }}

Related

insert data into tables where ids need to be equal

I have two tables, customer and order each with two records for 2020. The ******* starred values are what I want to add for FY 2021.
Customer:
ID
FY
Name
1
2020
Tina Smith
2
2020
Bobby Brown
134
2021
Tina Smith***
234
2021
Bobby Brown***
Order
ID
2digitFY
Food
Drink
1
20
Hot Dog
Water
2
20
Burger
Soda
134
21
Hot Dog
Water***
234
21
Burger
Soda ***
I want to add records to both tables that is the same data for FY 2020/20 just new sequence numbers with the year 2021/21starred data above. I can't figure out how I would make the new ids equal when they auto generate. Below is similar code I have set up (fake data used above).
insert into customer (id, fy, name)
select (id, '2021', name)
from customer
where fy = '2020'
insert into order (id, 2digitFY, food, drink)
select (id, '21', food, drink)
from order
where 2digitFY = '20'
I can't figure out how I would make the new ids equal when they auto generate.
If what you said means those columns are primary keys which are automatically generated, then you don't have control over it, Oracle does.
I presume that "auto generate" you said means identity column whose value is automatically generated. If so, modify it so that it uses GENERATED BY DEFAULT ON NULL option. It means that - if you don't provide ID value, Oracle will generate it. But, if you provide it, its value will be the one you inserted.
Similarly, if you're on 11g or lower (where identity columns didn't exist) and created those values by database triggers, make sure that they fire and populate ID columns only when their values are NULL.
If you do that, then you'll be able to create your own ID values and insert them as you wish.

Constraint across multiple columns based on row value

I need help creating constraints to prevent the following conditions in a SQL Server Database based on the table below.
1) If the owns flag is set for a given ID, a new row can not be added for that ID with the owns flag set.
2) An id can not be own by different ownerNames. (This is independent of the first case, say we allow the own flag to be set more then once for a ID). John cannot own ID 123 because David owns it, but we can have two records saying David owns ID 123.
Owns Id OwnerName
==============================
1 123 David
1 123 John
0 123 Alexis
0 254 Brandon
1 956 Rod

Database schema for Sales Commissions

I'm trying to create a database with table titles which contains different titles, code(short code for the name) and commission of that title on other titles for instance.
I have a table named Title
Id Name Code CommissionOnA CommissionOnEng
1 Admin A 0 15
2 Engineer Eng 1 0
Now Is it good to have table schema like this, as the titles will change and can be inserted, updated or deleted dynamically. So with my current approach I have to alter table and add another column to it, in order to add commission for new title.
Is there any better way to do it, considering in mind that this also support multilevel sale heirarchy. Schema for any database is fine, but for MySql is preferred.
The Scenerio is, that the form where user creates a new title, dynamically renders all the titles that exist in the table with the textbox, so that when user creates a new title, he should be able to add commissions corresponding to other titles for the new title.
for instance if user creates a new Title name "Consultant" with code "c", he should see textboxes for Admin, Engineer, so that when user saves it, a row in the table gets created which has following data
Id Name Code CommissionOnA CommissionOnEng CommissionOnC
1 Admin A 0 15 0
2 Engineer Eng 1 0 0
3 Consultant C 12 5 0
Now I have another table called Employees
Id Name Title ManagerId
1 Rob 1 Null
2 Kate 2 1
3 Eli 3 2
4 Al 2 3
Now when Ido recursion, each time a junior get sale, a commission should be transfered to his manager as well as manager of his manager based on the commission specified in the title table.
So, when Al sells something, than Eli should get commission of 5 as, title of Eli is Consultant and Eli is boss of Al, so Employee with title Consultant(3) get commission of 5, if Employee with title Engineer(2) sells something.
It's better to normalise your table schemas so you don't need to add new columns instead put those related columns into their own table and then join these records via a foreign key.
For example, create a new table named commissions, then have a column for its unique ID, the ID that relates to the titles table and the commission amount:
commissions
----------------------------
id (INT, NOT NULL, Primary Key)
titles_id (INT, NOT NULL)
amount (INT, NOT NULL, DEFAULT=0)
and the data would look like:
id titles_id amount
1 1 15
2 2 1

How to change values of foreign keys in postgresql?

Let's say I have two tables: Customer and City. There are many Customers that live in the same City. The cities have an uid that is primary key. The customers have a foreign key reference to their respective city via Customer.city_uid.
I have to swap two City.uids with one another for external reasons. But the customers should stay attached to their cities. Therefore it is necessary to swap the Customer.city_uids as well. So I thought I first swap the City.uids and then change the Customer.city_uids accordingliy via an UPDATE-statement. Unfortunately, I can not do that since these uids are referenced from the Customer-table and PostgreSQL prevents me from doing that.
Is there an easy way of swapping the two City.uids with one another as well as the Customer.city_uids?
One solution could be:
BEGIN;
1. Drop foreign key
2. Make update
3. Create foreign key
COMMIT;
Or:
BEGIN;
1. Insert "new" correct information
2. Remove outdated information
COMMIT;
My instinct is to recommend not trying to change the city table's id field. But there is lot of information missing here. So it really is a feeling rather than a definitive point of view.
Instead, I would swap the values in the other fields of the city table. For example, change the name of city1 to city2's name, and vice-versa.
For example:
OLD TABLE NEW TABLE
id | name | population id | name | population
------------------------- -------------------------
1 | ABerg | 123456 1 | BBerg | 654321
2 | BBerg | 654321 2 | ABerg | 123456
3 | CBerg | 333333 3 | CBerg | 333333
(The ID was not touched, but the other values were swapped. Functionally the same as swapping the IDs, but with 'softer touch' queries that don't need to make any changes to table constraints, etc.)
Then, in your associated tables, you can do...
UPDATE
Customer
SET
city_uid = CASE WHEN city_uid = 1 THEN 2 ELSE 1 END
WHERE
city_uid IN (1,2)
But then, do you have other tables that reference city_uid? And if so, is it feasible for you to repeat that update on all those tables?
You could create two temporary cities.
You would have:
City 1
City 2
City Temp 1
City Temp 2
Then, you could do the follow:
Update all Customer UIDs from City 1 to City Temp 1.
Update all Customer UIDs from City 2 to City Temp 2.
Swap City 1 and 2 UIDs
Move all Customers back from City Temp 1 to City 1.
Move all Customers back from City Temp 2 to City 2.
Delete the temporally cities.
You can also add an ON UPDATE CASCADE clause to the parent table's CREATE TABLE statement, as described here:
How to do a cascading update?

Checking the integrity of the data for an entity

I have three tables STUDENT, DEPARTMENT and COURSE in a University database...
STUDENT has a UID as a Primary key -> which is the UNIQUE ID of the student
DEPARTMENT has Dept_id as a Primary Key -> which is the Dept. number
COURSE has C_id as Primary Key -> which is the Course/subject Id
I need to store marks in a table by relating the primary key of STUDENT, DEPARTMENT and COURSE for each student in each course.
UID Dept_id C_id marks
1 CS CS01 98
1 CS CS02 96
1 ME ME01 88
1 ME ME02 90
The problem is if i create a table like this for marks then i feel the data operator might insert wrong combination of primary key of a student for example
UID Dept_id C_id marks
1 CS CS01 98
1 CS CS02 96
1 ME CS01 88 //wrong C_id (course id) inputted by the DBA
1 ME ME02 90
In which case how can i prevent him doing this?
Also is there any other way to store marks for each student ? I mean like :
UID Dept_id CS01 CS02
1 CS 98 96
3 CS 95 92
You should avoid duplicating data in your database if possible:
UID Dept_id C_id marks
1 CS CS01 98
^^ ^^
You could:
Change the course ID to a two column key (department, course number), eg ('CS', '01').
or:
Keep the course name as it is, but put the department ID field in the course table and omit it from your marks table. If you need to calculate the total marks for a specific department you can still do this easily by adding a JOIN to your query.
Your last suggestion seems to be a bad idea. You would need a column in your table for every course and most values would be NULL.
I'm not sure why you need the department in this table if the course indicates the department. Thus, why wouldn't your table be:
UID C_id marks
1 CS01 98
1 CS02 96
1 ME01 88
1 ME02 90
What is missing from this table is some indication of time. For example, a student could take the same course twice if they failed it the first time. Thus, you would need additional columns to indicate the semester and year.
Your suggestion would be a nightmare to maintain. You would have to add new columns every time a new course was added to the achedule. It also would be harder to query much of the time.
If you want to make sure that each course is appropriate for the department, you can do that in a trigger (make sure to handle multiple record inserts or updates) or in the application. This still won't prevent all data entry errors (it is possible to pick CS89 when you meant CS98), but it will reduce the amount of error. In this case it is unlikely the data would come from anywhere other than the application, so I'd probably choose to enforce the rules in the application. A pull down list where they chose the department and only the courses for that department showed would do the trick.
You could add foreign key constraints to your tables to ensure that a valid value is entered for student IDs, course IDs and department IDs. You could also add unique constraints to the table to ensure inadvertent duplicates were not created. But in the end you can't prevent incorrect data from being inserted; if you knew it was incorrect, you wouldn't need to ask for it.
Example: 29th February 1957 couldn't be my birthday; 15th July 2025 couldn't be my birthday; 27th September 1974 wasn't my birthday.