'ON REPLACE CASCADE' in sqlite - sql

Let's say I'm tracking donations for a charity:
CREATE TABLE people {
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
account_number INTEGER NOT NULL,
UNIQUE(name),
UNIQUE(account_number)
};
CREATE TABLE donations {
donor_id INTEGER NOT NULL
REFERENCES people(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
amount FLOAT NOT NULL
};
I can guarantee that a person's name and account number are both individually unique.
Sometimes I need to update either a user's name or their account number. I receive data that contains both a name and an account number, one of which will have changed. Because both name and account_number are individually unique, it shouldn't matter which one has changed - so long as one of them is the same as an existing entry in the table, the correct row will be replaced.
This is my understanding of what will happen if I do this with a REPLACE statement. The person in the people table will be deleted, along with their referenced donation (because of ON DELETE CASCADE), and a new person added with no associated donations:
people donations
| id | name | account_number | | donor_id | amount |
|----|------|----------------| |----------|--------|
| 1 | John | 11111111111111 | | 1 | 100.00 |
| 2 | Mark | 22222222222222 | | 1 | 200.00 |
| 2 | 250.00 |
>>> REPLACE INTO people ( name, account_number ) VALUES ( 'Suzy', 22222222222222 );
people donations
| id | name | account_number | | donor_id | amount |
|----|------|----------------| |----------|--------|
| 1 | John | 11111111111111 | | 1 | 100.00 |
| 3 | Suzy | 22222222222222 | | 1 | 200.00 |
How can I ensure that referenced values are propagated correctly? This is my desired end result (where '3' is any people ID so long as it matches):
people donations
| id | name | account_number | | donor_id | amount |
|----|------|----------------| |----------|--------|
| 1 | John | 11111111111111 | | 1 | 100.00 |
| 3 | Suzy | 22222222222222 | | 1 | 200.00 |
| 3 | 250.00 |

This is my understanding of what will happen if I do this with a
REPLACE statement. The person in the people table will be deleted,
along with their referenced donation (because of ON DELETE CASCADE),
and a new person added with no associated donations
Correct, but the referenced donations will be permanently deleted.
See the demo.
Sometimes I need to update either a user's name or their account
number. I receive data that contains both a name and an account
number, one of which will have changed.
What you can do is a simple UPDATE:
UPDATE people
SET name = 'Suzy',
account_number = 22222222222222
WHERE name = 'Suzy' OR account_number = 22222222222222
See the demo.

Related

SQL - Get unique values by key selected by condition

I want to clean a dataset because there are repeated keys that should not be there. Although the key is repeated, other fields do change. On repetition, I want to keep those entries whose country field is not null. Let's see it with a simplified example:
| email | country |
| 1#x.com | null |
| 1#x.com | PT |
| 2#x.com | SP |
| 2#x.com | PT |
| 3#x.com | null |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
Email acts as key, and country is the field which I want to filter by. On email repetition:
Retrieve the entry whose country is not null (case 1)
If there are several entries whose country is not null, retrieve one of them, the first occurrence for simplicity (case 2)
If all the entries' country is null, again, retrieve only one of them (case 3)
If the entry key is not repeated, just retrieve it no matter what its country is (case 4 and 5)
The expected output should be:
| email | country |
| 1#x.com | PT |
| 2#x.com | SP |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
I have thought of doing a UNION or some type of JOIN to achieve this. One possibility could be querying:
SELECT
...
FROM (
SELECT *
FROM `myproject.mydataset.mytable`
WHERE country IS NOT NULL
) AS a
...
and then match it with the full table to add those values which are missing, but I am not able to imagine the way since my experience with SQL is limited.
Also, I have read about the COALESCE function and I think it could be helpful for the task.
Consider below approach
select *
from `myproject.mydataset.mytable`
where true
qualify row_number() over(partition by email order by country nulls last) = 1

How to update Multiple rows in PostgreSQL with other fields of same table?

I have table which consist of 35,000 records in which half of the rows have name as null value
If the field has null value, i want to update the field with the value of username.
Can anyone help me with this ?
This is sample table
name | username | idnumber | type
----------------------------------------------
-- | jack | 1 | A
Mark | Mark | 2 | B
-- | dev | 3 | A
After update i want it to look like this
name | username | idnumber | type
----------------------------------------------
jack | jack | 1 | A
Mark | Mark | 2 | B
dev | dev | 3 | A
You seem to want:
update t
set name = username
where name is null;
Note that -- is not a typical representation of NULL values. You might consider <null> for instance.

Calculate Equation From Seperate Tables Data

I'm working on my senior High School Project and am reaching out to the community for help! (As my teacher doesn't know the answer to my question).
I have a simple "Products" table as shown below:
I also have a "Orders" table shown below:
Is there a way I can create a field in the "Orders" table named "Total Cost", and make that automaticly calculate the total cost from all the products selected?
Firstly, I would advise against storing calculated values, and would also strongly advise against using calculated fields in tables. In general, calculations should be performed by queries.
I would also strongly advise against the use of multivalued fields, as your images appear to show.
In general, when following the rules of database normalisation, most sales databases are structured in a very similar manner, containing with the following main tables (amongst others):
Products (aka Stock Items)
Customers
Order Header
Order Line (aka Order Detail)
A good example for you to learn from would be the classic Northwind sample database provided free of charge as a template for MS Access.
With the above structure, observe that each table serves a purpose with each record storing information pertaining to a single entity (whether it be a single product, single customer, single order, or single order line).
For example, you might have something like:
Products
Primary Key: Prd_ID
+--------+-----------+-----------+
| Prd_ID | Prd_Desc | Prd_Price |
+--------+-----------+-----------+
| 1 | Americano | $8.00 |
| 2 | Mocha | $6.00 |
| 3 | Latte | $5.00 |
+--------+-----------+-----------+
Customers
Primary Key: Cus_ID
+--------+--------------+
| Cus_ID | Cus_Name |
+--------+--------------+
| 1 | Joe Bloggs |
| 2 | Robert Smith |
| 3 | Lee Mac |
+--------+--------------+
Order Header
Primary Key: Ord_ID
Foreign Keys: Ord_Cust
+--------+----------+------------+
| Ord_ID | Ord_Cust | Ord_Date |
+--------+----------+------------+
| 1 | 1 | 2020-02-16 |
| 2 | 1 | 2020-01-15 |
| 3 | 2 | 2020-02-15 |
+--------+----------+------------+
Order Line
Primary Key: Orl_Order + Orl_Line
Foreign Keys: Orl_Order, Orl_Prod
+-----------+----------+----------+---------+
| Orl_Order | Orl_Line | Orl_Prod | Orl_Qty |
+-----------+----------+----------+---------+
| 1 | 1 | 1 | 2 |
| 1 | 2 | 3 | 1 |
| 2 | 1 | 2 | 1 |
| 3 | 1 | 1 | 4 |
| 3 | 2 | 3 | 2 |
+-----------+----------+----------+---------+
You might also opt to store the product description & price on the order line records, so that these are retained at the point of sale, as the information in the Products table is likely to change over time.

Which normal form or other formal rule does this database design choice violate?

The project I'm working on is an application that lets you design data entry forms, and automagically generates a schema in an underlying PostgreSQL database
to persist them as well as the browsing and editing UI.
The use case I've encountered this with is a store back-office database, but the app itself intends to be somewhat universal. The administrator creates the following entry forms with the given fields:
Customers
name (text box)
Items
name (text box)
stock (number field)
Order
customer (combo box selecting a customer)
order lines (a grid showing order lines)
OrderLine
item (combo box selecting an item)
count (number field)
When all this is done, the resulting database schema will be equivalent to this:
create table Customers(id serial primary key,
name varchar);
create table Items(id serial primary key,
name varchar,
stock integer);
create table Orders(id serial primary key);
create table OrderLines(id serial primary key,
count integer);
create table Links(id serial primary key,
fk1 integer references Customers.id,
fk2 integer references Items.id,
fk3 integer references Orders.id,
fk4 integer references OrderLines.id);
Links being a special table that stores all the relationships between entities; every row has (usually) two of the foreign keys set to a value, and the rest set to NULL. Whenever a new entry form is added to the application instance, a new foreign key referencing the table for this form is added to Links.
So, suppose our shop stocks some widgets, gizmos, and thingeys. A customer named Adam orders two widgets and three gizmos, and Betty orders four gizmos and five thingeys. The database will contain the following data:
Customers
/----+-------\
| ID | NAME |
| 1 | Adam |
| 2 | Betty |
\----+-------/
Items
/----+---------+-------\
| ID | NAME | STOCK |
| 1 | widget | 123 |
| 2 | gizmo | 456 |
| 3 | thingey | 789 |
\----+---------+-------/
Orders
/----\
| ID |
| 1 |
| 2 |
\----/
OrderLines
/----+-------\
| ID | COUNT |
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
\----+-------/
Links
/----+------+------+------+------\
| ID | FK1 | FK2 | FK3 | FK4 |
| 1 | 1 | NULL | 1 | NULL |
| 2 | 2 | NULL | 2 | NULL |
| 3 | NULL | NULL | 1 | 1 |
| 4 | NULL | NULL | 1 | 2 |
| 5 | NULL | NULL | 2 | 3 |
| 6 | NULL | NULL | 2 | 4 |
| 7 | NULL | 1 | NULL | 1 |
| 8 | NULL | 2 | NULL | 2 |
| 9 | NULL | 2 | NULL | 3 |
| 10 | NULL | 3 | NULL | 4 |
\----+------+------+------+------/
(The tables also contain a bunch of timestamps for auditing and soft deletion but I don't think they're relevant here, they just make writing the SQL by the administrator that much messier. The management app is also used to implement a bunch of different use cases, but they're generally primarily data entry, master-detail views, and either scalar fields or selection boxes.)
When I've had to write a join through this thing I'd grumbled about it to my coworker, who replied "well using separate tables for each relationship is one way to do it, this is another..." Leaving aside the obvious-to-me ugliness of the above and the practical issues, I also have a nagging feeling this has to be a violation of some normal form, but it's been a while since college and I'm struggling to figure out which of the criteria apply here.
Is there something stronger "well that's just your opinion" I can use when critiquing this design?

Four Table Join in BigQuery

Okay, so I'm trying to link together four different tables, and its getting very difficult. I provided snippets of each table in the hopes you all could help out
Table 1: data
+--------+--------+-----------+
| charge | amount | date |
+--------+--------+-----------+
| 123 | 10000 | 2/10/2016 |
| 456 | 10000 | 1/28/2016 |
| 789 | 10000 | 3/30/2016 |
+--------+--------+-----------+
Table 2: data_metadata
+--------+------------+------------+
| charge | key | value |
+--------+------------+------------+
| 123 | identifier | trrkfll212 |
| 456 | code | test |
| 789 | ID | 123xyz |
+--------+------------+------------+
Table 3: buyer
+-----+-----------+----------+----------+
| id | date | discount | plan |
+-----+-----------+----------+----------+
| ABC | 2/13/2016 | yes | option a |
| DEF | 2/1/2016 | yes | option a |
| GHI | 1/22/2016 | no | option a |
+-----+-----------+----------+----------+
Table 4: buyer_metadata
+--------------+-----------+--------+
| id | |key| | value |
+--------------+-----------+--------+
| ABC | migration | TRUE |
| DEF | emid | foo |
| GHI | ID | 123xyz |
+--------------+-----------+--------+
Okay, so the tables data and data_metadata are obviously connected by the charge column.
The tables buyer and buyer_metadata are connected by the id column.
But I want to link all of them together. I'm pretty sure the way to accomplish this is through linking the metadata tables together through the common field in the "value" column (in this example: 123xyz).
Could anyone help?
This might look like something like that if all "link" columns are unique :
SELECT *
FROM data d
JOIN data_metadata dm ON d.charge = dm.charge
JOIN buyer_metada bm ON dm.value = bm.value
JOIN buyer b ON bm.id = b.id
If not, I think you'll have to use something like GROUP BY clause
Let's take it in two steps, first create composite tables for data and buyer. Composite table for data:
SELECT data.charge, data.amount, data.date,
data_metadata.key, data_metadata.value
FROM [data] AS data
JOIN (SELECT charge, key, value FROM [data_metadata]) AS data_metadata
ON data.charge = data_metadata.charge
And composite table for buyer:
SELECT buyer.id, buyer.date, buyer.discount, buyer.plan,
buyer_metadata.key, buyer_metadata.value
FROM [buyer] AS buyer
JOIN (SELECT key, value FROM [buyer_metadata]) AS buyer_metadata
ON buyer.id = buyer_metadata.id
And then let's join the two composite tables
SELECT composite_data.*, composite_buyer.*
FROM (
SELECT data.charge, data.amount, data.date,
data_metadata.key, data_metadata.value
FROM [data] AS data
JOIN (SELECT charge, key, value FROM [data_metadata]) AS data_metadata
ON data.charge = data_metadata.charge) AS composite_data
JOIN (
SELECT buyer.id, buyer.date, buyer.discount, buyer.plan,
buyer_metadata.key, buyer_metadata.value
FROM [buyer] AS buyer
JOIN (SELECT key, value FROM [buyer_metadata]) AS buyer_metadata
ON buyer.id = buyer_metadata.id) AS composite_buyer
ON composite_data.value = composite_buyer.value
I haven't tested it but it's probably close.
For reference, here is the page on BigQuery JOINs. And have you seen this SO?