I have two tables, customer and order each with two records for 2020. The ******* starred values are what I want to add for FY 2021.
Customer:
ID
FY
Name
1
2020
Tina Smith
2
2020
Bobby Brown
134
2021
Tina Smith***
234
2021
Bobby Brown***
Order
ID
2digitFY
Food
Drink
1
20
Hot Dog
Water
2
20
Burger
Soda
134
21
Hot Dog
Water***
234
21
Burger
Soda ***
I want to add records to both tables that is the same data for FY 2020/20 just new sequence numbers with the year 2021/21starred data above. I can't figure out how I would make the new ids equal when they auto generate. Below is similar code I have set up (fake data used above).
insert into customer (id, fy, name)
select (id, '2021', name)
from customer
where fy = '2020'
insert into order (id, 2digitFY, food, drink)
select (id, '21', food, drink)
from order
where 2digitFY = '20'
I can't figure out how I would make the new ids equal when they auto generate.
If what you said means those columns are primary keys which are automatically generated, then you don't have control over it, Oracle does.
I presume that "auto generate" you said means identity column whose value is automatically generated. If so, modify it so that it uses GENERATED BY DEFAULT ON NULL option. It means that - if you don't provide ID value, Oracle will generate it. But, if you provide it, its value will be the one you inserted.
Similarly, if you're on 11g or lower (where identity columns didn't exist) and created those values by database triggers, make sure that they fire and populate ID columns only when their values are NULL.
If you do that, then you'll be able to create your own ID values and insert them as you wish.
Related
I have one table containing my members.
customer_id
name
age
1
John
74
2
Sarah
87
Everyday, I get a new table containing the current members.
If a new member has joined, I want to add them.
If a member has left, I want to nullify their name/id
If a current member is still a member then I want to keep them as is.
Imagine that I get a new upload with the following rows
customer_id
name
age
2
Sarah
87
3
Melvin
23
I then want to generate the table
customer_id
name
age
Null
Null
74
2
Sarah
87
3
Melvin
23
I don't want to nullify anything by mistake and therefore I want to run a few tests on this table before I replace my old one. The way I've done this is by creating a temporary table (let's call it customer_temp). However, I've now created a cyclical dependency since I:
Need to read the live table customer in order to create the customer_temp
Need to replace the live table customer with customer_temp after I've run my tests
Is there anyway I can do this using dbt?
Destroying data is tricky. I would avoid that unless it's necessary (e.g., DSAR compliance).
Assuming the new data is loaded into the same table in your database each day, I think this is a perfect candidate for snapshots, with the option for invalidating hard-deleted records. See the docs. This allows you to capture the change history of a table without any data loss.
If you turned on snapshots in the initial state, your snapshot table would look like (assuming the existing records had a timestamp of 1/1):
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
2
Sarah
87
1/1/2022
Then, after the source table was updated, re-running dbt snapshot (today) would create this table:
customer_id
name
age
valid_from
valid_to
1
John
74
1/1/2022
5/12/2022
2
Sarah
87
1/1/2022
3
Melvin
23
5/12/2022
You can create the format you'd like with a simple query:
select
case when valid_to is null then customer_id else null end as customer_id,
case when valid_to is null then name else null end as name,
age
from {{ ref('my_snapshot') }}
I have 3 columns in the same table in SQL one for the number, a name, and another unrelated data. The numbers repeat for a certain amount of times and have a name next to them, there can't be a name twice on the same number, but the names can be present in multiple different numbers. I need to make an SQL query to find what names have been under the same number the most amount of times. Any help will be very appreciated.
Example: SQL query will find what names have been grouped together the most.
1 Bill
1 Bob
1 Dave
2 Bob
2 John
2 Bill
To confirm - you would like to find
The pairs of names that occur together within a 'number'
Of those, find the pair that occurs most often
The trick here is to get all the pairs, then count how many 'numbers' that pair appears in.
To get the pairs, join the table to itself (on the number) - and then to only have one pairing in each, also join on name with the first in the pair < second in the pair.
The answer to this question depends on your database (SQL Server, MySQL, etc). However, here is an example written in T-SQL but it is fairly generic that does most of the work: it shows the counts and orders them by the the relevant count.
Feel free to get the TOP or LIMIT 1 just to get a pair with the most matches (noting that if there is a tie, only one would be chosen this way)
Alternatively modify the query to work out what the maximum number is, then get the pairs with that number.
CREATE TABLE NameGrps (NameNum int, Name varchar(30));
INSERT INTO NameGrps (NameNum, Name)
VALUES
(1, 'Bill'),
(1, 'Bob'),
(1, 'Dave'),
(2, 'Bob'),
(2, 'John'),
(2, 'Bill');
SELECT NamePairs.FirstInPair, NamePairs.SecondInPair, COUNT(NameNum) AS Num_Paired
FROM
(SELECT A.Name AS FirstInPair, B.Name AS SecondInPair, A.NameNum
FROM NameGrps A
INNER JOIN NameGrps B ON A.NameNum = B.NameNum AND A.Name < B.Name
) AS NamePairs
GROUP BY NamePairs.FirstInPair, NamePairs.SecondInPair
ORDER BY COUNT(NameNum) DESC, NamePairs.FirstInPair, NamePairs.SecondInPair;
And here are the results of the above
FirstInPair SecondInPair Num_Paired
Bill Bob 2
Bill Dave 1
Bill John 1
Bob Dave 1
Bob John 1
If you take a TOP or LIMIT 1 of that, it will find the pair of Bill and Bob is the most frequent.
Here is a db<>fiddle with the query, as well as additional information (e.g., what the sub-query does, and adding a TOP 1 version).
I have 2 tables storing information. For example:
Table 1 contains persons:
ID NAME CITY
1 BOB 1
2 JANE 1
3 FRED 2
The CITY is a id to a different table:
ID NAME
1 Amsterdam
2 London
The problem is that i want to insert data that i receive in the format:
ID NAME CITY
1 PETER Amsterdam
2 KEES London
3 FRED London
Given that the list of Cities is complete (i never receive a city that is not in my list) how can i insert the (new/received from outside)persons into the table with the right ID for the city?
Should i replace them before I try to insert them, or is there a performance friendly (i might have to insert thousands of lines at one) way to make the SQL do this for me?
The SQL server i'm using is Microsoft SQL Server 2012
First, load the data to be inserted into a table.
Then, you can just use a join:
insert into persons(id, name, city)
select st.id, st.name, c.d
from #StagingTable st left join
cities c
on st.city = c.name;
Note: The persons.id should probably be an identity column so it wouldn't be necessary to insert it.
insert into persons (ID,NAME,CITY) //you dont need to include ID if it is auto increment
values
(1,'BOB',(select Name from city where ID=1)) //another select query is getting Name from city table
if you want to add 1000 rows at a time that'd be great if you use stored procedure like this link
I'm trying to create a database with table titles which contains different titles, code(short code for the name) and commission of that title on other titles for instance.
I have a table named Title
Id Name Code CommissionOnA CommissionOnEng
1 Admin A 0 15
2 Engineer Eng 1 0
Now Is it good to have table schema like this, as the titles will change and can be inserted, updated or deleted dynamically. So with my current approach I have to alter table and add another column to it, in order to add commission for new title.
Is there any better way to do it, considering in mind that this also support multilevel sale heirarchy. Schema for any database is fine, but for MySql is preferred.
The Scenerio is, that the form where user creates a new title, dynamically renders all the titles that exist in the table with the textbox, so that when user creates a new title, he should be able to add commissions corresponding to other titles for the new title.
for instance if user creates a new Title name "Consultant" with code "c", he should see textboxes for Admin, Engineer, so that when user saves it, a row in the table gets created which has following data
Id Name Code CommissionOnA CommissionOnEng CommissionOnC
1 Admin A 0 15 0
2 Engineer Eng 1 0 0
3 Consultant C 12 5 0
Now I have another table called Employees
Id Name Title ManagerId
1 Rob 1 Null
2 Kate 2 1
3 Eli 3 2
4 Al 2 3
Now when Ido recursion, each time a junior get sale, a commission should be transfered to his manager as well as manager of his manager based on the commission specified in the title table.
So, when Al sells something, than Eli should get commission of 5 as, title of Eli is Consultant and Eli is boss of Al, so Employee with title Consultant(3) get commission of 5, if Employee with title Engineer(2) sells something.
It's better to normalise your table schemas so you don't need to add new columns instead put those related columns into their own table and then join these records via a foreign key.
For example, create a new table named commissions, then have a column for its unique ID, the ID that relates to the titles table and the commission amount:
commissions
----------------------------
id (INT, NOT NULL, Primary Key)
titles_id (INT, NOT NULL)
amount (INT, NOT NULL, DEFAULT=0)
and the data would look like:
id titles_id amount
1 1 15
2 2 1
I've asked this question here, but I don't think I got my point across.
Let's say I have the following tables (all PK are IDENTITY fields):
People (PersonId (PK), Name, SSN, etc.)
Loans (LoanId (PK), Amount, etc.)
Borrowers (BorrowerId(PK), PersonId, LoanId)
Let's say Mr. Smith got 2 loans on his name, 3 joint loans with his wife, and 1 join loan with his mistress. For the purposes of application I want to GROUP people, so that I can easily single-out the loans that Mr. Smith took out jointly with his wife.
To accomplish that I added BorrowerGroup table, now I have the following (all PK are IDENTITY fields):
People (PersonId (PK), Name, SSN, etc.)
Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)
BorrowerGroup(GroupId (PK))
Borrowers (BorrowerId(PK), GroupId, PersonId)
Now Mr. Smith is in 3 groups (himself, him and his wife, him and his mistress) and I can easily lookup his activity in any of those groups.
The problems with new design:
The only way to generate new BorrowerGroup is by inserting MAX(GourpId)+1 with IDENTITY_INSERT ON, this just doesn't feel right. Also, the notion of a table with 1 column is kind of weird.
I'm a firm believer in surrogate keys, and would like to stick to that design if possible.
This application does not care about individuals, the GROUP is treated as an individual
Is there a better way to group people for the purpose of this application?
You could just remove the table BorrowerGroups - it carries no information. This information is allready present via the Loans People share - I just assume you have a PeopleLoans table.
People Loans PeopleLoans
----------- ------------ -----------
1 Smith 6 S1 60 1 6
2 Wife 7 S2 60 1 7
3 Mistress 8 S+W1 74 1 8
9 S+W2 74 1 9
10 S+W3 74 1 10
11 S+M1 89 1 11
2 8
2 9
2 10
3 11
So your BorrowerGroups are actually almost the Loans - 6 and 7 with Smith only, 8 to 10 with Smith and Wife, and 11 with Smith and Mistress. So there is no need for BorrowerGroups in the first place, because they are identical to Loans grouped by the involved People.
But it might be quite hard to efficently retrieve this information, so you could think about adding a GroupId directly to Loans. Ignoring the second column of Loans (just for readability) the third column schould represent your groups. They are redundant, so you have to be carefull if you change them.
If you find a good way to derive a unique GroupId from the ids of involved people, you could make it a computed column. If a string would be okay as an group id, you could just order the ids of the people an concat them with a separator.
Group 60 with Smith only would get id '1', group 74 would become 1.2, and group 89 would become 1.3. Not that smart, but unique and easy to compute.
use the original schema:
People (PersonId (PK), Name, SSN, etc.)
Loans (LoanId (PK), Amount, etc.)
Borrowers (BorrowerId(PK), PersonId, LoanId)
just query for the data you need (your example to find husband and wife on same loans):
SELECT
l.*
FROM Borrowers b1
INNER JOIN Borrowers b2 ON b1.LoanId=b2.LoanId
INNER JOIN Loans l ON b1.LoanId=l.LoanId
WHERE b1.PersonId=#HusbandID
AND b2.PersonId=#WifeID
The design of the database seems OK. Why do you have to use MAX(GourpId)+1 when you create a new group? Can't you just create the row and then use SCOPE_IDENTITY() to return the new ID?
e.g.
INSERT INTO BorrowerGroup() DEFAULT VALUES
SELECT SCOPE_IDENTITY()
(See this other question)
(edit to SQL courtesy of this question)
I would do something more like this:
People (PersonId (PK), Name, SSN, etc.)
Loans (LoanId (PK), Amount, BorrowerGroupId, etc.)
BorrowerGroup(BorrowerGroupId (PK))
PersonBelongsToBorrowerGroup(BorrowerGroupId
(PK), PersonId(PK))
I got rid of the Borrowers table. Just store the info in the BorrowerGroup table. That's my preference.
The consensus seems to be to omit the BorrowerGroup table and I have to agree. Suggesting that you would use MAX(groupId+1) has all sorts of ACID/transaction issues and the main reason why IDENTITY fields exist.
That said; the SQL that KM provided looks good. There are any number of ways to get the same results. Joins, sub-selects and so on. The real issue there... is knowing the dataset. Given the explanation you provided the datasets are going to be very small. That also supports removing the BorrowerGroup table.
I would have a group table and then a groupmembers(borrowers) table to accomplish the many-to-many relationship between loans and people. This allows the tracking of data on the group other than just a list of members (I believe someone else made this suggestion?).
CREATE TABLE LoanGroup
(
ID int NOT NULL
, Group_Name char(50) NULL
, Date_Started datetime NULL
, Primary_ContactID int NULL
, Group_Type varchar(25)
)