PostgreSQL Insert into table with subquery selecting from multiple other tables - sql

I am learning SQL (postgres) and am trying to insert a record into a table that references records from two other tables, as foreign keys.
Below is the syntax I am using for creating the tables and records:
-- Create a person table + insert single row
CREATE TABLE person (
pname VARCHAR(255) NOT NULL,
PRIMARY KEY (pname)
);
INSERT INTO person VALUES ('personOne');
-- Create a city table + insert single row
CREATE TABLE city (
cname VARCHAR(255) NOT NULL,
PRIMARY KEY (cname)
);
INSERT INTO city VALUES ('cityOne');
-- Create a employee table w/ForeignKey reference
CREATE TABLE employee (
ename VARCHAR(255) REFERENCES person(pname) NOT NULL,
ecity VARCHAR(255) REFERENCES city(cname) NOT NULL,
PRIMARY KEY(ename, ecity)
);
-- create employee entry referencing existing records
INSERT INTO employee VALUES(
SELECT pname FROM person
WHERE pname='personOne' AND <-- ISSUE
SELECT cname FROM city
WHERE cname='cityOne
);
Notice in the last block of code, where I'm doing an INSERT into the employee table, I don't know how to string together multiple SELECT sub-queries to get both the existing records from the person and city table such that I can create a new employee entry with attributes as such:
ename='personOne'
ecity='cityOne'
The textbook I have for class doesn't dive into sub-queries like this and I can't find any examples similar enough to mine such that I can understand how to adapt them for this use case.
Insight will be much appreciated.

There doesn’t appear to be any obvious relationship between city and person which will make your life hard
The general pattern for turning a select that has two base tables giving info, into an insert is:
INSERT INTO table(column,list,here)
SELECT column,list,here
FROM
a
JOIN b ON a.x = b.y
In your case there isn’t really anything to join on because your one-column tables have no column in common. Provide eg a cityname in Person (because it seems more likely that one city has many person) then you can do
INSERT INTO employee(personname,cityname)
SELECT p.pname, c.cname
FROM
person p
JOIN city c ON p.cityname = c.cname
But even then, the tables are related between themselves and don’t need the third table so it’s perhaps something of an academic exercise only, not something you’d do in the real world
If you just want to mix every person with every city you can do:
INSERT INTO employee(personname,cityname)
SELECT pname, cname
FROM
person p
CROSS JOIN city c
But be warned, two people and two cities will cause 4 rows to be inserted, and so on (20 people and 40 cities, 800 rows. Fairly useless imho)
However, I trust that the general pattern shown first will suffice for your learning; write a SELECT that shows the data you want to insert, then simply write INSERT INTO table(columns) above it. The number of columns inserted to must match the number of columns selected. Don’t forget that you can select fixed values if no column from the query has the info (INSERT INTO X(p,c,age) SELECT personname, cityname, 23 FROM ...)

The following will work for you:
INSERT INTO employee
SELECT pname, cname FROM person, city
WHERE pname='personOne' AND cname='cityOne';
This is a cross join producing a cartesian product of the two tables (since there is nothing to link the two). It reads slightly oddly, given that you could just as easily have inserted the values directly. But I assume this is because it is a learning exercise.
Please note that there is a typo in your create employee. You are missing a comma before the primary key.

Related

creating the sql query for company supervisors

I have four tables
create table emp (emp_ss int, emp_name nvarchar(20));
create table comp(comp_name nvarchar(20), comp_address nvarchar(20));
create table works (emp_ss int, comp_name nvarchar(20));
create table supervises (spv_ss int, emp_ss int );
Here SUPRVISER_SS and EMP_SS are subset of SS. Now I have to find:
the name of all the companies who have more than 4 supervisors
I have made a query for the above problem but not sure whether it is correct or not
SELECT COMP_NAME , COUNT(EMP_SS) FROM WORKS
WHERE EMP_SS IN (SELECT DISTINCT SPV_SS FROM supervises)
GROUP BY COMP_NAME
HAVING COUNT(EMP_SS) > 4;
the name of supervisors who have the largest number of employees
but unable to get the required result of the above condition
SELECT SPV_SS, COUNT(*) max_ FROM supervises GROUP BY SPV_SS
You don't need to have a seperate table for supervisors unless they come with extra information that doesn't belong in the employee table, just add an extra field (foreign key) in Employee table that links to the primary key in the same table.
First question: select company just use a group by companyid clause and then check if the count of supervisors is larger than 4 for.
Second question: select count(empid) and supervisor, use group by supervisor clause and add order by clause on the count column
I explained the logic, as for the actual sql code, you're gonna have to figure that out yourself.

How do I insert data from one table to another when there is unequal number of rows in one another?

I have a table named People that has 19370 rows with playerID being the primary column. There is another table named Batting, which has playerID as a foreign key and has 104324 rows.
I was told to add a new column in the People table called Total_HR, which is included in the Batting table. So, I have to insert that column data from the Batting table into the People table.
However, I get the error:
Msg 515, Level 16, State 2, Line 183 Cannot insert the value NULL into
column 'playerID', table 'Spring_2019_BaseBall.dbo.People'; column
does not allow nulls. INSERT fails. The statement has been terminated.
I have tried UPDATE and INSERT INTO SELECT, however got the same error
insert into People (Total_HR)
select sum(HR) from Batting group by playerID
I expect the output to populate the column Total_HR in the People table using the HR column from the Batting table.
You could use a join
BEGIN TRAN
Update People
Set Total_HR = B.HR_SUM
from PEOPLE A
left outer join
(Select playerID, sum(HR) HR_SUM
from Batting
group by playerID) B on A.playerID = B.playerID
Select * from People
ROLLBACK
Notice that I've put this code in a transaction block so you can test the changes before you commit
From the error message, it seems that playerID is a required field in table People.
You need to specify all required fields of table People in the INSERT INTO clause and provide corresponding values in the SELECT clause.
I added field playerID below, but you might need to add additional required fields as well.
insert into People (playerID, Total_HR)
select playerID, sum(HR) from Batting group by playerID
It is strange, however, that you want to insert rows in a table that should already be there. Otherwise, you could not have a valid foreign key on field playerID in table Batting... If you try to insert such rows from table Batting into table People, you might get another error (violation of PRIMARY KEY constraint)... Unless... you are creating the database just now and you want to populate empty table People from filled/imported table Batting before adding the actual foreign key constraint to table People. Sorry, I will not question your intentions. I personally would consider to update the query somewhat so that it will not attempt to insert any rows that already exist in table People:
insert into People (playerID, Total_HR)
select Batting.playerID, sum(Batting.HR)
from Batting
left join People on People.playerID = Batting.playerID
where People.playerID is null and Batting.playerID is not null
group by playerID
You need to calculate the SUM and then join this result to the People table to bring into the same rows both Total_HR column from People and the corresponding SUM calculated from Batting.
Here is one way to write it. I used CTE to make is more readable.
WITH
CTE_Sum
AS
(
SELECT
Batting.playerID
,SUM(Batting.HR) AS TotalHR_Src
FROM
Batting
GROUP BY
Batting.playerID
)
,CTE_Update
AS
(
SELECT
People.playerID
,People.Total_HR
,CTE_Sum.TotalHR_Src
FROM
CTE_Sum
INNER JOIN People ON People.playerID = CTE_Sum.playerID
)
UPDATE CTE_Update
SET
Total_HR = TotalHR_Src
;

How to perform a mass SQL insert to one table with rows from two seperate tables

I need some T-SQL help. We have an application which tracks Training Requirements assigned to each employee (such as CPR, First Aid, etc.). There are certain minimum Training Requirements which all employees must be assigned and my HR department wants me to give them the ability to assign those minimum Training Requirements to all personnel with the click of a button. So I have created a table called TrainingRequirementsForAllEmployees which has the TrainingRequirementID's of those identified minimum TrainingRequirements.
I want to insert rows into table Employee_X_TrainingRequirements for every employee in the Employees table joined with every row from TrainingRequirementsForAllEmployees.
I will add abbreviated table schema for clarity.
First table is Employees:
EmployeeNumber PK char(6)
EmployeeName varchar(50)
Second Table is TrainingRequirementsForAllEmployees:
TrainingRequirementID PK int
Third table (the one I need to Insert Into) is Employee_X_TrainingRequirements:
TrainingRequirementID PK int
EmployeeNumber PK char(6)
I don't know what the Stored Procedure should look like to achieve the results I need. Thanks for any help.
cross join operator is suitable when cartesian product of two sets of data is needed. So in the body of your stored procedure you should have something like:
insert into Employee_X_TrainingRequirements (TrainingRequirementID, EmployeeNumber)
select r.TrainingRequirementID, e.EmployeeNumber
from Employees e
cross join TrainingRequirementsForAllEmployees r
where not exists (
select 1 from Employee_X_TrainingRequirements
where TrainingRequirementID = r.TrainingRequirementID
and EmployeeNumber = e.EmployeeNumber
)

Underlying rows in Group By

I have a table with a certain number of columns and a primary key column (suppose OriginalKey). I perform a GROUP BY on a certain sub-set of those columns and store them in a temporary table with primary key (suppose GroupKey). At a later stage, I may need to get more details about one or more of those groupings (which can be found in the temporary table) i.e. I need to know which were the rows from the original table that formed that group. Simply put, I need to know the mappings between GroupKey and OriginalKey. What's the best way to do this? Thanks in advance.
Example:
Table Student(
StudentID INT PRIMARY KEY,
Level INT, --Grade/Class/Level depending on which country you are from)
HomeTown TEXT,
Gender CHAR)
INSERT INTO TempTable SELECT HomeTown, Gender, COUNT(*) AS NumStudents FROM Student GROUP BY HomeTown, Gender
On a later date, I would like to find out details about all towns that have more than 50 male students and know details of every one of them.
How about joining the 2 tables using the GroupKey, which, you say, are the same?
Or how about doing:
select * from OriginalTable where
GroupKey in (select GroupKey from my_temp_table)
You'd need to store the fields you grouped on in your temporary table, so you can join back to the original table. e.g. if you grouped on fieldA, fieldB, and fieldC, you'd need something like:
select original.id
from original
inner join temptable on
temptable.fieldA = original.fieldA and
temptable.fieldB = original.fieldB and
temptable.fieldC = original.fieldC

A Simple Sql Select Query

I know I am sounding dumb but I really need help on this.
I have a Table (let's say Meeting) which Contains a column Participants.
The Participants dataType is varchar(Max) and it stores Participant's Ids in comma separated form like 1,2.
Now my problem is I am passing a parameter called #ParticipantsID in my Stored Procedure and want to do something like this:
Select Participants from Meeting where Participants in (#ParticipantsID)
Unfortunately I am missing something crucial here.
Can some one point that out?
I've been there before... I changed the DB design to have one record contain a single reference to the other table. If you can't change your DB structures and you have to live with this, I found this solution on CodeProject.
New Function
IF EXISTS(SELECT * FROM sysobjects WHERE ID = OBJECT_ID(’UF_CSVToTable’))
DROP FUNCTION UF_CSVToTable
GO
CREATE FUNCTION UF_CSVToTable
(
#psCSString VARCHAR(8000)
)
RETURNS #otTemp TABLE(sID VARCHAR(20))
AS
BEGIN
DECLARE #sTemp VARCHAR(10)
WHILE LEN(#psCSString) > 0
BEGIN
SET #sTemp = LEFT(#psCSString, ISNULL(NULLIF(CHARINDEX(',', #psCSString) - 1, -1),
LEN(#psCSString)))
SET #psCSString = SUBSTRING(#psCSString,ISNULL(NULLIF(CHARINDEX(',', #psCSString), 0),
LEN(#psCSString)) + 1, LEN(#psCSString))
INSERT INTO #otTemp VALUES (#sTemp)
END
RETURN
END
Go
New Sproc
SELECT *
FROM
TblJobs
WHERE
iCategoryID IN (SELECT * FROM UF_CSVToTable(#sCategoryID))
You would not typically organise your SQL database in quite this way. What you are describing are two entities (Meeting & Participant) that have a one-to-many relationship. i.e. a meeting can have zero or more participants. To model this in SQL you would use three tables: a meeting table, a participant table and a MeetingParticipant table. The MeetingParticipant table holds the links between meetings & participants. So, you might have something like this (excuse any sql syntax errors)
create table Meeting
(
MeetingID int,
Name varchar(50),
Location varchar(100)
)
create table Participant
(
ParticipantID int,
FirstName varchar(50),
LastName varchar(50)
)
create table MeetingParticipant
(
MeetingID int,
ParticipantID int
)
To populate these tables you would first create some Participants:
insert into Participant(ParticipantID, FirstName, LastName) values(1, 'Tom', 'Jones')
insert into Participant(ParticipantID, FirstName, LastName) values(2, 'Dick', 'Smith')
insert into Participant(ParticipantID, FirstName, LastName) values(3, 'Harry', 'Windsor')
and create a Meeting or two
insert into Meeting(MeetingID, Name, Location) values(10, 'SQL Training', 'Room 1')
insert into Meeting(MeetingID, Name, Location) values(11, 'SQL Training', 'Room 2')
and now add some participants to the meetings
insert into MeetingParticipant(MeetingID, ParticipantID) values(10, 1)
insert into MeetingParticipant(MeetingID, ParticipantID) values(10, 2)
insert into MeetingParticipant(MeetingID, ParticipantID) values(11, 2)
insert into MeetingParticipant(MeetingID, ParticipantID) values(11, 3)
Now you can select all the meetings and the participants for each meeting with
select m.MeetingID, p.ParticipantID, m.Location, p.FirstName, p.LastName
from Meeting m
join MeetingParticipant mp on m.MeetingID=mp.MeetingID
join Participant p on mp.ParticipantID=p.ParticipantID
the above should produce
MeetingID ParticipantID Location FirstName LastName
10 1 Room 1 Tom Jones
10 2 Room 1 Dick Smith
11 2 Room 2 Dick Smith
11 3 Room 2 Harry Windsor
If you want to find out all the meetings that "Dick Smith" is in you would write something like this
select m.MeetingID, m.Location
from Meeting m join MeetingParticipant mp on m.MeetingID=mp.ParticipantID
where
mp.ParticipantID=2
and get
MeetingID Location
10 Room 1
11 Room 2
I have omitted important things like indexes, primary keys and missing attributes such as meeting dates, but it is clearer without all the goo.
Your table is not normalized. If you want to query for individual participants, they should be split into their own table, along the lines of:
Meeting
MeetingId primary key
Other stuff
Persons
PersonId primary key
Other stuff
Participants
MeetingId foreign key Meeting(MeetingId)
PersonId foreign key Persons(PersonId)
primary key MeetingId,PersonId
Otherwise, you have to resort to all sorts of trickery (what I call SQL gymnastics) to find out what you want. That trickery never scales well - your queries become slow very quickly as the table grows.
With a properly normalized database, the queries can remain fast well into the multi-millions of records (I work with DB2/z where we are used to truly huge tables).
There are valid reasons for sometimes reverting to second normal form (or even first) for performance but that should be a very hard thought out decision (and based on actual performance data). All databases should initially start of in 3NF.
SELECT * FROM Meeting WHERE Participants LIKE '%,12,%' OR Participants LIKE '12,%' OR Participants LIKE '%,12'
where 12 is the ID you are looking for....
Ugly, what a nasty model.
If I understand your question correctly, you are trying to pass in a comma separated list of participant ids and see if it is in your list. This link lists several ways to do such a thing"
[http://vyaskn.tripod.com/passing_arrays_to_stored_procedures.htm][1]
codezy.blogspot.com
If you store the participant ids in a comma-separated list (as text) in the database, you cannot easily query it (as a list) using SQL. You would have to resort to string-operations.
You should consider changing your schema to use another table to map meetings to participants:
create table meeting_participants (
meeting_id integer not null , -- foreign key
participant_id integer not null
);
That table would have multiple rows per meeting (one for each participant).
You can then query that table for individual participants, or number of participants, and such.
If participants is a separate data type you should be storing it as a child table of your meeting table. e.g.
MEETING
PARTICIPANT 1
PARTICIPANT 2
PARTICIPANT 3
Each participant would hold the meeting ID so you can do a query
SELECT * FROM participants WHERE meeting_id = 1
However, if you must store a comma separated list (for some external reason) then you can do a string search to find the appropriate record. This would be a very inefficient way to do a query though.
That is not the best way to store the information you have.
If it is all you have got then you need to be doing a contains (not an IN). The best answer is to have another table that links Participants to Meetings.
Try SELECT Meeting, Participants FROM Meeting CONTAINS(Participants, #ParticipantId)