Insert into table with Inner Join - sql

I've been trying to execute insert into a table with an inner join with another table. I tried to use inner join as below but it didn't works. I'm not very sure which is more suitable whether to use INNER JOIN or LEFT JOIN
INSERT INTO ticketChangeSet (Comments, createdBy, createdDateTime)
VALUES ('Test', 'system', CURRENT_TIMESTAMP)
INNER JOIN tickets ON ticketChangeSet.ticket_id = tickets.id
WHERE tickets.id BETWEEN '3' AND '5'
Sample data:
tickets table
id comment createdDateTime closeDateTime createdBy
2 NULL 2022-07-05 15:36:20 2022-07-05 16:21:03 system
3 NULL 2022-07-05 15:36:20 2022-07-05 16:21:03 system
4 NULL 2022-07-05 15:36:20 2022-07-05 16:21:03 system
5 NULL 2022-07-05 15:36:20 2022-07-05 16:21:03 system
ticketChangeSet table
id comments createdBy createdDateTime ticket_id
1 Ticket not resolved system 2022-07-05 15:59:01 2
Basically, I want to insert this value ('Ticket not resolved', 'system', '2022-07-05 15:59:01') into the ticketChangeSet table for ticket_id 3 to 5 from ticket table

Just select the rows directly from differIssue (or maybe from Tickets - not certain) and supply your constants as the column values.
insert dbo.differIssue (Comments, createdby, dateTime) -- why the strange casing?
select 'Test', 'system', CURRENT_TIMESTAMP
from dbo.differIssue where Tickets_id between 89 and 100 -- why underscores
;
Notice the statement terminator and the use of schema name - both best practices. I also assumed that the ID column is numeric and removed the string delimiters around those filter values. I left out the join because it did not seem required. Presumably the relationship between differIssue and Tickets is 1:1 so an inner join does nothing useful. But perhaps you need to include rows from Tickets for that range of ID values but which might not exist in differIssue? So try
insert dbo.differIssue (Comments, createdby, dateTime)
select 'Test', 'system', CURRENT_TIMESTAMP
from dbo.Tickets where id between 89 and 100
;
But this all seems highly suspicious. I think there is at least one key column missing from the logic - and perhaps more than one.
Update. Now you've changed the table names, added more columns, and changed the filter. You still use string constants for a numeric column - a bad habit.
insert dbo.ticketChangeSet (...)
select ...
from dbo.Tickets as TKT
where not exists (select * from dbo.ticketChangeSet as CHG where CHG.ticket_id = TKT.id)
;
I leave it to you to fill in the missing bits.

Related

One SQL statement for counting the records in the master table based on matching records in the detail table?

I have the following master table called Master and sample data
ID---------------Date
1 2014-09-07
2 2014-09-07
3 2014-09-08
The following details table called Details
masterId-------------Name
1 John Walsh
1 John Jones
2 John Carney
1 Peter Lewis
3 John Wilson
Now I want to find out the count of Master records (grouped on the Date column) whose corresponding details record with Name having the value "John".
I cannot figure how to write a single SQL statement for this job.
**Please note that join is needed in order to find master records for count. However, such join creates duplicate master records for count. I need to remove such duplicate records from being counted when grouping on the Date column in the Master table.
The correct results should be:
count: grouped on Date column
2 2014-09-07
1 2014-09-08
**
Thanks and regards!
This answer assumes the following
The Name field is always FirstName LastName
You are looking once and only once for the John firstname. The search criteria would be different, pending what you need
SELECT Date, Count(*)
FROM tblmaster
INNER JOIN tbldetails ON tblmaster.ID=tbldetails.masterId
WHERE NAME LIKE 'John%'
GROUP BY Date, tbldetails.masterId
What we're doing here is using a wilcard character in our string search to say "Look for John where any characters of any length follows".
Also, here is a way to create table variables based on what we're working with
DECLARE #tblmaster as table(
ID int,
[date] datetime
)
DECLARE #tbldetails as table(
masterID int,
name varchar(50)
)
INSERT INTO #tblmaster (ID,[date])
VALUES
(1,'2014-09-07'),(2,'2014-09-07'),(3,'2014-09-08')
INSERT INTO #tbldetails(masterID, name) VALUES
(1,'John Walsh'),
(1,'John Jones'),
(2,'John Carney'),
(1,'Peter Lewis'),
(3,'John Wilson')
Based on all comments below, this SQL statement in it's clunky glory should do the trick.
SELECT date,count(t1.ID) FROM #tblmaster mainTable INNER JOIN
(
SELECT ID, COUNT(*) as countOfAll
FROM #tblmaster t1
INNER JOIN #tbldetails t2 ON t1.ID=t2.masterId
WHERE NAME LIKE 'John%'
GROUP BY id)
as t1 on t1.ID = mainTable.id
GROUP BY mainTable.date
Is this what you want?
select date, count(distinct m.id)
from master m join
details d
on d.masterid = m.id
where name like '%John%'
group by date;

Is there any way to get postgresql to report results from a join?

In other statistical softwares (STATA), when you perform a join between two separate tables there are options to reports the results of a join
For instance, if you join a table with another table on a column and the second table has non-unique values, it reports that.
Likewise, if you perform an inner join it reports the number of rows dropped from both tables and if you perform a left or right outer join it lets you know how many rows were unmatched.
It will need a nasty outer join. Here is the CTE version:
-- Some data
CREATE TABLE bob
( ID INTEGER NOT NULL
, zname varchar
);
INSERT INTO bob(id, zname) VALUES
(2, 'Alice') ,(3, 'Charly')
,(4,'David') ,(5, 'Edsger') ,(6, 'Fanny')
;
CREATE TABLE john
( ID INTEGER NOT NULL
, zname varchar
);
INSERT INTO john(id, zname) VALUES
(4,'David') ,(5, 'Edsger') ,(6, 'Fanny')
,(7,'Gerard') ,(8, 'Hendrik') ,(9, 'Irene'), (10, 'Joop')
;
--
-- Encode presence in bob as 1, presence in John AS 2, both=3
--
WITH flags AS (
WITH b AS (
SELECT 1::integer AS flag, id
FROM bob
)
, j AS (
SELECT 2::integer AS flag, id
FROM john
)
SELECT COALESCE(b.flag, 0) + COALESCE(j.flag, 0) AS flag
FROM b
FULL OUTER JOIN j ON b.id = j.id
)
SELECT flag, COUNT(*)
FROM flags
GROUP BY flag;
The result:
CREATE TABLE
INSERT 0 5
CREATE TABLE
INSERT 0 7
flag | count
------+-------
1 | 2
3 | 3
2 | 4
(3 rows)
As far as I know there is no option to do that within Postgres, although you could get a guess by looking at the estimates.
Calculating the missing rows requires you to count all rows so databases generally try to avoid things like that.
The options I can think of:
writing multiple queries
doing a full outer join and filtering the results (maybe with a subquery... can't think of a good way which will always easily work)
use writable complex table expressions to log the intermediate results

SQL - need to determine implicit end dates for supplied begin dates

Consider the following:
CREATE TABLE Members
(
MemberID CHAR(10)
, GroupID CHAR(10)
, JoinDate DATETIME
)
INSERT Members VALUES ('1', 'A', 2010-01-01)
INSERT Members VALUES ('1', 'C', 2010-09-05)
INSERT Members VALUES ('1', 'B', 2010-04-15)
INSERT Members VALUES ('1', 'B', 2010-10-10)
INSERT Members VALUES ('1', 'A', 2010-06-01)
INSERT Members VALUES ('1', 'D', 2001-11-30)
What would be the best way to select from this table, determining the implied "LeaveDate", producing the following data set:
MemberID GroupID JoinDate LeaveDate
1 A 2010-01-01 2010-04-14
1 B 2010-04-15 2010-05-31
1 A 2010-06-01 2010-09-04
1 C 2010-09-05 2010-10-09
1 B 2010-10-10 2010-11-29
1 D 2010-11-30 NULL
As you can see, a member is assumed to have no lapse in membership. The [LeaveDate] for each member status period is assumed to be the day prior to the next chronological [JoinDate] that can be found for that member in a different group. Of course this is a simplified illustration of my actual problem, which includes a couple more categorization/grouping columns and thousands of different members with [JoinDate] values stored in no particular order.
Something like this perhaps? Self join, and select the minimum joining date that is greater than the joining date for the current row - i.e. the leave date plus one. Subtract one day from it.
You may need to adjust the date arithmetic for your particular RDBMS.
SELECT
m1.*
, MIN( m2.JoinDate ) - INTERVAL 1 DAY AS LeaveDate
FROM
Members m1
LEFT JOIN
Members m2
ON m2.MemberID = m1.MemberID
AND m2.JoinDate > m1.JoinDate
GROUP BY
m1.MemberID
, m1.GroupID
, m1.JoinDate
ORDER BY
m1.MemberID
, m1.JoinDate
Standard (ANSI) SQL solution:
SELECT memberid,
groupid,
joindate,
lead(joindate) OVER (PARTITION BY memberid ORDER BY joindate ASC) AS leave_date
FROM members
ORDER BY joindate ASC

Last Record of a Join Table (how to optimize)

I have the same "problem" as described in (Last record of Join table): I need to join a "Master Table" with a "History Table" whereas I only want to join the latest (by date) Record of the the history table. So whenever I query a record for the mastertable I also geht the "latest" data of the History Table.
Master Table
ID
FIRSTNAME
LASTNAME
...
History Table
ID
LASTACTION
DATE
This is possible by joining both tables and using a subselect to retrieve the latest history table record as described in the answer given in the link above.
My Quesions are:
How can I solve the problem, that there might be in theory two History Records with the same date?
Is this kind of joining with the subselect really the best solution in terms of performance (and in general)? What do you think (I am NO expert in all this stuff) if I integrate a further attribute in the History table that is named "ISLATESTRECORD" as a boolean Flag that I manage manually (and that has a unique constrained). This attribute will then explicitly mark the latest record and I do not need any subselects as I can directly use this attribute in the where clause of the join.
On the other hand, this makes inserting a new record of course a little bit more complicated: I first have to remove the "ISLATESTRECORD" flag from the latest record, I have to insert the new History Record with the "ISLATESTRECORD" set and commit the transaction.
What do you think is the recommended solution? I do not have any clue about the performance impact of the subselects: I might have millions of "Mastertable" Records" that I have to search for a specific record also using in the search attributes of the joined History table like: "Give me the Mastertable Record with FIRSTNAME XYZ and the LASTACTION (of the History Table) was "changed_name". So this subselect might be called millions of times.
Or is it better work with a subselect to find the latest record, as subselects are very efficient and its better to keep everything normalized?
Thank you very much
I solve your problem with a query on your existing tables, and on your tables with an auto-incrementing identity column added to the history table. By adding an auto-incrementing identity column on your history table, you can get around the unique problem of the dates, and make the query easier.
To solve the problem with your tables (with SQL Server example code):
DECLARE #MasterTable table (MasterID int,FirstName varchar(20),LastName varchar(20))
DECLARE #HistoryTable table (MasterID int,LastAction char(1),HistoryDate datetime)
INSERT INTO #MasterTable VALUES (1,'AAA','aaa')
INSERT INTO #MasterTable VALUES (2,'BBB','bbb')
INSERT INTO #MasterTable VALUES (3,'CCC','ccc')
INSERT INTO #HistoryTable VALUES (1,'I','1/1/2009')
INSERT INTO #HistoryTable VALUES (1,'U','2/2/2009')
INSERT INTO #HistoryTable VALUES (1,'U','3/3/2009') --<<dups
INSERT INTO #HistoryTable VALUES (1,'U','3/3/2009') --<<dups
INSERT INTO #HistoryTable VALUES (2,'I','5/5/2009')
INSERT INTO #HistoryTable VALUES (3,'I','7/7/2009')
INSERT INTO #HistoryTable VALUES (3,'U','8/8/2009')
SELECT
MasterID,FirstName,LastName,LastAction,HistoryDate
FROM (SELECT
m.MasterID,m.FirstName,m.LastName,h.LastAction,h.HistoryDate,ROW_NUMBER() OVER(PARTITION BY m.MasterID ORDER BY m.MasterID) AS RankValue
FROM #MasterTable m
INNER JOIN (SELECT
MasterID,MAX(HistoryDate) AS MaxDate
FROM #HistoryTable
GROUP BY MasterID
) dt ON m.MasterID=dt.MasterID
INNER JOIN #HistoryTable h ON dt.MasterID=h.MasterID AND dt.MaxDate=h.HistoryDate
) AllRows
WHERE RankValue=1
OUTPUT:
MasterID FirstName LastName LastAction HistoryDate
----------- --------- -------- ---------- -----------
1 AAA aaa U 2009-03-03
2 BBB bbb I 2009-05-05
3 CCC ccc U 2009-08-08
(3 row(s) affected)
To solve the problem with a better, HistoryTable (with SQL Server example code):
it is better because it has an auto-incrementing history id identity column
DECLARE #MasterTable table (MasterID int,FirstName varchar(20),LastName varchar(20))
DECLARE #HistoryTableNEW table (HistoryID int identity(1,1), MasterID int,LastAction char(1),HistoryDate datetime)
INSERT INTO #MasterTable VALUES (1,'AAA','aaa')
INSERT INTO #MasterTable VALUES (2,'BBB','bbb')
INSERT INTO #MasterTable VALUES (3,'CCC','ccc')
INSERT INTO #HistoryTableNEW VALUES (1,'I','1/1/2009')
INSERT INTO #HistoryTableNEW VALUES (1,'U','2/2/2009')
INSERT INTO #HistoryTableNEW VALUES (1,'U','3/3/2009') --<<dups
INSERT INTO #HistoryTableNEW VALUES (1,'U','3/3/2009') --<<dups
INSERT INTO #HistoryTableNEW VALUES (2,'I','5/5/2009')
INSERT INTO #HistoryTableNEW VALUES (3,'I','7/7/2009')
INSERT INTO #HistoryTableNEW VALUES (3,'U','8/8/2009')
SELECT
m.MasterID,m.FirstName,m.LastName,h.LastAction,h.HistoryDate,h.HistoryID
FROM #MasterTable m
INNER JOIN (SELECT
MasterID,MAX(HistoryID) AS MaxHistoryID
FROM #HistoryTableNEW
GROUP BY MasterID
) dt ON m.MasterID=dt.MasterID
INNER JOIN #HistoryTableNEW h ON dt.MasterID=h.MasterID AND dt.MaxHistoryID=h.HistoryID
OUTPUT:
MasterID FirstName LastName LastAction HistoryDate HistoryID
----------- --------- -------- ---------- ----------------------- ---------
1 AAA aaa U 2009-03-03 00:00:00.000 4
2 BBB bbb I 2009-05-05 00:00:00.000 5
3 CCC ccc U 2009-08-08 00:00:00.000 7
(3 row(s) affected)
If the history table has a Primary Key (and all tables should), you can modify the subselect to extract the record with either the larger (or the smaller) PK value of the multiples that match the date criteria...
Select M.*, H.*
From Master M
Join History H
On H.PK = (Select Max(PK) From History
Where FK = M.PK
And Date = (Select Max(Date) From History
Where FK = M.PK))
As to performance, that can be addressed by adding the appropriate indices to these tables (History.Date, History.FK) but in general, depending on the specific table data distribution patterns, sub queries can adversely affect performance.

Updating Uncommitted data to a cell with in an UPDATE statement

I want to convert a table storing in Name-Value pair data to relational form in SQL Server 2008.
Source table
Strings
ID Type String
100 1 John
100 2 Milton
101 1 Johny
101 2 Gaddar
Target required
Customers
ID FirstName LastName
100 John Milton
101 Johny Gaddar
I am following the strategy given below,
Populate the Customer table with ID values in Strings Table
INSERT INTO CUSTOMERS SELECT DISTINCT ID FROM Strings
You get the following
Customers
ID FirstName LastName
100 NULL NULL
101 NULL NULL
Update Customers with the rest of the attributes by joining it to Strings using ID column. This way each record in Customers will have corresponding 2 matching records.
UPDATE Customers
SET FirstName = (CASE WHEN S.Type=1 THEN S.String ELSE FirstName)
LastName = (CASE WHEN S.Type=2 THEN S.String ELSE LastName)
FROM Customers
INNER JOIN Strings ON Customers.ID=Strings.ID
An intermediate state will be llike,
ID FirstName LastName ID Type String
100 John NULL 100 1 John
100 NULL Milton 100 2 Milton
101 Johny NULL 101 1 Johny
101 NULL Gaddar 101 2 Gaddar
But this is not working as expected. Because when assigning the values in the SET clause it is setting only the committed values instead of the uncommitted. Is there anyway to set uncommitted values (with in the processing time of query) in UPDATE statement?
PS: I am not looking for alternate solutions but make my approach work by telling SQL Server to use uncommitted data for UPDATE.
The easiest way to do it would be to split the update into two:
UPDATE Customers
SET FirstName = Strings.String
FROM Customers
INNER JOIN Strings ON Customers.ID=Strings.ID AND Strings.Type = 1
And then:
UPDATE Customers
SET LastName = Strings.String
FROM Customers
INNER JOIN Strings ON Customers.ID=Strings.ID AND Strings.Type = 2
There are probably ways to do it in one query such as a derived table, but unless that's a specific requirement I'd just use this approach.
Have a look at this, it should avoid all the steps you had
DECLARE #Table TABLE(
ID INT,
Type INT,
String VARCHAR(50)
)
INSERT INTO #Table (ID,[Type],String) SELECT 100 ,1 ,'John'
INSERT INTO #Table (ID,[Type],String) SELECT 100 ,2 ,'Milton'
INSERT INTO #Table (ID,[Type],String) SELECT 101 ,1 ,'Johny'
INSERT INTO #Table (ID,[Type],String) SELECT 101 ,2 ,'Gaddar'
SELECT IDs.ID,
tName.String NAME,
tSur.String Surname
FROM (
SELECT DISTINCT ID
FROM #Table
) IDs LEFT JOIN
#Table tName ON IDs.ID = tName.ID AND tName.[Type] = 1 LEFT JOIN
#Table tSur ON IDs.ID = tSur.ID AND tSur.[Type] = 2
OK, i do not think that you will find a solution to what you are looking for. From UPDATE (Transact-SQL) it states
Using UPDATE with the FROM Clause
The results of an UPDATE statement are
undefined if the statement includes a
FROM clause that is not specified in
such a way that only one value is
available for each column occurrence
that is updated, that is if the UPDATE
statement is not deterministic. For
example, in the UPDATE statement in
the following script, both rows in
Table1 meet the qualifications of the
FROM clause in the UPDATE statement;
but it is undefined which row from
Table1 is used to update the row in
Table2.
USE AdventureWorks;
GO
IF OBJECT_ID ('dbo.Table1', 'U') IS NOT NULL
DROP TABLE dbo.Table1;
GO
IF OBJECT_ID ('dbo.Table2', 'U') IS NOT NULL
DROP TABLE dbo.Table2;
GO
CREATE TABLE dbo.Table1
(ColA int NOT NULL, ColB decimal(10,3) NOT NULL);
GO
CREATE TABLE dbo.Table2
(ColA int PRIMARY KEY NOT NULL, ColB decimal(10,3) NOT NULL);
GO
INSERT INTO dbo.Table1 VALUES(1, 10.0), (1, 20.0), (1, 0.0);
GO
UPDATE dbo.Table2
SET dbo.Table2.ColB = dbo.Table2.ColB + dbo.Table1.ColB
FROM dbo.Table2
INNER JOIN dbo.Table1
ON (dbo.Table2.ColA = dbo.Table1.ColA);
GO
SELECT ColA, ColB
FROM dbo.Table2;
Astander is correct (I am accepting his answer). The update is not happening because of a read UNCOMMITTED issue but because of the multiple rows returned by the JOIN. I have verified this. UPDATE picks only the first row generated from the multiple records to update the original table. This is the behavior for MSSQL, Sybase and such RDMBMSs but Oracle does not allow this kind of an update an d it throws an error. I have verified this thing for MSSQL.
And again MSSQL does not support updating a cell with UNCOMMITTED data. Don't know the status with other RDBMSs. And I have no idea if anyRDBMS provides with in the query ISOLATION level management.
An alternate solution will be to do it in two steps, Aggregate to unpivot and then insert. This has lesser scans compared to methods given in above answers.
INSERT INTO Customers
SELECT
ID
,MAX(CASE WHEN Type = 1 THEN String ELSE NULL END) AS FirstName
,MAX(CASE WHEN Type = 2 THEN String ELSE NULL END) AS LastName
FROM Strings
GROUP BY ID
Thanks to my friend Roji Thomas for helping me with this.