Select rows not in another table, SQL Server query

Select rows not in another table, SQL Server query - sql

Subject table
CREATE TABLE [dbo].[BS_Subject](
[SubjectID] [bigint] IDENTITY(1,1) NOT NULL,
[DepartmentID] [bigint] NOT NULL,
[SubjectName] [varchar](50) NOT NULL,
[SubjectDescription] [varchar](100) NULL,
[SubjectShortCode] [varchar](10) NOT NULL,
CONSTRAINT [PK_Subject] PRIMARY KEY CLUSTERED
(
[SubjectID] ASC
)
SubjectToClass table
CREATE TABLE [dbo].[BS_SubjectToClass](
[SubjectToClassID] [bigint] IDENTITY(1,1) NOT NULL,
[SubjectID] [bigint] NOT NULL,
[ClassID] [bigint] NOT NULL,
CONSTRAINT [PK_BS_SubjectToClass] PRIMARY KEY CLUSTERED
(
[SubjectToClassID] ASC
)
I need list all the rows in the Subject table where subjectid is not in SubjectToClass table of a specified class.
I have this but unable to go any further
select Distinct(BS_Subject.SubjectID) DepartmentID,
SubjectName, SubjectDescription, SubjectShortCode
from dbo.BS_Subject
where BS_Subject.SubjectID <> (
SELECT Distinct(BS_Subject.SubjectID)
FROM dbo.BS_Subject, dbo.BS_SubjectToClass
Where BS_Subject.SubjectID = BS_SubjectToClass.SubjectID
And BS_SubjectToClass.ClassID = 2)

SELECT SubjectID, DepartmentID, SubjectName, SubjectDescription, SubjectShortCode
FROM BS_Subject
WHERE NOT EXISTS
(SELECT SubjectToClassID FROM BS_SubjectToClass WHERE
BS_Subject.SubjectID = BS_SubjectToClass.SubjectID
AND BS_SubjectToClass.ClassID =2)

You need to use the NOT IN operator - not the <> (that's VB or something....)
SELECT
DISTINCT(BS_Subject.SubjectID) DepartmentID,
SubjectName, SubjectDescription, SubjectShortCode
FROM dbo.BS_Subject
WHERE
BS_Subject.SubjectID NOT IN
(SELECT DISTINCT(BS_Subject.SubjectID)
FROM dbo.BS_Subject, dbo.BS_SubjectToClass
WHERE BS_Subject.SubjectID = BS_SubjectToClass.SubjectID
AND BS_SubjectToClass.ClassID = 2)

Related

SQL Server: select records, not linked to another table

I have a table:
CREATE TABLE [dbo].[CollectionSite]
(
[SiteCode] [nvarchar](32) NOT NULL,
[AddressId] [int] NOT NULL,
[RemittanceId] [int] NULL,
// additional columns
)
and a linked table:
CREATE TABLE [dbo].[CollectionSiteAddress]
(
[Id] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](255) NULL,
[Address1] [nvarchar](255) NULL,
[Address2] [nvarchar](255) NULL,
[City] [nvarchar](128) NULL,
[State] [nvarchar](64) NULL,
[Zip] [nvarchar](32) NULL,
)
Relationship between these 2 tables:
ALTER TABLE [dbo].[CollectionSite] WITH CHECK
ADD CONSTRAINT [FK_CollectionSite_CollectionSiteAddress_AddressId]
FOREIGN KEY([AddressId]) REFERENCES [dbo].[CollectionSiteAddress] ([Id])
GO
ALTER TABLE [dbo].[CollectionSite] WITH CHECK
ADD CONSTRAINT [FK_CollectionSite_CollectionSiteAddress_RemittanceId]
FOREIGN KEY([RemittanceId]) REFERENCES [dbo].[CollectionSiteAddress] ([Id])
GO
I want to select all records from CollectionSiteAddress, which are not linked to CollectionSite (neither AddressId nor RemittanceId). Which request should I use?
I tried:
SELECT *
FROM CollectionSiteAddress
LEFT JOIN CollectionSite ON CollectionSiteAddress.Id = CollectionSite.AddressId
OR CollectionSiteAddress.Id = CollectionSite.RemittanceId
but it selects all records from CollectionSiteAddress

You are missing this WHERE clause:
WHERE CollectionSite.[SiteCode] IS NULL
because you want all the unmatched rows of CollectionSiteAddress.
I used the column [SiteCode] to check if it is NULL because it is not nullable in the definition of the table.
So you can write your query like this (shortened with aliases):
SELECT csa.*
FROM CollectionSiteAddress csa LEFT JOIN CollectionSite cs
ON csa.Id = cs.AddressId OR csa.Id = cs.RemittanceId
WHERE cs.[SiteCode] IS NULL
Or use NOT EXISTS:
SELECT csa.*
FROM CollectionSiteAddress csa
WHERE NOT EXISTS (
SELECT 1
FROM CollectionSite cs
WHERE csa.Id = cs.AddressId OR csa.Id = cs.RemittanceId
)

I can't figure out how to Order by with string_agg

I have this query (I am using SQL Server 2019) and is working fine (combining Dates and Notes into one column). However, the result I am looking for is to have the latest date show up first.
How can I achieve that from this query?
SELECT ID,
(SELECT string_agg(concat(Date, ': ', Notes), CHAR(13) + CHAR(10) + CHAR(13) + CHAR (10)) as Expr1
FROM(SELECT DISTINCT nd.Notes, nd.Date
FROM dbo.ReleaseTrackerNotes AS nd
INNER JOIN dbo.ReleaseTracker AS ac4 ON ac4.ID = nd.ReleaseTrackerID
WHERE (ac4.ID = ac.ID)) AS z_1) AS vNotes
FROM dbo.ReleaseTracker AS ac
GROUP BY ID
I have tried the ORDER BY but is not working
Here is my table:
CREATE TABLE [dbo].[ReleaseTrackerNotes](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ReleaseTrackerID] [int] NULL,
[AOC_ModelID] [int] NULL,
[Date] [date] NULL,
[Notes] [nvarchar](800) NULL,
CONSTRAINT [PK_ReleaseTrackerNotes] PRIMARY KEY CLUSTERED
CREATE TABLE [dbo].[ReleaseTracker](
[ID] [int] IDENTITY(1,1) NOT NULL,
[AOC_ModelID] [int] NOT NULL,
[MotherboardID] [int] NOT NULL,
[StatusID] [int] NOT NULL,
[TestCateoryID] [int] NULL,
[TestTypeID] [int] NULL,
[DateStarted] [date] NULL,
[DateCompleted] [date] NULL,
[LCS#/ORS#] [nvarchar](20) NULL,
[ETCDate] [date] NULL,
[CardsNeeded] [nvarchar](2) NULL,
CONSTRAINT [PK_Compatibility] PRIMARY KEY CLUSTERED

Use WITHIN GROUP (ORDER BY ...):
SELECT
ID,
STRING_AGG(TRY_CONVERT(varchar, Date, 101) + ': ' + Notes +
CHAR(13) + CHAR(10) + CHAR(13), CHAR(10))
WITHIN GROUP (ORDER BY Date DESC) AS Expr1
FROM
(
SELECT DISTINCT ac4.ID, nd.Notes, nd.Date
FROM dbo.ReleaseTrackerNotes AS nd
INNER JOIN dbo.ReleaseTracker AS ac4
ON ac4.ID = nd.ReleaseTrackerID
) AS vNotes
GROUP BY ID;

SQL Query to update a colums from a table base on a datetime field

I have 2 table called tblSetting and tblPaquets.
I need to update 3 fields of tblPaquets from tblSetting base on a where clause that use a datetime field of tblPaquest and tblSetting.
The sql below is to represent what I am trying to do and I know it make no sense right now.
My Goal is to have One query to achieve this goal.
I need to extract the data from tblSettings like this
SELECT TOP(1) [SupplierID],[MillID],[GradeFamilyID] FROM [tblSettings]
WHERE [DateHeure] <= [tblPaquets].[DateHeure]
ORDER BY [DateHeure] DESC
And Update tblPaquets with this data
UPDATE [tblPaquets]
SET( [SupplierID] = PREVIOUS_SELECT.[SupplierID]
[MillID] = PREVIOUS_SELECT.[MillID]
[GradeFamilly] = PREVIOUS_SELECT.[GradeFamilyID] )
Here the table design
CREATE TABLE [tblSettings](
[ID] [int] NOT NULL,
[SupplierID] [int] NOT NULL,
[MillID] [int] NOT NULL,
[GradeID] [int] NOT NULL,
[TypeID] [int] NOT NULL,
[GradeFamilyID] [int] NOT NULL,
[DateHeure] [datetime] NOT NULL,
[PeakWetEnable] [tinyint] NULL)
CREATE TABLE [tblPaquets](
[ID] [int] IDENTITY(1,1) NOT NULL,
[PaquetID] [int] NOT NULL,
[DateHeure] [datetime] NULL,
[BarreCode] [int] NULL,
[Grade] [tinyint] NULL,
[SupplierID] [int] NULL,
[MillID] [int] NULL,
[AutologSort] [tinyint] NULL,
[GradeFamilly] [int] NULL)

You can do this using CROSS APPLY:
UPDATE p
SET SupplierID = s.SupplierID,
MillID = s.MillID
GradeFamilly = s.GradeFamilyID
FROM tblPaquets p CROSS APPLY
(SELECT TOP (1) s.*
FROM tblSettings s
WHERE s.DateHeure <= p.DateHeure
ORDER BY p.DateHeure DESC
) s;
Notes:
There are no parentheses before SET.
I don't recommend using [ and ] to escape identifiers, unless they need to be escaped.
I presume the query on tblSettings should have an ORDER BY to get the most recent rows.

Speeding up this query

This is my query which takes about 1.5 seconds. Can I lower this?
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY NAME asc) peta_rn,
peta_query.*
FROM
(SELECT
BOOK, PAGETRIMMED, NAME, TYPE, PDF
FROM
CCWiseDocumentNames2 cdn
INNER JOIN
CCWiseInstr2 cwi ON cwi.ID = cdn.ID) as peta_query) peta_paged
WHERE
peta_rn > 1331900 AND peta_rn <= 1331950
These are my table structures:
CREATE TABLE [dbo].[CCWiseDocumentNames2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[NAME] [varchar](100) NULL,
[OTHERNAM] [varchar](100) NULL,
[TYPE] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[CCWiseInstr2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[DATE] [datetime] NULL,
[ITYPE] [varchar](50) NULL,
[BOOK] [int] NULL,
[PAGE] [varchar](50) NULL,
[NOBP] [varchar](50) NULL,
[DESC] [varchar](240) NULL,
[TIF] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL,
[CONFIRM] [varchar](50) NULL,
[PDF] [varchar](50) NULL,
[PAGETRIMMED] [varchar](10) NULL,
[PageINT] [int] NULL,
[PageCHAR] [varchar](2) NULL,
[IdAuto] [int] NOT NULL
) ON [PRIMARY]
This is my execution plan:
As you can see it is 97% clustered index seek and 3% index scan. Any way to improve this query further?

You can't add rownumber on the fly to more than a million rows and expect a where clause will instantly recognize those rows with the newly generated rownumbers.

Because I don't have that volume of data, can only provide some options for your consideration:
Dedicate the clustered index for Name column (other than ID)
Make the join after you get row_number over name.
Include the three columns from CCWiseInstr2 into a non-clustered index on ID column. This could save some hard disk spindle's movement. Perfomance gain could only be observed with large volume of data.
CREATE NONCLUSTERED INDEX [idx2_ID_include] ON [dbo].[CCWiseInstr2] ([ID] ASC) INCLUDE ( [BOOK], [PDF], [PAGETRIMMED])
GO
With a as (
Select *
from ( SELECT ROW_NUMBER() OVER (ORDER BY NAME asc) as peta_rn, ID,
type
from CCWiseDocumentNames2) as Temp
where peta_rn > 1331900 AND peta_rn <= 1331950
)
select a.peta_rn,
a.type,
b.book,
b.PAGETRIMMED,
b.PDF
from a
join CCWiseInstr2 as b on a.id = b.id

SQL fastest 'GROUP BY' script

Is there any difference in how I edit the GROUP BY command?
my code:
SELECT Number, Id
FROM Table
WHERE(....)
GROUP BY Id, Number
is it faster if i edit it like this:
SELECT Number, Id
FROM Table
WHERE(....)
GROUP BY Number , Id

it's better to use DISTINCT if you don't want to aggregate data. Otherwise, there is no difference between the two queries you provided, it'll produce the same query plan

This examples are equal.
DDL:
CREATE TABLE dbo.[WorkOut]
(
[WorkOutID] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[TimeSheetDate] [datetime] NOT NULL,
[DateOut] [datetime] NOT NULL,
[EmployeeID] [int] NOT NULL,
[IsMainWorkPlace] [bit] NOT NULL,
[DepartmentUID] [uniqueidentifier] NOT NULL,
[WorkPlaceUID] [uniqueidentifier] NULL,
[TeamUID] [uniqueidentifier] NULL,
[WorkShiftCD] [nvarchar](10) NULL,
[WorkHours] [real] NULL,
[AbsenceCode] [varchar](25) NULL,
[PaymentType] [char](2) NULL,
[CategoryID] [int] NULL
)
Query:
SELECT wo.WorkOutID, wo.TimeSheetDate
FROM dbo.WorkOut wo
GROUP BY wo.WorkOutID, wo.TimeSheetDate
SELECT DISTINCT wo.WorkOutID, wo.TimeSheetDate
FROM dbo.WorkOut wo
SELECT wo.DateOut, wo.EmployeeID
FROM dbo.WorkOut wo
GROUP BY wo.DateOut, wo.EmployeeID
SELECT DISTINCT wo.DateOut, wo.EmployeeID
FROM dbo.WorkOut wo
Execution plan:

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select rows not in another table, SQL Server query - sql

SELECT SubjectID, DepartmentID, SubjectName, SubjectDescription, SubjectShortCode FROM BS_Subject WHERE NOT EXISTS (SELECT SubjectToClassID FROM BS_SubjectToClass WHERE BS_Subject.SubjectID = BS_SubjectToClass.SubjectID AND BS_SubjectToClass.ClassID =2)

Related

SQL Server: select records, not linked to another table

I can't figure out how to Order by with string_agg

SQL Query to update a colums from a table base on a datetime field

Speeding up this query

SQL fastest 'GROUP BY' script

Categories

Resources