SQL loop through a table and find the next part record - sql

Not sure where to start on this one. I inheriated a table that has a list of part numbers that are are active and inactive. If the part number is inactive, they enter the next valid part number. If the part number is active there is no Next PartNumber. They want to search on a Part Number and find all of the next part numbers that match.
Basically the table looks like this.
PartNumber Varchar(20), Active Varchar(3), NextPartNumber Varchar(20).
Problem is I do not know how many part numbers are in the chain. Here is a sample of the data:
100X No XYZ
XYZ No 45A6
45A6 Yes
QWER No RT98
RT98 No POUL1
POUL1 No N9HGT
N9HGT No FGH12
FGH12 Yes
I can write a query like this, but since I don't know how many part numbers there are, this won't work.
Select A.PartNumber, A.NextPartNumber, B.PartNumber, B.NextPartNumber, C.PartNumber, C.NextPartNumber
FROM tblPartTable as A
inner join
tblPartTable as B
on A.PartNumber = B.NextPartNumber
inner join
tblPartTable as C
on B.PartNumber = C.NextPartNumber
where A.PartNumber = '100X'

With SQL Server (which I'm assuming you're talking about since your earlier questions have been about it), you can use a recursive common table expression to easily get the searched for part and all its successors, there is no need to loop manually;
WITH cte AS (
-- Base condition, where do we start the search?
SELECT t.* FROM tblPartTable t WHERE t.PartNumber = '100X'
UNION ALL
-- Continue condition, how do we find the next part from the current one?
SELECT t.* FROM tblPartTable t JOIN cte ON t.PartNumber = cte.NextPartNumber
)
SELECT partnumber, active FROM cte;
An SQLfiddle to test with.
The same query works on most RDBMS's except MySQL.

Related

Loop through table and update a specific column

I have the following table:
Id
Category
1
some thing
2
value
This table contains a lot of rows and what I'm trying to do is to update all the Category values to change every first letter to caps. For example, some thing should be Some Thing.
At the moment this is what I have:
UPDATE MyTable
SET Category = (SELECT UPPER(LEFT(Category,1))+LOWER(SUBSTRING(Category,2,LEN(Category))) FROM MyTable WHERE Id = 1)
WHERE Id = 1;
But there are two problems, the first one is trying to change the Category Value to upper, because only works ok for 1 len words (hello=> Hello, hello world => Hello world) and the second one is that I'll need to run this query X times following the Where Id = X logic. So my question is how can I update X rows? I was thinking in a cursor but I don't have too much experience with it.
Here is a fiddle to play with.
You can split the words apart, apply the capitalization, then munge the words back together. No, you shouldn't be worrying about subqueries and Id because you should always approach updating a set of rows as a set-based operation and not one row at a time.
;WITH cte AS
(
SELECT Id, NewCat = STRING_AGG(CONCAT(
UPPER(LEFT(value,1)),
SUBSTRING(value,2,57)), ' ')
WITHIN GROUP (ORDER BY CHARINDEX(value, Category))
FROM
(
SELECT t.Id, t.Category, s.value
FROM dbo.MyTable AS t
CROSS APPLY STRING_SPLIT(Category, ' ') AS s
) AS x GROUP BY Id
)
UPDATE t
SET t.Category = cte.NewCat
FROM dbo.MyTable AS t
INNER JOIN cte ON t.Id = cte.Id;
This assumes your category doesn't have non-consecutive duplicates within it; for example, bora frickin bora would get messed up (meanwhile bora bora fickin would be fine). It also assumes a case insensitive collation (which could be catered to if necessary).
In Azure SQL Database you can use the new enable_ordinal argument to STRING_SPLIT() but, for now, you'll have to rely on hacks like CHARINDEX().
Updated db<>fiddle (thank you for the head start!)

SQL Server Comparing 2 tables having Millions of Records

I have 2 tables in SQL Server: Table 1 and Table 2.
Table 1 has 500 Records and Table 2 has Millions of Records.
Table 2 may/may not have the 500 Records of Table 1 in it.
I have to compare Table 1 and Table 2. But the result should give me only the Records of Table 1 which has any data change in Table 2. Means the Result should be less than or equal to 500.
I don't have any primary key but the columns in the 2 tables are same. I have written the following query. But I am getting time out exception and it is taking much time to process. Please help.
With CTE_DUPLICATE(OLD_FIRSTNAME ,New_FirstName,
OLD_LASTNAME ,New_LastName,
OLD_MINAME ,New_MIName ,
OLD_FAMILYID,NEW_FAMILYID,ROWNUMBER)
as (
Select distinct
OLD.FIRST_NAME AS 'OLD_FIRSTNAME' ,New.First_Name AS 'NEW_FIRSTNAME',
OLD.LAST_NAME AS 'OLD_LASTNAME',New.Last_Name AS 'NEW_LASTNAME',
OLD.MI_NAME AS 'OLD_MINAME',New.MI_Name AS 'NEW_MINAME',
OLD.FAMILY_ID AS 'OLD_FAMILYID',NEW.FAMILY_ID AS 'NEW_FAMILYID',
row_number()over(partition by OLD.FIRST_NAME ,New.First_Name,
OLD.LAST_NAME ,New.Last_Name,
OLD.MI_NAME ,New.MI_Name ,
OLD.FAMILY_ID,NEW.FAMILY_ID
order by OLD.FIRST_NAME ,New.First_Name,
OLD.LAST_NAME ,New.Last_Name,
OLD.MI_NAME ,New.MI_Name ,
OLD.FAMILY_ID,NEW.FAMILY_ID )as rank
From EEMSCDBStatic OLD,EEMS_VIPFILE New where
OLD.MPID <> New.MPID and old.FIRST_NAME <> New.First_Name
and OLD.LAST_NAME <> New.Last_Name and OLD.MI_NAME <> New.MI_Name
and old.Family_Id<>New.Family_id
)
sELECT OLD_FIRSTNAME ,New_FirstName,
OLD_LASTNAME ,New_LastName,
OLD_MINAME ,New_MIName ,
OLD_FAMILYID,NEW_FAMILYID FROM CTE_DUPLICATE where rownumber=1
I think the main problem here is that your query is forcing the DB to fully multiply your tables, which means processing ~500M combinations. It happens because you're connecting any record from T1 with any record from T2 that has at least one different value, including MPID that looks like the unique identifier that must be used to connect records.
If MPID is really the column that identifies records in both tables then your query should have a bit different structure:
SELECT old.FIRSTNAME, new.FirstName,
old.LASTNAME, new.LastName,
old.MINAME, new.MIName,
old.FAMILYID, new.FAMILYID
FROM EEMSCDBStatic old
INNER JOIN EEMS_VIPFILE new ON old.MPID = new.MPID
WHERE old.FIRST_NAME <> New.First_Name
AND OLD.LAST_NAME <> New.Last_Name
AND OLD.MI_NAME <> New.MI_Name
AND old.Family_Id <> New.Family_id
ORDER BY old.FIRSTNAME, new.FirstName,
old.LASTNAME, new.LastName,
old.MINAME, new.MIName,
old.FAMILYID, new.FAMILYID
A couple of other thoughts:
If you're looking for any change in a record (even if only one column has different values), you should use ORs in the WHERE clause, not ANDs. Now you're only looking for records that changed values in all columns. For instance, you'll fail to find a person who changed his or her first name but decided to keep last name.
You should obviously consider indexing your tables if it's possible.
Surely it is pointless to use DISTINCT keyword together with ROWNUMBER.
See this sql query distinct with Row_Number.
You are doing CROSS JOIN, which is terribly big in your case.
Perhaps in that condition you
where OLD.MPID <> New.MPID and old.FIRST_NAME <> New.First_Name and ...
you wanted to have OR instead of AND?
It is also not entirely clear why you use ROWNUMBER at all - perhaps to find the best match.
All this is because as #Shnugo correctly remarked, the logic behind your comparing is faulty - you must have some logic defined that would JOIN the tables (Like First and second name must be the same).

Access SQL query that uses a field from one table within where clause to create a calculated field

This really isn't that complicated but I can't wrap my head around it. Ive searched thoroughly but even coming up with a search string or title for this question was hard enough. The explanation will tell all:
Table "labelDrops" has fields "PO" and "Qty" (among other unrelated fields)
Table "imageLiveCount" has field "PO"
I want to count instances (count(imageLiveCount.id)) where imageLiveCoount.PO = labelDrops.PO and then subtract that from the "QTY" to create a calculated field called "QtyLeft".
The result would basically look like "SELECT PO, Qty, answerToThisQuestion AS QtyLeft FROM labelDrops"
Start by creating a query which gives you the count for each PO in your imageLiveCount table:
SELECT i.PO, Count(*) AS CountOfInstances
FROM imageLiveCount AS i
GROUP BY i.PO
If that query returns what you want, join it to the labelDrops table and compute QtyLeft:
SELECT
l.PO,
l.Qty,
(l.Qty - sub.CountOfInstances) AS QtyLeft
FROM
labelDrops AS l
INNER JOIN
(
SELECT i.PO, Count(*) AS CountOfInstances
FROM imageLiveCount AS i
GROUP BY i.PO
) AS sub
ON l.PO = sub.PO

Efficiently checking if more than one row exists in a recordset

I'm using SQL Server 2008 R2, and I'm trying to find an efficient way to test if more than 1 row exists in a table matching a condition.
The naive way to do it is a COUNT:
IF ( SELECT COUNT(*)
FROM Table
WHERE Column = <something>
) > 1 BEGIN
...
END
But this requires actually computing a COUNT, which is wasteful. I just want to test for more than 1.
The only thing I've come up with is a COUNT on a TOP 2:
IF ( SELECT COUNT(*)
FROM ( SELECT TOP 2 0 x
FROM Table
WHERE Column = <something>
) x
) > 1 BEGIN
...
END
This is clunky and requires commenting to document. Is there a more terse way?
If you have a PK in the table that you're checking for >1 row, you could nest another EXISTS clause. Not sure if this is faster, but it achieves your record result. For example, assuming a Station table with a PK named ID that can have zero-to-many Location table records with a PK named ID, Location has FK StationID, and you want to find the Stations with at least two Locations:
SELECT s.ID
FROM Station s
WHERE EXISTS (
SELECT 1
FROM Location L
WHERE L.StationID = s.ID
AND EXISTS (
SELECT 1
FROM Location L2
WHERE L2.StationID = L.StationID
AND L2.ID <> L.ID
)
)
I have the following solution to share, which may be lighter performance-wise.
I suppose that you are trying to fetch the first record and process it once you make sure that your SQL selection returns a single record. Therefore, go on and fetch it, but once you do that, try to fetch the next record right away, and if successful, you know that more than one record exists, and you can start your exception processing logic. Otherwise, you can still process your single record.

How can I stop joins from adding rows in my match query?

I'm having difficulty translating what I want into functional programming, since I think imperatively. Basically, I have a table of forms, and a table of expectations. In the Expectation view, I want it to look through the forms table and tell me if each one found a match. However, when I try to use joins to accomplish this, the joins are adding rows to the Expectation table when two or more forms match. I do not want this.
In an imperative fashion, I want the equivalent of this:
ForEach (row in Expectation table)
{
if (any form in the Form table matches the criteria)
{
MatchID = form.ID;
SignDate = form.SignDate;
...
}
}
What I have in SQL is this:
SELECT
e.*, match.ID, match.SignDate, ...
FROM
POFDExpectation e LEFT OUTER JOIN
(SELECT MIN(ID) as MatchID, MIN(SignDate) as MatchSignDate,
COUNT(*) as MatchCount, ...
FROM Form f
GROUP BY (matching criteria columns)
) match
ON (form.[match criteria] = expectation.[match criteria])
Which works okay, but very slowly, and every time there are TWO matches, a row is added to the Expectation results. Mathematically I understand that a join is a cross multiply and this is expected, but I'm unsure how to do this without them. Subquery perhaps?
I'm not able to give too many further details about the implementation, but I'll be happy to try any suggestion and respond with the results. I have 880 Expectation rows, and 942 results being returned. If I only allow results that match one form, I get 831 results. Neither are desirable, so if yours gets me to exactly 880, yours is the accepted answer.
Edit: I am using SQL Server 2008 R2, though a generic solution would be best.
Sample code:
--DROP VIEW ExpectationView; DROP TABLE Forms; DROP TABLE Expectations;
--Create Tables and View
CREATE TABLE Forms (ID int IDENTITY(1,1) PRIMARY KEY, ReportYear int, Name varchar(100), Complete bit, SignDate datetime)
GO
CREATE TABLE Expectations (ID int IDENTITY(1,1) PRIMARY KEY, ReportYear int, Name varchar(100))
GO
CREATE VIEW ExpectationView AS select e.*, filed.MatchID, filed.SignDate, ISNULL(filed.FiledCount, 0) as FiledCount, ISNULL(name.NameCount, 0) as NameCount from Expectations e LEFT OUTER JOIN
(select MIN(ID) as MatchID, ReportYear, Name, Complete, Min(SignDate) as SignDate, COUNT(*) as FiledCount from Forms f GROUP BY ReportYear, Name, Complete) filed
on filed.ReportYear = e.ReportYear AND filed.Name like '%'+e.Name+'%' AND filed.Complete = 1 LEFT OUTER JOIN
(select MIN(ID) as MatchID, ReportYear, Name, COUNT(*) as NameCount from Forms f GROUP BY ReportYear, Name) name
on name.ReportYear = e.ReportYear AND name.Name like '%'+e.Name+'%'
GO
--Insert Text Data
INSERT INTO Forms (ReportYear, Name, Complete, SignDate)
SELECT 2011, 'Bob Smith', 1, '2012-03-01' UNION ALL
SELECT 2011, 'Bob Jones', 1, '2012-10-04' UNION ALL
SELECT 2011, 'Bob', 1, '2012-07-20'
GO
INSERT INTO Expectations (ReportYear, Name)
SELECT 2011, 'Bob'
GO
SELECT * FROM ExpectationView --Should only return 1 result, returns 9
The 'filed' shows that they have completed a form, 'name' shows that they may have started one but not finished it. My view has four different 'match criteria' - each a little more strict, and counts each. 'Name Only Matches', 'Loose Matches', 'Matches' (default), 'Tight Matches' (used if there are more than one default match.
This is how I do it when I want to keep to a JOIN-type query format:
SELECT
e.*,
match.ID,
match.SignDate,
...
FROM POFDExpectation e
OUTER APPLY (
SELECT TOP 1
MIN(ID) as MatchID,
MIN(SignDate) as MatchSignDate,
COUNT(*) as MatchCount,
...
FROM Form f
WHERE form.[match criteria] = expectation.[match criteria]
GROUP BY ID (matching criteria columns)
-- Add ORDER BY here to control which row is TOP 1
) match
It usually performs better as well.
Semantically, {CROSS|OUTER} APPLY (table-expression) specifies a table-expression that is called once for each row in the preceding table expressions of the FROM clause and then joined to them. Pragmatically, however, the compiler treats it almost identically to a JOIN.
The practical difference is that unlike a JOIN table-expression, the APPLY table-expression is dynamically re-evaluated for each row. So instead of an ON clause, it relies on its own logic and WHERE clauses to limit/match its rows to the preceding table-expressions. This also allows it to make reference to the column-values of the preceding table-expressions, inside its own internal subquery expression. (This is not possible in a JOIN)
The reason that we want this here, instead of a JOIN, is that we need a TOP 1 in the sub-query to limit its returned rows, however, that means that we need to move the ON clause conditions to the internal WHERE clause so that it will get applied before the TOP 1 is evaluated. And that means that we need an APPLY here, instead of the more usual JOIN.
#RBarryYoung answered the question as I asked it, but there was a second question that I didn't make very clear. What I really wanted was a combination of his answer and this question, so for the record here's what I used:
SELECT
e.*,
...
match.ID,
match.SignDate,
match.MatchCount
FROM
POFDExpectation e
OUTER APPLY (
SELECT TOP 1
ID as MatchID,
ReportYear,
...
SignDate as MatchSignDate,
COUNT(*) as MatchCount OVER ()
FROM
Form f
WHERE
form.[match criteria] = expectation.[match criteria]
-- Add ORDER BY here to control which row is TOP 1
) match