Eliminating duplicates from a COALESCED column in a stored procedure? - sql

I'm writing a stored procedure. The query that I'm using takes rows that are identical in every way except for 2 columns and combines them into one row and coalesces the two rows into one. Now I'm running into another issue. Sometimes there are duplicate values in those two rows and I want to eliminate the duplicates.
Example:
TeamID Team City State Equipment
1 Thunder OKC OK Basketball, Basketball, Basketball, Shorts, Jersey, Jersey
I want it to be like this:
TeamID Team City State Equipment
1 Thunder OKC OK Basketball, Shorts, Jersey
Here is the query that I'm using that combines the rows.
SELECT DISTINCT
AssignedOfficeID, AssignedOffice, OperatorID, OperatorName, RigMasterID, DrillerRigNumber, WellID,
County, State, WellName, CompanyMan, CompanyManPhone, DateStart, DateStop, Representative, RepresentativeID, RepresentativeAssignedID, RepresentativeAssigned,
PricePerDay, CotNumber, CustomerOrderTicketNumber,
Equipment = STUFF((SELECT ', ' + COALESCE(RentalEquipmentAbbreviation, EquipmentAbbreviation, '')
FROM #ActiveRigsInfo AS ARI2
WHERE ARI2.AssignedOfficeID = ARI1.AssignedOfficeID AND ARI2.AssignedOfficeID = ARI1.AssignedOfficeID
AND ARI2.OperatorID = ARI1.OperatorID AND ARI2.OperatorName = ARI1.OperatorName
AND ARI2.RigMasterID = ARI1.RigMasterID AND ARI2.DrillerRigNumber = ARI1.DrillerRigNumber
AND ARI2.WellID = ARI1.WellID AND ARI2.County = ARI1.County AND ARI2.State = ARI1.State
AND ARI2.WellName = ARI1.WellName AND ARI2.CompanyMan = ARI2.CompanyMan AND ARI2.CompanyManPhone = ARI1.CompanyManPhone
AND ARI2.DateStart = ARI1.DateStart AND ARI2.Representative = ARI1.Representative
AND ARI2.CotNumber = ARI1.CotNumber
FOR XML PATH(''), TYPE).value('.[1]', 'nvarchar(max)'),1,2,'')
FROM #ActiveRigsInfo AS ARI1
ORDER BY AssignedOffice, OperatorID, RigMasterID;
Is there a way to do this when creating the stored procedure? Or is there a way that I can alter my query to do this?
All I want to do is take out the duplicates from the coalesced columns. The query does as expected except for that.
Thanks. I hope that makes sense.

You need DISTINCT in your subquery:
SELECT DISTINCT
AssignedOfficeID, AssignedOffice, OperatorID, OperatorName, RigMasterID, DrillerRigNumber, WellID,
County, State, WellName, CompanyMan, CompanyManPhone, DateStart, DateStop, Representative, RepresentativeID, RepresentativeAssignedID, RepresentativeAssigned,
PricePerDay, CotNumber, CustomerOrderTicketNumber,
Equipment = STUFF((SELECT DISTINCT ', ' + COALESCE(RentalEquipmentAbbreviation, EquipmentAbbreviation, '')
FROM #ActiveRigsInfo AS ARI2
WHERE ARI2.AssignedOfficeID = ARI1.AssignedOfficeID AND ARI2.AssignedOfficeID = ARI1.AssignedOfficeID
AND ARI2.OperatorID = ARI1.OperatorID AND ARI2.OperatorName = ARI1.OperatorName
AND ARI2.RigMasterID = ARI1.RigMasterID AND ARI2.DrillerRigNumber = ARI1.DrillerRigNumber
AND ARI2.WellID = ARI1.WellID AND ARI2.County = ARI1.County AND ARI2.State = ARI1.State
AND ARI2.WellName = ARI1.WellName AND ARI2.CompanyMan = ARI2.CompanyMan AND ARI2.CompanyManPhone = ARI1.CompanyManPhone
AND ARI2.DateStart = ARI1.DateStart AND ARI2.Representative = ARI1.Representative
AND ARI2.CotNumber = ARI1.CotNumber
FOR XML PATH(''), TYPE).value('.[1]', 'nvarchar(max)'),1,2,'')
FROM #ActiveRigsInfo AS ARI1
ORDER BY AssignedOffice, OperatorID, RigMasterID;

Related

How To Keep Records in Order from Derived Table

I am trying to update a SQL table from a remote DB2 table. There may be multiple updates for the same record but I need the updates to happen in the order they are in in the DB2 table. You can not use Order By on a derived table. I have tried several different options to try to get this to work, but the updates still do not happen in order.
For example:
Change 1 - CUSTOMER NAME = ABCX
Change 2 - CUSTOMER NAME = ABC
After I run the query, the customer name is ABCX when it should be ABC.
I truly do not know what else to try. I've tried temp tables (still derived table), creating a concatenated field with date and time fields, sub-select, row_number() over(order by date, time) and many other things. I'd like to keep it in order by the date and time fields in the remote table.
Any insight would be appreciated.
Thank you.
Here is the basic code I have:
SET
A.CUSRID = B.RECORD_ID,
A.CUSSTS = B.ACTIVE_CODE,
A.CUSCOM = B.COMPANY_NUMBER,
A.CUSMNM = B.CUSTOMER_NAME,
A.CUSAD1 = B.CUSTOMER_ADDRESS_1,
A.CUSAD2 = B.CUSTOMER_ADDRESS_2,
A.CUSAD3 = B.CUSTOMER_ADDRESS_3,
A.CUSZIP = B.CUSTOMER_ZIP_CODE,
A.CUSZPE = B.CUSZPE_NOT_USED,
A.CUSSTC = B.CUSTOMER_STATE,
A.CUSARA = B.CUSTOMER_AREA_CODE,
A.CUSPHN = B.CUSTOMER_PHONE,
A.CUSB17 = B.CUSB17_NOT_USED,
A.CUSTMT = B.STATEMENT_PRINT_CODE,
A.CUSCRL = B.CREDIT_LIMIT,
A.CUSCRC = B.CREDIT_CODE,
A.CUSMCD = B.CUSMCD_NOT_USED,
A.CUSTX1 = B.TAX_RATE_1,
A.CUSTX2 = B.TAX_RATE_2,
A.CUSTXC = B.TAX_RATE,
A.CUSTXE = B.TAX_EXEMPT_ID,
A.CUSB48 = B.CUSB48_NOT_USED,
A.CUSMDT = B.MAINTENANCE_DATE,
A.CUSB20 = B.CUSB20_NOT_USED,
A.CSSRCH = B.SEARCH_FIELD,
A.CUSBRN = B.BRANCH_ID,
A.CUSDST = B.DISTRIBUTOR_NUMBER,
A.CUSB28 = B.CUSB28_NOT_USED
FROM
dbo.mcusmas A
INNER JOIN (
SELECT
RECORD_ID,
ACTIVE_CODE,
COMPANY_NUMBER,
CUSTOMER_NUMBER,
CUSTOMER_NAME,
CUSTOMER_ADDRESS_1,
CUSTOMER_ADDRESS_2,
CUSTOMER_ADDRESS_3,
CUSTOMER_ZIP_CODE,
CUSZPE_NOT_USED,
CUSTOMER_STATE,
CUSTOMER_AREA_CODE,
CUSTOMER_PHONE,
CUSB17_NOT_USED,
STATEMENT_PRINT_CODE,
CREDIT_LIMIT,
CREDIT_CODE,
CUSMCD_NOT_USED,
TAX_RATE_1,
TAX_RATE_2,
TAX_RATE,
TAX_EXEMPT_ID,
CUSB48_NOT_USED,
MAINTENANCE_DATE,
CUSB20_NOT_USED,
SEARCH_FIELD,
BRANCH_ID,
DISTRIBUTOR_NUMBER,
CUSB28_NOT_USED,
FROM remoteserver.MCUSMASPLG
WHERE Event_State_ID = '*New' AND SENT_TO_DATA_WAREHOUSE = 'N'
) B
ON A.CUSMNB = B.CUSTOMER_NUMBER
There is no "change 1" or "change 2". There are only rows and SQL Server arbitrarily ends up using one of them. If you want to control the rows, you should select the one you want in advance:
FROM dbo.mcusmas A JOIN
(SELECT B.*,
ROW_NUMBER() OVER (PARTITION BY CUSTOMER_NUMBER ORDER BY <ordering col>) as seqnum
FROM remoteserver.MCUSMASPLG
WHERE Event_State_ID = '*New' AND
SENT_TO_DATA_WAREHOUSE = 'N'
) B
ON A.CUSMNB = B.CUSTOMER_NUMBER
I don't know how you are determining which row is the right one. Presumably, some column has this information and you can use it in the ORDER BY.

postgresql Multiple identical conditions are unified into one parameter

I have one sql that need convert string column to array and i have to filter with this column,sql like this:
select
parent_line,
string_to_array(parent_line, '-')
from
bx_crm.department
where
status = 0 and
'851' = ANY(string_to_array(parent_line, '-')) and
array_length(string_to_array(parent_line, '-'), 1) = 5;
parent_line is a varchar(50) column,the data in this like 0-1-851-88
question:
string_to_array(parent_line, '-') appear many times in my sql.
how many times string_to_array(parent_line) calculate in each row. one time or three times
how convert string_to_array(parent_line) to a parameter. at last,my sql may like this:
depts = string_to_array(parent_line, '-')
select
parent_line,
depts
from
bx_crm.department
where
status = 0 and
'851' = ANY(depts) and
array_length(depts, 1) = 5;
Postgres supports lateral joins which can simplify this logic:
select parent_line, v.parents, status, ... other columns ...
from bx_crm.department d cross join lateral
(values (string_to_array(parent_line, '-')) v(parents)
where d.status = 0 and
cardinality(v.parents) = 5
'851' = any(v.parents)
Use a derived table:
select *
from (
select parent_line,
string_to_array(parent_line, '-') as parents,
status,
... other columns ...
from bx_crm.department
) x
where status = 0
and cardinality(parents) = 5
and '851' = any(parents)

Oracle 11g - MERGE and error ORA-30926: unable to get a stable set of rows in the source tables

I have read lots of posts related to the Oracle (11g) error ORA-30926, and I've checked Oracle's documentation on the proper use of the merge statement.
Based on previous threads, I've changed my code to specify a distinct value in the using clause and that is the value I compare in the ON clause. But I still get the ORA-30926 error.
I've also tested the subquery in the USING clause and it returns data without any problem. I've created a temporary table containing only data that meets conditions in the WHERE clause of the USING statement and tried to run that and I still get the error. Both tables have data in them also.
I hope someone can spot something in my code that is incorrect or give me any recommendations on testing.
BEGIN
MERGE
INTO persons myTarget
USING (
select
distinct(USERID),
GIVENNAME,
INITIALS,
SN,
GENERATIONQUALIFIER,
TITLE,
DISPLAYNAME,
TELEPHONENUMBER,
FACSIMILETELEPHONENUMBER,
MOBILE,
OTHERTELEPHONE
from person_updates
WHERE
SN IS NOT NULL
AND LENGTH(SN) < 20
AND SUBSTR(USERID,0,2) IN (SELECT PLACEID FROM code_table)
AND (LENGTH(USERID) = 8 OR LENGTH(USERID) = 10)
) mySource
ON (myTarget.userid = mySource.USERID)
WHEN MATCHED THEN
UPDATE SET myTarget.first_name = UPPER(mySource.GIVENNAME),
myTarget.last_name = UPPER(mySource.SN),
myTarget.generation = UPPER(mySource.GENERATIONQUALIFIER),
myTarget.title = UPPER(mySource.TITLE),
myTarget.display_name = UPPER(mySource.DISPLAYNAME),
myTarget.phone_num = UPPER(mySource.TELEPHONENUMBER),
myTarget.fax_num = UPPER(mySource.FACSIMILETELEPHONENUMBER),
myTarget.mobile_num = UPPER(mySource.MOBILE),
myTarget.dsn_phone = UPPER(mySource.OTHERTELEPHONE);
END;
As correctly told by #shrek using distinct will give you distinct rows across combination of all the columns you have selected. I have used row_number analytical function to get distinct rows only based on userid.
Query:
BEGIN
MERGE
INTO persons myTarget
USING (
select * from(
select
row_number() over(partition by userid order by null) as rn,
USERID,
GIVENNAME,
INITIALS,
SN,
GENERATIONQUALIFIER,
TITLE,
DISPLAYNAME,
EMPLOYEETYPE,
TELEPHONENUMBER,
FACSIMILETELEPHONENUMBER,
MOBILE,
OTHERTELEPHONE
from person_updates
WHERE
SN IS NOT NULL
AND LENGTH(SN) < 20
AND SUBSTR(USERID,0,2) IN (SELECT PLACEID FROM code_table)
AND (LENGTH(USERID) = 8 OR LENGTH(USERID) = 10)) where rn = 1
) mySource
ON (myTarget.userid = mySource.USERID)
WHEN MATCHED THEN
UPDATE SET myTarget.first_name = UPPER(mySource.GIVENNAME),
myTarget.last_name = UPPER(mySource.SN),
myTarget.generation = UPPER(mySource.GENERATIONQUALIFIER),
myTarget.title = UPPER(mySource.TITLE),
myTarget.display_name = UPPER(mySource.DISPLAYNAME),
myTarget.dod_emp_type = UPPER(mySource.EMPLOYEETYPE),
myTarget.phone_num = UPPER(mySource.TELEPHONENUMBER),
myTarget.fax_num = UPPER(mySource.FACSIMILETELEPHONENUMBER),
myTarget.mobile_num = UPPER(mySource.MOBILE),
myTarget.dsn_phone = UPPER(mySource.OTHERTELEPHONE);
END;
Hope this will help.

How can i use stuff function for multiple columns in SQL server?

I have a requirement for concatenating two values of two rows having same Id's and averaging for other column. Here is the sample table I have:
Now my requirement is I need to concatenate the Response column, concatenate Response Rating column and average the Rating Avg column if it has same ParticipantId, UseriD, QuestionId and ConductedById.
Here is the target data what I wanted:
Here Response column and Response rating column is concatenated with respective rows and Rating Avg column is taken the average. I have done one column concatenation previously using stuff function. Can this be achieved using stuff function?
You can do the following. Just group by those columns and make 2 subselects for concatenated columns:
select UserID,
ConductedByID,
QuestionID,
(SELECT STUFF((SELECT ';' + Response
FROM TableName tn2 WHERE tn1.UserID = tn2.UserID and
tn1.ConductedByID = tn2.ConductedByID and
tn1.QuestionID = tn2.QuestionID and
tn1.ParticipantID = tn2.ParticipantID
FOR XML PATH('')) ,1,1,'')) as Response,
(SELECT STUFF((SELECT ';' + cast(Rating as varchar)
FROM TableName tn2 WHERE tn1.UserID = tn2.UserID and
tn1.ConductedByID = tn2.ConductedByID and
tn1.QuestionID = tn2.QuestionID and
tn1.ParticipantID = tn2.ParticipantID
FOR XML PATH('')) ,1,1,'')) as [Response Rating],
AVG(case when Rating = 'n/a' then 0 else cast(Rating as int) end) as [Rating Avg],
ParticipantID
from TableName tn1
group by UserID, ConductedByID, QuestionID, ParticipantID
This works perfectly
STUFF(
(
SELECT DISTINCT ',' + val_name
FROM t_t43_value_set
INNER JOIN t_t43_factory
ON val_id = fac_country
INNER JOIN t_t43_delivery delivery
ON pvs_part_version_id = del_part_version_id
AND pvs_supplier_id = del_supplier_id
AND del_factory_id = fac_factory_id FOR xml path('')),1,1,'') AS 'Country'

sql server query how to have multiple column comparision with subquery

How can I make the update query to work based on the sub query?
How can I compare all these columns in the sub query to the columns in the update statement?
Is there some neat and clean way to do it?
The query I am trying with it is shown below:
UPDATE Temp_CropData
SET RecordStatus = 0,
Remarks = ISNULL(Remarks, '') +' Duplicate Records'
WHERE
(SELECT Commodity ,City,Period,CropCondition
FROM [Temp_CropData]
GROUP BY DDate,Commodity,City,Period,CropCondition
HAVING count(*) >1)
Try using MERGE:
MERGE INTO Temp_CropData
USING (
SELECT Commodity, City, Period, CropCondition
FROM Temp_CropData
GROUP
BY DDate, Commodity, City, Period, CropCondition
HAVING COUNT(*) > 1
) AS source
ON Temp_CropData.Commodity = source.Commodity
AND Temp_CropData.City = source.City
AND Temp_CropData.Period = source.Period
AND Temp_CropData.CropCondition = source.CropCondition
WHEN MATCHED THEN
UPDATE
SET RecordStatus = 0,
Remarks = ISNULL(Remarks, '') + ' Duplicate Records';
I'm slightly suspicious of the fact that your subquery's SELECT and GROUP BY clauses do not match, though (i.e. DDate is in the GROUP BY but not the SELECT).
Try this:
UPDATE cd
SET RecordStatus = 0,
Remarks = ISNULL(Remarks, '') +' Duplicate Records'
FROM Temp_CropData cd
JOIN (SELECT Commodity ,City,Period,CropCondition
FROM [Temp_CropData]
GROUP BY DDate,Commodity,City,Period,CropCondition
HAVING count(*) >1) dup
ON cd.DDate = dup.DDate AND cd.Commodity=dup.Commodity AND cd.City = dup.City
AND cd.Period = dup.Period AND cd.CropCondition = dup.CropCondition