Combining duplicate records in SQL Server - sql

I have a table in SQL Server 2012 that holds a list of parts, location of the parts and the quantity on hand. The problem I have is someone put a space in front of the location when they added it to the database. This allowed there to be two records.
I need to create a job that will find the parts with spaces before the location and add those parts to the identical parts without spaces in front of the location. I'm not quite sure where to even start with this.
This is the before:
Partno | PartRev | Location | OnHand | Identity_Column
--------------------------------------------------------------------
0D6591D 000 MV3 55.000 103939
0D6591D 000 MV3 -55.000 104618
This is what I would like to have after the job ran:
Partno | PartRev | Location | OnHand | Identity_Column
--------------------------------------------------------------------
0D6591D 000 MV3 0 104618

Two steps: 1. update the records with the correct locations, 2. delete the records with the wrong locations.
update mytable
set onhand = onhand +
(
select coalesce(sum(wrong.onhand), 0)
from mytable wrong
where wrong.location like ' %'
and trim(wrong.location) = mytable.location
)
where location not like ' %';
delete from mytable where location like ' %';

You can do some grouping with a HAVING clause on to identify the records. I've used REPLACE to replace spaces with empty strings in the location column, you could also use LTRIM and RTRIM:
CREATE TABLE #Sample
(
[Partno] VARCHAR(7) ,
[PartRev] INT ,
[Location] VARCHAR(5) ,
[OnHand] INT ,
[Identity_Column] INT
);
INSERT INTO #Sample
([Partno], [PartRev], [Location], [OnHand], [Identity_Column])
VALUES
('0D6591D', 000, ' MV3', 55.000, 103939),
('0D6591D', 000, 'MV3', -55.000, 104618)
;
SELECT Partno ,
PartRev ,
REPLACE( Location, ' ', '') Location,
SUM(OnHand) [OnHand]
FROM #Sample
GROUP BY REPLACE(Location, ' ', '') ,
Partno ,
PartRev
HAVING COUNT(Identity_Column) > 1;
DROP TABLE #Sample;
Produces:
Partno PartRev Location OnHand
0D6591D 0 MV3 0

Related

SQL - How to use Comma separated column values in a where clause

I have a table called Configuration. It contains the values like below,
Id SourceColumns TargetColumns SourceTable TargetTable
1 Name, Age CName, CAge STable TTable
2 EId EmplId EmpTable TTable
In a stored procedure, I have to get the column names from the above table and I have to compare the source table and target table.
I am able to do that easily for the 2nd record as it has only one column name, so in the where clause I can write sourcecolumn = targetcolumn like,
SELECT
EId
, EmplId
FROM
EmpTable E
JOIN TTable T ON E.Eid = T.EmplId
The first record in the table has 2 columns separated by comma (,).
I have to compare like this,
SELECT
Name
, Age
FROM
STable S
JOIN TTable T ON S.Name = T.CName AND S.Age = T.CAge
In some cases the source columns and target columns may have more column names separated by comma(,)
Please help me on this.
As I don't know whether you have completely understood the data model I suggested in the request comments and in order to properly answer the question:
Your table is not normalized, as the data in the columns SourceColumns and TargetColumns is not atomic. And one even has to interpret the data (the separator is the comma and the nth element in one column relates to the nth element in the other column).
This is how your tables should look like instead (the create statements are pseudo code):
create table configuration_tables
(
id_configuration_tables int,
source_table text,
target_table text,
primary key (id_configuration_tables),
unique key (source_table),
unique key (target_table) -- or not? in your sample two souce table map to the same target table
);
create table configuration_columns
(
id_configuration_columns int,
id_configuration_tables int,
source_column text,
target_column text,
primary key (id_configuration_columns),
foreign key (id_configuration_tables) references configuration_tables (id_configuration_tables)
);
Your sample data would then become
configuration_tables
id_configuration_tables | source_table | target_table
------------------------+--------------+-------------
1 | STable | TTable
2 | EmpTable | TTable
configuration_columns
id_configuration_columns | id_configuration_tables | source_column | target_column
-------------------------+-------------------------+---------------+--------------
1 | 1 | Name | CName
2 | 1 | Age | CAge
3 | 2 | EId | EmplId
As of SQL Server 2017 you can use STRING_AGG to create your queries is. In earlier versions this was also possible with some STRING_AGG emulation you will easily find wit Google or SO.
select
'select s.' + string_agg (c.source_column + ', t.' + c.target_column, ', ') +
' from ' + t.source_table + ' s' +
' join ' + t.target_table + ' t' +
' on ' + string_agg('t.' + c.target_column + ' = s.' + c.source_column, ' and ') +
';' as query
from configuration_tables t
join configuration_columns c on c.id_configuration_tables = t.id_configuration_tables
group by t.source_table, t.target_table
order by t.source_table, t.target_table;
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=8866b2485ba9bba92c2391c67bb8cae0

how to preserve column names on dynamic pivot

Sales data contains dynamic product names which can contian any characters.
Dynamic pivot table is created based on sample from
Crosstab with a large or undefined number of categories
translate() is used to remove bad characters.
In result pivot table column names are corrupted: missing characters and spaces are removed.
How to return data with same column names as in source data ?
I tried to use
quote_ident(productname) as tootjakood,
instead of
'C'||upper(Translate(productname,'Ø. &/+-,%','O')) as tootjakood,
but it returns error
ERROR: column "Ø, 12.3/3mm" does not exist
testcase:
create temp table sales ( saledate date, productname char(20), quantity int );
insert into sales values ( '2016-1-1', 'Ø 12.3/3mm', 2);
insert into sales values ( '2016-1-1', '+-3,4%/3mm', 52);
insert into sales values ( '2016-1-3', '/3,2m-', 246);
do $do$
declare
voter_list text;
begin
create temp table myyk on commit drop as
select saledate as kuupaev,
'C'||upper(Translate(productname,'Ø. &/+-,%','O')) as tootjakood,
sum(quantity)::int as kogus
from sales
group by 1,2
;
drop table if exists pivot;
voter_list := (
select string_agg(distinct tootjakood, ' ' order by tootjakood) from myyk
);
execute(format('
create table pivot (
kuupaev date,
%1$s
)', (replace(voter_list, ' ', ' integer, ') || ' integer')
));
execute (format($f$
insert into pivot
select
kuupaev,
%2$s
from crosstab($ct$
select
kuupaev,tootjakood,kogus
from myyk
order by 1
$ct$,$ct$
select distinct tootjakood
from myyk
order by 1
$ct$
) as (
kuupaev date,
%4$s
);$f$,
replace(voter_list, ' ', ' + '),
replace(voter_list, ' ', ', '),
'',
replace(voter_list, ' ', ' integer, ') || ' integer' -- 4.
));
end; $do$;
select * from pivot;
Postgres 9.1 is used.
You should use double quotes. Because you are using spaces to identify column separators, you should remove spaces from column names (or change the way of separators identification).
With
...
select saledate as kuupaev,
format ('"%s"', replace (upper(productname), ' ', '')) as tootjakood,
sum(quantity)::int as kogus
from sales
...
you'll get:
kuupaev | /3,2M- | +-3,4%/3MM | O12.3/3MM
------------+--------+------------+-----------
2016-01-01 | | 52 | 2
2016-01-03 | 246 | |
(2 rows)

How to concatenate the "overflow" of fields with character limits

I have a table with 3 address fields and each address field has a limit of 100 characters each.
I need to create a query to make the maximum character limit for each address field to be 30 characters long. If one address field is > 30 then I'll cut off the rest, but take the remainder and concatenate it onto the beginning of the next address field. I would do this until the last address field (address3) is filled up and then just get rid of the remainder on the last address field.
Is there a way to do this with an SQL query or with T-SQL?
You don't specify what to do with very short addresses, but my first crack at it would be something like this:
with temp as
(
select 1 id, 'abcdefghijklmnopqrstuvwxyz123456789' part1, 'second part' part2, 'third part' part3
),
concated as
(
SELECT id, part1 + part2 + part3 as whole
FROM temp
)
select id,
SUBSTRING(whole, 0, 30) f,
SUBSTRING(whole, 30,30) s,
SUBSTRING(whole, 60,30) t
from concated
This returns:
id | f | s | t
1 | abcdefghijklmnopqrstuvwxyz123 | 456789second partthird part |
If that's not what you're looking for please specify the desired output for the above.
UPDATE:
Well... this appears to work but it's pretty gross. I'm sure someone can come up with a better solution.
with temp as
(
select 1 id, 'abcdefghijklmnopqrstuvwxyz123456789 ' part1, 'second part' part2, 'third part' part3
)
select id,
SUBSTRING(part1, 0, 30) f,
SUBSTRING(SUBSTRING(part1, 30, 70) + SUBSTRING(part2, 0,30),0,30) s,
SUBSTRING(SUBSTRING(SUBSTRING(SUBSTRING(part1, 30, 70) + SUBSTRING(part2, 0,30),30,70),0,30) + SUBSTRING(part3, 0,30),0,30) t
from temp
I think I'd go with the problem description, and write something that's "obviously" correct (provided I've understood your spec :-))
/* Setup data - second example stolen from Abe, first just showing that it works with short enough data */
declare #t table (ID int not null,Address1 varchar(100) not null,Address2 varchar(100) not null,Address3 varchar(100) not null)
insert into #t (ID,Address1,Address2,Address3)
values (1,'abc','def','ghi'),
(2,'abcdefghijklmnopqrstuvwxyz123456789 ', 'second part', 'third part')
/* Actual query - shift address pieces through the address fields, but only to later ones */
;with Shift1 as (
select
ID,SUBSTRING(Address1,1,30) as Address1,SUBSTRING(Address1,31,70) as Address1Over,Address2,Address3
from #t
), Shift2 as (
select
ID,Address1,SUBSTRING(Address1Over+Address2,1,30) as Address2,SUBSTRING(Address1Over+Address2,31,70) as Address2Over,Address3
from Shift1
), Shift3 as (
select
ID,Address1,Address2,SUBSTRING(Address2Over+Address3,1,30) as Address3
from Shift2
)
select * from Shift3
Result:
ID Address1 Address2 Address3
----------- ------------------------------ ------------------------------ ------------------------------
1 abc def ghi
2 abcdefghijklmnopqrstuvwxyz1234 56789 second part third part

Table Normalization (Parse comma separated fields into individual records)

I have a table like this:
Device
DeviceId Parts
1 Part1, Part2, Part3
2 Part2, Part3, Part4
3 Part1
I would like to create a table 'Parts', export data from Parts column to the new table. I will drop the Parts column after that
Expected result
Parts
PartId PartName
1 Part1
2 Part2
3 Part3
4 Part4
DevicePart
DeviceId PartId
1 1
1 2
1 3
2 2
2 3
2 4
3 1
Can I do this in SQL Server 2008 without using cursors?
-- Setup:
declare #Device table(DeviceId int primary key, Parts varchar(1000))
declare #Part table(PartId int identity(1,1) primary key, PartName varchar(100))
declare #DevicePart table(DeviceId int, PartId int)
insert #Device
values
(1, 'Part1, Part2, Part3'),
(2, 'Part2, Part3, Part4'),
(3, 'Part1')
--Script:
declare #DevicePartTemp table(DeviceId int, PartName varchar(100))
insert #DevicePartTemp
select DeviceId, ltrim(x.value('.', 'varchar(100)'))
from
(
select DeviceId, cast('<x>' + replace(Parts, ',', '</x><x>') + '</x>' as xml) XmlColumn
from #Device
)tt
cross apply
XmlColumn.nodes('x') as Nodes(x)
insert #Part
select distinct PartName
from #DevicePartTemp
insert #DevicePart
select tmp.DeviceId, prt.PartId
from #DevicePartTemp tmp
join #Part prt on
prt.PartName = tmp.PartName
-- Result:
select *
from #Part
PartId PartName
----------- ---------
1 Part1
2 Part2
3 Part3
4 Part4
select *
from #DevicePart
DeviceId PartId
----------- -----------
1 1
1 2
1 3
2 2
2 3
2 4
3 1
You will need a Tally table to accomplish this without a cursor.
Follow the instructions to create a tally table here: Tally Tables by Jeff Moden
This script will put the table into your Temp database, so you probably want to change the "Use DB" statement
Then you can run the script below to insert a breakdown of Devices and Parts into a temp table. You should then be able to join on your part table by the part name (to get the ID) and insert into your new DevicePart table.
select *,
--substring(d.parts, 1, t.n)
substring(d.parts, t.n, charindex(', ', d.parts + ', ',t.n) - t.n) 'Part'
into #devicesparts
from device d
cross join tally t
where t.n < (select max(len(parts))+ 1 from device)
and substring(', ' + d.parts, t.n, 1) = ', '
Have a look at using fn_Split to create a table variable from the comma separated values.
You can then use this to drive your insert.
EDIT: Actually, I think you may still need a cursor. Leaving this answer incase fn_Split helps.
If there is a maximum number of parts per device then, yes, it can be done without a cursor, but this is quite complex.
Essentially, create a table (or view or subquery) that has a DeviceID and one PartID column for each possible index in the PartID string. This can be accomplished by making the PartID columns calculated columns using fn_split or another method of your choice. From there you do a multiple self-UNION of this table, with one table in the self-UNION for each PartID column. Each table in the self-UNION has only one of the PartID columns included in the select list of the query for the table.

Need help with a SQL query that combines adjacent rows into a single row

I need a solution to a problem that has the table structure as listed below.
Input
1 1/1/2009 Product1
2 2/2/2009 Product2
3 3/3/2009 Product3
4 4/4/2009 Product4
5 5/5/2009 Product5
Output
1 1/1/2009 2/2009 Product1
2 3/3/2009 4/4/2009 Product3
3 5/5/2009 Product5
I tried using CTE. But was not very sucessful in extracting the second row value.
Appreciate any help. Thanks.
I don't know whether Russ' answer helps you at all. Here is a link to an article that explains how to add row numbers to the results of a query. (Search for "row_number" to find the most likely example.)
Once you have a query numbering the rows properly, you should be able to throw that into a CTE, then select from it twice -- once for odd numbers, then again for even numbers. Have each result return the even numbered value for joining (odd - 1 = even). At that point, you can join the results of the queries and get two products on one row.
You are looking for PIVOT: http://msdn.microsoft.com/en-us/library/ms177410.aspx
Here is a best shot with the info you gave, I do something similar in one of my apps. You may need to use a dynamic SQL query if the pivot values change.
SELECT *
FROM (SELECT [Date]
,[Product]
FROM [Values]
PIVOT (Max([Date])
FOR [Product]
IN ('put date ranges here')) pvt
here is what mine looks like, this will allow for a set of different values. This is used in a form builder to retrive the values of user input
--//Get a comma delimited list of field names from the field table for this form
DECLARE #FieldNames varchar(max)
SELECT #FieldNames = COALESCE(#FieldNames + ', ', '') + '[' + CAST([FieldName] AS varchar(max)) + ']'
FROM [Fields]
WHERE [FormID] = #FormID
--//create a dynamic sql pivot table that will bring our
--//fields and values together to look like a real sql table
DECLARE #SQL varchar(max)
SET #SQL =
'SELECT *
FROM (SELECT [Values].[RowID]
,[Fields].[FieldName]
,[Values].[Value]
FROM [Values]
INNER JOIN [Fields] ON [Fields].[FieldID] = [Values].[FieldID]
WHERE [Fields].[FormID] = ''' + #FormID + ''') p
PIVOT (Max([Value])
FOR [FieldName]
IN (' + #FieldNames + ')) pvt'
--//execute our sql to return the data
EXEC(#SQL)
You can actually create a dummy column initially and create a cte.
Then, use cte and join them on the dummy key and the row number or a series which simulates sequential numbers.
Then, filter in the dataset to display only odd numbered rows
create table dbo.test
(
id integer,
currdate varchar(20), -- just to keep simple, not to mess with dates, took this as string
productCode varchar(20)
);
insert into test values (1, '1/1/2009', 'product1');
insert into test values (2, '2/2/2009', 'product2');
insert into test values (3, '3/3/2009', 'product3');
insert into test values (4, '4/4/2009', 'product4');
insert into test values (5, '5/5/2009', 'product5');
with ctes as
(
select
't' as joinKey,
id, -- serves as rownum or else create another using row_num() over partition by and order by
currdate,
productCode
from test
)
select
c1.id,
c1.currdate,
c1.productCode,
c2.id,
c2.currdate,
c2.productCode
from
(
select
joinKey,
id,
currdate,
productCode
from
ctes
)c1,
(
select
joinKey,
id,
currdate,
productCode
from
ctes
)c2
where c1.joinKey = c2.joinKey
and c1.id + 1 = c2.id
and c1.id % 2 = 1
The result is as shown below:
id currdate productCode id currdate productCode
1 1/1/2009 product1 2 2/2/2009 product2
3 3/3/2009 product3 4 4/4/2009 product4