Duplicate rows with information merged and then remove duplicates

Duplicate rows with information merged and then remove duplicates - sql

I have a table called customers which gets info populated from a form not all fields are required (this is because the form generates asf / xml with the inputted info) and I would like to be able to merge duplicates into one row then delete the duplicates.
Here is my table
CID | LastName | FirstName | Street | City | ZipCode | HomePhone | CellPhone | EmailAddr
1 Test NULL NULL NULL NULL NULL NULL NULL
2 NULL TEST NULL NULL NULL NULL NULL NULL
3 NULL NULL Test NULL NULL NULL NULL NULL
4 NULL NULL NULL Test NULL NULL NULL NULL
5 NULL NULL NULL NULL Test NULL NULL NULL
6 NULL NULL NULL NULL NULL Test NULL NULL
7 NULL NULL NULL NULL NULL NULL TEST NULL
8 NULL NULL NULL NULL NULL NULL NULL TEST
I want to merge the data from each field that isn't null into the Fist Instance of then update that record and delete the remaining 7 records.
I am still starting out in SQL but understand joins, inserts, updates deletes etc. Any advice or direction would be greatly appreciated. I have found multiple posts where I can merge this data in a report but not to many where I can actually truly merge the data and delete the duplicate rows.
I just found this post while searching so it may be what I am looking for
mysql-consolidate-duplicate-data-records-via-update-delete

Try this one -
SET NOCOUNT ON;
DECLARE #temp TABLE
(
CID INT PRIMARY KEY
, LastName NVARCHAR(10)
, FirstName NVARCHAR(10)
, Street NVARCHAR(10)
, City NVARCHAR(10)
, ZipCode NVARCHAR(10)
, HomePhone NVARCHAR(10)
, CellPhone NVARCHAR(10)
, EmailAddr NVARCHAR(10)
)
INSERT INTO #temp (CID, LastName, FirstName, Street, City, ZipCode, HomePhone, CellPhone, EmailAddr)
VALUES
(1, 'Test', NULL, NULL, NULL, NULL, NULL, NULL, NULL),
(2, NULL, 'TEST', NULL, NULL, NULL, NULL, NULL, NULL),
(3, NULL, NULL, 'Test', NULL, NULL, NULL, NULL, NULL),
(4, NULL, NULL, NULL, 'Test', NULL, NULL, NULL, NULL),
(5, NULL, NULL, NULL, NULL, 'Test', NULL, NULL, NULL),
(6, NULL, NULL, NULL, NULL, NULL, 'Test', NULL, NULL),
(7, NULL, NULL, NULL, NULL, NULL, NULL, 'TEST', NULL),
(8, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'TEST'),
(12, 'Tes2', NULL, NULL, NULL, NULL, NULL, NULL, NULL),
(14, NULL, 'TES2', NULL, NULL, NULL, NULL, NULL, NULL),
(17, NULL, NULL, 'Tes2', NULL, NULL, NULL, NULL, NULL),
(18, 'Tes3', NULL, NULL, NULL, NULL, NULL, NULL, NULL),
(19, NULL, 'TES3', NULL, NULL, NULL, NULL, NULL, NULL),
(20, NULL, NULL, 'Tes3', NULL, NULL, NULL, NULL, NULL),
(21, NULL, NULL, NULL, 'Test3', NULL, NULL, NULL, NULL)
DECLARE #buffer_temp TABLE
(
CID INT PRIMARY KEY
, LastName NVARCHAR(50)
, FirstName NVARCHAR(50)
, Street NVARCHAR(50)
, City NVARCHAR(50)
, ZipCode NVARCHAR(50)
, HomePhone NVARCHAR(50)
, CellPhone NVARCHAR(50)
, EmailAddr NVARCHAR(50)
)
;WITH cte AS
(
SELECT t.CID, NextCID = ISNULL(t2.CID, (SELECT MAX(y.CID) FROM #temp y))
FROM #temp t
OUTER APPLY (
SELECT TOP 1 CID = t1.CID - 1
FROM #temp t1
WHERE t1.CID > t.CID
AND t1.LastName IS NOT NULL
) t2
WHERE t.LastName IS NOT NULL
)
INSERT INTO #buffer_temp
SELECT
t2.CID
, LastName = MAX(LastName)
, FirstName = MAX(FirstName)
, Street = MAX(Street)
, City = MAX(City)
, ZipCode = MAX(ZipCode)
, HomePhone = MAX(HomePhone)
, CellPhone = MAX(CellPhone)
, EmailAddr = MAX(EmailAddr)
FROM #temp t
CROSS APPLY (
SELECT *
FROM cte t2
WHERE t.CID BETWEEN t2.CID AND t2.NextCID
) t2
GROUP BY t2.CID
DELETE FROM #temp
INSERT INTO #temp
SELECT *
FROM #buffer_temp
SELECT *
FROM #temp
Output:
CID LastName FirstName Street City ZipCode HomePhone CellPhone EmailAddr
----------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1 Test TEST Test Test Test Test TEST TEST
12 Tes2 TES2 Tes2 NULL NULL NULL NULL NULL
18 Tes3 TES3 Tes3 Test3 NULL NULL NULL NULL

It looks like you want to merge records 1-8, then 9-16, then 17-24, and so on.
Fortunately, you have a CID field that you can use for identifying the groups. All you need is the group, and the formula (CID - 1)/8 does the trick (SQL Server does integer division when dividing integers so, say, 4/8 = 0 and not 0.5). Here is the query:
select (CID - 1) / 8 as NewCID,
max(LastName) as LastName, max(FirstName) as FirstName, . . .
from t
group by (CID - 1) / 8;

Related

insert based on condition check in similar columns between 2 tables and add NULL when missing SQL

CREATE TABLE #temppayload(
[user_id] [int] NOT NULL,
[field_id] [int] NOT NULL,
[type_id] [int] NOT NULL,
[field_value_string] [nvarchar](4000) NULL,
[field_value_numeric] [float] NULL,
[field_value_date] [date] NULL,
[field_value_bool] [bit] NULL,
[field_value_lookup] [int] NULL
)
insert into #temppayload values (1,31,1,'NEW',NULL,NULL,NULL,NULL)
insert into #temppayload values (1,11,1,'Update1',NULL,NULL,NULL,NULL)
insert into #temppayload values (1,12,2,NULL,NULL,NULL,NULL,'4')
insert into #temppayload values (4,41,4,Null,'1',NULL,NULL,NULL)
CREATE TABLE #tempdb(
[user_id] [int] NOT NULL,
[field_id] [int] NOT NULL,
[type_id] [int] NOT NULL,
[field_value_string] [nvarchar](4000) NULL,
[field_value_numeric] [float] NULL,
[field_value_date] [date] NULL,
[field_value_bool] [bit] NULL,
[field_value_lookup] [int] NULL
)
insert into #tempdb values (1,11,1,'create1',NULL,NULL,NULL,NULL)
insert into #tempdb values (1,12,2, NULL,NULL,NULL,NULL,NULL)
insert into #tempdb values (1,13,3,NULL,NULL,'2020-04-15',NULL,NULL)
insert into #tempdb values (4,41,4,NULL,'1',NULL,NULL,NULL)
user_id field_id type_id [field_value_string] [field_value_numeric] [field_value_date] [field_value_bool] [field_value_lookup]
1 31 1 NEW NULL NULL NULL NULL
1 11 1 update1 NULL NULL NULL NULL
1 12 2 NULL NULL NULL NULL 4
1 13 3 NULL NULL NULL NULL NULL
condition
new value from payload
value change compared to payload and tempdb
insert with null when #temppayload does not have it but exist in #tempdb
i was thinking of doing not exist seperately, left join seperately and union both the result.
Adding condition like below , the approach does not look efficient to me. i tried with innerjoin and leftjoin together could not figure out.
Any help is much appreciated . Thanks in advance.
select
distinct
st.user_id
,st.field_id
,st.type_id
,st.field_value_string
,st.field_value_numeric
,st.field_value_date
,st.field_value_bool
,st.field_value_lookup
from #temppayload st
where not exists (select *
from #tempdb t
where st.user_id = t.user_id
AND st.type_id = t.type_id
AND st.field_id = t.field_id
AND st.field_value_string = t.field_value_string
AND st.field_value_string IS NULL
AND t.field_value_string Is NULL

Able to figure this out by below approach . feel free to share if there is any other efficient way to do this.
select * into #modifiedpayload FROM
(select distinct st.user_id,st.field_id,st.type_id,st.field_value_string,st.field_value_numeric,st.field_value_date,st.field_value_lookup from
#temppayload st
where not exists (select user_id,st.field_id,st.type_id,st.field_value_string,field_value_numeric from #tempdb
where st.user_id = user_id
AND st.type_id = type_id
AND st.field_id = field_id
AND (st.field_value_string IS NULL AND field_value_string IS NULL OR st.field_value_string = field_value_string)
AND (st.field_value_numeric IS NULL AND field_value_numeric IS NULL OR st.field_value_numeric = field_value_numeric)
AND (st.field_value_date IS NULL AND field_value_date IS NULL OR st.field_value_date = field_value_date)
AND (st.field_value_bool IS NULL AND field_value_bool IS NULL OR st.field_value_bool = field_value_bool)
AND (st.field_value_lookup IS NULL AND field_value_lookup IS NULL OR st.field_value_lookup = field_value_lookup)
)
UNION
select t.user_id,t.field_id,t.type_id, NULL as field_value_string, NULL as field_value_numeric, NULL as field_value_date, NULL field_value_lookup from #tempdb t
LEFT JOIN #temppayload s
on t.user_id = s.user_id AND t.field_id = s.field_id
where s.field_id IS NULL ) AS X

Merging multiple rows into one row by Grouping and checking for non null

I have data in this format. I want to merge multiple rows into one-row grouping by ID column. each row will have only one non null value and one non null value for each column when grouped by ID.
[ID], [foo], [foo1], [foo2], [foo3], [foo4], [foo5]
1, data1, null, null, null, null, null
1, null, data2, null, null, null, null
1, null, null, data3, null, null, null
1, null, null, null, data4, null, null
1, null, null, null, null, data5, null
1, null, null, null, null, null, data6
2, data1, null, null, null, null, null
2, null, data2, null, null, null, null
2, null, null, data3, null, null, null
2, null, null, null, data4, null, null
2, null, null, null, null, data5, null
2, null, null, null, null, null, data6
Desired Output:
[ID], [foo], [foo1], [foo2], [foo3], [foo4], [foo5]
1, data1, data2, data3, data4, data5, data6
2, data1, data2, data3, data4, data5, data6

The aggregate functions max and min (as well as most others) will just ignore nulls. You could group by the ID and query the max of the other columns, which would return the single non-null value this column has:
SELECT id, MAX(foo), MAX(foo1), MAX(foo2), MAX(foo3), MAX(foo4), MAX(foo5)
FROM mytable
GROUP BY id

In your case, you can use max():
select id, max(foo) as foo, max(foo1) as foo1, . . .
from t
group by id;
I should note that your original data structure is often produced by a query that is a bit awry. Sometimes it is easier to fix the code that generates that result.

Firebird 2.5 insert row to existing table

I have table TABLE1 in a Firebird 2.5 database and want to insert multiple rows.
script:
INSERT INTO TABLE1 (ID, IDPREDEK, ICO, DIC, FIRMA, MISTO, ULICE, PSC, CISSML, CISEVID, PLATOD, PLATDO, JMENO, PRIJMENI, TITUL, FUNKCE, TELEFON, TELEFON2, FAX, EMAIL, ODP_JMENO, ODP_PRIJMENI, ODP_TITUL, ODP_FUNKCE, ODP_TELEFON, ODP_TELEFON2, ODP_FAX, ODP_EMAIL, D_INIDOP, D_INISETR, D_KATPRAC, POCETMUZI, POCETZENY, HASCHILD, HASCHILD1, HASCHILD2, POZNAMKA)
VALUES (91, 89, NULL, NULL, 'CLY0010702 - PHM-LPH_DEPO / PRG/RSM/FSB/PHM', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'N', 'N', 'N', NULL);
And problem is this error :
Error while importing to table TABLE1:
Engine Error (code = 335544665)
violation of PRIMARY or UNIGUE KEY constraint
"PK_TABLE1" on table "FIRMY".
Problematic key value is ("ID=95).
SQL Error (code= -803):
Invalid insert or update value(s): object columns are constrained - no 2 table rows can have duplicate column values.
In TABLE1 is last ID number 94, I don't have two same rows with ID 95.
Any ideas what to do?

Check is your ID identity. If so just leave it out of insert into() and values().

i found solution, replace ID :
... VALUES (GEN_ID(GEN_TABLE1 , 1), 91, NULL, NULL, ...

Insert from one table to another (different databases)

Auth2.dbo.Accounts : account_id, login_name , password , other columns...
Auth.dbo.Account : account_id , account , password , other columns...
I want to insert accounts(account_id,account,password) from "Auth.dbo.Account" to "Auth2.dbo.Accounts" (account_id,login_name , password) and giving other columns values that I want.
I was hoping that this could work , but once I wrote in SQL Management studio , I got an error syntax error near the " select account_id from Auth.dbo.account"
INSERT [dbo].[Accounts] ([account_id], [login_name], [password],
[referral_id], [referral_code], [pcbang], [block],
[withdraw_remain_time], [age], [auth_ok],
[last_login_server_idx], [event_code], [server_list_mask],
[result], [ip], [game_code], [gamecode], [login_event],
[email], [security_a_1], [security_a_2], [security_a_3],
[security_a_4], [security_q_1], [security_q_2], [security_q_3],
[security_q_4], [votepoints], [cash], [country])
VALUES (select account_id from auth.dbo.Account, N'Imad',
N'3cfbbd2ae3c3e416c6d00a5a12ee60e8', NULL, NULL,
0, 0, NULL, -1997, 1, NULL, NULL, NULL, NULL,
N'::1', NULL, NULL, NULL, N'imad.lekal#outlook.com',
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
0, 189, NULL)

Try with simple INSERT/SELECT. In this case account_id will be from auth.dbo.Account and all the other values will be constants.
INSERT Auth2.[dbo].[Accounts] ([account_id], [login_name], [password], [referral_id], [referral_code], [pcbang], [block], [withdraw_remain_time], [age], [auth_ok], [last_login_server_idx], [event_code], [server_list_mask], [result], [ip], [game_code], [gamecode], [login_event], [email], [security_a_1], [security_a_2], [security_a_3], [security_a_4], [security_q_1], [security_q_2], [security_q_3], [security_q_4], [votepoints], [cash], [country])
select account_id , N'Imad', N'3cfbbd2ae3c3e416c6d00a5a12ee60e8', NULL, NULL, 0, 0, NULL, -1997, 1, NULL, NULL, NULL, NULL, N'::1', NULL, NULL, NULL, N'imad.lekal#outlook.com', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 189, NULL from auth.dbo.Account;

Remove the VALUES (
When you INSERT using SELECT you should not use VALUES, since the values are coming from the SELECT
Also prefix the table names with the database names, assuming that Auth and Auth2 are 2 different databases.
Finally, your FROM is off.
Following should work.
INSERT Auth2.dbo.Accounts
(
[account_id]
, [login_name]
, [password]
, [referral_id]
, [referral_code]
, [pcbang]
, [block]
, [withdraw_remain_time]
, [age]
, [auth_ok]
, [last_login_server_idx]
, [event_code]
, [server_list_mask]
, [result]
, [ip]
, [game_code]
, [gamecode]
, [login_event]
, [email]
, [security_a_1]
, [security_a_2]
, [security_a_3]
, [security_a_4]
, [security_q_1]
, [security_q_2]
, [security_q_3]
, [security_q_4]
, [votepoints]
, [cash]
, [country]
)
SELECT
account_id
, N'Imad'
, N'3cfbbd2ae3c3e416c6d00a5a12ee60e8'
, NULL
, NULL
, 0
, 0
, NULL
, -1997
, 1
, NULL
, NULL
, NULL
, NULL
, N'::1'
, NULL
, NULL
, NULL
, N'imad.lekal#outlook.com'
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, 0
, 189
, NULL
FROM auth.dbo.Account
PRO TIP: Format your SQL properly bro, it helps to debug :).

hi i think this will help you first of all you create database link
CREATE DATABASE LINK your_db_link_name
CONNECT TO Schema_Name
IDENTIFIED BY <PWD>
USING 'Your Database Name';
next you need to make a query
INSERT [Accounts] ([account_id], [login_name], [password], [referral_id]
, [referral_code], [pcbang], [block], [withdraw_remain_time], [age], [
auth_ok], [last_login_server_idx], [event_code], [server_list_mask], [
result], [ip], [game_code], [gamecode], [login_event], [email], [
security_a_1], [security_a_2], [security_a_3], [security_a_4], [
security_q_1], [security_q_2], [security_q_3], [security_q_4], [
votepoints], [cash], [country])
VALUES (
select ac.account_id
from Your_DB_Link_Name#Account ac, N'Imad', N'3cfbbd2ae3c3e416c6d00a5a12ee60e8',
NULL, NULL, 0, 0, NULL, -1997, 1, NULL, NULL, NULL, NULL, N'::1', NULL,
NULL, NULL, N'imad.lekal#outlook.com', NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, 0, 189, NULL)
i think this will help you if you have any query then feel free

Update multiple column from another table using sql server 2005

I have one master table named table1 where i store or update mobileNo data daily for a month.
;WITH table1 AS (SELECT * FROM (VALUES
(9999999999, '01/10/2013', NULL, NULL, NULL, NULL),
(9999999999, NULL, '02/10/2013', NULL, NULL, NULL),
(9999999999, NULL, NULL, '03/10/2013', NULL, NULL),
(9999999999, NULL, NULL, NULL, '04/10/2013', NULL),
(9999999999, NULL, NULL, NULL, NULL, '30/10/2013'),
(9999999999, NULL, NULL, NULL, NULL, NULL),
(8888888888, '01/10/2013', NULL, NULL, NULL, NULL),
(8888888888, NULL, '02/10/2013', NULL, NULL, NULL),
(8888888888, NULL, NULL, '03/10/2013', NULL, NULL),
(8888888888, NULL, NULL, NULL, '04/10/2013', NULL),
(8888888888, NULL, NULL, NULL, NULL, '30/10/2013'))
as t(mobileno,date1,date2,date3,date4,date30))
And i have another table named table2 where i keep unique mobileNo against table1. Now i want to update table2 against table1 if any data exists in table1.
mobileno date1 date2 date3 date4 date30
--------------- ---------- ---------- ---------- ---------- ----------
8888888888 01/10/2013 02/10/2013 03/10/2013 04/10/2013 30/10/2013
9999999999 01/10/2013 02/10/2013 03/10/2013 04/10/2013 30/10/2013
However i tried the query like this
UPDATE table1
set table1.date1 =
(SELECT date1 from table2 where table2.mobileno = table1.mobileno)
Where table2.mobileno = table1.mobileno
How do i update in a single query without repeating to update the 30 nos. of date columns, please help me. Thanks in advance.

Perhaps NULL values from TABLE2 can be avoided to UPDATE NOT NULL values in TABLE1 using following UPDATE statement.
Only ISNULL function is added to previous post
update table1
set
date1 = ISNULL(t2.date1, date1),
date2 = ISNULL(t2.date2, date2)
from table2 t2
where table1.mobileno = t2.mobileno

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Duplicate rows with information merged and then remove duplicates - sql

Related

insert based on condition check in similar columns between 2 tables and add NULL when missing SQL

Merging multiple rows into one row by Grouping and checking for non null

Firebird 2.5 insert row to existing table

Insert from one table to another (different databases)

Update multiple column from another table using sql server 2005

Categories

Resources