Finding duplicate members in a table - sql

I have come across a problem in writing a query to find duplicate members in a table. I have tried to simplify the problem with a sample table and data.
CREATE TABLE MYTABLE (
S_ID VARCHAR2(10),
PARAM VARCHAR2(10),
VALUE VARCHAR2(10)
);
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('1', 'NAME', 'A');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('1', 'AGE', '15');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('1', 'SEX', 'M');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('2', 'NAME', 'B');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('2', 'AGE', '16');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('2', 'SEX', 'M');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('3', 'NAME', 'A');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('3', 'AGE', '15');
INSERT INTO MYTABLE (S_ID, PARAM, VALUE) VALUES ('3', 'SEX', 'M');
Here items with S_ID 1 and 3 are same.

Here's an easy way to find duplicate values in one field:
SELECT fieldName FROM TableName
GROUP BY FieldName
HAVING COUNT(*) > 1

You mean this? :
select colname, count(colname) from TableName
Group by colname Having (count(colname) > 1)

Related

How to select distinct multi-column values in Oracle SQL?

I am trying to get distinct values with multi column select.
Sample table:
CREATE TABLE DUP_VALUES (ID NUMBER, NAME VARCHAR2(64));
INSERT INTO DUP_VALUES values (1, 'TEST1');
INSERT INTO DUP_VALUES values (2, 'TEST1');
INSERT INTO DUP_VALUES values (3, 'TEST2');
INSERT INTO DUP_VALUES values (4, 'TEST2');
INSERT INTO DUP_VALUES values (5, 'TEST1');
INSERT INTO DUP_VALUES values (6, 'TEST1');
INSERT INTO DUP_VALUES values (7, 'TEST1');
I want to get
ID NAME
1 TEST1
3 TEST2
I tried with SELECT DISTINCT ID, NAME FROM DUP_VALUES
But, I got all values, because ID is unique.
Use aggregation:
select min(id) as id, name
from dup_values
group by name;

SQL Server: Find the group of records existing in another group of records

I'm new to SQL Server and I searched for a solution to find, if a group is included in another group.
The query result should be grp_id 2 because 'A'+'B' is included in grp 3 and 5.
The result should be the grp_id of the the groups, that are included in other groups. With this result i´ll make an update of another table, joined with the grp_id.
The result should be:
+----+
| id |
+----+
| 2 |
+----+
I stuck in SQL because I do not find a solution to compare the groups. The idea was using bitwise comparison. But for that I had to add the value of each item in a field. I think there could be an easier way.
Thank you and best regards!
Eric
create table tmp_grpid (grp_id int);
create table tmp_grp (grp_id int, item_val nvarchar(10));
insert into tmp_grpid(grp_id) values (1);
insert into tmp_grpid(grp_id) values (2);
insert into tmp_grpid(grp_id) values (3);
insert into tmp_grpid(grp_id) values (4);
insert into tmp_grpid(grp_id) values (5);
--
insert into tmp_grp(grp_id, item_val) values (1, 'A');
insert into tmp_grp(grp_id, item_val) values (2, 'A');
insert into tmp_grp(grp_id, item_val) values (2, 'B');
insert into tmp_grp(grp_id, item_val) values (3, 'A');
insert into tmp_grp(grp_id, item_val) values (3, 'B');
insert into tmp_grp(grp_id, item_val) values (3, 'C');
insert into tmp_grp(grp_id, item_val) values (4, 'A');
insert into tmp_grp(grp_id, item_val) values (4, 'C');
insert into tmp_grp(grp_id, item_val) values (4, 'D');
insert into tmp_grp(grp_id, item_val) values (5, 'A');
insert into tmp_grp(grp_id, item_val) values (5, 'B');
insert into tmp_grp(grp_id, item_val) values (5, 'E');
Geez!
Technically speaking, group one is found in all other groups right? So, first a cross join to itself would be best with the condition that the values are the same AND that the groups are different, but before we do that we need to know how many items belong to each group so that's why we have the first select as a group that includes the count of elements per group, then join that with the cross join...Hope this helps.
select distinct dist_grpid
from
(select grp_id, count(*) cc from tmp_grp group by grp_id) g
inner join
(
select dist.grp_id dist_grpid, tmp_grp.grp_id, count(*) cc
from
tmp_grp dist
cross join tmp_grp
where
dist.item_val = tmp_grp.item_val and
dist.grp_id != tmp_grp.grp_id
group by
dist.grp_id,
tmp_grp.grp_id
) cj on g.grp_id = cj.dist_grpid and g.cc = cj.cc

How to select data from joined table as a table type?

I'd like to build de-normalised view that handles reference data in table type.
create table reftab (id number, name varchar2(40), details varchar2(1000));
/
create table basictab (id number, name varchar2(300), contract varchar2 (20));
/
create or replace type reftype is object (id number, name varchar2(40), details varchar2(1000));
/
create or replace type reftypetab as table of reftype;
/
insert into basictab values (1, 'aaa', 'c1');
insert into basictab values (2, 'aab', 'c1');
insert into basictab values (3, 'aaa', 'c2');
insert into basictab values (4, 'aaa', 'c3');
insert into reftab values (1, 'asd', 'aaa');
insert into reftab values (1, 'asg', 'ass');
insert into reftab values (1, 'ash', 'add');
insert into reftab values (1, 'asf', 'agg');
insert into reftab values (3, 'asd', 'aaa');
insert into reftab values (3, 'ad', 'aa');
insert into reftab values (4, 'asd', 'aaa');
insert into reftab values (4, 'as', 'a');
insert into values (4, 'ad', 'aa');
/
With such data I'd like to have view that contains 4 rows of basictab with additional column that is reftypetab and contains all ref data joined on id.
I know I can obtain it by:
CREATE OR REPLACE FUNCTION pipef (p_id IN NUMBER) RETURN reftypetab PIPELINED AS
BEGIN
FOR x IN (select * from reftab where id = p_id) LOOP
PIPE ROW(reftype(x.id, x.name, x.details));
END LOOP;
RETURN;
END;
/
SELECT id, pipef(id)
FROM reftab
group BY id;
/
but is there any better way without function to get the result?
Your current set-up gets:
SELECT id, pipef(id) as result
FROM reftab
group BY id;
ID RESULT(ID, NAME, DETAILS)
---------- ------------------------------------------------------------------------------------------------------------------------
1 REFTYPETAB(REFTYPE(1, 'asd', 'aaa'), REFTYPE(1, 'asg', 'ass'), REFTYPE(1, 'ash', 'add'), REFTYPE(1, 'asf', 'agg'))
4 REFTYPETAB(REFTYPE(4, 'asd', 'aaa'), REFTYPE(4, 'as', 'a'), REFTYPE(4, 'ad', 'aa'))
3 REFTYPETAB(REFTYPE(3, 'asd', 'aaa'), REFTYPE(3, 'ad', 'aa'))
You could use the collect() function to simplify that:
select id, cast(collect(reftype(id, name, details)) as reftypetab) as result
from reftab
group by id;
ID RESULT(ID, NAME, DETAILS)
---------- ------------------------------------------------------------------------------------------------------------------------
1 REFTYPETAB(REFTYPE(1, 'asd', 'aaa'), REFTYPE(1, 'asf', 'agg'), REFTYPE(1, 'ash', 'add'), REFTYPE(1, 'asg', 'ass'))
3 REFTYPETAB(REFTYPE(3, 'asd', 'aaa'), REFTYPE(3, 'ad', 'aa'))
4 REFTYPETAB(REFTYPE(4, 'asd', 'aaa'), REFTYPE(4, 'ad', 'aa'), REFTYPE(4, 'as', 'a'))
If you want information from basictab as well you can use a multiset operator:
select bt.id, bt.name,
cast(multiset(select reftype(rt.id, rt.name, rt.details)
from reftab rt where rt.id = bt.id) as reftypetab) as result
from basictab bt;
ID NAME RESULT(ID, NAME, DETAILS)
---------- ---------- ------------------------------------------------------------------------------------------------------------------------
1 aaa REFTYPETAB(REFTYPE(1, 'asd', 'aaa'), REFTYPE(1, 'asg', 'ass'), REFTYPE(1, 'ash', 'add'), REFTYPE(1, 'asf', 'agg'))
2 aab REFTYPETAB()
3 aaa REFTYPETAB(REFTYPE(3, 'asd', 'aaa'), REFTYPE(3, 'ad', 'aa'))
4 aaa REFTYPETAB(REFTYPE(4, 'asd', 'aaa'), REFTYPE(4, 'as', 'a'), REFTYPE(4, 'ad', 'aa'))
Further research gave me another way to do that:
SELECT id, CAST(COLLECT(reftype(r.id, r.name, r.details)) AS reftypetab) AS customer_ids
FROM reftab r group by id;
This looks much better but still I would ask if there are any opther ways to do that.
EDIT
Maybe that's not exactly what I was asking about but cursor expresion can give similar result.
SELECT id, cursor(select reftype(r.id, r.name, r.details) from reftab r where r.id = b.id) AS customer_ids
FROM basictab b;

Insert and Update multiple values in single SQL Statement

We are using Insert statements for multi inserts like this:
INSERT INTO [db1].[dbo].[tb1] ([ID], [CLM1], [CLM2])
VALUES
('1', "A", "DB"),
('2', "AB", "BQ"),
('3', "AA", "BH"),
('4', "AD", "BT"),
('5', "AF", "EB"),
('6', "EA", "AB")
In the above table, ID is primary key, want to know one query with passing all values, values should update existing records and insert new records into table
You can use Merge:
MERGE INTO [db1].[dbo].[tb1] AS Target
USING (
VALUES
('1', 'A', 'DB'),
('2', 'AB', 'BQ'),
('3', 'AA', 'BH'),
('4', 'AD', 'BT'),
('5', 'AF', 'EB'),
('6', 'EA', 'AB')
) AS Source (new_ID, new_CLM1, new_CLM2)
ON Target.ID = Source.new_ID
WHEN MATCHED THEN
UPDATE SET
ID = Source.new_ID,
CLM1 = Source.new_CLM1,
CLM2 = Source.new_CLM2
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, CLM1, CLM2) VALUES (new_ID, new_CLM1, new_CLM2);
Merge Doc

SQL Update First record of Duplicate row in table

I am looking to update the first record when a duplicate is found in a table.
CREATE TABLE tblauthor
(
Col1 varchar(20),
Col2 varchar(30)
);
CREATE TABLE tblbook
(
Col1 varchar(20),
Col2 varchar(30),
Col3 varchar(30)
);
INSERT INTO tblAuthor
(Col1,Col2)
VALUES
('1', 'John'),
('2', 'Jane'),
('3', 'Jack'),
('4', 'Joe');
INSERT INTO tblbook
(Col1,Col2,Col3)
VALUES
('1', 'John','Book 1'),
('2', 'John','Book 2'),
('3', 'Jack','Book 1'),
('4', 'Joe','Book 1'),
('5', 'Joe','Book 2'),
('6', 'Jane','Book 1'),
('7', 'Jane','Book 2');
The update result I want to accomplish should update the records as follows. I would like tblbook.col3 = 1st.
select * from tblbook
('1', 'John','1st'),
('3', 'Jack','1st'),
('4', 'Joe','1st'),
('6', 'Jane','1st');
Can't seem to even get this done with distinct.
Use ROW_NUMBER to assign a number to each row grouped by the Author's name (col2) and then update the ones that have a number of 1
update tblbook set col3 = '1st'
where col1 in(
select
col1
from (
select
tblbook.col1,
tblbook.col2,
tblbook.col3,
ROW_NUMBER() OVER (PARTITION BY tblbook.Col2 order by tblbook.col1) as rownum
from tblbook
left outer join tblauthor on tblbook.col2 = tblauthor.col2
) [t1]
where [t1].rownum = 1
)
Fiddle: http://sqlfiddle.com/#!3/4b6c8/20/0
If you want to update tblbook so the third column is '1st' on duplicates, then you can easily do so with an updatable CTE:
with toupdate as (
select tbl2.*, row_number() over (partition by col2 order by col1) as seqnum
from tbl2
)
update toupdate
set col3 = '1st'
where seqnum = 1;
This is the closest that I can come to understanding what you really want.