Select records within a group - sql

Not sure how to frame this question so asking with an example.
From the below table, I want to find out all those records which are not defined with type as 'A'.
So from this table I want to find out the record with ID as 2.
TableA
+-----+------+
| ID1 | Type |
+-----+------+
| 1 | A |
| 1 | B |
| 1 | C |
| 2 | B |
| 2 | C |
| 3 | A |
| 3 | B |
| 3 | C |
+-----+------+
There is also a TableB, if we want to use.
+-----+
| ID2 |
+-----+
| 1 |
| 2 |
| 3 |
+-----+
Thanks a lot for helping.

One method is to use a HAVING clause with a conditional COUNT:
SELECT ID1
FROM dbo.YourTable
GROUP BY ID1
HAVING COUNT(CASE WHEN [Type] = 'A' THEN 1 END) = 0;

You can use a CTE to select all the IDs that are associated to the value you want to exclude.
Then you can use a subquery to filter out those IDs:
declare #TableA table (ID1 int, Type char(1))
insert into #TableA
values
(1, 'A')
,(1, 'B')
,(1, 'C')
,(2, 'B')
,(2, 'C')
,(3, 'A')
,(3, 'B')
,(3, 'C')
;with filteredIds
as
(
select
distinct ID1
from #TableA
where Type ='A'
)
select
*
from #TableA
where
ID1 not in (select id1 from filteredIds)
The result contains only the records that do not have value 'A' (in your example records with ID1=2):

You can use not exists:
select t2.id2
from TableB t2
where not exists (select 1
from TableA t1
where t2.id2 = t1.id1 and t1.Type = 'A'
);
With an index on TableA(ID1, Type), this is probably the fastest method under most circumstances.
Note that this also finds ids that are not in TableA at all.

Related

How to update sql records by comparing duplicate strings?

I have a table below:
id | echantillon_dta | Est_en_double
1 | Bonjour | null
2 | Bonjour | null
3 | Bonjour | null
4 | Joke | null
5 | Joke | null
6 | | null
And after process query will show below:
id | echantillon_dta | Est_en_double
1 | Bonjour | 1
2 | Bonjour | 1
3 | Bonjour | 1
4 | Joke | 4
5 | Joke | 4
6 | | null
How to compare string vesus string? And how to update column like so?
you can use update by min(id) when Record_Details is same.
and there is a misdescription :
6 | Nope | 6 //No duplicates found, stay null
id 6 is no duplicate but isDuplicate column value is 6,shouldn't it be null?
so i use having count(1) > 1 to slove it.
CREATE TABLE Table1
("id" int, "Record_Details" varchar2(11), "isDuplicate" varchar2(4))
;
INSERT ALL
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (1, 'Hello World', NULL)
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (2, 'Hello World', NULL)
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (3, 'Hello World', NULL)
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (4, 'Joke', NULL)
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (5, 'Joke', NULL)
INTO Table1 ("id", "Record_Details", "isDuplicate")
VALUES (6, 'Nope', NULL)
SELECT * FROM dual
;
update (
select T.*
, (select min("id")
from Table1 Tmp
where Tmp."Record_Details" = T."Record_Details"
group by Tmp."Record_Details" having count(1) > 1 --No duplicates found, stay null
) as "new_isDuplicate"
from Table1 T
)
set "isDuplicate" = "new_isDuplicate"
6 rows affected
select * from Table1
id | Record_Details | isDuplicate
-: | :------------- | :----------
1 | Hello World | 1
2 | Hello World | 1
3 | Hello World | 1
4 | Joke | 4
5 | Joke | 4
6 | Nope | null
db<>fiddle here
You seem to want the minimum id with the same record_details.
This should work:
select t.*,
min(id) over (partition by record_details) as isDuplicate
from t;
If you want this as an update, a correlated subquery is a simple approach:
update t
set isduplicate = (select min(t2.id)
from t t2
where t2.record_details = t.record_details
);
You can use a MERGE statement and analytic function to find the duplicates:
Oracle Setup:
CREATE TABLE Table_name ( id, Record_Details, isDuplicate ) AS
SELECT 1, 'Hello World', CAST( NULL AS NUMBER ) FROM DUAL UNION ALL
SELECT 2, 'Hello World', NULL FROM DUAL UNION ALL
SELECT 3, 'Hello World', NULL FROM DUAL UNION ALL
SELECT 4, 'Joke', NULL FROM DUAL UNION ALL
SELECT 5, 'Joke', NULL FROM DUAL UNION ALL
SELECT 6, 'Nope', NULL FROM DUAL;
Merge:
MERGE INTO table_name dst
USING (
SELECT ROWID rid,
MIN( id ) OVER ( PARTITION BY Record_details ) AS dupe_id
FROM table_name
) src
ON (
dst.ROWID = src.RID
AND dst.id <> src.dupe_id -- remove this line if you want to update all rows
)
WHEN MATCHED THEN
UPDATE SET isDuplicate = dupe_id;
Output:
ID | RECORD_DETAILS | ISDUPLICATE
-: | :------------- | ----------:
1 | Hello World | null
2 | Hello World | 1
3 | Hello World | 1
4 | Joke | null
5 | Joke | 4
6 | Nope | null
db<>fiddle here

How to search/select a list of composite index values and get the exact matching rows in SQL Server?

I have a list of value pairs that I have to search in a table in SQL Server. the table is something like this:
| id | class | value |
| 1 | A | 300 |
| 2 | A | 400 |
| 1 | B | 500 |
| 2 | B | 350 |
| 1 | C | 230 |
| 2 | C | 120 |
The columns id and class have an unique composite index that I want to take advantage of. Now I have this list of id-class pairs that I have to get from this table:
(1, A)
(2, B)
I need to select them to UPDATE the value of both rows to any value. Let's say 1000.
My problem is, how do I select those two rows while taking advantage of the composite index?
I have tried this:
SELECT
*
FROM
table
WHERE
id IN (1, 2)
AND class IN ('A','B')
But this returns me the combinations:
| id | class |
| 1 | A |
| 1 | B |
| 2 | A |
| 2 | B |
and I just want:
| id | class |
| 1 | A |
| 2 | B |
this would work:
SELECT
*
FROM
test
WHERE
CAST(id as varchar)+class IN ('1A', '2B')
but this breaks the index. Is there a way to get what I need while taking advantage of the index?
The follow scripts will take advantage of composite index:
SELECT
*
FROM [table] T
INNER JOIN (
SELECT
1 AS ID
,'A' AS CLASS
UNION
SELECT
2 AS ID
,'B' AS CLASS
) t2
ON T.Id = t2.Id
AND T.class = t2.class
or this:
SELECT
*
FROM [table] T
WHERE (Id = 1 AND class = 'A')
OR (Id = 2 AND class = 'B')
or this:
WITH CTE AS(
SELECT 1 AS ID, 'A' AS CLASS UNION
SELECT 2 AS Id, 'B' AS CLASS
)
SELECT
*
FROM [table] T
WHERE EXISTS (
SELECT 1
FROM CTE t2
WHERE T.Id = t2.Id
AND T.class = t2.class
)
You can use the OR operator, see:
SELECT * FROM table WHERE (id = 1 AND class = 'A') OR (id = 2 AND class = 'B')

How to rewrite a LEFT JOIN

Please, consider the following query:
create table lt (id1 int, val1 string);
insert into lt VALUES (1, "one"), (2, "two"), (3, "three");
create table rt (id2 int, val2 string);
insert into rt VALUES (2, "two"), (3, "three"), (4, "four");
select * from lt left join rt on id1=id2;
+-----+-------+------+-------+
| id1 | val1 | id2 | val2 |
+-----+-------+------+-------+
| 1 | one | NULL | NULL |
| 2 | two | 2 | two |
| 3 | three | 3 | three |
+-----+-------+------+-------+
For this specific example I can rewrite the LEFT JOIN as INNER JOIN + query that gets all IDs that are not in the "rt" table:
select lt.*, NULL as id2, NULL as val2 from lt where id1 not in (select id2 from rt)
union all
select * from lt join rt on id1=id2;
+-----+-------+------+-------+
| id1 | val1 | id2 | val2 |
+-----+-------+------+-------+
| 1 | one | NULL | NULL |
| 2 | two | 2 | two |
| 3 | three | 3 | three |
+-----+-------+------+-------+
Both querires give same result for this example. But is this generally true? Can I rewrite any LEFT JOIN in this fashion (or may be there is a shorter way)?
You can try below -
DEMO
select val1, NULL as id2, NULL as val2 from lt where id1 not in (select id2 from
rt)
union
select val1,id1, val1 from lt where 1=1 and id1 in (select id2 from rt)
OUTPUT:
val1 id2 val2
one
two 2 two
three 3 three

select SQL rows depending on non-identical results of a query

I am trying to select the columns which is relevant without knowing in advance which ones
I do a:
select *
from table
where id = '1'
the result i get is maybe 10 rows and 100+ columns
|id | column1 | column2 | column3 | column4 | column5 |....
| 1 | a | b | c | d | e |....
| 1 | a | XXX | c | d | e |....
| 1 | a | b | c | YYY | e |....
| 1 | a | b | c | d | e |....
For every row, one (or more) of the columns value is different, but i dont know which one(s)
is there any way i can create a temp table with the first query and do a sub query to display only one columns which doesnt have the same value in all the rows?
so the result would look like this:
|id | column2 | column4 |
| 1 | b | d |
| 1 | XXX | d |
| 1 | b | YYY |
| 1 | b | d |
since column 2 and 4 were the ones with non identical data these are the ones I want to see.
As already mentioned, this would require dynamic sql.
Maybe this will help you:
CREATE TABLE Column_Relevance
SELECT id,
COUNT(DISTINCT(column_1))/COUNT(*) AS relevance_column_1,
COUNT(DISTINCT(column_2))/COUNT(*) AS relevance_column_2,
COUNT(DISTINCT(column_3))/COUNT(*) AS relevance_column_3,
# AND SO ON....
GROUP BY id;
All relevance_columns with value < 1 indicate different values for the columns. You can build the whole statement in excel in a few minutes.
Once the table is created, add another column and create a select statement based on the column relevance (e.g. select if(relevance_column_1<1, column_1, else 'ignore') as column_1. This will return the string 'ignore' for all columns, that don't have distinct values.
This is far from perfect but maybe it helps you a little.
Here is a way you could use some aggregation to help. You said you have nearly 100 columns so this could take some effort to create but once it is done it would be fine. And this is just for analysis. You could utilize sys.columns to build the code for you but then we are back in the land of dynamic sql.
declare #Something table
(
ID int
, Column1 varchar(10)
, Column2 varchar(10)
, Column3 varchar(10)
, Column4 varchar(10)
, Column5 varchar(10)
)
insert #Something
values
(1, 'a', 'b', 'c', ' d ', 'e')
, (1, 'a', 'XXX', 'c', ' d ', 'e')
, (1, 'a', 'b', 'c', 'YYY', 'e')
, (1, 'a', 'b', 'c', ' d ', 'e')
;
with MinMax as
(
select ID
, MIN(Column1) as Col1Min
, MAX(Column1) as Col1Max
, MIN(Column2) as Col2Min
, MAX(Column2) as Col2Max
, MIN(Column3) as Col3Min
, MAX(Column3) as Col3Max
, MIN(Column4) as Col4Min
, MAX(Column4) as Col4Max
, MIN(Column5) as Col5Min
, MAX(Column5) as Col5Max
from #Something
group by ID
)
select s.ID
, Column1 = case when mm.Col1Max = mm.Col1Min then '' else s.Column1 end
, Column2 = case when mm.Col2Max = mm.Col2Min then '' else s.Column2 end
, Column3 = case when mm.Col3Max = mm.Col3Min then '' else s.Column3 end
, Column4 = case when mm.Col4Max = mm.Col4Min then '' else s.Column4 end
, Column5 = case when mm.Col5Max = mm.Col5Min then '' else s.Column5 end
from #Something s
join MinMax mm on mm.ID = s.ID
Have you tried using distinct ? It returns only unique rows:
select *
from table
where id = '1'
|id | column2 | column4 |
| 1 | a | a |
| 1 | a | a |
| 1 | b | d |
| 1 | b | d |
select distinct * from table where id= '1'
|id | column2 | column4 |
| 1 | a | a |
| 1 | b | d |
I hope this helps you.

SQL to return one resultset from multiple tables when either table could return zero results

I have two tables with some columns that are similar and some that are different. I need to return a result that merges the differing columns into one result set, however, I need a condition that may result in either of the tables having no matches, I tried a union, but that is returning two rows with null values, and I would like just one. Here are two example tables:
TableA
----------------------------------------------------
| ID | ColumnA | ColumnB | ForeignKeyA | TimeStamp |
----------------------------------------------------
| 1 | Val1 | Val2 | KeyA | 2013-01-01|
----------------------------------------------------
| 2 | Val3 | Val4 | KeyB | 2013-01-02|
----------------------------------------------------
TableB
------------------------------------------
| ID | ColumnC | ForeignKeyA | TimeStamp |
------------------------------------------
| 1 | Val5 | KeyA | 2013-01-01|
------------------------------------------
| 2 | Val6 | KeyC | 2013-01-02|
------------------------------------------
and here are some pseudo queries and the return values I would like:
1)
SELECT TableA.ColumnA AS ColumnA,
TableA.ColumnB AS Column B,
TableB.ColumnC AS ColumnC,
TableA.id AS TableA_ID,
TableB.id AS TableB_ID
(WHERE ForeignKeyA in either table = KeyA and TimeStamp in either table = 2013-01-01)
>>
-------------------------------------------------------
| ColumnA | ColumnB | ColumnC | TableA_ID | TableB_ID |
-------------------------------------------------------
| Val1 | Val2 | Val5 | 1 | 1 |
-------------------------------------------------------
2)
SELECT TableA.ColumnA AS ColumnA,
TableA.ColumnB AS Column B,
TableB.ColumnC AS ColumnC,
TableA.id AS TableA_ID,
TableB.id AS TableB_ID
(WHERE ForeignKeyA in either table = KeyB and TimeStamp in either table = 2013-01-02)
>>
-------------------------------------------------------
| ColumnA | ColumnB | ColumnC | TableA_ID | TableB_ID |
-------------------------------------------------------
| Val3 | Val4 | Null | 2 | Null |
-------------------------------------------------------
3)
SELECT TableA.ColumnA AS ColumnA,
TableA.ColumnB AS Column B,
TableB.ColumnC AS ColumnC,
TableA.id AS TableA_ID,
TableB.id AS TableB_ID
(WHERE ForeignKeyA in either table = KeyC and TimeStamp in either table = 2013-01-02)
>>
-------------------------------------------------------
| ColumnA | ColumnB | ColumnC | TableA_ID | TableB_ID |
-------------------------------------------------------
| Null | Null | Val6 | Null | 2 |
-------------------------------------------------------
I think that in your query you can use full outer join like this (you can add columns into on clause, but it'll not change anything):
with
cteA as (select * from TableA where TimeStamp = _ts and ForeignKeyA = _fk),
cteB as (select * from TableB where TimeStamp = _ts and ForeignKeyA = _fk)
select
A.ColumnA, A.ColumnB, B.ColumnC,
A.ID as TableA_ID, B.ID as TableB_ID
from cteA as A
full outer join cteB as B on 1 = 1
But you can use full outer join without prefiltering (This would be less efficient is you have indexes on ForeignKeyA and TimeStamp columns):
select
A.ColumnA, A.ColumnB, B.ColumnC,
A.ID as TableA_ID, B.ID as TableB_ID
from TableA as A
full outer join TableB as B
on B.ForeignKeyA = A.ForeignKeyA and B.TimeStamp = A.TimeStamp
where
coalesce(A.ForeignKeyA, B.ForeignKeyA) = _fk and
coalesce(A.TimeStamp, B.TimeStamp) = _ts
sql fiddle demo
The possibly tricky detail here is to join on (ForeignKeyA, TimeStamp), not on ID as one would normally do.
With this simplified setup (using legal column names):
CREATE TABLE tbl_a (id int, col_a text, col_b text, fk_a text, ts timestamp);
INSERT INTO tbl_a VALUES
(1, 'Val1', 'Val2', 'KeyA', '2013-01-01')
,(2, 'Val3', 'Val4', 'KeyB', '2013-01-02');
CREATE TABLE tbl_b (id int, col_c text, fk_a text, ts timestamp);
INSERT INTO tbl_b VALUES
(1, 'Val5', 'KeyA', '2013-01-01')
,(2, 'Val6', 'KeyC', '2013-01-02');
The query would be:
SELECT a.col_a, a.col_b, b.col_c
,a.id AS a_id, b.id AS b_id
FROM tbl_a a
FULL JOIN tbl_b b USING (fk_a, ts)
WHERE 'KeyA' IN (a.fk_a, b.fk_a)
AND '2013-01-01' IN (a.ts, b.ts);
Should be considerably faster than using CTEs. Test with EXPLAIN ANALYZE.
-> SQLfiddle demo.