SQL Query JOIN on the same table for a given data

SQL Query JOIN on the same table for a given data - sql

I have the following table and data:
CREATE TABLE TEST_TABLE (
ID NUMBER(6) NOT NULL,
COMMON_SEQ NUMBER(22),
NAME VARCHAR(20),
CONSTRAINT PK_CONST PRIMARY KEY (ID)
);
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1001, NULL, 'Michelle');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1002, NULL, 'Tiberius');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1003, NULL, 'Marigold');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1004, 999, 'Richmond');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1005, 999, 'Marianne');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1006, NULL, 'Valentin');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1007, 888, 'Juliette');
INSERT INTO TEST_TABLE (ID, COMMON_SEQ, NAME) VALUES (1008, NULL, 'Lawrence');
Some records in this table are related to each other by the common value of COMMON_SEQ (for example COMMON_SEQ of 999 relates Richmond and Marianne).
How can I select all names based on given ID as an input?
I tried joining table to itself (works ok when COMMON_SEQ is null). This example returns Michelle record:
SELECT T.ID, T.COMMON_SEQ,T.NAME
FROM TEST_TABLE T
LEFT JOIN TEST_TABLE T2 ON NOT T.COMMON_SEQ is NULL
AND T.COMMON_SEQ=T2.COMMON_SEQ AND T.ID<>T2.ID
WHERE T.ID=1001
But it doesn't bring back 2 records for ID 1004. This example returns only Richmond record (but I need to return also Marianne record):
SELECT T.ID, T.COMMON_SEQ,T.NAME
FROM TEST_TABLE T
LEFT JOIN TEST_TABLE T2 ON NOT T.COMMON_SEQ is NULL
AND T.COMMON_SEQ=T2.COMMON_SEQ AND T.ID<>T2.ID
WHERE T.ID=1004
How can I improve/rewrite the query to return Richmond and Marianne records when I supply only one ID value (either 1004 or 1005)?

You could use:
SELECT *
FROM TEST_TABLE t
WHERE COMMON_SEQ IN (SELECT COMMON_SEQ
FROM TEST_TABLE t1
WHERE t1.ID = 1004)
OR t.ID = 1004;
DBFiddle Demo
Passing the same parameter twice to handle NULL in COMMON_SEQ.

Try this
SELECT COALESCE (ty.id, tx.id) AS id,
COALESCE (ty.common_seq, tx.common_seq) AS common_seq,
COALESCE (ty.name, tx.name) AS name
FROM test_table tx LEFT OUTER JOIN test_table ty
ON (tx.common_seq = ty.common_seq)
WHERE tx.ID = 1004;
With this you can avoid using IN or EXISTS and this is likely to be more performant.

Related

Select query with "is null"-clause and subselect / left join returns no results

I'm trying, on base of Oracle 12c, to select entries from a DETAIL table which should fulfill two where-constraints:
foreign key is in sub-select
column is null
The following statement with the sub-select (in-clause) returns no results at all.
select *
from DETAIL
where (PARENT_ID) in (
select ID from MASTER where COL1 = 1)
and COL2 is null;
And also the left join returns no results:
select d.*
from MASTER m
left join DETAIL d
on d.PARENT_ID = m.ID
where m.COL1 = 1
and d.COL2 is null;
My table setup contains the following tables:
create table MASTER (
ID NUMBER(19) NOT NULL,
COL1 NUMBER(19),
primary key (ID)
);
insert into MASTER ("ID", "COL1") VALUES (1, 1);
insert into MASTER ("ID", "COL1") VALUES (2, 2);
insert into MASTER ("ID", "COL1") VALUES (3, 1);
create table DETAIL (
ID NUMBER(19) NOT NULL,
PARENT_ID NUMBER(19),
COL2 NUMBER(2,0),
primary key (ID),
foreign key (PARENT_ID) references MASTER(ID)
);
insert into DETAIL ("ID", "PARENT_ID", "COL2") VALUES (1, 1, 1);
insert into DETAIL ("ID", "PARENT_ID", "COL2") VALUES (2, 2, 1);
insert into DETAIL ("ID", "PARENT_ID", "COL2") VALUES (3, 3, 1);
insert into DETAIL ("ID", "PARENT_ID", "COL2") VALUES (4, 3, 2);
I would expect to get more than 0 entries as result on base of the code above.
Any recommandations? Thanks!

In your DETAIL table COL2 always has a value and is thus never NULL. Because of this the COL2 IS NULL condition is never satisfied, and your queries return no results.

SQL Server output in desired formatted way

I have two tables, tblName and tblCode.
tblName:
CREATE TABLE [dbo].[TblName]
(
[Id] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](50) NULL,
CONSTRAINT [PK_TblName]
PRIMARY KEY CLUSTERED ([Id] ASC)
) ON [PRIMARY]
tblCode:
CREATE TABLE [dbo].[tblCode]
(
[NameId] [int] NULL,
[Code] [varchar](50) NULL,
[Value] [varchar](50) NULL
) ON [PRIMARY]
Code:
INSERT INTO [dbo].[TblName]
([Name])
VALUES
('Rahul'),
('Rohit'),
('John'),
('David'),
('Stephen')
GO
INSERT INTO [dbo].[tblCode] ([NameId], [Code], [Value])
VALUES (1, 'DEL', 'Delivery'),
(1, 'DEL', 'Deployment'),
(2, 'REL', 'Release Management'),
(3, 'REL', 'Release Management'),
(4, 'TEST', 'Testing'),
(4, 'TEST', 'Final Testing')
I am trying to write a query to get all Names which are in tblCode with the Code and Value. For example I have NameId 1 present in tblCode with Code 'DEL' and Value as 'Delivery' and 'Deployment'. Similarly I have NameId 2,3 and 4 in tblCode with same or different Code and Value. So I am trying to get output in such a way if same name with same code is present in tblCode then it should come row with Name and Comma separated values as shown in below desired output.
This is they query I am using but its not giving the output I am looking for.
SELECT
N.Name,
CASE
WHEN C.Code = 'DEL'
THEN C.Value
ELSE ''
END As 'CodeValue'
FROM
TblName N
INNER JOIN
tblCode C ON N.Id = C.NameId
WHERE
C.NameId = 1 AND C.Code IN ('DEL', 'REL', 'TEST')

http://sqlfiddle.com/#!6/bdca78/11/0
CREATE TABLE tblName (id INTEGER, name VARCHAR(255));
INSERT INTO tblName VALUES(1, 'Rahul');
INSERT INTO tblName VALUES(2, 'Rohit');
INSERT INTO tblName VALUES(3, 'John');
INSERT INTO tblName VALUES(4, 'David');
INSERT INTO tblName VALUES(5, 'Steven');
CREATE TABLE tblCode(nameId INTEGER, code VARCHAR(255), value VARCHAR(255));
INSERT INTO tblCode VALUES(1, 'DEL', 'Delivery');
INSERT INTO tblCode VALUES(1, 'DEL', 'Development');
INSERT INTO tblCode VALUES(2, 'REL', 'Release Management');
INSERT INTO tblCode VALUES(3, 'REL', 'Release Management');
INSERT INTO tblCode VALUES(4, 'TEST', 'Testing');
INSERT INTO tblCode VALUES(4, 'TEST', 'Final Testing');
SELECT name,
codeValue
FROM
(SELECT tblName.name AS name,
STUFF((SELECT ',' + tblCode.value
FROM tblCode
WHERE tblCode.nameId = tblName.id
FOR XML PATH('')), 1 ,1, '') AS codeValue
FROM tblName) inline_view
WHERE codeValue IS NOT NULL;
Edit
ZoharPeled's solution is correct. This solution returns the name and a comma separated list of value and does not consider the code.
Zohar aggregates the list of value per code, which I think is what's required. OP, if the objective is to aggregate per name per code, including code in the output would make the result set more meaningful.
The following links show the difference, first is my SQL statement, then Zohar's.
My SQL: http://sqlfiddle.com/#!6/94fd0/1/0
Zohar's SQL: http://sqlfiddle.com/#!6/94fd0/2/0

Here is one way to do it:
;WITH CTE AS
(
SELECT DISTINCT
[NameId]
,[Code]
,(
SELECT STUFF(
(SELECT ',' + Value
FROM dbo.tblCode t1
WHERE t0.Code = t1.Code
AND t0.NameId = t1.NameId
FOR XML PATH(''))
, 1, 1, '')
) AS CodeValue
FROM dbo.tblCode t0
)
SELECT Name, CodeValue
FROM tblName
INNER JOIN CTE ON Id = CTE.NameId
ORDER BY Id
Results:
Name CodeValue
Rahul Delivery,Deployment
Rohit Release Management
John Release Management
David Testing,Final Testing
Read this SO post for an explanation on how to use STUFF and FOR XML to create a concatenated string from multiple rows.
You can see a live demo on rextester.

sql query to join two tables and a boolean flag to indicate whether it contains any words from third table

I have 3 tables with the following schema
create table main (
main_id int PRIMARY KEY,
secondary_id int NOT NULL
);
create table secondary (
secondary_id int NOT NULL,
tags varchar(100)
);
create table bad_words (
words varchar(100) NOT NULL
);
insert into main values (1, 1001);
insert into main values (2, 1002);
insert into main values (3, 1003);
insert into main values (4, 1004);
insert into secondary values (1001, 'good word');
insert into secondary values (1002, 'bad word');
insert into secondary values (1002, 'good word');
insert into secondary values (1002, 'other word');
insert into secondary values (1003, 'ugly');
insert into secondary values (1003, 'bad word');
insert into secondary values (1004, 'pleasant');
insert into secondary values (1004, 'nice');
insert into bad_words values ('bad word');
insert into bad_words values ('ugly');
insert into bad_words values ('worst');
expected output
----------------
1, 1000, good word, 0 (boolean flag indicating whether the tags contain any one of the words from the bad_words table)
2, 1001, bad word,good word,other word , 1
3, 1002, ugly,bad word, 1
4, 1003, pleasant,nice, 0
I am trying to use case to select 1 or 0 for the last column and use a join to join the main and secondary table, but getting confused and stuck. Can someone please help me with a query ? These tables are stored in redshift and i want query compatible with redshift.
you can use the above schema to try your query in sqlfiddle
EDIT: I have updated the schema and expected output now by removing the PRIMARY KEY in secondary table so that easier to join with the bad_words table.

You can use EXISTS and a regex comparison with \m and \M (markers for beginning and end of a word, respectively):
with
main(main_id, secondary_id) as (values (1, 1000), (2, 1001), (3, 1002), (4, 1003)),
secondary(secondary_id, tags) as (values (1000, 'very good words'), (1001, 'good and bad words'), (1002, 'ugly'),(1003, 'pleasant')),
bad_words(words) as (values ('bad'), ('ugly'), ('worst'))
select *, exists (select 1 from bad_words where s.tags ~* ('\m'||words||'\M'))::int as flag
from main m
join secondary s using (secondary_id)

select main_id, a.secondary_id, tags, case when c.words is not null then 1 else 0 end
from main a
join secondary b on b.secondary_id = a.secondary_id
left outer join bad_words c on c.words like b.tags

SELECT m.main_id, m.secondary_id, t.tags, t.is_bad_word
FROM srini.main m
JOIN (
SELECT st.secondary_id, st.tags, exists (select 1 from srini.bad_words b where st.tags like '%'+b.words+'%') is_bad_word
FROM
( SELECT secondary_id, LISTAGG(tags, ',') as tags
FROM srini.secondary
GROUP BY secondary_id ) st
) t on t.secondary_id = m.secondary_id;
This worked for me in redshift and produced the following output with the above mentioned schema.
1 1001 good word false
3 1003 ugly,bad word true
2 1002 good word,other word,bad word true
4 1004 pleasant,nice false

Select rows with duplicate values in 2 columns

This is my table:
CREATE TABLE [Test].[dbo].[MyTest]
(
[Id] BIGINT NOT NULL,
[FId] BIGINT NOT NULL,
[SId] BIGINT NOT NULL
);
And some data:
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (1, 100, 11);
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (2, 200, 12);
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (3, 100, 21);
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (4, 200, 22);
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (5, 300, 13);
INSERT INTO [Test].[dbo].[MyTest] ([Id], [FId], [SId]) VALUES (6, 200, 12);
So I need 2 select query,
First Select FId, SId that like a distinct in both column so the result is:
100, 11
200, 12
100, 21
200, 22
300, 13
As you see the values of 200, 12 returned once.
Second query is the Id's of that columns whose duplicated in both FId, SId So the result is:
2
6
Does any one have any idea about it?

Standard SQL
SELECT
M.ID
FROM
( -- note all duplicate FID, SID pairs
SELECT FID, SID
FROM MyTable
GROUP BY FID, SID
HAVING COUNT(*) > 1
) T
JOIN -- back onto main table using these duplicate FID, SID pairs
MyTable M ON T.FID = M.FID AND T.SID = M.SID
Using windowing:
SELECT
T.ID
FROM
(
SELECT
ID,
COUNT(*) OVER (PARTITION BY FID, SID) AS CountPerPair
FROM
MyTable
) T
WHERE
T.CountPerPair > 1

First query:
SELECT DISTINCT Fid,SId
FROM MyTest
Second query:
SELECT DISTINCT a1.Id
FROM MyTest a1 INNER JOIN MyTest a2
ON a1.Fid = a2.Fid
AND a1.SId = a2.SId
AND a1.Id <> a2.Id
I cannot test them, but I think they should work...

first:
select distinct FId,SId from [Test].[dbo].[MyTest]
second query
select distinct t.Id
from [Test].[dbo].[MyTest] t
inner join [Test].[dbo].[MyTest] t2
on t.Id<>t2.Id and t.FId=t2.FId and t.SId=t2.SId

Part 1 is as mentioned above distinct.
This will resolve second part.
select id from [Test].[dbo].[MyTest] a
where exists(select 1 from [Test].[dbo].[MyTest] where a.[SId] = [SId] and a.[FId] = [FId] and a.id <> id)

How do I remove all but some records based on a threshold?

I have a table like this:
CREATE TABLE #TEMP(id int, name varchar(100))
INSERT INTO #TEMP VALUES(1, 'John')
INSERT INTO #TEMP VALUES(1, 'Adam')
INSERT INTO #TEMP VALUES(1, 'Robert')
INSERT INTO #TEMP VALUES(1, 'Copper')
INSERT INTO #TEMP VALUES(1, 'Jumbo')
INSERT INTO #TEMP VALUES(2, 'Jill')
INSERT INTO #TEMP VALUES(2, 'Rocky')
INSERT INTO #TEMP VALUES(2, 'Jack')
INSERT INTO #TEMP VALUES(2, 'Lisa')
INSERT INTO #TEMP VALUES(3, 'Amy')
SELECT *
FROM #TEMP
DROP TABLE #TEMP
I am trying to remove all but some records for those that have more than 3 names with the same id. Therefore, I am trying to get something like this:
id name
1 Adam
1 Copper
1 John
2 Jill
2 Jack
2 Lisa
3 Amy
I am not understanding how to write this query. I have gotten to the extent of preserving one record but not a threshold of records:
;WITH FILTER AS
(
SELECT id
FROM #TEMP
GROUP BY id
HAVING COUNT(id) >=3
)
SELECT id, MAX(name)
FROM #TEMP
WHERE id IN (SELECT * FROM FILTER)
GROUP BY id
UNION
SELECT id, name
FROM #TEMP
WHERE id NOT IN (SELECT * FROM FILTER)
Gives me:
1 Robert
2 Rocky
3 Amy
Any suggestions? Oh by the way, I don't care what records are preserved while merging.

You can do it using CTE
CREATE TABLE #TEMP(id int, name varchar(100))
INSERT INTO #TEMP VALUES(1, 'John')
INSERT INTO #TEMP VALUES(1, 'Adam')
INSERT INTO #TEMP VALUES(1, 'Robert')
INSERT INTO #TEMP VALUES(1, 'Copper')
INSERT INTO #TEMP VALUES(1, 'Jumbo')
INSERT INTO #TEMP VALUES(2, 'Jill')
INSERT INTO #TEMP VALUES(2, 'Rocky')
INSERT INTO #TEMP VALUES(2, 'Jack')
INSERT INTO #TEMP VALUES(2, 'Lisa')
INSERT INTO #TEMP VALUES(3, 'Amy')
SELECT *
FROM #TEMP;
WITH CTE(N) AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY id ORDER BY id)
FROM #Temp
)
DELETE CTE WHERE N>3;
SELECT *
FROM #TEMP;
DROP TABLE #TEMP

I will change your select like this (not tested)
select name from #temp group by name having count(id) > 3
then you can implement your query in a delete statement using your select as a where clause

in inner query you can use row_number function over (partition by id)
and then in outer query you have to give condition like below
select id,name from (
SELECT id,name, row_number() over (partition by id order by 1) count_id FROM #test
group by id, name )
where count_id <=3

If i got your question right, you need to get rows when id occurrence 3 or more times
select t1.name,t1.id from tbl1 t1
inner join tbl1 t2 on t1.id = t2.id
group by t1.name, t1.id
having count(t1.id) > 2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Query JOIN on the same table for a given data - sql

You could use: SELECT * FROM TEST_TABLE t WHERE COMMON_SEQ IN (SELECT COMMON_SEQ FROM TEST_TABLE t1 WHERE t1.ID = 1004) OR t.ID = 1004; DBFiddle Demo Passing the same parameter twice to handle NULL in COMMON_SEQ.

Related

Select query with "is null"-clause and subselect / left join returns no results

SQL Server output in desired formatted way

sql query to join two tables and a boolean flag to indicate whether it contains any words from third table

Select rows with duplicate values in 2 columns

How do I remove all but some records based on a threshold?

Categories

Resources