SQL query too complicated for me - sql

I have a table with a note column and there can be value 'Start' or 'End'. And then there are other columns, that can have same value, but only difference is in that 'note' column...
I need to select rows which have the 'note' set to 'Start', but only those, there are no row with the same values and 'note' set to 'End'. Sorry, it's complicated to explain. I'll try to show some example.
Coll1 Coll2 Coll3 note
-----------------------------
a a a Start
a a a End
b b b Start
b b b End
c c c Start <- I need select those rows
-- There is no row with 'c c c End' combination in the table
d d d Start
d d d End
e e e Start <- I need select those rows
-- There is no row with 'e e e End' combination in the table
Can anybody help me please?

Try to use
SELECT *
FROM tbl t1
WHERE t1.note = 'Start' AND NOT EXISTS (SELECT *
FROM tbl t2
WHERE t2.note = 'End'
AND t2.Coll1 = t1.Coll1
AND t2.Coll2 = t1.Coll1
AND t2.Coll3 = t1.Coll3)
Maybe this query not optimal, but this query is easy for understand.

The most simple way should be to aggregate the records and check whether there is an End record for the group:
select col1, col2, col3
from mytable
group by col1, col2, col3
having count(case when note = 'Start' then 1 end) = 1
and count(case when note = 'End' then 1 end) = 0;
Adjust this as you like (e.g. if you are fine with several start records make it >= 1 instead of = 1).

Will this work for you?
SELECT *
FROM mytable t1
LEFT JOIN mytable t2 ON
t1.Coll1 = t2.Coll1 AND
t1.Coll2 = t2.Coll2 AND
t1.Coll3 = t2.Coll3 AND
t2.note = 'End'
WHERE t1.note = 'Start' AND t2.Coll1 IS NULL

You'll get more answers if you include CREATE TABLE and INSERT statements in your question. I'm using PostgreSQL; Oracle is similar.
create table test (
col1 char(1) not null,
col2 char(1) not null,
col3 char(1) not null,
note varchar(10) not null
check (note in ('start', 'end')),
primary key (col1, col2, col3, note)
);
I'm assuming primary key (col1, col2, col3, note). The presence of NULL complicates this approach.
insert into test values
('a', 'a', 'a', 'start'),
('a', 'a', 'a', 'end'),
('b', 'b', 'b', 'start'),
('b', 'b', 'b', 'end'),
('c', 'c', 'c', 'start'),
('d', 'd', 'd', 'start'),
('d', 'd', 'd', 'end'),
('e', 'e', 'e', 'start');
Now we can take a set of starts and a set of ends. A left outer join will preserve all the rows in starts; missing rows in ends will be filled with NULL.
with starts as (
select * from test where note = 'start'
), ends as (
select * from test where note = 'end'
)
select s.* from starts s
left outer join ends e
on e.col1 = s.col1
and e.col2 = s.col2
and e.col3 = s.col3
where e.col1 is null
and e.col2 is null
and e.col3 is null
and e.note is null;

SELECT coll1, coll2, coll3, count(*)
FROM tbl
Where note='start'
GROUP BY coll1, coll2, coll3,
HAVING count(*) < 2

Here's another solution to add to the ones you've already got:
WITH sample_data AS (SELECT 'a' Coll1, 'a' Coll2, 'a' Coll3, 'Start' note FROM dual UNION ALL
SELECT 'a' Coll1, 'a' Coll2, 'a' Coll3, 'End' note FROM dual UNION ALL
SELECT 'b' Coll1, 'b' Coll2, 'b' Coll3, 'Start' note FROM dual UNION ALL
SELECT 'b' Coll1, 'b' Coll2, 'b' Coll3, 'End' note FROM dual UNION ALL
SELECT 'c' Coll1, 'c' Coll2, 'c' Coll3, 'Start' note FROM dual UNION ALL
SELECT 'd' Coll1, 'd' Coll2, 'd' Coll3, 'Start' note FROM dual UNION ALL
SELECT 'd' Coll1, 'd' Coll2, 'd' Coll3, 'End' note FROM dual UNION ALL
SELECT 'e' Coll1, 'e' Coll2, 'e' Coll3, 'Start' note FROM dual)
-- End of mimicking a table called "sample_data" with your data in it
-- for use in the query below:
SELECT coll1,
coll2,
coll3,
MAX(CASE WHEN note = 'Start' THEN note END) note
FROM sample_data
GROUP BY coll1,
coll2,
coll3
HAVING MAX(CASE WHEN note = 'Start' THEN note END) = 'Start'
AND MAX(CASE WHEN note = 'End' THEN note END) IS NULL;
COLL1 COLL2 COLL3 NOTE
----- ----- ----- -----
e e e Start
c c c Start
It only has one access through the table (as opposed to the two accesses in Surename's answer), but you should test which solution works best with your data and table structure - one may be faster than the other.

For completeness sake, for it has been suggested in a comment to preferably think in sets:
select col1, col2, col3 from mytable where note = 'Start'
minus
select col1, col2, col3 from mytable where note = 'End';

select col1,col2,col3,count(note) from tb1
group by col1,col2,col3
having count(note)=1
also u can do it
select * from tb1
where note <> 'End' and note = 'Start'

Related

SQL - create new col based on value of other column group by third column

I have this table
Id
item
type
A
itemA1
X
A
itemA2
X
B
itemA1
X
B
itemA2
X
B
itemA3
Y
And i would like to create new indicator which contains the information about if the Id contains only item of type X or only tpye Y or both like this :
Id
Indicator
A
Only X
B
Both
EDIT: It's possible to have more than 2 kind of types
Thanks in advance for your help
Consider below generic approach
select id,
if(count(distinct type)=1,'Only ','') || string_agg(distinct type) indicator
from your_table
group by id
This will cover if both item are 'Y' as well:
with table_with_sample_data as (
select 'A' as Id ,'itemA1' as item, 'X' as type union all
select 'A', 'itemA2', 'X' union all
select 'B', 'itemA1', 'X' union all
select 'B', 'itemA2', 'X' union all
select 'B', 'itemA3', 'Y' union all
select 'C', 'itemA1', 'Y' union all
select 'C', 'itemA2', 'Y'
)
select id,
if(count(distinct type)=1,'Only '|| max(type), 'Both') indicator from table_with_sample_data
group by id

How to check in SQL if multi columnar set is in the table (without string concatenation)

Let's assume I've 3 columns in a table with values like this:
table_1:
A | B | C
-----------------------
'xx' | '' | 'y'
'x' | 'y' | 'x'
'x' | 'x' | 'y'
'x' | 'yy' | ''
'x' | '' | 'yy'
'x' | 'y' | 'y'
I've a result set (result of an SQL SELECT statement) which I want to identify in the above table if it exists there:
[
('x', 'x', 'y')
('x', 'y', 'y')
]
This result set would match for 5 (of 6) rows in instead of the 2 from the table above if I've compared the results of simple string concatenation, e.g. I would simply compare the results of this: SELECT concat(A, B, C) FROM table_1
I could solve this problem with comparing the results of more complex string concatenation functions like this: SELECT concat('A=', A, '_', 'B=', B, '_', 'C=', C )
BUT:
I don't want to use any hardcoded special separator in a string concatenation like _ or =
because any character might be in the data
e.g.: somewhere in column B there might be this value: xx_C=yy
it's not a clean solution
I don't want to use string concatenation at all, because it's an ugly solution
it makes the "distance" between the attributes disappear
not general enough
maybe I've columns with different datatypes I don't want to convert to a STRING based column
Question:
Is it possible to solve somehow this problem without using string concatenation?
Is there a simple solution for this multi column value checking problem?
I want to solve this in BiqQuery, but I'm interested in a general solution for every relational databse/datawarehouse.
Thank you!
CREATE TABLE test.table_1 (
A STRING,
B STRING,
C STRING
) AS
SELECT * FROM (
SELECT 'xx', '', 'y'
UNION ALL
SELECT 'x', 'y', 'x'
UNION ALL
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'yy', ''
UNION ALL
SELECT 'x', '', 'yy'
UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT A, B, C
FROM test.table_1
WHERE (A, B, C) IN ( -> I need this functionality
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'y', 'y'
);
Below is the most generic way I can think of (BigQuery Standard SQL):
#standardSQL
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.test.table1` AS (
SELECT 'xx' a, '' b, 'y' c UNION ALL
SELECT 'x', 'y', 'x' UNION ALL
SELECT 'x', 'x', 'y' UNION ALL
SELECT 'x', 'yy', '' UNION ALL
SELECT 'x', '', 'yy' UNION ALL
SELECT 'x', 'y', 'y'
), `project.test.table2` AS (
SELECT 'x' a, 'x' b, 'y' c UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
with output
Row a b c
1 x x y
2 x y y
Use join:
SELECT t1.*
FROM test.table_1 t1 JOIN
(SELECT 'x' as a, 'x' as b, 'y' as c
UNION ALL
SELECT 'x', 'y', 'y'
) t2
USING (a, b, c);

Comparing 2 lists in Oracle

I have 2 lists which I need to compare. I need to find if at least one element from List A is found in List B. I know IN doesn't work with 2 lists. What are my other options?
Basically something like this :
SELECT
CASE WHEN ('A','B','C') IN ('A','Z','H') THEN 1 ELSE 0 END "FOUND"
FROM DUAL
Would appreciate any help!
You are probably looking for something like this. The WITH clause is there just to simulate your "lists" (whatever you mean by that); they are not really part of the solution. The query you need is just the last three lines (plus the semicolon at the end).
with
first_list (str) as (
select 'A' from dual union all
select 'B' from dual union all
select 'C' from dual
),
second_list(str) as (
select 'A' from dual union all
select 'Z' from dual union all
select 'H' from dual
)
select case when exists (select * from first_list f join second_list s
on f.str = s.str) then 1 else 0 end as found
from dual
;
FOUND
----------
1
In Oracle you can do:
select
count(*) as total_matches
from table(sys.ODCIVarchar2List('A', 'B', 'C')) x,
table(sys.ODCIVarchar2List('A', 'Z', 'H')) y
where x.column_value = y.column_value;
You need to repeat the conditions:
SELECT (CASE WHEN 'A' IN ('A', 'Z', 'H') OR
'B' IN ('A', 'Z', 'H') OR
'C' IN ('A', 'Z', 'H')
THEN 1 ELSE 0
END) as "FOUND"
FROM DUAL
If you are working with collection of String you can try Multiset Operators.
create type coll_of_varchar2 is table of varchar2(4000);
and:
-- check if exits
select * from dual where cardinality (coll_of_varchar2('A','B','C') multiset intersect coll_of_varchar2('A','Z','H')) > 0;
-- list of maching elments
select * from table(coll_of_varchar2('A','B','C') multiset intersect coll_of_varchar2('A','Z','H'));
Additionally:
-- union of elemtns
select * from table(coll_of_varchar2('A','B','C') multiset union distinct coll_of_varchar2('A','Z','H'));
select * from table(coll_of_varchar2('A','B','C') multiset union all coll_of_varchar2('A','Z','H'));
-- eelemnt from col1 not in col2
select * from table(coll_of_varchar2('A','A','B','C') multiset except all coll_of_varchar2('A','Z','H'));
select * from table(coll_of_varchar2('A','A','B','C') multiset except distinct coll_of_varchar2('A','Z','H'));
-- check if col1 is subset col2
select * from dual where coll_of_varchar2('B','A') submultiset coll_of_varchar2('A','Z','H','B');
I am trying to do something very similar but the first list is another field on the same query created with listagg and containing integer numbers like:
LISTAGG(my_first_list,', ') WITHIN GROUP(
ORDER BY
my_id
) my_first_list
and return this with all the other fields that I am already returning
SELECT
CASE WHEN my_first_list IN ('1,2,3') THEN 1 ELSE 0 END "FOUND"
FROM DUAL

SQL Beginner question: CASE AS END for multiple columns?

I am making a stored procedure which creates a target data table (#tmp_target_table), does some checking on it, and outputs the results in a resultset table (#tmp_resultset_table). The resultset table needs to have multiple new columns: user_warning_id, user_warning_note, and user_warning_detail in addition to existing columns from #tmp_target_table.
I have a working stored procedure as in the following but this has some issue. I need to write conditionA, conditionB, and conditionB repeatedly but these conditions will need to be changed in the future. How would you write a code that is more extensible?
<Working code>
SELECT existing_col1, existing_col2,
CASE
WHEN conditionA
THEN user_warning_id_A
WHEN conditionB
THEN user_warning_id_B
WHEN conditionC
THEN user_warning_id_C
END AS user_warning_id,
CASE
WHEN conditionA
THEN user_warning_note_A
WHEN conditionB
THEN user_warning_note_B
WHEN conditionC
THEN user_warning_note_C
END AS user_warning_note,
CASE
WHEN conditionA
THEN user_warning_detail_A
WHEN conditionB
THEN user_warning_detail_B
WHEN conditionC
THEN user_warning_detail_C
END AS user_warning_detail,
existing_col3, existing_col4
INTO #tmp_resultset_table
FROM #tmp_target_table
SELECT * FROM #tmp_resultant_table
In SQL Server, you can use a lateral join (i.e., apply) to arrange the data so you can use a reference table:
select tt.*,
v2.user_warning_id, v2.user_warning_note, v2.user_warning_detail
from #tmp_target_table tt cross apply
(values (case when conditionA then 'a'
when conditionA then 'b'
when conditionA then 'c'
end)
) v(cond) left join
(values ('a', user_warning_id_A, user_warning_note_A, user_warning_detail_A),
('b', user_warning_id_B, user_warning_note_B, user_warning_detail_B),
('c', user_warning_id_C, user_warning_note_C, user_warning_detail_C)
) v2(cond, user_warning, user_warning_note, user_warning_detail)
on v2.cond = v.cond;
This also makes it pretty easy to add more levels, if you like.
Note: You could combine v and v2 into a single values list. I separated them, because you might want to consider making v2 an actual reference table.
EDIT:
DB2 supports lateral joins with the lateral keyword. I don't remember if DB2 supports values(). So try this:
select tt.*,
v2.user_warning_id, v2.user_warning_note, v2.user_warning_detail
from #tmp_target_table tt cross join lateral
(select (case when conditionA then 'a'
when conditionA then 'b'
when conditionA then 'c'
end)
from sysibm.sysdummy1
) v(cond) left join
(select 'a' as cond, user_warning_id_A as user_warning_id, user_warning_note_A as user_warning_note, user_warning_detail_A user_warning_detail
from sysibm.sysdummy1
union all
select 'b', user_warning_id_B, user_warning_note_B, user_warning_detail_B
from sysibm.sysdummy1
union all
select 'c', user_warning_id_C, user_warning_note_C, user_warning_detail_C
from sysibm.sysdummy1
) v2(cond, user_warning, user_warning_note, user_warning_detail)
on v2.cond = v.cond;
You could put the messages into a table and the condition logic into a function.
Just using temp tables so you can test it out.
Warnings
select warningID = 1, note = 'note 1', detail = 'notes on warning 1'
into #warning
union
select warningID = 2, note = 'note 2', detail = 'notes on warning 2'
union
select warningID = 3, note = 'note 3', detail = 'notes on warning 3'
union
select warningID = 4, note = 'note 4', detail = 'notes on warning 4'
Data values that have to meet conditions coming from some table ... #conditions
select condID = 1, val1 = 10, val2 = 1
into #conditions
union
select condID = 2, val1 = 20, val2 = 1
union
select condID = 3, val1 = 5, val2 = 2
union
select condID = 4, val1 = 30, val2 = 1
union
select condID = 4, val1 = 12, val2 = 1
Then a function that determines warnings based on conditions in the data. Takes values as input and returns a warningID
create function testWarningF
(
#val1In int
)
returns int
as
begin
declare #retVal int
select #retVal = case when #val1In <= 10 then 1
when #val1In > 10 and #val1In <=20 then 2
else 3
end
return #retVal
end
go
Then, the SQL ...
select *
from #conditions c
inner join #warning w on w.warningID = dbo.warningF(val1)
... returns this result
condID val1 val2 warningID note detail
1 10 1 1 note 1 notes on warning 1
2 20 1 2 note 2 notes on warning 2
3 5 2 1 note 1 notes on warning 1
4 12 1 2 note 2 notes on warning 2
4 30 1 3 note 3 notes on warning 3
Possibly the simplest method it to move the Conditions into a sub-select, then reference a token in the other select. E.g.
SELECT existing_col1
, existing_col2
, CASE CON
WHEN 'A' THEN user_warning_id_A
WHEN 'B' THEN user_warning_id_B
WHEN 'C' THEN user_warning_id_C END AS user_warning_id
, CASE CON
WHEN 'A' THEN user_warning_note_A
WHEN 'B' THEN user_warning_note_B
WHEN 'C' THEN user_warning_note_C END AS user_warning_note
, CASE CON
WHEN 'A' THEN user_warning_detail_A
WHEN 'B' THEN user_warning_detail_B
WHEN 'C' THEN user_warning_detail_C END AS user_warning_detail
, existing_col3
, existing_col4
FROM (
SELECT T.*
, CASE WHEN conditionA THEN 'A'
WHEN conditionB THEN 'B'
WHEN conditionC THEN 'C' END AS CON
FROM
#tmp_target_table T
)
although Gordon's answer is also neat, even though it adds two joins in the access plan. In Db2 Syntax, this works (on Db2 11.1.3.3 anyway)
select tt.*,
v2.user_warning_id, v2.user_warning_note, v2.user_warning_detail
from #tmp_target_table tt
, (values (case when conditionA then 'a'
when conditionB then 'b'
when conditionC then 'c'
end)
) v(cond) left join
(values ('a', 'user_warning_id_A', 'user_warning_note_A', 'user_warning_detail_A'),
('b', 'user_warning_id_B', 'user_warning_note_B', 'user_warning_detail_B'),
('c', 'user_warning_id_C', 'user_warning_note_C', 'user_warning_detail_C')
) v2(cond, user_warning_id, user_warning_note, user_warning_detail)
on v2.cond = v.cond;
testing with
create table #tmp_target_table(i int);
insert into #tmp_target_table(values 1);
create variable conditionA boolean;
create variable conditionB boolean default true;
create variable conditionC boolean;
returns
I USER_WARNING_ID USER_WARNING_NOTE USER_WARNING_DETAIL
- ----------------- ------------------- ---------------------
1 user_warning_id_B user_warning_note_B user_warning_detail_B

Find duplicate records in database against unique attributes

I want to find duplicate of IRN # entered into a table in database. Here are the unique attributes (logically unique) of the IRN.
ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldNo
An IRN can have multiple WeldNos meaning the above unique attributes may repeat for one IRN # (with of course one of the 5 attribute values must be unique).
Now I am trying to find out whether there are any duplicate IRNs entered into the system or not? How can I find that through a sql query?
P.S: Due to bad design of database, there is no primary key in the table..
Here is what I have tried so far but this does not give the correct results.
select * from WeldInfo a, WeldInfo b
where a.ProjectNo = b.ProjectNo and
a.DrawingNo = b.DrawingNo and
a.DrawingRev = b.DrawingRev and
a.SpoolNo = b.SpoolNo and
a.WeldNo = b.WeldNo and
a.IrnNo <> b.IrnNo;
But i'm not sure, have i understood your question.
select * from (
select count(*) over ( partition by ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldN) rr,t.* from WeldInfo t)
where rr > 1;
Explanation.
with tab as (
select 1 as id, 'a' as a , 'b' as b , 'c' as c from dual
union all
select 2 , 'a', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'd', 'c' from dual
)
select t.*
, count(*) over (partition by a,b,c) cnt1
, count(distinct id) over (partition by a,b,c) cnt2
from tab t;