SQL Concatenate based on string inclusion - sql

I have the following SQL table:
COL_A || COL_B ||
=========================
aa || 1 ||
aa || 2 ||
aa.bb || 3 ||
aa.bb.cc || 4 ||
aa.bb.cc || 5 ||
dd || 6 ||
dd.ee || 7 ||
As part of a SELECT query, I'd like to group by the values of Col_A and concatenate the values in Col_B based on the values in Col_A being a subset of one another. Meaning, if a value of Col_A is contained by/is equal to another value of Col_A, the corresponding Col_B of the superset/same Col_A value should be concatenated together.
Desired result:
COL_A || COL_B ||
======================================
aa || [1, 2, 3, 4, 5] ||
aa.bb || [3, 4, 5] ||
aa.bb.cc || [4, 5] ||
dd || [6, 7] ||
dd.ee || [7] ||

You can use a self join with array_agg:
select t1.col_a, array_agg(distinct t2.col_b)
from vals t1 join vals t2 on t2.col_a ~ t1.col_a
group by t1.col_a order by t1.col_a

You can do this using a lateral join
select t.cola, Concat('[',x.colB,']') ColB
from t
left join lateral (
select string_agg(colb::character,',') colB
from t t2
where t2.cola ~ t.cola
)x on true
group by t.cola, x.colb;
Working fiddle

Related

Separate columns by commas ignoring nulls

I have the below table:
A
B
C
D
E
A1
null
C1
null
E1
A2
B2
C2
null
null
null
null
C3
null
E3
I would like the below output (separated by commas, if any value is null, then do not add a comma):
F
A1, C1, E1
A2, B2, C2
C3, E3
You can concatenate all columns regardless of whether they are null or not (saving a lot of comparisons to null), but then fix the commas with string functions. Whether this will be faster or slower than checking each value individually for being null will depend on the data (how many columns - I assume 5 is just for illustration - and how frequent null is in the data, for example).
I included more data for testing in the with clause (which, obviously, is not part of the answer; remove it and use your actual table and column names).
with
inputs (a, b, c, d, e) as (
select 'A1', null, 'C1', null, 'E1' from dual union all
select 'A2', 'B2', 'C2', null, null from dual union all
select null, null, 'C3', null, 'E3' from dual union all
select null, null, null, null, null from dual union all
select null, 'B5', null, null, null from dual
)
select a, b, c, d, e,
regexp_replace(
trim (both ',' from a || ',' || b || ',' || c || ',' || d || ',' || e)
, ',+', ', ') as f
from inputs;
A B C D E F
---- ---- ---- ---- ---- ---------------
A1 C1 E1 A1, C1, E1
A2 B2 C2 A2, B2, C2
C3 E3 C3, E3
B5 B5
EDIT
In a comment, the OP expanded the scope of the question. The new requirement is to also remove leading and/or trailing whitespace from the input tokens (including ignoring tokens altogether, if they consist entirely of whitespace).
This can be achieved as follows:
select a, b, c, d, e,
ltrim(
rtrim(
regexp_replace(
a || ',' || b || ',' || c || ',' || d || ',' || e
, '[[:space:]]*,[,[:space:]]*', ', '
)
, ', ' || chr(9)
)
, ', ' || chr(9)
) as f
from inputs;
Basically, you want concat_ws() -- which Oracle does not support. Instead, you can use:
select trim(',' from
(case when A is not null then ',' || A end ||
case when B is not null then ',' || B end ||
case when C is not null then ',' || C end ||
case when D is not null then ',' || D end ||
case when E is not null then ',' || E end
)
)

Select values substituting the values of other column

I have a table like this :-
A B
28496 TS_28496_FX
7365 TS_7365_FX
14760 TS_14760_FX
222 TS_222_AA1
I want to find all the records for column B which does not match the pattern
'TS_' || A || '_FX'
So this shows me the only result
222 TS_222_AA1
Thanks
This is a way:
with yourData(A,B) as (
select '28496' ,'TS_28496_FX' from dual union all
select '7365' ,'TS_7365_FX' from dual union all
select '14760' ,'TS_14760_FX' from dual union all
select '222' ,'TS_222_AA1' from dual union all
select '999' ,'999' from dual
)
select *
from yourData
where B != 'TS_' || A || '_FX'
which gives:
A B
----- -----------
222 TS_222_AA1
999 999
This assumes that B always is not null; otherwise you may use
where nvl(B, '-') != 'TS_' || A || '_FX'

Select all rows with equal values in 2 columns within each group

Consider the following table
ID || YEAR || TERM || NAME || UNIT
----------------------------------------
1 || 1985 || 1 || MARIE || 01VS
1 || 1986 || 2 || MARIE || 01VS
1 || 1986 || 2 || MARIE || 07GB
1 || 1986 || 3 || MARIE || 07GB
2 || 1992 || 1 || AVALON || 01VS
2 || 1992 || 2 || AVALON || 01VS
2 || 1992 || 3 || AVALON || 01VS
3 || 2001 || 1 || DENIS || 08HK
3 || 2001 || 1 || DENIS || 07GB
3 || 2001 || 2 || DENIS || 08HK
3 || 2002 || 1 || DENIS || 08HK
I wanted to write a sql query in H2 which would return all rows for each ID in which YEAR and TERM have equal values. So for the table above the result should be like below:
ID || YEAR || TERM || NAME || UNIT
----------------------------------------
1 || 1986 || 2 || MARIE || 01VS
1 || 1986 || 2 || MARIE || 07GB
3 || 2001 || 1 || DENIS || 08HK
3 || 2001 || 1 || DENIS || 07GB
You can use exists :
select t.*
from table t
where exists (select 1
from table t1
where t1.id = t.id and t1.year = t.year and
t.term = t1.term and t1.unit <> t.unit
);
Something like the below would work I think
select *
from table t
where exists (select id, term from table t2
where t2.id = t.id
and t2.term = t.term
group by id, term
having count(*) > 1)
However it would be easier if the table had a primary key of some sort.
How about joining the table to a subquery with GROUP BY and a HAVING ?
select t.*
from yourtable t
join
(
select ID, YEAR, TERM
from yourtable
group by ID, YEAR, TERM
having count(*) > 1
) d on (d.ID = t.ID and d.YEAR = t.YEAR and d.TERM = t.TERM);

How to query the latest date from each duplicated name

I have a question to query tuple(s) that have latest date of each name.
This is my example table.
ID || NAM E || DATE || INFOA || INFOB || INFOC
1 || Alice || 2015-08-20 12:0:0 || Y || N || Y
2 || Bob || 2015-08-20 12:0:0 || Y || N || Y
3 || Cheschire || 2015-08-20 12:0:0 || N || Y || Y
4 || Alice || 2015-08-25 12:0:0 || N || Y || N
5 || Bob || 2015-08-15 12:0:0 || Y || Y || N
Query I used
SELECT NAME, MAX(DATE), INFOA, INFOB, INFOC
FROM EXAMPLE_TABLE
GROUP BY NAME,INFOA,INFOB,INFOC
Result is...
Alice || 2015-08-20 12:0:0 || Y || N || Y
Bob || 2015-08-20 12:0:0 || Y || N || Y
Cheschire || 2015-08-20 12:0:0 || N || Y || Y
Alice || 2015-08-25 12:0:0 || N || Y || N
Bob || 2015-08-15 12:0:0 || Y || Y || N
But my expected result is...
Bob || 2015-08-20 12:0:0 || Y || N || Y
Cheschire || 2015-08-20 12:0:0 || N || Y || Y
Alice || 2015-08-25 12:0:0 || N || Y || N
What should I do?
Use NOT EXISTS to return a row if there are no other row with same name but a later date:
select *
from tablename t1
where NOT EXISTS (select 1 from tablename t2
where t2.name = t1.name
and t2.date > t1.date)
I tried below:
CREATE TABLE T1(AA varchar2(10),bb TIMESTAMP(6),cc varchar2(1),dd varchar2(1),ee varchar2(1));
INSERT INTO T1 VALUES ('a',systimestamp-5,'Y','N','Y');
INSERT INTO T1 VALUES ('b',systimestamp-5,'N','N','Y');
INSERT INTO T1 VALUES ('c',systimestamp-5,'N','Y','Y');
INSERT INTO T1 VALUES ('a',systimestamp-1,'N','Y','N');
insert into t1 values ('b',systimestamp-11,'Y','Y','N');
Now, below is the query I used to get output you wanted:
SELECT * FROM T1
WHERE (t1.aa, T1.BB) IN (SELECT aa, MAX(BB)
from t1 group by aa);
Output:
b 21-AUG-15 02.51.47.000000000 AM N N Y
c 21-AUG-15 02.51.47.000000000 AM N Y Y
a 25-AUG-15 02.51.48.000000000 AM N Y N
Note: as per your question, you required latest date for each name (no matter what other values would be)
use below query to get the results as you expected
select id,name,date1,infoa,infob,infoc
from
(
select id,name,date1, row_number() over (partition by name order by date1 desc) as s
,infoa,infob,infoc
from testpart
)
where s=1
order by date1
Please try with the below code snippet.
DECLARE #userData TABLE(
ID INT NOT NULL,
Name VARCHAR(MAX) NOT NULL,
[Date] DATETIME NOT NULL,
INFOA VARCHAR(MAX) NOT NULL,
INFOB VARCHAR(MAX) NOT NULL,
INFOC VARCHAR(MAX) NOT NULL
);
INSERT INTO #userData VALUES ('1','Alice','2015-08-20 12:0:0','Y','N','Y')
INSERT INTO #userData VALUES ('2','Bob','2015-08-20 12:0:0','Y','N','Y')
INSERT INTO #userData VALUES ('3','Cheschire','2015-08-20 12:0:0','N','Y','Y')
INSERT INTO #userData VALUES ('4','Alice','2015-08-25 12:0:0','N','Y','N')
INSERT INTO #userData VALUES ('5','Bob','2015-08-15 12:0:0','Y','Y','N')
SELECT a.ID,a.Name,a.Date, a.INFOA,a.INFOB,a.INFOC FROM (
select *,RANK() OVER (PARTITION BY [Name] ORDER BY [DATE] DESC) AS [Rank]
from #userData
) a where a.[Rank] = 1
ORDER BY a.ID

SQL Column compare in the same table (self-join)

I need a hint in order to solve this SQL (self-join) problem:
a table, with columns value and category
id || value || category || foo
------------------------------------
1 || 1 || a || 1
2 || 2 || a || 4
3 || 3 || a || 2
4 || 0 || b || 2
5 || 1 || b || 1
6 || 2 || b || 4
7 || 3 || b || 2
8 || 4 || b || 2
9 || 5 || b || 1
10 || 5 || b || 4
11 || 6 || b || 2
12 || 99 || z || 2
I would like to compare all values from category b and all values from category a and get all values that are in b and not in a or their id, so:
(0,1,2,3,4,5,5,6) "compare" (1,2,3) => (0,4,5,5,6)
ANSI SQL:
SELECT
*
FROM
tbl
WHERE
category = 'b'
AND value NOT IN (SELECT value FROM tbl WHERE category = 'a')
See it live here.
Start analyzing your task: "get all values that are in b and not in a or their id"
get all values > SELECT value FROM mytable
that are in b > WHERE category = 'b'
and not in a > AND value NOT IN (SELECT value FROM mytable WHERE category = 'a')
or their id - what should this mean?