How to split a cell and create a new row in sql - sql

I have a column which stores multiple comma separated values. I need to split it in a way so that it gets split into as many rows as values in that column along with remaining values in that row.
eg:
John 111 2Jan
Sam 222,333 3Jan
Jame 444,555,666 2Jan
Jen 777 4Jan
Output:
John 111 2Jan
Sam 222 3Jan
Sam 333 3Jan
Jame 444 2Jan
Jame 555 2Jan
Jame 666 2Jan
Jen 777 4Jan
P.S : I have seen multiple questions similar to this, but could not find a way to split in such a way.

This solution is built on Vertica, but it works for every database that offers a function corresponding to SPLIT_PART().
Part of it corresponds to the un-pivoting technique that works with every ANSI compliant database platform that I explain here (just the un-pivoting part of the script):
Pivot sql convert rows to columns
So I would do it like here below. I'm assuming that the minimalistic date representation is part of the second column of a two-column input table. So I'm first splitting that short date literal away, in a first Common Table Expression (and, in a comment, I list that CTE's output), before splitting the comma separated list into tokens.
Here goes:
WITH
-- input
input(name,the_string) AS (
SELECT 'John', '111 2Jan'
UNION ALL SELECT 'Sam' , '222,333 3Jan'
UNION ALL SELECT 'Jame', '444,555,666 2Jan'
UNION ALL SELECT 'Jen' , '777 4Jan'
)
,
-- put the strange date literal into a separate column
the_list_and_the_date(name,list,datestub) AS (
SELECT
name
, SPLIT_PART(the_string,' ',1)
, SPLIT_PART(the_string,' ',2)
FROM input
)
-- debug
-- SELECT * FROM the_list_and_the_date;
-- name|list |datestub
-- John|111 |2Jan
-- Sam |222,333 |3Jan
-- Jame|444,555,666|2Jan
-- Jen |777 |4Jan
,
-- ten integers (too many for this example) to use as pivoting value and as "index"
ten_ints(idx) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
UNION ALL SELECT 9
UNION ALL SELECT 10
)
-- the final query - pivoting prepared input using a CROSS JOIN with ten_ints
-- and filter out where the SPLIT_PART() expression evaluates to the empty string
SELECT
name
, SPLIT_PART(list,',',idx) AS token
, datestub
FROM the_list_and_the_date
CROSS JOIN ten_ints
WHERE SPLIT_PART(list,',',idx) <> ''
;
name|token|datestub
John|111 |2Jan
Jame|444 |2Jan
Jame|555 |2Jan
Jame|666 |2Jan
Sam |222 |3Jan
Sam |333 |3Jan
Jen |777 |4Jan
Happy playing ...
Marco the Sane

Related

How to split comma delimited data from one column into multiple rows

I'm trying to write a query that will have a column show a specific value depending on another comma delimited column. The codes are meant to denote Regular time/overtime/doubletime/ etc. and they come from the previously mentioned comma delimited column. In the original view, there are columns for each of the different hours accrued separately. For the purposes of this, we can say A = regular time, B = doubletime, C = overtime. However, we have many codes that can represent the same type of time.
What my original view looks like:
Employee_FullName
EmpID
Code
Regular Time
Double Time
Overtime
John Doe
123
A,B
7
2
0
Jane Doe
234
B
4
0
1
What my query outputs:
Employee_FullName
EmpID
Code
Hours
John Doe
123
A, B
10
John Doe
123
A, B
5
Jane Doe
234
B
5
What I want the output to look like:
Employee_FullName
EmpID
Code
Hours
John Doe
123
A
10
John Doe
123
B
5
Jane Doe
234
B
5
It looks the way it does in the first table because currently it's only pulling from the regular time column. I've tried using a case switch to have it look for a specific code and then pull the number, but I get a variety of errors no matter how I write it. Here's what my query looks like:
SELECT [Employee_FullName],
SUBSTRING(col, 1, CHARINDEX(' ', col + ' ' ) -1)'Code',
hrsValue
FROM
(
SELECT [Employee_FullName], col, hrsValue
FROM myTable
CROSS APPLY
(
VALUES ([Code],[RegularHours])
) C (COL, hrsValue)
) SRC
Any advice on how to fix it or perspective on what to use is appreciated!
Edit: I cannot change the comma delimited data, it is provided that way. I think a case within a cross apply will solve it but I honestly don't know.
Edit 2: I will be using a unique EmployeeID to identify them. In this case yes A is regular time, B is double time, C is overtime. The complication is that there are a variety of different codes and multiple refer to each type of time. There is never a case where A would refer to regular time for one employee and double time for another, etc. I am on SQL Server 2017. Thank you all for your time!
If you are on SQL Server 2016 or better, you can use OPENJSON() to split up the code values instead of cumbersome string operations:
SELECT t.Employee_FullName,
Code = LTRIM(j.value),
Hours = MAX(CASE j.[key]
WHEN 0 THEN RegularTime
WHEN 1 THEN DoubleTime
WHEN 2 THEN Overtime END)
FROM dbo.MyTable AS t
CROSS APPLY OPENJSON('["' + REPLACE(t.Code,',','","') + '"]') AS j
GROUP BY t.Employee_FullName, LTRIM(j.value);
Example db<>fiddle
You can use the following code to split up the values
Note how NULLIF nulls out the CHARINDEX if it returns 0
The second half of the second APPLY is conditional on that null
SELECT
t.[Employee_FullName],
Code = TRIM(v2.Code),
v2.Hours
FROM myTable t
CROSS APPLY (VALUES( NULLIF(CHARINDEX(',', t.Code), 0) )) v1(comma)
CROSS APPLY (
SELECT Code = ISNULL(LEFT(t.Code, v1.comma - 1), t.Code), Hours = t.RegularTime
UNION ALL
SELECT SUBSTRING(t.Code, v1.comma + 1, LEN(t.Code)), t.DoubleTime
WHERE v1.comma IS NOT NULL
) v2;
db<>fiddle
You can go for CROSS APPLY based approach as given below.
Thanks to #Chalieface for the insert script.
CREATE TABLE mytable (
"Employee_FullName" VARCHAR(8),
"Code" VARCHAR(3),
"RegularTime" INTEGER,
"DoubleTime" INTEGER,
"Overtime" INTEGER
);
INSERT INTO mytable
("Employee_FullName", "Code", "RegularTime", "DoubleTime", "Overtime")
VALUES
('John Doe', 'A,B', '10', '5', '0'),
('Jane Doe', 'B', '5', '0', '0');
SELECT
t.[Employee_FullName],
c.Code,
CASE WHEN c.code = 'A' THEN t.RegularTime
WHEN c.code = 'B' THEN t.DoubleTime
WHEN c.code = 'C' THEN t.Overtime
END AS Hours
FROM myTable t
CROSS APPLY (select value from string_split(t.code,',')
) c(code)
Employee_FullName
Code
Hours
John Doe
A
10
John Doe
B
5
Jane Doe
B
0

Oracle: union all query 1 and query 2 want to minus some rows if query 1 have rowdata

my query as below , i want to minus some rows from query1 when query2 have rowdata , but i don't know how to do:
my query:
with query1 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
t.org_id
from (
select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID is null or LINK_ORG_ID = '')) t
group by t.org_id) ,
query2 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
org_id
from (select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID = '55')) t
group by org_id)
select *
from query1
union all
select *
from query2 minus
-- this doesn't work ,i want to minus the rowdata from query 1 when query1.org_id = query2.org_id. the query2 is marked as outer query column.
(select * from query1 where query1.ORG_ID = query2.ORG_ID)
;
sample table
name phone link_org_id org_id
lily 133 1
ming 144 1
hao 333 2
jane 1234 55 2
bob 666 3
herry 555 3
query 1 result:
name phone org_id
lily,ming 133,144 1
hao 333 2
bob,herry 666,555 3
query 2 result:
name phone org_id
jane 1234 2
such like this , jane selected by query2 and hao selected by query 1 . All of them are from a same org which org_id =2 . but i don't need hao ,i just need jane. how to do?
i means if query2 can find result , then no need query1's result. but if query2 can't find any data, then i need query1's data.
The way it is now, you'll first have to split names (and phones) into rows, and then apply set operators (UNION, MINUS) to such a data.
Which means that you shouldn't use WM_CONCAT at all; at least, not at the beginning, because
first you concatenate data
then you'd have to split it back into rows
UNION / MINUS sets
Doing useless job in the first 2 steps.
I'd suggest you to UNION / MINUS data first, then aggregate them using WM_CONCAT. By the way, which database version do you use? WM_CONCAT is a) undocumented, b) doesn't even exist in latest Oracle database versions so you'd rather switch to LISTAGG, if possible.

Oracle SQL. Find Matching values in two different columns and different rows from same table or different one

it is my first question in this community. Any help is welcomed.
Imagine I have a table like this (It can also be having the columns in different tables, I do not mind):
Account_Name_1:
Nike
Pepsi
Coke
Account_Name_2:
Reebok
Coke
Nike
I need to query a list of Account Names who are in "Account_Name_1" and "Account_Name_2"
Which will result as:
Accounts_in_both_columns
Nike
Coke
How can I do this? I have tried with Inner Join but I am not sure,
Thank you :)
extra:
I also have a problem of naming inconsistency across the Account names, some of them are named differently even if they are the same account. Example:
Account_Name_1 Account_Name_2
Nike Reebok
Pepsi Coke
Coke Nike Inc
If we run the same query as before, it will only list 'Coke'.
I have read about UTL Matching, Levenshtein Distance Algorithm and JARO_WINKLER_SIMILARITY Function. But I am not able to create a column of those values who has similarity and how much similar are they, so I can investigate and decide if they are the same account or not.
Please keep in mind it is not about same row matching, but value matching in two columns.
Thank you
I think you want union:
select account_name_1 as account_name
from t
union -- on purpose to remove duplicates
select account_name_2
from t;
EDIT:
If you want the values in both columns, just use exists:
select distinct account_name_1 as account_name
from t
where exists (select 1
from t t2
where t2.account_name_2 = t.account_name_1
);
As far as I understood the question, it is intersect you're looking for:
SQL> with
2 tab_1 (col) as
3 (select 'Nike' from dual union all
4 select 'Pepsi' from dual union all
5 select 'Coke' from dual
6 ),
7 tab_2 (col) as
8 (select 'Reebok' from dual union all
9 select 'Coke' from dual union all
10 select 'Nike' from dual
11 )
12 -- code you need follows
13 select col from tab_1
14 intersect
15 select col from tab_2;
COL
------
Coke
Nike
SQL>

SQL Server: pivoting without aggregation on a table with two columns

This is a question on a test. I have a table with two columns. I want to pivot on one of them and output the other.
Table structure:
(Name varchar(10), Age int)
I need output with age values as columns and Names listed below each age value.
From searching, I only see examples where there is at least one other column that is used to "group by" for want of a better term. In other words, there is a common factor in each row of the output. My problem does not have this property.
I tried:
SELECT
[agevalue1], [agevalue2], [agevalue3], [agevalue4]
FROM
(SELECT Name, Age FROM MyClass) AS SourceTable
PIVOT
(MAX(Name)
FOR Age IN ([agevalue1], [agevalue2], [agevalue3], [agevalue4])
) AS PivotTable;
I specified agevalue* as a string, i.e. in quotes. I got the column headings alright but a row of NULLS below them.
P.S.: The solution does not need to use pivot but I couldn't think of an alternative approach.
Sample Data:
Name Age
Bob 11
Rick 25
Nina 30
Sam 11
Cora 16
Rachel 25
Desired output:
11 16 25 30
Bob Cora Rick Nina
Sam NULL Rachel NULL
Try this :
with tab as
(
Select 'A' Name, 10 Age union all
Select 'B',11 union all
Select 'c',10 union all
Select 'D',11 union all
Select 'E',11 union all
Select 'F',11
)
select distinct
Age
, stuff((
select ',' + g.Name
from tab g
where g.age = g1.age
order by g.age
for xml path('')
),1,1,'') as Names_With_Same_Age
from tab g1
group by g1.age,Name
To group these together in one row:
11 16 25 30
Bob Cora Rick Nina
and separate them from another set, like:
11 16 25 30
Sam NULL Rachel NULL
they must have something different between each row, since doing a MAX(Name) would get you only one Name for each Age.
This query creates a number that links a particular Age to a row number and then pivots the result. As you said, the PIVOT will group by all columns not referenced in the PIVOT function, so it will group by this row indexer, separating the values like you wanted.
;WITH IndexedClass AS
(
SELECT
M.Name,
M.Age,
-- The ordering will determine which person goes first for each Age
RowIndexer = ROW_NUMBER() OVER (PARTITION BY M.Age ORDER BY M.Name)
FROM
MyClass AS M
)
SELECT
P.[11],
P.[16],
P.[25],
P.[30]
FROM
IndexedClass AS I
PIVOT (
MAX(I.Name) FOR I.Age IN ([11], [16], [25], [30])
) AS P

Conditionally append a character in select statement

Functionality I'm trying to add to my DB2 stored procedure:
Select a MIN() date from a joined table column.
IF there was more than one row in this joined table, append a " * " to the date.
Thanks, any help or guidance is much appreciated.
It's not clear which flavor of DB2 is needed nor if any suggestion worked. This works on DB2 for i:
SELECT
T1.joinCol1,
max( T2.somedateColumn ),
count( t2.somedateColumn ),
char(max( T2.somedateColumn )) concat case when count( T2.somedateColumn )>1 then '*' else '' end
FROM joinFile1 t1 join joinFile2 t2
on joinCol1 = joinCol2
GROUP BY T1.joinCol1
ORDER BY T1.joinCol1
The SQL is fairly generic, so it should translate to many environments and versions.
Substitute table and column names as needed. The COUNT() here actually counts rows from the JOIN rather than the number of times the specific date occurs. If a count of duplicate dates is needed, then some changes to this example are also needed.
Hope this helps
Say I have result coming as
1 Jeff 1
2 Jeff 333
3 Jeff 77
4 Jeff 1
5 Jeff 14
6 Bob 22
7 Bob 4
8 Bob 5
9 Bob 6
Here the value 1 is repeated twice(in 3 column)
So, this query gets the count as 2 along with the * concatenated along with it
SELECT A.USER_VAL,
DECODE(A.CNT, '1', A.CNT, '0', A.CNT, CONCAT(A.CNT, '*')) AS CNT
FROM (SELECT DISTINCT BT.USER_VAL, CAST(COUNT(*) AS VARCHAR2(2)) AS CNT
FROM SO_BUFFER_TABLE_8 BT
GROUP BY BT.USER_VAL) A