The motivation for this challenge was to easily and accurately simulate a data set of IP ranges that relate to each other in a certain way.
The Challenge
A table contains a single column of text type.
The text contains one or more lines where each lines contains one or more sections created from dashes.
The goal is to write a query that returns a tuple for each section with its start point and end point.
E.g.
'
--- -- -
----
'
The text above contains 2 lines.
It contains 4 sections.
The 1st line contains 3 sections.
The 2nd line contains 1 section.
The tuples for the 1st lines are (1,3),(5,6),(8,8).
The tuple for the 2nd line is (2,5).
Requirements
The solution should be a single SQL query (sub-queries are fine).
The use of T-SQL, PL/SQL etc. is not allowed.
The use of UDF (User Defined Functions) is not allowed
If needed, we might assume that there is only a single record in the table.
Sample Data
create table t (txt varchar (1000) not null);
insert into t (txt) values
(
'
--- --- --- ---
---------- -
- - -- -- --- ---
----- ---- --- -- -
-------
'
);
Requested Result
* Only the last 2 columns (section_start/end) are required, the rest are for debugging purposes.
line_ind section_ind section_length section_start section_end
-------- ----------- -------------- ------------- -----------
1 1 3 2 4
1 2 3 6 8
1 3 3 11 13
1 4 3 17 19
2 1 10 1 10
2 2 1 21 21
3 1 1 2 2
3 2 1 4 4
3 3 2 6 7
3 4 2 9 10
3 5 3 12 14
3 6 3 16 18
4 1 5 7 11
4 2 4 13 16
4 3 3 18 20
4 4 2 22 23
4 5 1 25 25
5 1 7 4 10
Oracle
SELECT row_n AS line_ind
,dense_rank() over(PARTITION BY row_n ORDER BY s_beg) AS section_ind
,s_end - s_beg AS section_length
,s_beg - decode(row_n, 0, 0, instr(a,chr(10),1,row_n)) AS section_start
,s_end - decode(row_n, 0, 0, instr(a,chr(10),1,row_n)) -1 AS section_end
FROM (SELECT a
,s_beg
,DECODE(s_end, 0, length(a) + 1, s_end) AS s_end
,length(substr(a, 1, s_beg))
- length(REPLACE(substr(a, 1, s_beg), chr(10))) AS row_n
,lvl
FROM (SELECT txt as a
,DECODE(LEVEL, 1, 0, regexp_instr(txt , '\s|\n', 1, LEVEL - 1)) + 1 AS s_beg
,regexp_instr(txt , '\s|\n', 1, LEVEL) AS s_end
,LEVEL AS lvl
FROM t
CONNECT BY LEVEL <= length(txt ) - length(regexp_replace(txt , '\s|\n')) + 1)
)WHERE s_beg != s_end;
Oracle
select regexp_instr (txt,'-+',1,level,0) - instr (txt,chr(10),regexp_instr (txt,'-+',1,level,0) - length (txt) - 1,1) as section_start
,regexp_instr (txt,'-+',1,level,1) - 1 - instr (txt,chr(10),regexp_instr (txt,'-+',1,level,0) - length (txt) - 1,1) as section_end
from t
connect by level <= regexp_count (txt,'-+')
;
Teradata
with l
as
(
select line_ind
,line
from table
(
regexp_split_to_table (-1,t.txt,'\r','')
returns (minus_one int,line_ind int,line varchar(1000))
)
as l
)
select l.line_ind
,s.section_ind
,regexp_instr (l.line,'\S+',1,s.section_ind,0) as section_start
,regexp_instr (l.line,'\S+',1,s.section_ind,1) - 1 as section_end
,char_length (s.section) as section_length
from table
(
regexp_split_to_table (l.line_ind,l.line,'\s+','')
returns (line_ind int,section_ind int,section varchar(1000))
)
as s
,l
where l.line_ind =
s.line_ind
order by l.line_ind
,s.section_ind
;
Related
I need to generate a sequence in SQL Server 2016 database based on the following logic.
I have three fields each represents ID of Brand, Category and the Product. A brand could have multiple categories and each category could have multiple Products.
I want to generate a Sequence based on the values in these 3 fields
BrandNum
CategoryNum
ProductID
1
1
1
1
2
1
1
1
2
1
2
2
1
3
1
1
4
1
2
1
1
2
1
2
2
1
3
2
1
4
2
1
10
2
1
20
10
10
10
11
9
2
2
10
1
2
1
200
For example if Brand Number is 1, category is 1 and ItemID is 1 then I want 1100001. The first 1 from left represents the Brand Number, second number from left represents the category number and right most number 1 is the item number.
So for other example if the brand number is 4, category is 5 and ItemID is 100 then I need to generate 4500100.
it is working fine with the following logic (there could be a better way of writing it).
Select BrandNum*1000000+CategoryNum*100000+ItemID From Table
It works fine but this logic fails when the brand number is 10, category number is 10 (and any itemID (let's say 120)
The above code gets 1100120 but what I want is 101000120 (first 10 is brand number, second 10 is category number, and next 5 digits is for Item)
Could you please advice me what logic I should use to achieve my output?
It seems like what you ultimately want is a NVARCHAR with segments dedicated to your categories, for the first row:
010100001 because brand is 1 (01) category is 1 (01) and product is 1 (00001).
To get this done you could just cast to NVARCHAR and then pad the resulting string:
DECLARE #Product TABLE (Brand INT, Category INT, Product INT)
INSERT INTO #Product (Brand, Category, Product) VALUES
(1 , 1 , 1 ), (1 , 2 , 1 ), (1 , 1 , 2 ), (1 , 2 , 2 ),
(1 , 3 , 1 ), (1 , 4 , 1 ), (2 , 1 , 1 ), (2 , 1 , 2 ),
(2 , 1 , 3 ), (2 , 1 , 4 ), (2 , 1 , 10 ), (2 , 1 , 20 ),
(10, 10, 10 ), (11, 9 , 2 ), (2 , 10, 1 ), (2 , 1 , 200)
SELECT *, RIGHT('00'+CAST(Brand AS NVARCHAR(2)),2)+RIGHT('00'+CAST(Category AS NVARCHAR(2)),2)+RIGHT('00000'+CAST(Product AS NVARCHAR(5)),5) AS SKU
FROM #Product
Brand Category Product SKU
-------------------------------
1 1 1 010100001
1 2 1 010200001
1 1 2 010100002
1 2 2 010200002
1 3 1 010300001
1 4 1 010400001
2 1 1 020100001
2 1 2 020100002
2 1 3 020100003
2 1 4 020100004
2 1 10 020100010
2 1 20 020100020
10 10 10 101000010
11 9 2 110900002
2 10 1 021000001
2 1 200 020100200
You may use CONCAT and FORMAT functions as the following:
select BrandNum, CategoryNum, ProductID,
concat(BrandNum, CategoryNum, format(ProductID,'00000')) sequence
from table_name
See a demo.
I have query like this
select 1,2,3 from dual
union all
select 1,2,3 from dual
When I need to add new row, i put another union all, and that's ok. But problem appear when I need several union, for example 20. It is really annoying and not efficient to make another 17 unions. Is there a way (some procedure, function whatever) to make it faster and more elegant?
No problem, easy-peasy.
SQL> select 1, 2, 3
2 from dual
3 connect by level <= 10;
1 2 3
---------- ---------- ----------
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
10 rows selected.
SQL>
Sometimes it's easier to use json_table in such cases:
select *
from json_table(
'{data:[
[1,2,3,"abc"],
[2,3,4,"def"],
[3,4,5,"xyz"],
]
}'
,'$.data[*]'
columns
a number path '$[0]',
b number path '$[1]',
c number path '$[2]',
d varchar2(30) path '$[3]'
);
Results:
A B C D
---------- ---------- ---------- ------------------------------
1 2 3 abc
2 3 4 def
3 4 5 xyz
A variation on Littlefoot's answer:
select 1, 2, 3
from xmltable('1 to 20');
I have a table table1 with column line which is of type CLOB
Here are the values:
seq line
------------------------------
1 ISA*00*TEST
ISA*00*TEST1
GS*123GG*TEST*456:EHE
ST*ERT*RFR*EDRR*EER
GS*123GG*TEST*456:EHE
-------------------------------
2 ISA*01*TEST
GS*124GG*TEST*456:EHE
GS*125GG*TEST*456:EHE
ST*ERQ*RFR*EDRR*EER
ST*ERW*RFR*EDRR*EER
ST*ERR*RFR*EDRR*EER
I am trying to find the distinct string of the substring before the second star.
The output would be:
distinct_line_value count
ISA*00 2
GS*123GG 2
ST*ERT 1
ISA*01 1
GS*124GG 1
GS*125GG 1
ST*ERQ 1
ST*ERW 1
ST*ERR 1
Any ideas how I can do it based on distinct for the first 2 stars?
Here's one option:
Test case:
SQL> select * from test;
SEQ LINE
---------- --------------------------------------------------
1 ISA*00*TEST
ISA*00*TEST1
GS*123GG*TEST*456:EHE
ST*ERT*RFR*EDRR*EER
GS*123GG*TEST
2 ISA*01*TEST
GS*124GG*TEST*456:EHE
GS*125GG*TEST*456:EHE
ST*ERQ*RFR*EDRR*EER
ST*E
Query (see comments within the code; apart from that REGEXP_SUBSTR is crucial here, along with its 'm' match parameter which treats the input string as multiple lines):
SQL> with
2 -- split CLOB values to rows
3 inter as
4 (select seq,
5 regexp_substr(line, '^.*$', 1, column_value, 'm') res
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= regexp_count(line, chr(10)) + 1
9 ) as sys.odcinumberlist))
10 ),
11 -- convert CLOB to VARCHAR2 (so that SUBSTR works)
12 inter2 as
13 (select to_char(res) res From inter)
14 -- the final result
15 select substr(res, 1, instr(res, '*', 1, 2)) val, count(*)
16 from inter2
17 group by substr(res, 1, instr(res, '*', 1, 2))
18 order by 1;
VAL COUNT(*)
-------------------------------------------------- ----------
GS*123GG* 2
GS*124GG* 1
GS*125GG* 1
ISA*00* 2
ISA*01* 1
ST*ERQ* 1
ST*ERR* 1
ST*ERT* 1
ST*ERW* 1
9 rows selected.
SQL>
The question may be very simple but i don't know how to fix it,
I have this table structure
sno left Right
1 2 1
2 2 2
3 1 2
4 3 1
5 2 4
6 7 1
7 2 8
How do I get a result set like the one below
sno left Right Result
1 2 1 1
2 2 2 2
3 1 2 1
4 3 1 1
5 2 4 2
6 7 1 1
7 2 8 2
I wanna select the Data what mimimum value is matched between two columns,
Eg:3 and 1
1 is minimum value between these two and 1 is matched with 3, so the matched value is 1.
eg: 2 and 4
2 is minimum value between these two and 2 is is mathed with 4, so the matched value is 2.
Edited:
If choose 8 and 2 for example
8 contains(1,2,3,4,5,6,7,8)
2 contains(1,2)
So the Result is 2
Because 2 values are matched here.
I hope i explained it well, thanks
The following SQL will return the positive value of a subtraction operation between the left and right values - in a column with Result as the header. It will calculate the difference between left and right values - ABS will make the result positive.
SELECT
sno,
left,
Right,
ABS(left - right) AS Result
FROM tablename
One of the possible solutions:
DECLARE #t TABLE ( sno INT, l INT, r INT )
INSERT INTO #t
VALUES ( 1, 2, 1 ),
( 2, 2, 2 ),
( 3, 1, 2 ),
( 4, 3, 1 ),
( 5, 2, 4 ),
( 6, 7, 1 ),
( 7, 2, 8 )
SELECT *,
(SELECT MIN(v) FROM (VALUES(l),(r)) m(v)) AS m
FROM #t
Output:
sno l r m
1 2 1 1
2 2 2 2
3 1 2 1
4 3 1 1
5 2 4 2
6 7 1 1
7 2 8 2
case
when left < right then left
else right
end
I have de-normalized table, something like
CODES
ID | VALUE
10 | A,B,C
11 | A,B
12 | A,B,C,D,E,F
13 | R,T,D,W,W,W,W,W,S,S
The job is to convert is where each token from VALUE will generate new row. Example:
CODES_TRANS
ID | VALUE_TRANS
10 | A
10 | B
10 | C
11 | A
11 | B
What is the best way to do it in PL/SQL without usage of custom pl/sql packages, ideally with pure SQL?
Obvious solution is to implement it via cursors. Any ideas?
Another alternative is to use the model clause:
SQL> select id
2 , value
3 from codes
4 model
5 return updated rows
6 partition by (id)
7 dimension by (-1 i)
8 measures (value)
9 ( value[for i from 0 to length(value[-1])-length(replace(value[-1],',')) increment 1]
10 = regexp_substr(value[-1],'[^,]+',1,cv(i)+1)
11 )
12 order by id
13 , i
14 /
ID VALUE
---------- -------------------
10 A
10 B
10 C
11 A
11 B
12 A
12 B
12 C
12 D
12 E
12 F
13 R
13 T
13 D
13 W
13 W
13 W
13 W
13 W
13 S
13 S
21 rows selected.
I have written up to 6 alternatives for this type of query in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html
Regards,
Rob.
I have a pure SQL solution for you.
I adapted a trick I found on an old Ask Tom site, posted by Mihail Bratu. My adaptation uses regex to tokenise the VALUE column, so it requires 10g or higher.
The test data.
SQL> select * from t34
2 /
ID VALUE
---------- -------------------------
10 A,B,C
11 A,B
12 A,B,C,D,E,F
13 R,T,D,W1,W2,W3,W4,W5,S,S
SQL>
The query:
SQL> select t34.id
2 , t.column_value value
3 from t34
4 , table(cast(multiset(
5 select regexp_substr (t34.value, '[^(,)]+', 1, level)
6 from dual
7 connect by level <= length(value)
8 ) as sys.dbms_debug_vc2coll )) t
9 where t.column_value != ','
10 /
ID VALUE
---------- -------------------------
10 A
10 B
10 C
11 A
11 B
12 A
12 B
12 C
12 D
12 E
12 F
13 R
13 T
13 D
13 W1
13 W2
13 W3
13 W4
13 W5
13 S
13 S
21 rows selected.
SQL>
Based on Celko's book, here is what I found and it's working well!
SELECT
TABLE1.ID
, MAX(SEQ1.SEQ) AS START_POS
, SEQ2.SEQ AS END_POS
, COUNT(SEQ2.SEQ) AS PLACE
FROM
TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
WHERE
SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
AND SEQ1.SEQ < SEQ2.SEQ
AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
Where V_SEQ is a static table with one field:
SEQ, integer values 1 through N, where N >= MAX_LENGTH(VALUE).
This is based on the fact the the VALUE is wrapped by ',' on both ends, like this:
,A,B,C,D,
If your tokens are fixed length (like in my case) I simply used PLACE field to calculate the actual string. If variable length, use start_pos and end_pos
So, in my case, tokens are 2 char long, so the final SQL is:
SELECT
TABLE1.ID
, SUBSTR(TABLE1.VALUE, T_SUB.PLACE * 3 - 2 , 2 ) AS SINGLE_VAL
FROM
(
SELECT
TABLE1.ID
, MAX(SEQ1.SEQ) AS START_POS
, SEQ2.SEQ AS END_POS
, COUNT(SEQ2.SEQ) AS PLACE
FROM
TABLE1, V_SEQ SEQ1, V_SEQ SEQ2
WHERE
SUBSTR(',' || TABLE1.VALUE || ',', SEQ1.SEQ, 1) = ','
AND SUBSTR(',' || TABLE1.VALUE || ',', SEQ2.SEQ, 1) = ','
AND SEQ1.SEQ < SEQ2.SEQ
AND SEQ2.SEQ <= LENGTH(TABLE1.VALUE)
GROUP BY TABLE1.ID, TABLE1.VALUE, SEQ2.SEQ
) T_SUB
INNER JOIN
TABLE1 ON TABLE1.ID = T_SUB.ID
ORDER BY TABLE1.ID, T_SUB.PLACE
Original Answer
In SQL Server TSQL we parse strings and make a table object. Here is sample code - maybe you can translate it.
http://rbgupta.blogspot.com/2007/10/tsql-parsing-delimited-string-into.html
Second Option
Count the number of commas per row. Get the Max number of commas. Let's say that in the entire table you have a row with 5 commas max. Build a SELECT with 5 substrings. This will make it a set based operation and should be much faster than a rbar.