I want to generate composite sequences in the following format:
<Alphabet><2 digit numeric code>
Each alphabet series will have numeric values ranging from 00 to 99.
The initial value will be A00, the subsequent values will be A01, A02 and so on. Upon reaching A99, the next sequence should carry-on to B00. When the "B" series is exhausted, it will move over to the C-series (i.e. C00) and so on. The sequence will continue until it reaches Z99 - at which point it will reset back to A00.
How can this be done in SQL (or PL/SQL)?
Personally I would store just a NUMBER and then calculate the "composite sequence" on the fly with something like:
select
chr(ascii('A') + ((number_sequence div 100) mod 26)) || to_char(number_sequence mod 100) composite_sequence,
...
from mytable
26 assuming the English alphabet, modify for your desired alphabet
Use:
SELECT CHR(x.ascii) || LPAD(y.num - 1, 2, '0') AS val
FROM (SELECT 64 + LEVEL AS ascii
FROM DUAL
CONNECT BY LEVEL <= 26) x,
(SELECT LEVEL AS num
FROM DUAL
CONNECT BY LEVEL <= 100) y
Related
Oracle 18c:
Using an SQL query, I want to generate a list of coordinates that make up the line segments of a square grid graph:
STARTPOINT_X STARTPOINT_Y ENDPOINT_X ENDPOINT_Y
------------ ------------ ---------- ----------
0 0 1 0 --horizontal lines
1 0 2 0
2 0 3 0
3 0 4 0
4 0 5 0
5 0 6 0
...
0 0 0 1 --vertical lines
0 1 0 2
0 2 0 3
0 3 0 4
0 4 0 5
0 5 0 6
...
[220 rows selected]
Details:
The lines would be split at each intersection. So, in the image above, there are 220 lines. Each line is composed of two vertices.
Ideally, I would have the option of specifying in the query what the overall grid dimensions would be. For example, specify this somewhere in the SQL: DIMENSIONS = 10 x 10 (or DIMENSIONS = 100 x 100, etc.).
To keep things simple, we can assume the grid's overall shape will always be a square (length = width). And we can make the cell size 1 unit.
I've supplied sample data in this db<>fiddle. I created that data using Excel.
Hint: The vertical grid lines start at row 111.
The reason I want to generate this data is:
I want sample line data to work with when testing Oracle Spatial queries. Sometimes I need a few hundred lines. Other times, I need thousands of lines.
Also, if the lines are in a grid, then it will be obvious if any lines are missing in my results (by looking at the data in mapping software and spotting gaps).
How can I generate those grid line coordinates using SQL?
Related: Generate grid line features using SQL
You can use:
WITH range (v) AS (
SELECT LEVEL - 1 FROM DUAL CONNECT BY LEVEL <= 11
)
SELECT x.v AS startpoint_x,
y.v AS startpoint_y,
x.v + 1 AS endpoint_x,
y.v AS endpoint_y
FROM range x CROSS JOIN range y
WHERE x.v <= 9
UNION ALL
SELECT x.v AS startpoint_x,
y.v AS startpoint_y,
x.v AS endpoint_x,
y.v + 1 AS endpoint_y
FROM range x CROSS JOIN range y
WHERE y.v <= 9
or, more generally:
WITH range (v) AS (
SELECT LEVEL - 1 FROM DUAL CONNECT BY LEVEL - 1 <= GREATEST(:max_x, :max_y)
)
SELECT x.v AS startpoint_x,
y.v AS startpoint_y,
x.v + 1 AS endpoint_x,
y.v AS endpoint_y
FROM range x CROSS JOIN range y
WHERE x.v < :max_x
AND y.v <= :max_y
UNION ALL
SELECT x.v AS startpoint_x,
y.v AS startpoint_y,
x.v AS endpoint_x,
y.v + 1 AS endpoint_y
FROM range x CROSS JOIN range y
WHERE x.v <= :max_x
AND y.v < :max_y
db<>fiddle here
There's a few ways to generate rows in Oracle. Note: This particular (recursive) way might not be optimal for very large grids, for that you might want to cross join 2 rows a bunch of times, however, this way is more amenable to injecting a variable for your dimension.
Selecting from the magic dual table usually returns 1 row but you can use the recursive connect by with the magic level value to determine how many rows you want. It doesn't return a 0-level so I hard-coded that in.
Looking at your square, its a mirror image made up of single unit vectors; all the horizontal vectors are repeated vertically, so only half have to be generated. Note the union all in the final query just returns the same data but swaps the x and y points.
It cross joins dimension CTE 3 times. The first 2 are to get the start & end and only a 3rd because for all the e.g. horizontal vectors we just want the vertical coordinates to be the same for both start and end. It filters out where start & end are equal as those are zero-length vectors which are not needed as well as those longer than length 1 using where b.point - a.point = 1 .
with dimension as (
select 0 as point from dual
union all
select level
from dual
connect by level <= 10
), points as (
select
a.point as startpoint,
b.point as endpoint,
c.point as fixed
from dimension a
cross join dimension b
cross join dimension c
where b.point - a.point = 1
)
select
startpoint as startpoint_x,
fixed as startpoint_y,
endpoint as endpoint_x,
fixed as endpoint_y
from points
union all
select
fixed as startpoint_x,
startpoint as startpoint_y,
fixed as endpoint_x,
endpoint as endpoint_y
from points
order by startpoint_y, endpoint_y, startpoint_x, endpoint_x
The place where you would inject the variable is on line 6, replacing that 10 with whatever grid size you want connect by level <= 10.
In a SQL*Plus script you could do that like
define dimension = 10;
with ...[ rest of the query blah blah ]
connect by level <= &dimension
The following SQL query is supposed to return the max consecutive numbers in a set.
WITH RECURSIVE Mystery(X,Y) AS (SELECT A AS X, A AS Y FROM R)
UNION (SELECT m1.X, m2.Y
FROM Mystery m1, Mystery m2
WHERE m2.X = m1.Y + 1)
SELECT MAX(Y-X) + 1 FROM Mystery;
This query on the set {7, 9, 10, 14, 15, 16, 18} returns 3, because {14 15 16} is the longest chain of consecutive numbers and there are three numbers in that chain. But when I try to work through this manually I don't see how it arrives at that result.
For example, given the number set above I could create two columns:
m1.x
m2.y
7
7
9
9
10
10
14
14
15
15
16
16
18
18
If we are working on rows and columns, not the actual data, as I understand it WHERE m2.X = m1.Y + 1 takes the value from the next row in Y and puts it in the current row of X, like so
m1.X
m2.Y
9
7
10
9
14
10
15
14
16
15
18
16
18
Null?
The main part on which I am uncertain is where in the SQL recursion actually happens. According to Denis Lukichev recursion is the R part - or in this case the RECURSIVE Mystery(X,Y) - and stops when the table is empty. But if the above is true, how would the table ever empty?
Since I don't know how to proceed with the above, let me try a different direction. If WHERE m2.X = m1.Y + 1 is actually a comparison, the result should be:
m1.X
m2.Y
14
14
15
15
16
16
But at this point, it seems that it should continue recursively on this until only two rows are left (nothing else to compare). If it stops here to get the correct count of 3 rows (2 + 1), what is actually stopping the recursion?
I understand that for the above example the MAX(Y-X) + 1 effectively returns the actual number of recursion steps and adds 1.
But if I have 7 consecutive numbers and the recursion flows down to 2 rows, should this not end up with an incorrect 3 as the result? I understand recursion in C++ and other languages, but this is confusing to me.
Full disclosure, yes it appears this is a common university question, but I am retired, discovered this while researching recursion for my use, and need to understand how it works to use similar recursion in my projects.
Based on this db<>fiddle shared previously, you may find it instructive to alter the CTE to include an iteration number as follows, and then to show the content of the CTE rather than the output of final SELECT. Here's an amended CTE and its content after the recursion is complete:
Amended CTE
WITH RECURSIVE Mystery(X,Y) AS ((SELECT A AS X, A AS Y, 1 as Z FROM R)
UNION (SELECT m1.X, m2.A, Z+1
FROM Mystery m1
JOIN R m2 ON m2.A = m1.Y + 1))
CTE Content
x
y
z
7
7
1
9
9
1
10
10
1
14
14
1
15
15
1
16
16
1
18
18
1
9
10
2
14
15
2
15
16
2
14
16
3
The Z field holds the iteration count. Where Z = 1 we've simply got the rows from the table R. The, values X and Y are both from the field A. In terms of what we are attempting to achieve these represent sequences consecutive numbers, which start at X and continue to (at least) Y.
Where Z = 2, the second iteration, we find all the rows first iteration where there is a value in R which is one higher than our Y value, or one higher than the last member of our sequence of consecutive numbers. That becomes the new highest number, and we add one to the number of iterations. As only three numbers in our original data set have successors within the set, there are only three rows output in the second iteration.
Where Z = 3, the third iteration, we find all the rows of the second iteration (note we are not considering all the rows of the first iteration again), where there is, again, a value in R which is one higher than our Y value, or one higher than the last member of our sequence of consecutive numbers. That, again, becomes the new highest number, and we add one to the number of iterations.
The process will attempt a fourth iteration, but as there are no rows in R where the value is one more than the Y values from our third iteration, no extra data gets added to the CTE and recursion ends.
Going back to the original db<>fiddle, the process then searches our CTE content to output MAX(Y-X) + 1, which is the maximum difference between the first and last values in any consecutive sequence, plus one. This finds it's value from the record produced in the third iteration, using ((16-14) + 1) which has a value of 3.
For this specific piece of code, the output is always equivalent to the value in the Z field as every addition of a row through the recursion adds one to Z and adds one to Y.
Projected code is used to convert a date into integer and vice-versa. I want to know the reason why here we have used this specific hexadecimal codes and the number series to get back the date from int. If there is an article about this code sample it would also help me understand this code actually.
I have tried online Hex to Decimal conversion for this codes and found its a 256^1,256^2... even though trying not able to find the exact reason.
declare #dDate date = '2017-10-12'
declare #iDate int = 0
select #iDate = ( (datepart(year,#dDate)*65536 | datepart(month,#dDate)*256 | datepart(dd,#dDate)))
select (#iDate&0xfff0000)/65536 --year
select (#iDate&0xff00)/256 --Month
select (#iDate&0xff) --Date
& is an operator doing bitwise AND. "|" is bitwise OR. See here and here. Also see here for an explanation on using bitwise AND/OR to store multiple number values in a single number column.
This part:
#iDate&0xfff0000
will "mask", or eliminate/replace-with-zeros, the portion of iDate that isn't from 256^2. Then you divide by 65536 -- which is simply reversing the original math of multiplying the year by 65536.
If the concept of bitwise AND is foreign, I'll give an example that DOESN'T WORK in decimal. Bitwise AND converts the whole thing to binary and then masks things (like IP subnetting, if you're familiar with that).
Anyway, consider a decimal number 20171012. If such a thing as a decimal-wise AND existed, it could look like 20171012&11110000. The "1" places are "keepers" and the "0" places are "throw-aways". If you stack them vertically, the result is to keep the values with a "1" beneath them and replace the values with a "0" beneath them with a "0".
number 20171012
dec-wise AND 11110000
result 20170000
now the result isn't 2017, so you'd have to divide by 10000 to get 2017.
For 20171012&1100 you have to use implied leading zeros:
number 20171012
dec-wise AND 00001100
result 1000
I probably would have converted to int by adding the year*10000 and month * 100 and day. Reverting back I would use a combination of integer division and MOD. But I think the bitwise AND is perhaps a bit more elegant (particularly for getting the month).
Based on your comment, I will include how I have converted dates to int and reverted back:
declare #dDate date = '2017-10-12'
declare #iDate int
set #iDate = year(#dDate) * 10000 + month(#dDate) * 100 + day(#dDate)
select #iDate
select 'year', #iDate/10000 -- basic integer division provides the year
select 'month', (#iDate % 10000)/100 -- combine modulo and integer division to get the month
select 'day', #iDate % 100 -- basic modulo arithmetic provides the day
returns:
20171012
year 2017
month 10
day 12
This is bit manipulation.
Bit Shifting
Decimal 3 = Binary 11
If we do a left shift (<<) 4 bits in 3 it will become 48 which is equal to binary 110000 <- 4 zero bits added due to left shift
But since we don't have bit shifting operators in T-SQL therefore we can do the math.
Left Shifting of n bits in number x = x * 2^n
Therefore, multiple a number with 256 is actually left shift 8 bits from that number (2^8 = 256).
Later on when you do bitwise OR between 2 numbers they actually "concatenate" the bits up.
For example, you need to concatenate 2 binary numbers, (3) 11 and (2) 10, the resultant number should be 1110 = 14
So first we'll do 2 left shift in 3 = 3 * 2^2 = 12 and then we will do bitwise OR this number with the next number
12 = 1100
2 = 0010
OR
---------------
14 = 1110
Your example is actually saving the whole date in an integer variable which is actually efficient way of saving a date.
In my QlikView app I made the age groups with the following expression (it is defined dimension actually)
=Replace(Aggr(Class(Count(Surname), 10), Age), '<= x <', ' - ')
The groups are calculated properly, however, I have problems with sorting the groups from the smallest to the highest one. How can I do it?
When you say that groups are calculated properly, I suppose that you mean your dimension is 0 - 10, 10 - 20...
But the value of each group is wrong.
Try something like this : =Replace(Class(Age, 10), '<= x <', ' - ')
Or better :
=if(isnull(Age),<Null>,subfield(class(Age, 10),' ',1) & ' - ' & (num(subfield(class(Age, 10),' ',5))-1))
This one handle Null and make better groups.
Class fonction return "0 <= x < 10" which should be translate 0-9 instead of 0 - 10.
Subfield(X, ' ', Y) split X by space and return the Y part.
It can be done with sorting the dimension by text.
I'm trying to experiment with this, http://gyazo.com/8190a3c98a520bbeb77335e05ea5a636 (a visual basic console application). I want it to allow the user to enter in a word such, and have the console reply with it in all spaced combinations possible, so:
Say i'm using the word TEST, for example it would be created spaced out like this:
T EST
T E ST
T E S T
TE ST
TES T
T ES T
And so on... (Such as every combination it can be spaced out with multiple spaces or not)
Is this possible through the Console Application?
When counting, you start at the lowest digit. You start with that digit at zero and you count up until you reach the highest value for that digit, like this: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Then, once you reach the highest value, you have to add a second digit (e.g. 10). Then you go from lowest to highest again on the lowest digit again (e.g. 10 - 19) before incrementing the second digit again (e.g. 20). In that way, once you reach 999, you will have listed every possible combination of values in a three digit number.
When counting in binary, it works the same way, but the highest value for each digit is one, so you count up on the lowest digit like this: 0, 1. Then you have to add the second digit and count up again: 10, 11. Then you need to add a third digit (e.g. 100) and do it all again on the first two. By the time you get to 111, you will have listed every possibly combination of 1's and 0's in a three digit binary number.
So, if you think of the space between each letter as a digit in a binary number, where 0 means no space and 1 means there is a space, then all you have to do is count up from 0 to the highest value in a binary number that is the same number of digits as the length of your word, minus 1. So, for instance, with the word TEST, the the counting would look like this:
000 = TEST
001 = TES T
010 = TE ST
011 = TE S T
100 = T EST
...