Informix 4gl Split a String or Char - variables

I wanted to know the Informix 4gl command to split a variable
such as
lv_var = variable01;variable02
into
lv_var01 = variable01
lv_var02 = variable02
Is there something in Informix 4gl that can do this.
In python I could do
lv_array = lv_var.split(";")
and use the variables from the array

It's possible with classic Informix 4gl with something like this...
define
p_list dynamic array of char(10)
main
define
i smallint,
cnt smallint,
p_str char(500)
let p_str = "a;b;c;d"
let cnt = toarray(p_str, ";")
for i = 1 to cnt
display p_list[i]
end for
end main
function toarray(p_str, p_sep)
define
p_str char(2000),
p_sep char(1),
i smallint,
last smallint,
ix smallint,
p_len smallint
let ix = 0
let p_len = length(p_str)
# -- get size of array needed
for i = 1 to p_len
if p_str[i] = p_sep then
let ix = ix + 1
end if
end for
if ix > 0 then
# -- we have more then one
allocate array p_list[ix + 1]
let ix = 1
let last = 1
for i = 1 to p_len
if p_str[i] = p_sep then
let p_list[ix] = p_str[last,i-1]
let ix = ix + 1
let last = i + 1
end if
end for
# -- set the last one
let p_list[ix] = p_str[last, p_len]
else
# -- only has one
allocate array p_list[1]
let ix = 1
let p_list[ix] = p_str
end if
return ix
end function
Out:
a
b
c
d
Dynamic array support requires IBM Informix 4GL 7.32.UC1 or higher

There isn't a standard function to do that. One major problem is returning the array. I'd probably write a C function to do the job, but in I4GL, it would look like:
FUNCTION nth_split_field(str, c, n)
DEFINE str VARCHAR(255)
DEFINE c CHAR(1)
DEFINE n INTEGER
...code to find nth field delimited by c in str...
END FUNCTION

What you'll find is that the products that have grown to superceed Informix 4GL over the years such as FourJs Genero will have built-in methods that have been added to simplify the Informix 4GL developers life.
So something like this would do what you are looking for if you upgraded to Genero
-- Example showing how string can be parsed using string tokenizer
-- New features added to Genero since Informix 4gl used include
-- STRING - like a CHAR but length does not need to be specified - http://www.4js.com/online_documentation/fjs-fgl-manual-html/?path=fjs-fgl-manual#c_fgl_datatypes_STRING.html
-- DYNAMIC ARRAY like an ARRAY but does not need to have length specified. Is also passed by reference to functions - http://www.4js.com/online_documentation/fjs-fgl-manual-html/?path=fjs-fgl-manual#c_fgl_Arrays_010.html
-- base.StringTokenizer - methods to split a string - http://www.4js.com/online_documentation/fjs-fgl-manual-html/?path=fjs-fgl-manual#c_fgl_ClassStringTokenizer.html
MAIN
DEFINE arr DYNAMIC ARRAY OF STRING
DEFINE i INTEGER
CALL string2array("abc;def;ghi",arr,";")
-- display result
FOR i = 1 TO arr.getLength()
DISPLAY arr[i]
END FOR
-- Should display
--abc
--def
--ghi
END MAIN
FUNCTION string2array(s,a,delimiter)
DEFINE s STRING
DEFINE a DYNAMIC ARRAY OF STRING
DEFINE delimiter STRING
DEFINE tok base.StringTokenizer
CALL a.clear()
LET tok = base.StringTokenizer.create(s,delimiter)
WHILE tok.hasMoreTokens()
LET a[a.getLength()+1] = tok.nextToken()
END WHILE
-- a is DYNAMIC ARRAY so has been pased by reference and does not need to be explicitly returned
END FUNCTION

Related

Using levenshtein on parts of string in SQL

I am trying to figure out a way to work some fuzzy searching methods into our store front search field using the Levenshtein method, but I'm running into a problem with how to search for only part of product names.
For example, a customer searches for scisors, but we have a product called electric scissor. Using the Levenshtein method levenshtein("scisors","electric scissor") we will get a result of 11, because the electric part will be counted as a difference.
What I am looking for is a way for it to look at substrings of the product name, so it would compare it to levenshtein("scisors","electric") and then also levenshtein("scisors","scissor") to see that we can get a result of only 2 in that second substring, and thus show that product as part of their search result.
Non-working example to give you an idea of what I'm after:
SELECT * FROM products p WHERE levenshtein("scisors", p.name) < 5
Question: Is there a way to write an SQL statement that handles checking for parts of the string? Would I need to create more functions in my database to be able to handle it perhaps or modify my existing function, and if so, what would it look like?
I am currently using this implementation of the levenshtein method:
//levenshtein(s1 as VARCHAR(255), s2 as VARCHAR(255))
//returns int
BEGIN
DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
DECLARE s1_char CHAR;
-- max strlen=255
DECLARE cv0, cv1 VARBINARY(256);
SET s1_len = CHAR_LENGTH(s1), s2_len = CHAR_LENGTH(s2), cv1 = 0x00, j = 1, i = 1, c = 0;
IF s1 = s2 THEN
RETURN 0;
ELSEIF s1_len = 0 THEN
RETURN s2_len;
ELSEIF s2_len = 0 THEN
RETURN s1_len;
ELSE
WHILE j <= s2_len DO
SET cv1 = CONCAT(cv1, UNHEX(HEX(j))), j = j + 1;
END WHILE;
WHILE i <= s1_len DO
SET s1_char = SUBSTRING(s1, i, 1), c = i, cv0 = UNHEX(HEX(i)), j = 1;
WHILE j <= s2_len DO
SET c = c + 1;
IF s1_char = SUBSTRING(s2, j, 1) THEN
SET cost = 0; ELSE SET cost = 1;
END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j, 1)), 16, 10) + cost;
IF c > c_temp THEN SET c = c_temp; END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10) + 1;
IF c > c_temp THEN
SET c = c_temp;
END IF;
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;
END WHILE;
SET cv1 = cv0, i = i + 1;
END WHILE;
END IF;
RETURN c;
END
This is a bit long for a comment.
First, I would suggest using a full-text search with a synonyms list. That said, you might have users with really bad spelling abilities, so the synonyms list might be difficult to maintain.
If you use Levenshtein distance, then I suggest doing it on a per word basis. For each word in the user's input, calculate the closest word in the name field. Then add these together to get the best match.
In your example, you would have these comparisons:
levenshtein('scisors', 'electric')
levenshtein('scisors', 'scissor')
The minimum would be the second. If the user types multiple words, such as 'electrk scisors', then you would be doing
levenshtein('electrk', 'electric') <-- minimum
levenshtein('electrk', 'scissor')
levenshtein('scisors', 'electric')
levenshtein('scisors', 'scissor') <-- minimum
This is likely to be an intuitive way to approach the search.

IBM Informix aggregate function

I need to develop some kind of function in a informix db, in order to split one string into multiple rows for example:
Column1
one,two,three,four
And my expected result is:
column1
one
two
three
four
What i was thinking is to create a function, that splits the string into multiple rows. My actual code is the next one :
create function split(text_splitted varchar(100), separator char(1))
returning varchar(100)
define splitted_word varchar(100);
define current_val char(1);
define start, cont integer;
let start = 0;
let splitted_word = "";
let current_val = "";
for cont = 0 to length(text_splitted)
let current_val = substr(text_splitted, cont, 1);
if current_val = separator then
let splitted_word = substr(text_splitted, start, cont - start);
let start = cont + 1;
return splitted_word with resume;
end if;
end for;
end function
If you execute the next statement, works find:
execute function split('hello.my.name.is', '.');
And the result is:
hello
my
name
this is perfect, but my problem is that when you launch a query with this function, and the function returns more than one row an error is raised. What i have been google, is that i need to create an aggregate function but i am not able to build this function. I am new in this kind of developing....
Here is the little documentation i found: http://www.pacs.tju.edu/informix/answers/english/docs/dbdk/is40/extend/04aggs3.html
Thanks!

Count unknown variables from a table

I have a problem here... if I have a table with few repeated string results. I want to know the value am the ammount of each.
For example. A function return an unknown "letters" and with unknown quantities in quantity
Function () return Table end
Table ={'a','a','c','b','b','a',...}
And I want to get this.
table.a={'a','a','a'}
table.b={'b','b'}
table.c={'c'}
....
....
I have no clue how to solve it...
Write a function, which creates a hash map of these things:
function RepetitionCounter(tInput)
local tCounter = {}
for i, v in ipairs(tInput) do
tCounter[v] = (tCounter[v] or 0) + 1
end
return tCounter
end
which you'll use as follows:
local tData = {'a','a','c','b','b','a',...}
local tCounts = RepetitionCounter(tData)
and the table tCounts would be as follows:
tCounts.a = 3
tCounts.b = 2
tCounts.c = 1
Modifying the function above by just a little, you can get the desired output. Replace the following line:
tCounter[v] = (tCounter[v] or 0) + 1
with
if not tCounter[v] then
tCounter[v] = {}
else
table.insert(tCounter[v], v)
end

Generalizing increasing number of nested loop algorithm

Sorry for the terrible title, but I have no clue on how to generalize (or simplify) my loop case here.
I have a program that iterates to a sequence of integer, for example dimension=1 to 5.
In each iteration, there will be a main loop, and inside the main loop, there will be a nested loop. The number of the nested loop will be [dimension].
For example, in dimension=1, there is a For loop. In dimension=2, there is a For loop inside a For loop. And so on.
Is there any possible way to simplify the algorithm? currently I'm manually write totally different code for each value of [dimension]. Imagine if dimension=1 to 100? I'll be dead.
Here's my piece of program (written in VB.NET)
for dimension=2
Dim result(2) As Integer
For i = 0 To 1
For j = 0 To 1
result(0)=i
result(1)=j
Next
Next
For dimension=3
Dim result(3) As Integer
For i = 0 To 1
For j = 0 To 1
For k = 0 To 1
result(0)=i
result(1)=j
result(2)=k
Next
Next
Next
For dimension=4
Dim result(4) As Integer
For i = 0 To 1
For j = 0 To 1
For k = 0 To 1
For l = 0 To 1
result(0)=i
result(1)=j
result(2)=k
result(3)=l
Next
Next
Next
Next
And so on..
Any suggestion?
Thanks!
There are plenty of solutions:
Recursion
Idk, if vb.net supports methods, but if it does, this would probably be the simplest:
void nestedLoop(int lower , int upper , int remaining_loops , int[] values)
if(remaining_loops == 0)
//process values list
else
for int i in [lower , upper)
values[remaining_loops] = i
nestedLoop(lower , upper , remaining_loops - 1)
Integer Transformation
In theory, a number can be represented by any radix:
d_i * radix ^ i + d_i-1 * radix ^ (i - 1) ... + d_0 * radix ^ 0
Consider each digit the value of one of the nested loops:
for int i in [0 , max)
for int j in [0 , max)
for int k in [0 , max)
...
Could be represented by a 3-digit number with radix max, where d_0 = i, d_1 = j, etc.. Basically how each digit is mapped to one of the values can be arbitrary and will only affect the order of the output.
void nestedLoops(int upper , int dimension)
for int i in [0 , pow(upper , dimension))
int[] values
int digit_sub = 1
int tmp = i
for int j in [0 , dimension)
values[j] = tmp % dimension
tmp /= dimension
//all values of the loops are now in values
//process them here
There would be a few other options aswell, but these are the most common.
Please do note that when you do
Dim result(2) As Integer
You are actually declaring an array of 3 elements see this question for why. It's a subtle difference in VB.NET
That being said, I'll assume that you meant to declare an array of only 2 elements. If this is the case then you could build and call a recursive function like this
LoopOver(result)
Sub LoopOver(ByRef array() As Integer, ByVal Optional level As Integer = 0)
If array.Length = level Then
Return
Else
array(level) = 1
LoopOver(array, level + 1)
End If
End Sub
This recursive function will call itself (i.e., it will loop) for as many times as the array's size.

Lua table.toString(tableName) and table.fromString(stringTable) functions?

I am wanting to convert a 2d lua table into a string, then after converting it to a string convert it back into a table using that newly created string. It seems as if this process is called serialization, and is discussed in the below url, yet I am having a difficult time understanding the code and was hoping someone here had a simple table.toString and table.fromString function
http://lua-users.org/wiki/TableSerialization
I am using the following code in order to serialize tables:
function serializeTable(val, name, skipnewlines, depth)
skipnewlines = skipnewlines or false
depth = depth or 0
local tmp = string.rep(" ", depth)
if name then tmp = tmp .. name .. " = " end
if type(val) == "table" then
tmp = tmp .. "{" .. (not skipnewlines and "\n" or "")
for k, v in pairs(val) do
tmp = tmp .. serializeTable(v, k, skipnewlines, depth + 1) .. "," .. (not skipnewlines and "\n" or "")
end
tmp = tmp .. string.rep(" ", depth) .. "}"
elseif type(val) == "number" then
tmp = tmp .. tostring(val)
elseif type(val) == "string" then
tmp = tmp .. string.format("%q", val)
elseif type(val) == "boolean" then
tmp = tmp .. (val and "true" or "false")
else
tmp = tmp .. "\"[inserializeable datatype:" .. type(val) .. "]\""
end
return tmp
end
the code created can then be executed using loadstring(): http://www.lua.org/manual/5.1/manual.html#pdf-loadstring if you have passed an argument to 'name' parameter (or append it afterwards):
s = serializeTable({a = "foo", b = {c = 123, d = "foo"}})
print(s)
a = loadstring(s)()
The code lhf posted is a much simpler code example than anything from the page you linked, so hopefully you can understand it better. Adapting it to output a string instead of printing the output looks like:
t = {
{11,12,13},
{21,22,23},
}
local s = {"return {"}
for i=1,#t do
s[#s+1] = "{"
for j=1,#t[i] do
s[#s+1] = t[i][j]
s[#s+1] = ","
end
s[#s+1] = "},"
end
s[#s+1] = "}"
s = table.concat(s)
print(s)
The general idea with serialization is to take all the bits of data from some data structure like a table, and then loop through that data structure while building up a string that has all of those bits of data along with formatting characters.
How about a JSON module? That way you have also a better exchangeable data. I usually prefer dkjson, which also supports utf-8, where cmjjson won't.
Under the kong works this
local cjson = require "cjson"
kong.log.debug(cjson.encode(some_table))
Out of the kong should be installed package lua-cjson https://github.com/openresty/lua-cjson/
Here is a simple program which assumes your table contains numbers only. It outputs Lua code that can be loaded back with loadstring()(). Adapt it to output to a string instead of printing it out. Hint: redefine print to collect the output into a table and then at the end turn the output table into a string with table.concat.
t = {
{11,12,13},
{21,22,23},
}
print"return {"
for i=1,#t do
print"{"
for j=1,#t[i] do
print(t[i][j],",")
end
print"},"
end
print"}"
Assuming that:
You don't have loops (table a referencing table b and b referencing a)
Your tables are pure arrays (all keys are consecutive positive integers, starting on 1)
Your values are integers only (no strings, etc)
Then a recursive solution is easy to implement:
function serialize(t)
local serializedValues = {}
local value, serializedValue
for i=1,#t do
value = t[i]
serializedValue = type(value)=='table' and serialize(value) or value
table.insert(serializedValues, serializedValue)
end
return string.format("{ %s }", table.concat(serializedValues, ', ') )
end
Prepend the string resulting from this function with a return, store it on a .lua file:
-- myfile.lua
return { { 1, 2, 3 }, { 4, 5, 6 } }
You can just use dofile to get the table back.
t = dofile 'myfile.lua'
Notes:
If you have loops, then you will have
to handle them explicitly - usually with an extra table to "keep track" of repetitions
If you don't have pure arrays, then
you will have to parse t differently,
as well as handle the way the keys are rendered (are they strings? are they other tables? etc).
If you have more than just integers
and subtables, then calculating
serializedValue will be more
complex.
Regards!
I have shorter code to convert table to string but not reverse
function compileTable(table)
local index = 1
local holder = "{"
while true do
if type(table[index]) == "function" then
index = index + 1
elseif type(table[index]) == "table" then
holder = holder..compileTable(table[index])
elseif type(table[index]) == "number" then
holder = holder..tostring(table[index])
elseif type(table[index]) == "string" then
holder = holder.."\""..table[index].."\""
elseif table[index] == nil then
holder = holder.."nil"
elseif type(table[index]) == "boolean" then
holder = holder..(table[index] and "true" or "false")
end
if index + 1 > #table then
break
end
holder = holder..","
index = index + 1
end
return holder.."}"
end
if you want change the name just search all compileTable change it to you preferred name because this function will call it self if it detect nested table but escape sequence I don't know if it work
if you use this to create a lua executable file that output the table it will ge compilation error if you put new line and " sequence
this method is more memory efficient
Note:
Function not supported
User data I don't know
My solution:
local nl = string.char(10) -- newline
function serialize_list (tabl, indent)
indent = indent and (indent.." ") or ""
local str = ''
str = str .. indent.."{"
for key, value in pairs (tabl) do
local pr = (type(key)=="string") and ('["'..key..'"]=') or ""
if type (value) == "table" then
str = str..nl..pr..serialize_list (value, indent)..','
elseif type (value) == "string" then
str = str..nl..indent..pr..'"'..tostring(value)..'",'
else
str = str..nl..indent..pr..tostring(value)..','
end
end
str = str:sub(1, #str-1) -- remove last symbol
str = str..nl..indent.."}"
return str
end
local str = serialize_list(tables)
print('return '..nl..str)