Replace a character from all columns in Hive tables - hive

I need to perform regex replace function on all columns in my Hive table.
Is there a way to perform the operation on all columns without calling out each column individually?

Use regexp_replace. Below is the syntax for Hive REGEXP_REPLACE Function.
regexp_replace(string INITIAL_STRING, string PATTERN, string REPLACEMENT);

val col : DataFrame = hiveContext.sql("show columns in dbname.table_name")
val arry = col.collectAsList().toArray
def regexpReplace(x: AnyRef): String =
return "select regexp_replace(" + x + ",[^0-9a-zA-Z]," + "' ') from dbname.tbl_name"
for( col <- arry)
{
val res = regexpReplace(col.toString.substring(1,x.toString.length-1))
sqlContext.sql(res)
}

Related

numpy/pandas - why the selected the element from list are the same by random.choice

there is a list which contains integer values.
list=[1,2,3,.....]
then I use np.random.choice function to select a random element and add it to the a existing dataframe column, please refer to below code
df.message = df.message.astype(str) + "rowNumber=" + '"' + str(np.random.choice(list)) + '"'
But the element selected by np.random.choice and appended to the message column are always the same for all message row.
What is issue here?
Expected result is that the selected element from the list is not the same.
Pass to np.random.choice with parameter size and convert values to strings:
df = pd.DataFrame(
{'message' : ['aa','bb','cc']})
L = [1,2,3,4,5]
df.message = (df.message.astype(str) + "rowNumber=" + '"' +
np.random.choice(L, size=len(df)).astype(str) + '"')
print (df)
message
0 aarowNumber="4"
1 bbrowNumber="2"
2 ccrowNumber="5"

how to access the string containing "=" and "AND" operator in where clause in oracle

My query is like but it' not working and giving error "SQL command not properly ended" and the string:HDR.TRX_DT = DTL.TRX_DT AND HDR.BU_TYPE = DTL.BU_TYPE AND HDR.BU_CODE = DTL.BU_CODE
AND HDR.TRX_NO = DTL.TRX_NO AND HDR.RGSTR_NO = DTL.RGSTR_NO AND HDR.TRX_TYP_CD in ('COS') is value in column . i want use that value in where clause of select statement .How will do you that .plz suggest
select * from mdbat.migration_ctrl_all where addition_condition='HDR.TRX_DT = DTL.TRX_DT AND HDR.BU_TYPE = DTL.BU_TYPE AND HDR.BU_CODE = DTL.BU_CODE
AND HDR.TRX_NO = DTL.TRX_NO AND HDR.RGSTR_NO = DTL.RGSTR_NO AND HDR.TRX_TYP_CD in ('COS')';
Escape the ' delimiting the COS string, as said by Alex Larionow. But escape each one with another '
select * from mdbat.migration_ctrl_all where addition_condition='HDR.TRX_DT = DTL.TRX_DT AND HDR.BU_TYPE = DTL.BU_TYPE AND HDR.BU_CODE = DTL.BU_CODE
AND HDR.TRX_NO = DTL.TRX_NO AND HDR.RGSTR_NO = DTL.RGSTR_NO AND HDR.TRX_TYP_CD in (''COS'')';

Eclipselink NamedNativeQuery pass column name as parameter and not a value

Trying to pass column name as parameter but JPA sets it as a value surrounding it with single quotes.
#NamedNativeQueries({
#NamedNativeQuery(
name = "Genre.findAllLocalized",
query = "SELECT "
+ " CASE "
+ " WHEN ? IS NULL THEN genre_default"
+ " ELSE ? "
+ " END localized_genre "
+ "FROM genre ORDER BY localized_genre")
})
Then:
List<String> res = em.createNamedQuery("Genre.findAllLocalized")
.setParameter(1, colName)
.setParameter(2, colName)
.getResultList();
The problem is that the column names being passed are taken as values so the result will return result list with repeated values of "col_name" instead of selecting the value of the column passed as parameter.
Is this achievable?
Basically it makes no sense to create a prepared query like this, how would you name that query anyway: "*"? So the short answer is: no.
But you could create named queries dynamically if this matches your requirement:
String colName = "colName";
String query = "SELECT WHEN " + colName + " IS NULL THEN genre_default";
Query query = entitymanager.createQuery(query);
Probably using a criteria builder is more the way you want to use JPA (code from https://en.wikibooks.org/wiki/Java_Persistence/Criteria):
// Select the employees and the mailing addresses that have the same address.
CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
CriteriaQuery criteriaQuery = criteriaBuilder.createQuery();
Root employee = criteriaQuery.from(Employee.class);
Root address = criteriaQuery.from(MailingAddress.class);
criteriaQuery.multiselect(employee, address);
criteriaQuery.where( criteriaBuilder.equal(employee.get("address"), address.get("address"));
Query query = entityManager.createQuery(criteriaQuery);
List<Object[]> result = query.getResultList();

Smarter way of constructing query

In my node.js module, I have some data in an array which I need to construct a query:
var valueClause = "(fieldA = '" + data.fieldA + "' AND fieldB = '";
var whereClause = ' WHERE ';
var hasAdded = false;
data.accounts.forEach(function (account) {
whereClause += valueClause + account.fieldB + "') OR ";
hasAdded = true;
})
if (hasAdded) {
// remove OR
whereClause = whereClause.substring(0, whereClause.length - 3);
// use the whereClause in the query
...
}
At the end of the above codes, if I have 2 accounts, I have whereClause:
' WHERE (fieldA = 'abcde' AND fieldB = '0003') OR (fieldA = 'abcde' AND fieldB = '0002') OR
I always have to remove the last ' OR' bit.
Is there a smarter way to construct the above?
Since the value of fieldA seems to be fix in the loop you could write this query
WHERE fieldA = 'abcde' AND fieldB IN ('0002','0003')
Since you have no prepared SQL Statement, where you could use the Array directly, you have to join the values similar to yout approach.
If guaranteed at lest one value exists I concatenate in the loop always with leading comma and use substr(1) of the value

Lua table.toString(tableName) and table.fromString(stringTable) functions?

I am wanting to convert a 2d lua table into a string, then after converting it to a string convert it back into a table using that newly created string. It seems as if this process is called serialization, and is discussed in the below url, yet I am having a difficult time understanding the code and was hoping someone here had a simple table.toString and table.fromString function
http://lua-users.org/wiki/TableSerialization
I am using the following code in order to serialize tables:
function serializeTable(val, name, skipnewlines, depth)
skipnewlines = skipnewlines or false
depth = depth or 0
local tmp = string.rep(" ", depth)
if name then tmp = tmp .. name .. " = " end
if type(val) == "table" then
tmp = tmp .. "{" .. (not skipnewlines and "\n" or "")
for k, v in pairs(val) do
tmp = tmp .. serializeTable(v, k, skipnewlines, depth + 1) .. "," .. (not skipnewlines and "\n" or "")
end
tmp = tmp .. string.rep(" ", depth) .. "}"
elseif type(val) == "number" then
tmp = tmp .. tostring(val)
elseif type(val) == "string" then
tmp = tmp .. string.format("%q", val)
elseif type(val) == "boolean" then
tmp = tmp .. (val and "true" or "false")
else
tmp = tmp .. "\"[inserializeable datatype:" .. type(val) .. "]\""
end
return tmp
end
the code created can then be executed using loadstring(): http://www.lua.org/manual/5.1/manual.html#pdf-loadstring if you have passed an argument to 'name' parameter (or append it afterwards):
s = serializeTable({a = "foo", b = {c = 123, d = "foo"}})
print(s)
a = loadstring(s)()
The code lhf posted is a much simpler code example than anything from the page you linked, so hopefully you can understand it better. Adapting it to output a string instead of printing the output looks like:
t = {
{11,12,13},
{21,22,23},
}
local s = {"return {"}
for i=1,#t do
s[#s+1] = "{"
for j=1,#t[i] do
s[#s+1] = t[i][j]
s[#s+1] = ","
end
s[#s+1] = "},"
end
s[#s+1] = "}"
s = table.concat(s)
print(s)
The general idea with serialization is to take all the bits of data from some data structure like a table, and then loop through that data structure while building up a string that has all of those bits of data along with formatting characters.
How about a JSON module? That way you have also a better exchangeable data. I usually prefer dkjson, which also supports utf-8, where cmjjson won't.
Under the kong works this
local cjson = require "cjson"
kong.log.debug(cjson.encode(some_table))
Out of the kong should be installed package lua-cjson https://github.com/openresty/lua-cjson/
Here is a simple program which assumes your table contains numbers only. It outputs Lua code that can be loaded back with loadstring()(). Adapt it to output to a string instead of printing it out. Hint: redefine print to collect the output into a table and then at the end turn the output table into a string with table.concat.
t = {
{11,12,13},
{21,22,23},
}
print"return {"
for i=1,#t do
print"{"
for j=1,#t[i] do
print(t[i][j],",")
end
print"},"
end
print"}"
Assuming that:
You don't have loops (table a referencing table b and b referencing a)
Your tables are pure arrays (all keys are consecutive positive integers, starting on 1)
Your values are integers only (no strings, etc)
Then a recursive solution is easy to implement:
function serialize(t)
local serializedValues = {}
local value, serializedValue
for i=1,#t do
value = t[i]
serializedValue = type(value)=='table' and serialize(value) or value
table.insert(serializedValues, serializedValue)
end
return string.format("{ %s }", table.concat(serializedValues, ', ') )
end
Prepend the string resulting from this function with a return, store it on a .lua file:
-- myfile.lua
return { { 1, 2, 3 }, { 4, 5, 6 } }
You can just use dofile to get the table back.
t = dofile 'myfile.lua'
Notes:
If you have loops, then you will have
to handle them explicitly - usually with an extra table to "keep track" of repetitions
If you don't have pure arrays, then
you will have to parse t differently,
as well as handle the way the keys are rendered (are they strings? are they other tables? etc).
If you have more than just integers
and subtables, then calculating
serializedValue will be more
complex.
Regards!
I have shorter code to convert table to string but not reverse
function compileTable(table)
local index = 1
local holder = "{"
while true do
if type(table[index]) == "function" then
index = index + 1
elseif type(table[index]) == "table" then
holder = holder..compileTable(table[index])
elseif type(table[index]) == "number" then
holder = holder..tostring(table[index])
elseif type(table[index]) == "string" then
holder = holder.."\""..table[index].."\""
elseif table[index] == nil then
holder = holder.."nil"
elseif type(table[index]) == "boolean" then
holder = holder..(table[index] and "true" or "false")
end
if index + 1 > #table then
break
end
holder = holder..","
index = index + 1
end
return holder.."}"
end
if you want change the name just search all compileTable change it to you preferred name because this function will call it self if it detect nested table but escape sequence I don't know if it work
if you use this to create a lua executable file that output the table it will ge compilation error if you put new line and " sequence
this method is more memory efficient
Note:
Function not supported
User data I don't know
My solution:
local nl = string.char(10) -- newline
function serialize_list (tabl, indent)
indent = indent and (indent.." ") or ""
local str = ''
str = str .. indent.."{"
for key, value in pairs (tabl) do
local pr = (type(key)=="string") and ('["'..key..'"]=') or ""
if type (value) == "table" then
str = str..nl..pr..serialize_list (value, indent)..','
elseif type (value) == "string" then
str = str..nl..indent..pr..'"'..tostring(value)..'",'
else
str = str..nl..indent..pr..tostring(value)..','
end
end
str = str:sub(1, #str-1) -- remove last symbol
str = str..nl..indent.."}"
return str
end
local str = serialize_list(tables)
print('return '..nl..str)