How to properly clean column header in Power Query and capitalize first letter only without changing other letter? - header

I would like to clean a column Header of the table so that my column header that has a name like below:
[Space][Space][Space]First Name[Space][Space]
[Space]MaintActType[Space]
TECO date[Space]
FIN Date
ABC indicator
COGS
Created On
And my desired Column Header Name to be like below:
First Name
Main Act Type
TECO Date
FIN Date
ABC Indicator
COGS
Created On
my code is as below:
let
Source = Excel.Workbook(File.Contents("C:\RawData\sample.xlsx"), null, true),
#"sample_Sheet" = Source{[Item="sample",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(#"sample_Sheet", [PromoteAllScalars=true]),
#"Trim ColumnSpace" = Table.TransformColumnNames(#"Promoted Headers", Text.Trim),
#"Split CapitalLetter" = Table.TransformColumnNames(#"Trim ColumnSpace", each Text.Combine(Splitter.SplitTextByPositions(Text.PositionOfAny(_, {"A".."Z"},2)) (_), " ")),
#"Remove DoubleSpace" = Table.TransformColumnNames(#"Split CapitalLetter", each Replacer.ReplaceText(_, " ", " ")),
#"Capitalise FirstLetter" = Table.TransformColumnNames(#"Remove DoubleSpace", Text.Proper),
#"Remove Space" = Table.TransformColumnNames(#"Capitalise FirstLetter", each Text.Remove(_, {" "})),
#"Separate ColumnName" = Table.TransformColumnNames(#"Remove Space", each Text.Combine(Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"}) (_), " "))
in
#"Separate ColumnName"
However, i get the result as below. Which is not what i wanted as all the capital letter we combined together. How do i change the code so that i get the result as wanted? I would really appreciate your help, please.
First Name
Main Act Type
TECODate
FINDate
ABCIndicator
COGS
Created On
Alternatively, i changed the code to:
let
Source = Excel.Workbook(File.Contents("C:\RawData\sample.xlsx"), null, true),
#"sample_Sheet" = Source{[Item="sample",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(#"sample_Sheet", [PromoteAllScalars=true]),
#"Trim ColumnSpace" = Table.TransformColumnNames(Input, Text.Trim),
#"Separate ColumnName" = Table.TransformColumnNames(#"Trim ColumnSpace", each Text.Combine(Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"}) (_), " ")),
#"Capitalise FirstLetter" = Table.TransformColumnNames(#"Separate ColumnName", Text.Proper)
in
#"Capitalise FirstLetter"
Unfortunately it return the result like so:
First Name
Main Act Type
Teco Date
Fin Date
Abc Indicator
COGS
Created On
I have no idea how to play around the code anymore.

One way is to mark the existing spaces with something (I used "ZZZ") and restore them back to spaces at the end. Here's your code with a couple of tweaks. Thanks for your code. I was trying to do Text.Proper and your sample helped me!
let
Source = Input,
#"Replaced Value" = Table.ReplaceValue(Source,"[Space]"," ",Replacer.ReplaceText,{"Headers"}),
#"Transposed Table" = Table.Transpose(#"Replaced Value"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Trim ColumnSpace" = Table.TransformColumnNames(#"Promoted Headers", Text.Trim),
#"Change space to ZZZ" = Table.TransformColumnNames(#"Trim ColumnSpace", each Replacer.ReplaceText(_, " ", " ZZZ ")),
#"Split CapitalLetter" = Table.TransformColumnNames(#"Change space to ZZZ", each Text.Combine(Splitter.SplitTextByPositions(Text.PositionOfAny(_, {"A".."Z"},2)) (_), " ")),
#"Capitalise FirstLetter" = Table.TransformColumnNames(#"Split CapitalLetter", Text.Proper),
#"Remove Space" = Table.TransformColumnNames(#"Capitalise FirstLetter", each Text.Remove(_, {" "})),
#"Separate ColumnName" = Table.TransformColumnNames(#"Remove Space", each Text.Combine(Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"}) (_), " ")),
#"Change ZZZ to space" = Table.TransformColumnNames(#"Separate ColumnName", each Replacer.ReplaceText(_, "ZZZ", " ")),
#"Remove DoubleSpace" = Table.TransformColumnNames(#"Change ZZZ to space", each Replacer.ReplaceText(_, " ", " "))
in
#"Remove DoubleSpace"

Related

My input function and len function does not return a correct result in some cases

I have a question I wrote this script to format an arabic text to a certain format. But I found a problem then when I type a longer sentence it adds more dots then there are letters. I will paste the code here and explain what the problem is.
import itertools
while True:
input_cloze_deletion = input("\nEnter: text pr 'exit'\n> ")
input_exit_check = input_cloze_deletion.strip().lower()
if input_exit_check == "exit":
break
# copy paste
interpunction_list = ["الله", ",", ".", ":", "?", "!", "'"]
# copy paste
interpunction_list = [ ",", ".", ":", "?", "!", "'", "-", "(", ")", "/", "الله", "اللّٰـه"]
text_replace_0 = input_cloze_deletion.replace(",", " ,")
text_replace_1 = text_replace_0.replace(".", " .")
text_replace_2 = text_replace_1.replace(":", " :")
text_replace_3 = text_replace_2.replace(";", " ;")
text_replace_4 = text_replace_3.replace("?", " ?")
text_replace_5 = text_replace_4.replace("!", " !")
text_replace_6 = text_replace_5.replace("'", " ' ")
text_replace_7 = text_replace_6.replace("-", " - ")
text_replace_8 = text_replace_7.replace("(", " ( ")
text_replace_9 = text_replace_8.replace(")", " ) ")
text_replace_10 = text_replace_9.replace("/", " / ")
text_replace_11 = text_replace_10.replace("الله", "اللّٰـه")
text_split_list = text_replace_11.split()
count_number = []
letter_count_list = []
index_list = itertools.cycle(range(1, 4))
for letter_count in text_split_list:
if letter_count in interpunction_list:
letter_count_list.append(letter_count)
elif "ـ" in letter_count:
letter_count = len(letter_count) - 1
count_number.append(letter_count)
print(letter_count)
else:
letter_count = len(letter_count)
count_number.append(letter_count)
print(letter_count)
for count in count_number:
letter_count_list.append(letter_count * ".")
zip_list = zip(text_split_list, letter_count_list)
zip_list_result = list(zip_list)
for word, count in zip_list_result:
if ((len(word)) == 2 or word == "a" or word == "و") and not word in interpunction_list :
print(f" {{{{c{(next(index_list))}::{word}::{count}}}}}", end="")
elif word and count in interpunction_list:
print(word, end = "")
else:
print(f" {{{{c{(next(index_list))}::{word}::{count}}}}}", end="")
when I type كتب عليـ ـنا و علـ ـي
the return is {{c1::كتب::...}} {{c2::عليـ::...}} {{c3::ـنا::...}} {{c1::و::..}} {{c2::علـ::..}} {{c3::ـي::..}}
but it should be
{{c1::كتب::...}} {{c2::عليـ::...}} {{c3::ـنا::..}} {{c1::و::.}} {{c2::علـ::..}} {{c3::ـي::.}}
I add a print function the print the len() results and the result is correct but it add an extra dot in some case.
But when I type just a single "و" it does a correct len() function but when I input a whole sentence it add an extra dot and I don't know why.
please help

Assign Word Table Cell values to variable in Word Document

I am trying to create 2 variable from a string in the cell. cell string is "Mr Jonhattan Smith Sun". Value1 I want as "Jonhattan" and value2 as "Smith Sun". I have the following codes but doesn't seem to work properly. any Help Please
value1 = Left(ThirdTable.Rows(10).Cells(2).Range.text, Len(ThirdTable.Rows(10).Cells(2).Range.text) - InStrRev(ThirdTable.Rows(10).Cells(2).Range.text, " "))
value2 = Right(ThirdTable.Rows(10).Cells(2).Range.text, Len(ThirdTable.Rows(10).Cells(2).Range.text) - InStrRev(ThirdTable.Rows(10).Cells(2).Range.text, " ") + 1)
Try:
ss = Split(ThirdTable.Rows(10).Cells(2).Range.Text, " ")
Value1 = ss(1)
Value2 = ss(2) & " " & ss(3)
Given that
ThirdTable.Rows(10).Cells(2).Range.Text Gives you Mr Jonhattan Smith Sun
Demo:
If that Weird Letter comes up like in the Demo use:
Value2 = ss(2) & " " & Left(ss(3), Len(ss(3)) - 1)

Talend - Read a SQL from .txt and pass variables to it

I need to read an Excel file and excecute a different SQL Query (Oracle) for each row of the Excel, depending on the value that a column ("tipo") of the table has. Also, I need to pass variables to the SQL query that comes from others columns of the same excel.
I have accomplished this with a tjava, creating a string that generates de query.. somthing like this (simplified):
String var1 = input_row.var1;
String var2 = input_row.var2;
String sql_query = ""
if (tipo.toString().equals(String.valueOf("Option1"))) {
query_ev = "select " + var1 + " as variable1, " + var2 + " as var 2 from dual";}
if (tipo.toString().equals(String.valueOf("Option2"))) {
query_ev = "select " + var1+ " from dual";}
context.sql = sql_query;
Nevertheless, my "problem" is that my queries are long, and this method only allows me to put them in the tjava whithout line breaks, so its very difficult to edit them since they remain a "one-row-query" in the tjava.
Is there a way to accomplish the same but having the queries formatted ?
for example, for Option1 having this
select " + var1 + " as variable1, " + var2 + " as var 2
from dual
where
...
I appreciate any clue.
Try this:
query_ev = "select "
" + var1 + " as variable1, " + var2 + " as var 2 "+
from dual";
I am without Java right now to verify but I think this approach could help formatting. Just check that you have enough spaces - don't do this
"as var 2"+"from dual"
which would result in SQL like
"as var 2from dual"

groovy closure instantiate variables

is it possible to create a set of variables from a list of values using a closure??
the reason for asking this is to create some recursive functionality based on a list of (say) two three four or five parts
The code here of course doesn't work but any pointers would be helpful.then
def longthing = 'A for B with C in D on E'
//eg shopping for 30 mins with Fiona in Birmingham on Friday at 15:00
def breaks = [" on ", " in ", "with ", " for "]
def vary = ['when', 'place', 'with', 'event']
i = 0
line = place = with = event = ""
breaks.each{
shortline = longthing.split(breaks[i])
longthing= shortline[0]
//this is the line which obviously will not work
${vary[i]} = shortline[1]
rez[i] = shortline[1]
i++
}
return place + "; " + with + "; " + event
// looking for answer of D; C; B
EDIT>>
Yes I am trying to find a groovier way to clean up this, which i have to do after the each loop
len = rez[3].trim()
if(len.contains("all")){
len = "all"
} else if (len.contains(" ")){
len = len.substring(0, len.indexOf(" ")+2 )
}
len = len.replaceAll(" ", "")
with = rez[2].trim()
place = rez[1].trim()
when = rez[0].trim()
event = shortline[0]
and if I decide to add another item to the list (which I just did) I have to remember which [i] it is to extract it successfully
This is the worker part for then parsing dates/times to then use jChronic to convert natural text into Gregorian Calendar info so I can then set an event in a Google Calendar
How about:
def longthing = 'A for B with C in D on E'
def breaks = [" on ", " in ", "with ", " for "]
def vary = ['when', 'place', 'with', 'event']
rez = []
line = place = with = event = ""
breaks.eachWithIndex{ b, i ->
shortline = longthing.split(b)
longthing = shortline[0]
this[vary[i]] = shortline[1]
rez[i] = shortline[1]
}
return place + "; " + with + "; " + event
when you use a closure with a List and "each", groovy loops over the element in the List, putting the value in the list in the "it" variable. However, since you also want to keep track of the index, there is a groovy eachWithIndex that also passes in the index
http://groovy.codehaus.org/GDK+Extensions+to+Object
so something like
breaks.eachWithIndex {item, index ->
... code here ...
}

Is there a better way to code this sqlQuery in R?

I'm writing an R script to get some database data and then do stuff with it, using the RODBC package. Currently all my sqlQuery commands are one long string;
stsample<-sqlQuery(odcon, paste"select * from bob.DESIGNSAMPLE T1, bob.DESIGNSUBJECTGROUP T2, bob.DESIGNEVENT T3, bob.CONFIGSAMPLETYPES T4 WHERE T1.SUBJECTGROUPID = T2.SUBJECTGROUPID AND T1.TREATMENTEVENTID = T3.TREATMENTEVENTID AND T1.SAMPLETYPEKEY = T4.SAMPLETYPEKEY AND T1.STUDYID = T2.STUDYID AND T1.STUDYID = T3.STUDYID AND T1.STUDYID = ", chstudid, sep=""))
head(stsample)
which looks ugly and is hard to read/update. I've tried putting them multiline, but then new line characters get in the way, currently my best is this using lots of paste's;
stsample<-sqlQuery(odcon,
paste(
"select ",
"* ",
"from ",
"BOB.DESIGNSAMPLE T1, ",
"BOB.DESIGNSUBJECTGROUP T2, ",
"BOB.DESIGNEVENT T3, ",
"BOB.CONFIGSAMPLETYPES T4 ",
"WHERE ",
"T1.SUBJECTGROUPID = T2.SUBJECTGROUPID ",
"AND T1.TREATMENTEVENTID = T3.TREATMENTEVENTID ",
"AND T1.SAMPLETYPEKEY = T4.SAMPLETYPEKEY ",
"AND T1.STUDYID = T2.STUDYID ",
"AND T1.STUDYID = T3.STUDYID ",
"AND T1.STUDYID = ",chstudid,
sep="")
)
head(stsample)
But I don't like having to put quotes around everyline, and getting my whitespace correct. Is there a better way ?
I would use something like this:
stsample<-sqlQuery(odcon,
paste("
####DATASET CONSTRUCTION QUERY #########
select
*
from
BOB.DESIGNSAMPLE T1,
BOB.DESIGNSUBJECTGROUP T2,
BOB.DESIGNEVENT T3,
BOB.CONFIGSAMPLETYPES T4
WHERE
T1.SUBJECTGROUPID = T2.SUBJECTGROUPID
AND T1.TREATMENTEVENTID = T3.TREATMENTEVENTID
AND T1.SAMPLETYPEKEY = T4.SAMPLETYPEKEY
AND T1.STUDYID = T2.STUDYID
AND T1.STUDYID = T3.STUDYID
AND T1.STUDYID =
###################################
", as.character(chstudid), sep="")
)
What about using gsub("\n", " ", "long multiline select string") instead of paste?
This is a really old question, but thought this may be useful to someone.
One thing I have found useful is the GetoptLong package, which provides the qq() function. I think it is inspired by Perl, but essentially it provides a way to do a multiline string with easy variable interpolation. For example:
library(GetoptLong)
tableName <- "myTable"
id <- 42
sqlQuery(odcon, qq("
SELECT * FROM #{tableName}
WHERE id = #{id}
LIMIT 1
")
Obviously I should mention the usual caveat that this is a bad idea if you are working directly with user input and it would be better to use some kind of prepared statement in that case.