Below is my awk code to sort and split input files and rename it to new output files. But I have a problem as I don't want to keep " " in the filename but my filename for output files are created based on first four columns of input files. file= path ""$1""$2""$3""$4"_03042017.csv". I have been trying to use gsub to remove " " but it will also remove " " inside files which are not my desire outcome. I just want to remove " " in filename only. Anyone can please help me? Appreciate it a lot.
awk -F"|" 'BEGIN { OFS = "|" } NR==1 {
for( i=1;i<5;i++) $i = ""
h = substr($0, index($0,$5)); sub(/^[[:blank:]]+/,"", h)
next
}
{
file= path ""$1""$2"_"$3"_"$4"_03042017.csv"
# remove 4 first field
for( i=1;i<5;i++) $i = ""
# cleaning starting space
Cleaned = substr($0, index($0,$5)); sub( /^[[:blank:]]+/, "", Cleaned)
print ( a[file]++ ? "" : "DM9 03042017" ORS h ORS ) Cleaned > file
}
END {
for(file in a) print "EOF " a[file] > file
} ' file1
Related
I am trying to do a text processing script, for what seems to be a rather simple task.
I have a file, which contains the following repeated pattern
111 0 1000 other stuff #<- here a new element begins
some text & #<- "&" or white spaces increment -
some more #<- signal continue on next line
last line
221 1 1.22E22 # new element $2!=0 must be followed by float
text &
contiuned text
c comment line in between
more text &
last line
2221 88 -12.123 &
line1
line2
c comment line
last line
223 0 lll -111 $ element given by line
22 22 -3.14 $ element given by new line
I would like to get
111 0 1000 other stuff #<- here a new element begins
some text & #<- "&" or white spaces increment -
some more #<- signal continue on next line
last line &
xyz=1
221 1 1.22E22 # new element $2!=0 must be followed by float
text &
contiuned text
c comment line in between
more text &
last line &
xyz=1
2221 88 -12.123 &
line1
line2
c comment line
last line &
xyz=1
223 0 lll -111 & $ element given by line
xyz=1
22 22 -3.14 & $ element given by new line
xyz=1
I would like to develop an awk script that appends a string to the last line of each element. To do so my script looks for the new element pattern, and continues to read until one of the next element indicators are found. Unfortunately, it does not function properly because it prints the last line two times and fails to append to the very last line of the file.
function newelement(line) {
split(line, s, " ")
if (s[1] ~/^[0-9]+$/ && ((s[2] ~/^[0-9]+$/ && s[3] ~/\./) || (s[2] == 0 && s[3] !~/\./))) {
return 1
} else {
return -1
}
}
function contline(line) {
if (line~/&/ || line~/^[cC]/ || line~/^\s{3,10}[^\s]./) {
return 1
} else {
return -1
}
}
BEGIN {
subs = " xyz=1 "
} #increment to have the next line in store
FNR == 1 {
getline nextline < FILENAME
}
{
# get the next line
getline nextline < FILENAME
if (newelement($0) == 1 && NR < 3673) {
if (length(old) > 0 || $0~/^$/) {
printf("%s &\n%20s\n", old, subs)
print $0
}
# to capture one line elements with no following continuation
# i.e.
# 221 91 0.5 33333
# 22 0 11
#look at the next line
else if (($0!~/&/ && contline(nextline) == -1)) {
printf("%s &\n%20s\n", $0, subs)
}
}
else {
print "-" $0
}
# store last not - commented line
if ($0!~/^\s{0,20}[cC]/) old = $0
}
Where the comment line has c or c followed by an empty space. Comment lines should be preserved but no strings should be appended to them.
Please check the following code and let me know if it works for you:
$ cat 3.1.awk
BEGIN{
subs = " xyz=1 "
threshold = 3673
}
# return boolean if the current line is a new element
function is_new_element(){
return ($1~/^[0-9]+$/) && (($2 ~ /^[0-9]+$/ && $3~/\./) || ($2 == 0 && $3 !~/\./))
}
# return boolean if the current line is a comment or empty line
function is_comment() {
return /^\s*[cC] / || /^\s*$/
}
# function to append extra text to line
# and followed by comments if applicable
function get_extra_text( extra_text) {
extra_text = sprintf("%s &\n%20s", prev, subs)
text = (text ? text ORS : "") extra_text
if (prev_is_comment) {
text = text ORS comment
prev_is_comment = 0
comment = ""
}
return text
}
NR < threshold {
# replace the above line with the following one if
# you want to process up to the first EMPTY line
#NR==1,/^\s*$/ {
# if the current line is a new element
if (is_new_element()) {
# save the last line and preceeding comments
# into the variable 'text', skip the first new element
if (has_hit_first_new_element) text = get_extra_text()
has_hit_first_new_element = 1
prev_is_new = 1
# before hitting the first new_element line, all lines
# should be printed as-is
} else if (!has_hit_first_new_element) {
print
next
# if current line is a comment
} else if (is_comment()) {
comment = (comment ? comment ORS : "") $0
prev_is_comment = 1
next
# if the current line is neither new nor comment
} else {
# if previous line a new element
if (prev_is_new) {
print (text ? text ORS : "") prev
text = ""
# if previous line is comment
} else if (prev_is_comment) {
print prev ORS comment
prev_is_comment = 0
comment = ""
} else {
print prev
}
prev_is_new = 0
}
# prev saves the last non-comment line
prev = $0
next
}
# print the last block if NR >= threshold
!is_last_block_printed {
print get_extra_text()
is_last_block_printed = 1;
}
# print lines when NR > threshold or after the first EMPTY line
{ print "-" $0 }
Where
The lines are divided into 3 categories and processed differently:
is_new_element() to true when the current line is a new element, the flag prev_is_new identify the previous new element
is_comment() function to true, then the current line is a comment, prev_is_comment to identify the previous comment line
other lines: all other lines except the above two
Others notes:
You can select a NR < threshold(which is 3673 in your code), or a range pattern NR==1,/^\s*$/ to process only a range of lines.
is_last_block_printed flag and related code are to make sure the last processing block is printed either at the end of the above range or in the END{} block
I did not check the trailing & for the continuing line, if they are followed by a comment or a new element, the logic has to be defined, i.e. which one should take precedence
If there are other lines before the first is_new_element() line, the code will not work well. This can be fixed by adding another flag instead of using if (NR > 1) to update text.
Testing Sample:
$ cat 3.1.txt
111 0 1000 other stuff #<- here a new element begins
some text & #<- "&" or white spaces increment -
some more #<- signal continue on next line
last line
221 1 1.22E22 # new element $2!=0 must be followed by float
text &
contiuned text
c comment line in between
more text &
last line
2221 88 -12.123 &
line1
line2
c comment line 1
last line
c comment line 2
c comment line 3
c comment line 4
c comment line 5
223 0 lll -111
223 0 22 -111
223 0 22 -111
c comment line in between 1
c comment line in between 2
22 22 -3.14
c comment line at the end
Output:
$ awk -f 3.1.awk 3.1.txt
111 0 1000 other stuff #<- here a new element begins
some text & #<- "&" or white spaces increment -
some more #<- signal continue on next line
last line &
xyz=1
221 1 1.22E22 # new element $2!=0 must be followed by float
text &
contiuned text
c comment line in between
more text &
last line &
xyz=1
2221 88 -12.123 &
line1
line2
c comment line 1
last line &
xyz=1
c comment line 2
c comment line 3
c comment line 4
c comment line 5
223 0 lll -111 &
xyz=1
223 0 22 -111 &
xyz=1
223 0 22 -111 &
xyz=1
c comment line in between 1
c comment line in between 2
22 22 -3.14 &
xyz=1
c comment line at the end
Some extra explanantion:
One concern to process the text is the trailing newline "\n" when appending subs to prev line. it's especially important when consecutive new_element lines happen.
Important to notice, the variable prev in the code is defined as the previous non-comment line (category-1, 3 defined above). there could be zero or multiple comment (category-2) lines between the prev line and the current line. that's also why we use print prev ORS comment instead of print comment ORS prev when printing regular comments (not those preceding the new_element line).
A block of comment lines (1 or more consecutive comment lines) are saved into the variable comment. if it's right before the new_element line, then append the block to the variable text. All other block of comments will be printed in the line print prev ORS comment mentioned above
The function get_extra_text() is to process the extra_text, which is in the order: prev subs ORS comments, where comments is appended only when prev_is_comment flag is 1. Do notice that the same variable text could have saved multiple prev subs ORS comments blocks if there are consecutive new_element lines.
We only print on the category-3 line mentioned above(neither a new_element nor a comment). This is a safe place when we have no worry about the trailing newline or extra_text:
if the prev_is_new, we print the cached text and then the variable prev (which is a new_element)
if the prev_is_comment, we just print the prev ORS comment. notice again the variable prev saves the last non-comment line from the current line, it does not have to be the line right above the current line.
all other case, just print the prev line as-is
Since we are concatenating lines into text and comment variables, we use the following syntax to avoid the leading ORS (which is "\n" by default)
text = (text ? text ORS : "") prev
If the leading ORS is not a concern, you can just use the following:
text = text ORS prev
and because the lines are appended to these variables, we will need to reset
them (i.e. text = "") each time after we consume them, otherwise, the
concatenated variable will contain all previously processed lines.
Final notes
added a flag has_hit_first_new_element, in case there are lines before the first new_element line, they will be printed as-is. In this code, the first new_element line should be treated differently, using NR==1 is not a safe-belt.
removed the code in the END{} block which is redundant
Try this:
function newelement(line){
split(line,s," ")
if(s[1]~/^[0-9]+$/ && ((s[2]~/^[0-9]+$/ && s[3]~/\./)|| (s[2]==0 && s[3]!~/\./))){return 1}
else{return -1}
}
BEGIN{
subs=" xyz=1 "
}
{
if (length($0)==0) next # Skip empty lines, remove or change it according to your needs.
if (newelement($0)==1){
if (length(last_data)>0) {
printf("%s &\n%20s\n",last_data,subs)
if (last_type=="c") {
print comments
}
}
last_data=$0
last_type="i"
} else if($0 ~/^\s*[cC] /) {
if (last_type=="c") comments = comments ORS $0
else comments = $0
last_type="c"
} else {
if (last_type=="c") print comments
else if(length(last_data)>0) print last_data
last_data=$0
last_type="d"
}
}
END{
printf("%s &\n%20s\n",last_data,subs)
if (last_type=="c") print comments
}
Three variables:
last_data to hold last data line.
last_type to hold the type of last line, i for indicator, c for comments.
comments to hold comments line(s).
Issue, where the character I am removing does not exist I get a blank string
Aim: To look for three characters in order and only get the characters to the left of the character I am looking for. However if the character does not exist then to do nothing.
Code:
Dim vleftString As String = File.Name
vleftString = Left(vleftString, InStr(vleftString, "-"))
vleftString = Left(vleftString, InStr(vleftString, "_"))
vleftString = Left(vleftString, InStr(vleftString, " "))
As a 'fix' I have done
Dim vleftString As String = File.Name
vleftString = Replace(vleftString, "-", " ")
vleftString = Replace(vleftString, "_", " ")
vleftString = Left(vleftString, InStr(vleftString, " "))
vleftString = Trim(vleftString)
Based on Left of a character in a string in vb.net
If File.Name is say 1_2.pdf it passes "-" and then works on line removing anything before "" (though not "" though I want it to)
When it hits the line for looking for anything left of space it then makes vleftString blank.
Since i'm not familiar (and avoid) the old VB functions here a .NET approach. I assume you want to remove the parts behind the separators "-", "_" and " ", then you can use this loop:
Dim fileName = "1_2.pdf".Trim() ' Trim used to show you the method, here nonsense
Dim name = Path.GetFileNameWithoutExtension(fileName).Trim()
For Each separator In {"-", "_", " "}
Dim index = name.IndexOf(separator)
If index >= 0 Then
name = name.Substring(0, index)
End If
Next
fileName = String.Format("{0}{1}", name, Path.GetExtension(fileName))
Result: "1.pdf"
I am currently using VBA script to transfer a CSV table to Access (from inside Excel)
objAccess.DoCmd.TransferText acImportDelim, "", _
"table1", "C:\donPablo\StackOverFlow\StackCSV.csv", True
The problem is that Access incorrectly defines Types for my columns.
Some of my columns have text and numbers rows, that's why half of the imports are damaged with an error code: "Type Conversion Failure"
I have read on the internet that you can fix that by
Creating the table with the exact same name and with predefined types for columns
objAccess.DoCmd.RunSQL "CREATE TABLE " + cstrTable + "(id Text);"
That didn't work. The same error.
Adding first column of type Text into the CSV file
So I added a row which is 100% text. The same error.
It seems like there is some kind of "clever" conversion going on inside Access and I can't bypass it.
The only possible scenario to bypass this conversion would be to convert all entries inside CSV file using this logic:
Before:
value1,value2,"value3", value4
After
"value1","value2","value3", "value4"
Is there a way to do this operation? regex of some kind maybe?
I created a hardcode solution for the above mentioned problem (no regex)
Dim current_Char As Variant
ignoretext = False
For Counter = 1 To Len(currentLine)
current_Char = Mid(currentLine, Counter, 1)
next_Char = Mid(currentLine, Counter + 1, 1)
If ignoretext = False And current_Char = """" Then 'opening of existing quote
ignoretext = True
newLine = newLine & current_Char
ElseIf ignoretext = True And current_Char = """" Then 'ending of existing quote
ignoretext = False
newLine = newLine & current_Char
ElseIf ignoretext = True Then
newLine = newLine & current_Char
ElseIf ignoretext = False Then
If current_Char = "," Then
If last_Char <> """" Then
newLine = newLine & """"
End If
newLine = newLine & current_Char
If next_Char <> """" Then
newLine = newLine & """"
End If
Else
newLine = newLine & current_Char
End If
Else
newLine = newLine & current_Char
End If
last_Char = current_Char
Next
If Mid(currentLine, 1, 1) <> """" Then
newLine = """" & newLine
End If
If Mid(currentLine, Len(currentLine), 1) <> """" Then
newLine = newLine & """"
End If
Some of the variable definitions probably missing but the logic is still there ;)
what is does basically:
before
"fsdf, dfafs",val,",fsd",156,fsd
after
"fsdf, dfafs","val",",fsd","156","fsd"
all of your fields are now Text type ;)
logic ignores all the commas once existing quote is detected
it will continue to ignore unless ending quote is found
for all other commas logic will add a before_quote if previous Char was not a quote
for all other commas logic will add a after_quote if next Char will not be a quote
all other chars will be appended to string AS IS
finally, at the end of the logic we will add quote at Start or End of string depending on existence of existing quotes at mentioned positions
have fun
I'm trying to extract the artwork file from my iTunes MP3 files using AutoHotkey (v1.1). The script works well until it gets to the SaveArtworkToFile method.
objITunesApp := ComObjCreate("iTunes.Application")
objLibrary := objITunesApp.Sources.Item(1)
objPlaylist := objLibrary.Playlists.ItemByName("! iTunesCovers")
objTracks := objPlaylist.Tracks
Loop, % objTracks.Count
{
objTrack := objTracks.Item(A_Index)
Loop, % objTrack.Artwork.Count
{
objArtwork := objTrack.Artwork.Item(A_Index)
TrayTip, % "Track Index: " . objTrack.index
, % "Artwork: " . A_Index . "/" . objTrack.Artwork.Count . "`n"
. "Format: " . objArtwork.Format . "`n"
. "IsDownloadedArtwork: " . objArtwork.IsDownloadedArtwork . "`n"
. "Description: " . objArtwork.Description
strFilePath := objTrack.index . "-" . A_Index
if (objArtwork.Format = 1)
strExtension := "bmp"
else if (objArtwork.Format = 2)
strExtension := "jpg"
else if (objArtwork.Format = 4)
strExtension := "gif"
else if (objArtwork.Format = 5)
strExtension := "png"
else
strExtension := ""
strResult := objArtwork.SaveArtworkToFile(strFilePath . "." . strExtension)
MsgBox, % strFilePath . "." . strExtension . "`nResult: " . strResult
}
}
I get this error message:
---------------------------
SaveArtworkToFile.ahk
---------------------------
Error: 0x8000FFFF - Défaillance irrémédiable
Source: (null)
Description: (null)
HelpFile: (null)
HelpContext: 0
Specifically: SaveArtworkToFile
Line#
---> 017: strResult := objArtwork.SaveArtworkToFile(strFilePath)
---------------------------
I have the same result with bpm and jpg file formats. And strResult returned by SaveArtworkToFile is empty. Should this method be supported by the AHK iTunes.Application COM object?
Thanks and Happy New Year!
#Manuell: Oh! Thanks for putting back the doc to my attention. In the
Parameters: filePath Full path to the artwork image file.
I missed the word "Full". In my script, I was relying on relative path. I just tested it with an absolute path and this work!
Googled it for you: IITArtwork::SaveArtworkToFile
HRESULT IITArtwork::SaveArtworkToFile ( [in] BSTR filePath )
Save artwork data to an image file.
The format of the saved data is specified by the artwork's format
(JPEG, PNG, or BMP). The directory that contains the file must already
exist, it will not be created. If the file already exists, its
contents will be replaced.
Parameters: filePath Full path to the artwork image file.
That method doen't return a value (as Hans said in comment). Try:
objArtwork.SaveArtworkToFile(strFilePath . "." . strExtension)
I am populating a listbox with some text & saving the output to textfile (sObj.txt)
'Saving items of lb1 in a file under C:\temp
Dim i As Integer
W = New IO.StreamWriter("C:\temp\sObj.txt")
For i = 0 To lb1.Items.Count - 1
W.WriteLine(lb1.Items.Item(i))
Next
W.Close()
This text file contains 3 (for example) entries, let's say abc in 1st line, def in 2nd line & ghi in the 3rd line.
Now I want to append another text file (MPadd.txt) using sObj.txt entries such that I get something like the following:
'Appending listbox items to the file MPadd.txt
Using SW As New IO.StreamWriter("c:\temp\MPadd.txt", True)
SW.WriteLine("some text" & abc & "some text")
SW.WriteLine("some text" & def & "some text")
SW.WriteLine("some text" & ghi & "some text")
End Using
Please help in getting it correctly. thanks.
Just read all the lines from the first file (just three lines so it is not a problem) and then loop over these lines adding prefix and postfix text as you like
EDIT
Following your last example
Dim commands() =
{
"cdhdef -t ftpv2 -c r -f {0} -x ",
"cdhdsdef -v CPUSRG {0} ",
"cacls K:\AES\data\Cdh\ftp\{0}\Archive /E /G OSSUSER:C"
}
Dim counter As Integer = 0
Dim objLines = File.ReadAllLines("C:\temp\sObj.txt")
Using SW As New IO.StreamWriter("c:\temp\MPadd.txt", True)
' Loop only for the number of strings in commands (max 3 now)
for x = 0 to commands.Length - 1
line = objeLines(x).Trim
' This check will prevent empty lines to be used for the output
If line <> string.Empty Then
SW.WriteLine(string.Format(commands(counter), line))
counter += 1
End If
Next
End Using
This example use composite formatting where you define a format string and a progressive placeholder where you want to insert another value.
Of course this will work only if you have just 3 lines in your input file