Remove all instances of a string 7 characters long in a textbox in VB - vb.net

I have two text boxes. The first contains this text just like shown.
I need to remove the first 7 characters of each row then show the edited text in the second box.
The first number is different every time so I can't use this
RawText.Text = Replace(RawText.Text, "1757792", " ")
TextFilter.Text = RawText.Text
because the number changes every row.
Is there a way to have a button remove ALL instances of ANY text 7 characters long?
1757792 02 08 09 10 15 21 22 29 34 40 44 46 47 48 53 56 58 68 69 71
1757793 01 07 16 20 22 25 30 36 38 39 42 48 49 51 58 66 70 72 79 80
1757794 01 02 07 09 10 18 29 32 35 36 48 53 54 56 62 65 68 69 71 73
1757795 01 02 06 09 12 18 23 27 30 35 43 52 57 59 60 61 62 73 74 76
1757796 01 11 13 14 18 19 22 31 34 41 45 46 54 57 61 70 71 72 79 80
1757797 01 08 10 18 19 21 32 41 43 44 45 54 61 62 64 66 68 73 74 80
1757798 02 03 06 09 10 23 27 28 33 36 38 41 49 53 60 61 64 73 74 80
1757799 02 12 16 34 36 44 51 52 55 57 58 59 64 71 73 75 76 78 79 80
1757800 05 11 13 17 18 19 23 24 27 31 34 38 39 45 48 61 67 73 79 80
1757801 17 23 29 31 35 38 43 45 48 51 56 57 60 64 65 66 67 73 77 78
1757802 05 06 11 14 17 20 21 27 28 29 33 41 45 49 58 66 67 73 79 80
1757803 06 07 10 11 12 19 20 21 25 30 33 35 38 42 46 51 65 66 75 80
1757804 06 14 16 19 20 23 32 42 43 44 48 52 62 67 68 69 71 72 74 78

You can use string methods like Substring. If you really want to remove the first 7 you can use String.Substring:
Dim txt2Lines = From l In RawText.Lines
Let index = Math.Min(l.Length, 7)
Select l.Substring(index)
txt2.Lines = txt2Lines.ToArray()
This handles also the case that there are also shorter lines.
Note that it doesn't remove the leading space since that is not part of the first seven characters. You could use l.Substring(index).TrimStart().
Another approach is to search the first space and remove everything before that:
Dim txt2Lines = From l In RawText.Lines
Let index = Math.Max(l.IndexOf(" "), 0)
Select l.Substring(index)
txt2.Lines = txt2Lines.ToArray()
String.IndexOf returns -1 if the substring wasn't found, that's why i've used Math.Max(l.IndexOf(" "), 0). In that case the full line should be taken.

You could use String.Split to split the text at the vbCrLf (line break), then use String.SubString to select the string parter starting at index 8, and there you are.
And as GSerg pointed out, if you would like to replace all 7 digit occurences try this:
Dim ResultString As String
Try
ResultString = Regex.Replace(SubjectString, "\d{7}", "", RegexOptions.Singleline)
Catch ex As ArgumentException
'Syntax error in the regular expression
End Try

Related

kafka consumer .net 'Protocol message end-group tag did not match expected tag.'

I am trying to read data from kafka as you can see :
var config = new ConsumerConfig
{
BootstrapServers = ""*******,
GroupId = Guid.NewGuid().ToString(),
AutoOffsetReset = AutoOffsetReset.Earliest
};
MessageParser<AdminIpoChange> parser = new(() => new AdminIpoChange());
using (var consumer = new ConsumerBuilder<Ignore, byte[]>(config).Build())
{
consumer.Subscribe("AdminIpoChange");
while (true)
{
AdminIpoChange item = new AdminIpoChange();
var cr = consumer.Consume();
item = parser.ParseFrom(new ReadOnlySpan<byte>(cr.Message.Value).ToArray());
}
consumer.Close();
}
I am using google protobuf for send and receive data .This code returns this error in parser line:
KafkaConsumer.ConsumeAsync: Protocol message end-group tag did not match expected tag.
Google.Protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
at Google.Protobuf.ParsingPrimitivesMessages.CheckLastTagWas(ParserInternalState& state, UInt32 expectedTag)
at Google.Protobuf.ParsingPrimitivesMessages.ReadGroup(ParseContext& ctx, Int32 fieldNumber, UnknownFieldSet set)
at Google.Protobuf.UnknownFieldSet.MergeFieldFrom(ParseContext& ctx)
at Google.Protobuf.UnknownFieldSet.MergeFieldFrom(UnknownFieldSet unknownFields, ParseContext& ctx)
at AdminIpoChange.pb::Google.Protobuf.IBufferMessage.InternalMergeFrom(ParseContext& input) in D:\MofidProject\domain\obj\Debug\net6.0\Protos\Rlc\AdminIpoChange.cs:line 213
at Google.Protobuf.ParsingPrimitivesMessages.ReadRawMessage(ParseContext& ctx, IMessage message)
at Google.Protobuf.CodedInputStream.ReadRawMessage(IMessage message)
at AdminIpoChange.MergeFrom(CodedInputStream input) in D:\MofidProject\domain\obj\Debug\net6.0\Protos\Rlc\AdminIpoChange.cs:line 188
at Google.Protobuf.MessageExtensions.MergeFrom(IMessage message, Byte[] data, Boolean discardUnknownFields, ExtensionRegistry registry)
at Google.Protobuf.MessageParser`1.ParseFrom(Byte[] data)
at infrastructure.Queue.Kafka.KafkaConsumer.ConsumeCarefully[T](Func`2 consumeFunc, String topic, String group) in D:\MofidProject\infrastructure\Queue\Kafka\KafkaConsumer.cs:line 168
D:\MofidProject\mts.consumer.plus\bin\Debug\net6.0\mts.consumer.plus.exe (process 15516) exited with code -1001.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.'
Updated:
My sample data that comes from Kafka :
- {"SymbolName":"\u0641\u062F\u0631","SymbolIsin":"IRo3pzAZ0002","Date":"1400/12/15","Time":"08:00-12:00","MinPrice":17726,"MaxPrice":21666,"Share":1000,"Show":false,"Operation":0,"Id":"100d8e0b54154e9d902054bff193e875","CreateDateTime":"2022-02-26T09:47:20.0134757+03:30"}
My rlc Model :
syntax = "proto3";
message AdminIpoChange
{
string Id =1;
string SymbolName =2;
string SymbolIsin =3;
string Date =4;
string Time=5;
double MinPrice =6;
double MaxPrice =7;
int32 Share =8;
bool Show =9;
int32 Operation =10;
string CreateDateTime=11;
enum AdminIpoOperation
{
Add = 0;
Edit = 1;
Delete = 2;
}
}
My data in bytes :
7B 22 53 79 6D 62 6F 6C 4E 61 6D 65 22 3A 22 5C 75 30 36 34 31 5C 75 30 36 32 46 5C 75 30
36 33 31 22 2C 22 53 79 6D 62 6F 6C 49 73 69 6E 22 3A 22 49 52 6F 33 70 7A 41 5A 30 30 30
32 22 2C 22 44 61 74 65 22 3A 22 31 34 30 30 2F 31 32 2F 31 35 22 2C 22 54 69 6D 65 22 3A
22 30 38 3A 30 30 2D 31 32 3A 30 30 22 2C 22 4D 69 6E 50 72 69 63 65 22 3A 31 37 37 32 36
2C 22 4D 61 78 50 72 69 63 65 22 3A 32 31 36 36 36 2C 22 53 68 61 72 65 22 3A 31 30 30 30
2C 22 53 68 6F 77 22 3A 66 61 6C 73 65 2C 22 4F 70 65 72 61 74 69 6F 6E 22 3A 30 2C 22 49
64 22 3A 22 31 30 30 64 38 65 30 62 35 34 31 35 34 65 39 64 39 30 32 30 35 34 62 66 66 31
39 33 65 38 37 35 22 2C 22 43 72 65 61 74 65 44 61 74 65 54 69 6D 65 22 3A 22 32 30 32 32
2D 30 32 2D 32 36 54 30 39 3A 34 37 3A 32 30 2E 30 31 33 34 37 35 37 2B 30 33 3A 33 30 22
7D
The data is definitely not protobuf binary; byte 0 starts a group with field number 15; inside this group is:
field 4, string
field 13, fixed32
field 6, varint
field 12, fixed32
field 6, varint
after this (at byte 151), an end-group token is encountered with field number 6
There are many striking things about this:
your schema doesn't use groups (in fact, the mere existence of groups is now hard to find in the docs), so ... none of this looks right
end-group tokens are always required to match the last start-group field number, which it doesn't
fields inside a single level are usually (although as a "should", not a "must") written in numerical order
you have no field 12 or 13 declared
your field 6 is of the wrong type - we expect fixed64 here, but got varint
So: there's no doubt about it: that data is ... not what you expect. It certainly isn't valid protobuf binary. Without knowing how that data is stored, all we can do is guess, but on a hunch: let's try decoding it as UTF8 and see what it looks like:
{"SymbolName":"\u0641\u062F\u0631","SymbolIsin":"IRo3pzAZ0002","Date":"1400/12/15","Time":"08:00-12:00","MinPrice":17726,"MaxPrice":21666,"Share":1000,"Show":false,"Operation":0,"Id":"100d8e0b54154e9d902054bff193e875","CreateDateTime":"2022-02-26T09:47:20.0134757+03:30"}
or (formatted)
{
"SymbolName":"\u0641\u062F\u0631",
"SymbolIsin":"IRo3pzAZ0002",
"Date":"1400/12/15",
"Time":"08:00-12:00",
"MinPrice":17726,
"MaxPrice":21666,
"Share":1000,
"Show":false,
"Operation":0,
"Id":"100d8e0b54154e9d902054bff193e875",
"CreateDateTime":"2022-02-26T09:47:20.0134757+03:30"
}
Oops! You've written the data as JSON, and you're trying to decode it as binary protobuf. Decode it as JSON instead, and you should be fine. If this was written with the protobuf JSON API: decode it with the protobuf JSON API.

Reverse the order of the rows by chunks of n rows

Consider the following sequence:
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
which produces:
A B C D
0 56 83 99 46
1 40 70 22 51
2 70 9 78 33
3 65 72 79 87
4 0 6 22 73
.. .. .. .. ..
95 35 76 62 97
96 86 85 50 65
97 15 79 82 62
98 21 20 19 32
99 21 0 51 89
I can reverse the sequence with the following command:
df.iloc[::-1]
That gives me the following result:
A B C D
99 21 0 51 89
98 21 20 19 32
97 15 79 82 62
96 86 85 50 65
95 35 76 62 97
.. .. .. .. ..
4 0 6 22 73
3 65 72 79 87
2 70 9 78 33
1 40 70 22 51
0 56 83 99 46
How would I rewrite the code if I wanted to reverse the sequence every nth row, e.g. every 4th row?
IIUC, you want to reverse by chunk (3, 2, 1, 0, 8, 7, 6, 5…):
One option is to use groupby with a custom group:
N = 4
group = df.index//N
# if the index is not a linear range
# import numpy as np
# np.arange(len(df))//N
df.groupby(group).apply(lambda d: d.iloc[::-1]).droplevel(0)
output:
A B C D
3 45 33 73 77
2 91 34 19 68
1 12 25 55 19
0 65 48 17 4
7 99 99 95 9
.. .. .. .. ..
92 89 68 48 67
99 99 28 52 87
98 47 49 21 8
97 80 18 92 5
96 49 12 24 40
[100 rows x 4 columns]
A very fast method, based only on indexing is to use numpy to generate a list of the indices reversed by chunk:
import numpy as np
N = 4
idx = np.arange(len(df)).reshape(-1, N)[:, ::-1].ravel()
# array([ 3, 2, 1, 0, 7, 6, 5, 4, 11, ...])
# slice using iloc
df.iloc[idx]

Prevent Envoy from modifying the sharding key

We use a two-layer Envoy setup.
[front-end] -> E -> [middleware] -> E -> [backend]
Middleware is supposed to take the sharding key from the HTTP metadata and re-transmit it when talking to the backend.
What we have noticed is that Envoy modifies the HTTP header, which is crashing our service inside gRPC.
E1016 11:19:45.808599731 19 call.cc:912] validate_metadata: {"created":"#1602847185.808584663","description":"Illegal header value","file":"external/com_github_grpc_grpc/src/core/lib/surface/validate_metadata.cc","file_line":44,"offset":56,"raw_bytes":"36 37 36 38 33 61 34 34 36 35 36 35 37 30 34 33 36 66 36 34 36 35 34 31 34 39 33 61 36 35 36 33 36 63 36 39 37 30 37 33 36 35 32 64 37 30 36 63 37 35 36 37 36 39 36 65 a5 '67683a44656570436f646541493a65636c697073652d706c7567696e.'\u0000"}
E1016 11:19:45.808619606 19 call_op_set.h:947] assertion failed: false
Any way to avoid this?
UPDATE:
Seems to be only happening with x- headers.
The problem was actually not related to Envoy in the end. Turns out that gRPC strings are not null terminated.

Repeating the format specifiers in awk

I am trying to format the output of the AWK's printf() function. More precisely, I am trying to print a matrix with very long rows and I would like to wrap them and continue on the next line. What I am trying to do is best illustrated using Fortran. Consider the following Fortran statement:
write(*,'(10I5)')(i,i=1,100)
The output would be the integers in the range 1:100 printed in rows of 10 elements.
Is it possible to do the same in AWK. I could do that by offsetting the index and printing to new line with "\n". The question is whether that can be done in an elegant manner as in Fortran.
Thanks,
As suggested in the comments I would like to explain my Fortran code, given as an example above.
(i,i=1,100) ! => is a do loop going from 1 to 100
write(*,'(10I5)') ! => is a formatted write statement
10I5 says print 10 integers and for each integer allocate 5 character slot
The trick is, that when one exceeds the 10 x 5 character slots given by the formatted write, one jumps on the next line. So one doesn't need the trailing "\n".
This may help you
[akshay#localhost tmp]$ cat test.for
implicit none
integer i
write(*,'(10I5)')(i,i=1,100)
end
[akshay#localhost tmp]$ gfortran test.for
[akshay#localhost tmp]$ ./a.out
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
[akshay#localhost tmp]$ awk 'BEGIN{for(i=1;i<=100;i++)printf("%5d%s",i,i%10?"":"\n")}'
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100

Strange encoding in PDF property fields

I have a question about how the document properties (Title, Author, etc.) are stored in a PDF file. It looks like UTF-16 in big-endian byte order.
So "MyName" will be encoded as:
FE FF 00 4D 00 79 00 4E 00 61 00 6D 00 65
However, I run into this character "-" which should have value FF 0D, but I find in its place these hex number FF 5C 72
So "My-Name" looks like this:
FE FF 00 4D 00 79 FF 5C 72 00 4E 00 61 00 6D 00 65
Anybody knows why FF 5C 72 is used here? Why 3 bytes when everywhere else is UTF-16? Why these values?
You are not interpreting correctly what you see:
FE FF is the start of a sequence.
00 is a null byte.
4D in your case most likely translates to M.
79 in your case most likely translates to y.
4E in your case most likely translates to N.
61 in your case most likely translates to a.
6D in your case most likely translates to m.
65 in your case most likely translates to e.
Compare this to the output of my simple ascii command line tool, which prints a list of all ASCII aliases as a table with their hex and dec encodings:
$ ascii -h
Usage: ascii [-dxohv] [-t] [char-alias...]
-t = one-line output -d = Decimal table -o = octal table -x = hex table
-h = This help screen -v = version information
Prints all aliases of an ASCII character. Args may be chars, C \-escapes,
English names, ^-escapes, ASCII mnemonics, or numerics in decimal/octal/hex.
Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex
0 00 NUL 16 10 DLE 32 20 48 30 0 64 40 # 80 50 P 96 60 ` 112 70 p
1 01 SOH 17 11 DC1 33 21 ! 49 31 1 65 41 A 81 51 Q 97 61 a 113 71 q
2 02 STX 18 12 DC2 34 22 " 50 32 2 66 42 B 82 52 R 98 62 b 114 72 r
3 03 ETX 19 13 DC3 35 23 # 51 33 3 67 43 C 83 53 S 99 63 c 115 73 s
4 04 EOT 20 14 DC4 36 24 $ 52 34 4 68 44 D 84 54 T 100 64 d 116 74 t
5 05 ENQ 21 15 NAK 37 25 % 53 35 5 69 45 E 85 55 U 101 65 e 117 75 u
6 06 ACK 22 16 SYN 38 26 & 54 36 6 70 46 F 86 56 V 102 66 f 118 76 v
7 07 BEL 23 17 ETB 39 27 ' 55 37 7 71 47 G 87 57 W 103 67 g 119 77 w
8 08 BS 24 18 CAN 40 28 ( 56 38 8 72 48 H 88 58 X 104 68 h 120 78 x
9 09 HT 25 19 EM 41 29 ) 57 39 9 73 49 I 89 59 Y 105 69 i 121 79 y
10 0A LF 26 1A SUB 42 2A * 58 3A : 74 4A J 90 5A Z 106 6A j 122 7A z
11 0B VT 27 1B ESC 43 2B + 59 3B ; 75 4B K 91 5B [ 107 6B k 123 7B {
12 0C FF 28 1C FS 44 2C , 60 3C < 76 4C L 92 5C \ 108 6C l 124 7C |
13 0D CR 29 1D GS 45 2D - 61 3D = 77 4D M 93 5D ] 109 6D m 125 7D }
14 0E SO 30 1E RS 46 2E . 62 3E > 78 4E N 94 5E ^ 110 6E n 126 7E ~
15 0F SI 31 1F US 47 2F / 63 3F ? 79 4F O 95 5F _ 111 6F o 127 7F DEL
Oh, surprise!
This table matches my "assumptions" from above perfectly. So you can savely re-consider your own ones about "UTF-16 in big-endian byte order".
And that means for your hex number in question, FF 5C 72?!?
Well, look it up: FF you can skip, 5C 72 is \r... Which means? (Answer left as excercise to the reader)