Kotlin query: Replace all the words in the string starting and ending with $ e.g. $lorem$ to <i>lorem<\i> - kotlin

I am stuck on the following code challenge in Kotlin:
Replace all the words in the string starting and ending with $ e.g. $lorem$ to <i>lorem</i>
var incomingString = "abc 123 $Lorem$, $ipsum$, $xyz$ 547"
// My non working code:
fun main(args: Array<String>) {
val incomingString = "abc 123 \$Lorem$, \$ipsum$, \$xyz$ 547";
var finalString = "";
println(filteredValue)
if (incomingString.contains("$")){
val intermediateString = incomingString.replace("\$", "<i>")
finalString = "$intermediateString</i>"
}
println(finalString)
}
Output is:
abc 123 <i>Lorem<i>, <i>ipsum<i>, <i>xyz<i> 547</i>
Desired output:
abc 123 <i>Lorem</i>, <i>ipsum</i>, <i>xyz</i> 547</i>

I am not going to do your home work for you, but the reason why you have a challenge including $ is that symbol has two special purposes
Read up about String Interpolation in Kotlin: https://kotlinlang.org/docs/idioms.html#string-interpolation ... you will need to take care to prevent the $ being used for interpolation ... and seems you already got that part
$ is also a special character in Regular Expressions. Regular Expressions are an esoteric area of programming - meaning very complicated to get your head around, but very very powerful. Worth the effort. Using Regular Expression (Regex) approach for this program is what will get you lots of marks if you can also be sure to escape the $. Here is the Kotlin Regex replace function docs:
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/-regex/replace.html

Every time you come across a $ in your input string, you need to alternate between replacing it with <i> and replacing it with </i>. Therefore, you need to have a variable that tells you what state you're in, and every time you make a replacement, you flip that variable. You may find Kotlin's String.replaceFirst method useful.

Related

Escape hex like \u... in kotlin strings

I have a string "\ufffd\ufffd hello\n"
i have a code like this
fun main() {
val bs = "\ufffd\ufffd hello\n"
println(bs) // �� hello
}
and i want to see "\ufffd\ufffd hello", how can i escape \u for every hex values
UPD:
val s = """\uffcd"""
val req = """(?<!\\\\)(\\\\\\\\)*(\\u)([A-Fa-f\\d]{4})""".toRegex()
return s.replace(unicodeRegex, """$1\\\\u$3""")
(I'm interpreting the question as asking how to clearly display a string that contains non-printable characters.  The Kotlin compiler converts sequences of a \u followed by 4 hex digits in string literals into single characters, so the question is effectively asking how to convert them back again.)
Unfortunately, there's no built-in way of doing this.  It's fairly easy to write one, but it's a bit subjective, as there's no single definition of what's ‘printable‘…
Here's an extension function that probably does roughly what you want:
fun String.printable() = map {
when (Character.getType(it).toByte()) {
Character.CONTROL, Character.FORMAT, Character.PRIVATE_USE,
Character.SURROGATE, Character.UNASSIGNED, Character.OTHER_SYMBOL
-> "\\u%04x".format(it.toInt())
else -> it.toString()
}
}.joinToString("")
println("\ufffd\ufffd hello\n".printable()) // prints ‘\ufffd\ufffd hello\u000a’
The sample string in the question is a bad example, because \uFFFD is the replacement character — a black diamond with a question mark, usually shown in place of any non-displayable characters.  So the replacement character itself is displayable!
The code above treats it as non-displayable by excluding the Character.OTHER_SYMBOL type — but that will also exclude many other symbols.  So you'll probably want to remove it, leaving just the other 5 types.  (I got those from this answer.)
Because the trailing newline is non-displayable, that gets converted to a hex code too.  You could extend the code to handle the escape codes \t, \b, \n, \r and maybe \\ too if needed.  (You could also make it more efficient… this was done for brevity!)
Simply escape the \ in your strings by adding another backslash in front of it:
val bs = "\\ufffd\\ufffd hello\n"
You can also use raw strings with """ so you don't have to escape the backslashes (which is useful for regex):
val bs = """\ufffd\ufffd hello\n"""
Note that in that case the \n would also NOT be counted as an LF character, and will be literally printed as the 2 characters "\n".
You can add literal line breaks in your raw string if you want an actual line feed, though:
val bs = """\ufffd\ufffd hello
"""

Perl6 regex not matching end $ character with filenames

I've been trying to learn Perl6 from Perl5, but the issue is that the regex works differently, and it isn't working properly.
I am making a test case to list all files in a directory ending in ".p6$"
This code works with the end character
if 'read.p6' ~~ /read\.p6$/ {
say "'read.p6' contains 'p6'";
}
However, if I try to fit this into a subroutine:
multi list_files_regex (Str $regex) {
my #files = dir;
for #files -> $file {
if $file.path ~~ /$regex/ {
say $file.path;
}
}
}
it no longer works. I don't think the issue with the regex, but with the file name, there may be some attribute I'm not aware of.
How can I get the file name to match the regex in Perl6?
Regexes are a first-class language within Perl 6, rather than simply strings, and what you're seeing here is a result of that.
The form /$foo/ in Perl 6 regex will search for the string value in $foo, so it will be looking, literally, for the characters read\.p6$ (that is, with the dot and dollar sign).
Depending on the situation of the calling code, there are a couple of options:
If you really are receiving regexes as strings, for example read as input or from a file, then use $file.path ~~ /<$regex>/. This means it will treat what's in $regex as regex syntax.
If you will just be passing a range of different regexes in, change the parameter to be of type Regex, and then do $file.path ~~ $regex. In this case, you'd pass them like list_files_regex(/foo/).
Last but not least, dir takes a test parameter, and so you can instead write:
for dir(test => /<$regex>/) -> $file {
say $file.path;
}

Password regex not working in Kotlin

I am trying to run the below code to validate a password string against my regex. But it's always returning false. What am I doing wrong ?
fun main(args: Array<String>) {
val PASSWORD_REGEX = """^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%!\-_?&])(?=\\S+$).{8,}""".toRegex()
val password:String = "Align#123"
println(PASSWORD_REGEX.matches(password))
}
You are using raw strings but are escaping the last \S which is causing a literal match of \S. If I remove the extra backslash, your test case works for me. And as others have stated, you might be able to remove that stanza entirely.
So this...
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%!\-_?&])(?=\\S+$).{8,}
^
|
Remove -+
Becomes this
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%!\-_?&])(?=\S+$).{8,}
I used Regex101 to help me, which seems to be a nice way to turn regex into English.

Split a BibTeX author field into parts

I am trying to parse a BibTeX author field using the following grammar:
use v6;
use Grammar::Tracer;
# Extract BibTeX author parts from string. The parts are separated
# by a comma and optional space around the comma
grammar Author {
token TOP {
<all-text>
}
token all-text {
[<author-part> [[\s* ',' \s*] || [\s* $]]]+
}
token author-part {
[<-[\s,]> || [\s* <!before ','>]]+
}
}
my $str = "Rockhold, Mark L";
my $result = Author.parse( $str );
say $result;
Output:
TOP
| all-text
| | author-part
| | * MATCH "Rockhold"
| | author-part
But here the program hangs (I have to press CTRL-C) to abort.
I suspect the problem is related to the negative lookahead assertion. I tried to remove it, and then the program does not hang anymore, but then I am also not able to extract the last part "Mark L" with an internal space.
Note that for debugging purposes, the Author grammar above is a simplified version of the one used in my actual program.
The expression [\s* <!before ','>] may not make any progress. Since it's in a quantifier, it will be retried again and again (but not move forward), resulting in the hang observed.
Such a construct will reliably hang at the end of the string; doing [\s* <!before ',' || $>] fixes it by making the lookahead fail at the end of the string also (being at the end of the string is a valid way to not be before a ,).
At least for this simple example, it looks like the whole author-part token could just be <-[,]>+, but perhaps that's an oversimplification for the real problem that this was reduced from.
Glancing at all-text, I'd also point out the % quantifier modifier which makes matching comma-separated (or anything-separated, really) things easier.

Does mIRC Scripting have an escape character?

I'm trying to write a simple multi-line Alias that says several predefined strings of characters in mIRC. The problem is that the strings can contain:
{
}
|
which are all used in the scripting language to group sections of code/commands. So I was wondering if there was an escape character I could use.
In lack of that, is there a method, or alternative way to be able to "say" multiple lines of these strings, so that this:
alias test1 {
/msg # samplestring}contains_chars|
/msg # _that|break_continuity}{
}
Outputs this on typing /test1 on a channel:
<MyName> samplestring}contains_chars|
<MyName> _that|break_continuity}{
It doesn't have to use the /msg command specifically, either, as long as the output is the same.
So basically:
Is there an escape character of sorts I can use to differentiate code from a string in mIRC scripting?
Is there a way to tell a script to evaluate all characters in a string as a literal? Think " " quotes in languages like Java.
Is the above even possible using only mIRC scripting?
"In lack of that, is there a method, or alternative way to be able to "say" multiple lines of these strings, so that this:..."
I think you have to have to use msg # every time when you want to message a channel. Alterativelty you can use the /say command to message the active window.
Regarding the other 3 questions:
Yes, for example you can use $chr(123) instead of a {, $chr(125) instead of a } and $chr(124) instead of a | (pipe). For a full list of numbers you can go to http://www.atwebresults.com/ascii-codes.php?type=2. The code for a dot is 46 so $chr(46) will represent a dot.
I don't think there is any 'simple' way to do this. To print identifiers as plain text you have to add a ! after the $. For example '$!time' will return the plain text '$time' as $time will return the actual value of $time.
Yes.