What does tilde mean in julia? - dataframe

What does tilde (~) mean in julia? For example here:
using DataFrames
dt=readtable("timelog 54660.csv", separator=';')
dt[~[(x in [:Date, :Topic, :Hours]) for x in names(dt)]]
I can't find revelant info in julia docs. Search for tilde or "~" seems to give nothing

You can use the help mode in the REPL by typing a question mark,
?~ gives the following output :
help?> ~
search: ~
~(x)
Bitwise not.
julia> ~4
-5
julia> ~10
-11
julia> ~true
false

Related

Return lines with at least n consecutive occurrences of the pattern in bash [duplicate]

This question already has answers here:
Specify the number of pattern repeats in JavaScript Regex
(2 answers)
Closed 1 year ago.
Might be naive question, but I can't find an answer.
Given a text file, I'd like to find lines with at least (defined number) of occurrences of a certain pattern, say, AT[GA]CT.
For example, in n=2, from the file:
ATGCTTTGA
TAGATGCTATACTTGA
TAGATGCTGTATACTTGA
Only the second line should be returned.
I know how to use grep/awk to search for at least one instance of this degenerate pattern, and for some defined number of pattern instances occurring non-consecutively. But the issue is the pattern occurrences MUST be consecutive, and I can't figure out how to achieve that.
Any help appreciated, thank you very much in advance!
I would use GNU AWK for this task following way, let file.txt content be
ATGCTTTGA
TAGATGCTATACTTGA
TAGATGCTGTATACTTGA
then
awk 'BEGIN{p="AT[GA]CT";n=2;for(i=1;i<=2;i+=1){pat=pat p}}$0~pat' file.txt
output
TAGATGCTATACTTGA
Explanation: I use for loop to repeat p n times, then filter line by checking if line ($0) does match with what I created earlier.
Alternatively you might use string formatting function sprintf as follows:
awk 'BEGIN{n=2}$0~sprintf("(AT[GA]CT){%s}",n)' file.txt
Explanation: I used sprintf function, %s in first argument marks where to put n. If you want to know more about what might be used in printf and sprintf first argument read Format Modifiers
(both solutions tested in GNU Awk 5.0.1)

Search for multiple question marks in pandas

I want to search for multiple signs in my dataset with pandas. For example when I search for multiple explanation points I use this script that works:
df_double=df[df["text"].str.contains("!!")==True]
df_double
But when I want to change this script to search for multiple question marks, I get an error:
df_double=df[df["text"].str.contains("??")==True]
df_double
What is wrong with this script?
Use \ for escape ?, because special regex chars with {2} for specify 2 chars:
df1 = df[df["text"].str.contains("\?{2}", na=False)]
Or:
df1 = df[df["text"].str.contains("\?\?", na=False)]

Fractions for variables names in Julia

In julia you can write subscripts by \_ for variable names. I was wondering if there is anything similar for writing fractions in variable names. Something like \frac{}{} in LaTeX. I understand this may be harder as it takes two arguments. If there is none, I will use /. But in this case I would like to use some enclosures to make clear what is being differentiated. I assume () is not usable? [] or {} would be ok?
The subscripts or other non-latin names you see in Julia code are just normal unicodes the same as "regular" names. the LaTeX commands are only a function of Julia REPL to remember and input them.
As for unicode, in principle you can represent some simple fractions like ⁽²⁺ⁱ⁾⁄₍ₛ₊ₜ₎, using the ⁄ (U+2044 Fraction slash) symbol and subscripts and superscripts. The rendering depends on your font, but do not expect a verticle layout in any current fonts.
However, Julia recognizes ⁄ (U+2044 Fraction slash, not the / in your keyboard) as "invalid character" when used along during parsing. The same applies to \not, which can only be used in conjunction with some operators, so it's not an option too.
As for the brackets and the normal /, they are operators and are parsed differently. However, there is an (ugly) way to circumvent this: you can use macros to bypass the parsing and use strings as variable names. For example:
julia> macro n_str(name)
esc(Symbol(name))
end
#n_str (macro with 1 method)
julia> n"∂(2x + 3)/∂x" = 2
2
julia> 2n"∂(2x + 3)/∂x"
4

What is the regex for these cases?

What is the regex for these cases:
29000.12345678900, expected result 29000.123456789
29000.000, expected result 29000
29000.00003400, expected result 29000.000034
In short, I want to eliminate the 0 point if there is no 1-9 found again behind decimal and I also want to eliminate the dot (.) if actually the number can be considered as integer.
I use this regex
(?:.0*$|0*$)
but it gives me this result:
29123.6 from 29123.6400, 4 is gone from there.
When I tested the regex separately, it works perfectly,
.0*$ gives me 29123 from 29123.0000
0*$ gives me 29123.6423 from 29123.642300
Am I missing something with the combined regex?
If you think regex is the best way of doing it, you can just use something like this:
\.?0+$
It works for both cases:
> '12300000.000001130000000'.replace(/\.?0+$/g, '')
"12300000.00000113"
> '12300000.000000000000'.replace(/\.?0+$/g, '')
"12300000"
You can use this regex
^\d+(\.\d*[1-9])?
- -------------
| |->this would match only if the digits after . end with [1-9]
|
|->^ depicts the start of the string..it is necessary to match the pattern
that solves your problem
try it here
You simply want this:
^\d*(\.?\d*[1-9])?
^\d* that means one or more digit before the first group.
In the () that describes matching group.
\.? means single DOT(.) can be there but optional. eg. (.)
\d* there can be one or more digits. eg. (1234)
\.?\d* there can be one DOT and one or more digit eg. (.123)
[1-9] this includes only digit from 1 to 9 only excluding 0. eg. (2344)
Regex
I don't know whether Objective-C supports something like the following construct, but in Python you can do it completely without regular expressions using str.rstrip():
In [1]: def shorten_number(number):
...: return number.rstrip('0').rstrip('.')
In [2]: shorten_number('29000.12345678900')
Out[2]: '29000.123456789'
In [3]: shorten_number('29000.000')
Out[3]: '29000'

printing floating point numbers in D

It's been quite a while since I last used D Programming Language, and now I'm using it for some project that involves scientific calculations.
I have a bunch of floating point data, but when I print them using writefln, I get results like: 4.62593E-172 which is a zero! How do I use string formatting % stuff to print such things as 0?
Right now I'm using a hack:
if( abs(a) < 0.0000001 )
writefln(0);
else
writefln(a);
it does the job, but I want to do it using the formatting operations, if possible.
UPDATE
someone suggested writefln("%.3f", a) but the problem with it is that it prints needless extra zeros, i.e. 0 becomes 0.000 and 1.2 becomes 1.200
Can I make it also remove the trailing zeros?
Short answer: This can't be done with printf format specifiers.
Since D uses the same formatting as C99's vsprintf(), you find your answer in this thread: Avoid trailing zeroes in printf()
Try something like
writefln("%.3f", a);
Federico's answer should work, for more information check the format specifiers section.
I see you are currently using Phobos, however what you are trying to do is supported in Tango.
Stdout.formatln("{:f2}", 1.2);
will print "1.20"