Replace values in a file conditional on their value - awk

I have a file full of numbers that range 10.00-10.66, 20.67-21.33, 30.67-31.33 and 40.34-42.00.
Example input:
10.21 21.12 10.50 30.80
30.91 31.12 31.00 10.30
21.21 20.99 20.90 31.20
41.71 41.72 10.10 41.80
I want to convert the file such that:
10.00-10.20 = 0|0:[DOSE]
10.21-10.66 = .|.:[DOSE]
20.90-21.10 = 1|0:[DOSE]
20.67-20.89 = .|.:[DOSE]
21.11-21.33 = .|.:[DOSE]
30.90-31.10 = 0|1:[DOSE]
30.67-30.89 = .|.:[DOSE]
31.11-31.33 = .|.:[DOSE]
41.80-42.00 = 1|1:[DOSE]
41.34-41.79 = .|.:[DOSE]
Example output:
.|.:10.21 .|.:21.12 .|.:10.50 .|.:30.80
0|1:30.91 .|.:31.12 0|1:31.00 .|.:10.30
.|.:21.21 1|0:20.99 1|0:20.90 .|.:31.20
.|.:41.71 .|.:41.72 0|0:10.10 1|1:41.80
I can think of a way to do this in R, but the actual file is roughly 1000*5000000 elements in size, and I don't think R can cope!
Is there a way to conditionally replace all elements in a file dependant on their value with an in-line text editor like sed or awk? Alternative programs are welcome!

A simple way to do this in awk would be like this:
{
for (i=1;i<=NF;++i) {
if ($i>=10&&$i<=10.2) $i="0|0:"$i
else if ($i>=10.21&&$i<=10.66) $i=".|.:"$i
# etc.
}
print
}
That is, loop through each field of each record and add the strings you want depending on the value of the field. You can put the script in a file and run it like awk -f script.awk input_file

Related

Nextflow adding def function in to script

I have got errors like .command.sh: line 2: syntax error near unexpected token `('
/*
* Step 3
*/
chr_length = file(params.chr_length)
process create_bedgraph_and_bigwig {
publishDir "${params.outdir}/bedgraphandbigwig", mode: 'copy'
input:
set val(sample_id), file(vector_log) from vector_log_ch
set val(sample_id), file(target_query_bam) from target_query_bam_ch
file chr_length
output:
set val(sample_id), file("${sample_id}.bedgraph.log.txt") into bed_log_ch
set val(sample_id), file("${sample_id}.bed") into bed_ch
set val(sample_id), file("${sample_id}.clean.bed") into clean_bed_ch
set val(sample_id), file("${sample_id}.fragments.bed") into fragments_bed_ch
set val(sample_id), file("${sample_id}.sorted.fragments.bed") into sorted_fragments_bed_ch
shell:
'''
def fp = file(${vector_log})
def lines = fp.readLines()
def line3 = lines[3].split(' ')[4].toInteger()
def line4 = lines[4].split(' ')[4].toInteger()
def aln_sum = (10000/(line3 + line4)).toString()
bedtools bamtobed -bedpe -i !{target_query_bam} > !{sample_id}.bed 2>!{sample_id}.bedgraph.log.txt
awk '$1==$4 && $6-$2 < 1000 {{print $0}}' !{sample_id}.bed > !{sample_id}.clean.bed 2>!{sample_id}.bedgraph.log.txt
cut -f 1,2,6 !{sample_id}.clean.bed > !{sample_id}.fragments.bed 2>!{sample_id}.bedgraph.log.txt
sort -k 1,1 !{sample_id}.fragments.bed > !{sample_id}.sorted.fragments.bed
'''
}
The simple answer is to avoid using 'def' if the variable needs to be used in a shell definition or template. I couldn't actually find this after a quick search of the documentation, but I did find this note from the author:
Using groovy native string interpolation that would work, but when using the !{..} syntax scripts variable cannot be declared locally using the def keyword.
To summarise:
script/shell variable should be defensively declared in the local scope using the def keyboard
do not use def when:
i. the variable needs to be referenced as a output value
ii. the variable needs to be used in a shell template
https://github.com/nextflow-io/nextflow/issues/678#issuecomment-386206123

Karate: one liner json path expression not working

I have a two line json path expression that prints something and I want to put it all in one line:
Given path 'device/'
When method get
Then status 200
#This correctly prints the value:
And def device_search = $.device[?(#.manufacturer == 'a manufacturer')]
And print device_search[0].id
#This doesn't work (prints null):
And print $.device[?(#.manufacturer == 'a manufacturer')][0].id
Thanks!
This is not supported. Use 2 steps.
But if you insist, use karate.get().
And print karate.get("$.device[?(#.manufacturer == 'a manufacturer')][0].id")

How can I read values from a configuration file (text)?

I need to read values (text) from a configuration file named .env and assign them to variables so I can use them later in my program.
The .env file contains name/value pairs and looks something like this:
ENVIRONMENT_VARIABLE_ONE = AC9157847d72b1aa5370fdef36786863d9
ENVIRONMENT_VARIABLE_TWO = 73cad721b8cad6718d469acc42ffdb1f
ENVIRONMENT_VARIABLE_THREE = +13335557777
What I have tried so far
read-values.red
Red [
]
contents: read/lines %.env
env-one: first contents
env-two: second contents
env-three: third contents
print env-one ; ENVIRONMENT_VARIABLE_ONE = AC9157847d72b1aa5370fdef36786863d9
print env-two ; ENVIRONMENT_VARIABLE_ONE = 73cad721b8cad6718d469acc42ffdb1f
print env-three ; ENVIRONMENT_VARIABLE_ONE = +13335557777
What I'm looking for
print env-one ; AC9157847d72b1aa5370fdef36786863d9
print env-two ; 73cad721b8cad6718d469acc42ffdb1f
print env-three ; +13335557777
How do I continue or change my code and parse these strings such as the env- variables will contain just the values?
env-one: skip find first contents " = " 3
See help for find and skip
Another solution using parse could be:
foreach [word value] parse read %.env [collect some [keep to "=" skip keep to newline skip]] [set load word trim value]
This one will add the words to the global context ENVIRONMENT_VARIABLE_ONE will be AC9157847d72b1aa5370fdef36786863d9 and so on.

Elegantly appending a set of strings (.txt file) to another set of strings (.txt also)?

This request might seem slightly ridiculous, unfortunately however, it is direly needed by my small company and because of this I will be awarding the maximum bounty for a good solution.
We have a set of legacy order information stored in a .txt file. In order to import this order information into our new custom database system, we need to, for each row, append on a value from another set.
So, in my .txt file I have :
Trans Date,NorthTotal,NorthSoFar,SouthTotal,SouthSoFar,IsNorthWorkingDay,IsSouthWorkingDay
2012-01-01,21,0,21,0,0,0
2012-01-02,21,0,21,0,0,0
2012-01-03,21,1,21,1,1,1
...
Now, I have a set of locations in a .txt file also, for which I need to add two columns - city and country. Let's say :
City, Country
London,England
Paris,France
For each row in my first text file, I need to append on a row of my second text file! So, for my end result, using my sample data above, I wish to have :
Trans Date,NorthTotal,NorthSoFar,SouthTotal,SouthSoFar,IsNorthWorkingDay,IsSouthWorkingDay,City,Country
2012-01-01,21,0,21,0,0,0,London,England
2012-01-02,21,0,21,0,0,0,London,England
2012-01-03,21,1,21,1,1,1,London,England
2012-01-01,21,0,21,0,0,0,Paris,France
2012-01-02,21,0,21,0,0,0,Paris,France
2012-01-03,21,1,21,1,1,1,Paris,France
...
At the moment my only idea for this is to import both files into an SQL database and write a complicated function to append the two together (hence my tag) - surely someone can save me and think of something that will not take all day though! Please?! Thank you very much.
Edit : I am open to solutions written in all programming languages; but would prefer something which uses DOS or some kind of console/program that can be easily reran!
If you are open to using a database and importing these files (which should not be very difficult), then you do not need a "complicated function to append the two together". All you need is a simple cross join like this ... select t1.*, t2.* from t1, t2
See for yourself at... http://sqlfiddle.com/#!2/0c584/1
Here is a solution in C#. You run it like:
joinfiles a.txt b.txt c.txt
where a.txt is the first file, b.txt the second one, and c.txt the output file that will be created. It generates the output at 100 MB/s on my machine so that is probably fast enough.
using System;
using System.IO;
using System.Text;
namespace JoinFiles
{
class Program
{
static void Main(string[] args)
{
if (args.Length != 3)
return;
string[] file1, file2;
try
{
using (var sr1 = new StreamReader(args[0]))
using (var sr2 = new StreamReader(args[1]))
{
file1 = sr1.ReadToEnd().Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
file2 = sr2.ReadToEnd().Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
}
using (var outstream = new StreamWriter(args[2], false, Encoding.Default, 1048576))
{
outstream.WriteLine(file1[0] + "," + file2[0]);
for (int i = 1; i < file2.Length; i++)
for (int j = 1; j < file1.Length; j++)
outstream.WriteLine(file1[j] + "," + file2[i]);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
}
bash script example
echo -e 'c1\na\nb' > t1
echo -e 'c2\n1\n2' > t2
while read l1;do
read -u 3 l2
echo "$l1,$l2"
done <t1 3<t2
see man bash / internal function / read
You could also write a WSH script to do this and execute from the command line. Here is a quick hack (works but will certainly need some refining). You'll need to save this as a vbs file and execute on the cli like this... wscript script.vbs infile1.txt infile2.txt outfile.txt where script.vbs is your script and infile 1 and 2 are input filenames and outfile.txt is the output file.
Set FSO_In1 = CreateObject("Scripting.FileSystemObject")
Set FSO_In2 = CreateObject("Scripting.FileSystemObject")
Set FSO_Out = CreateObject("Scripting.FileSystemObject")
Set File_Out = FSO_In1.CreateTextFile(Wscript.Arguments.Item(2),2)
Set F1_file = FSO_In1.OpenTextFile(Wscript.Arguments.Item(0),1)
HeaderWritten = False
Header = F1_File.Readline 'Read the first header line from first file
Do While F1_File.AtEndOfStream = False
F1_Line = F1_file.Readline
Set F2_File = FSO_In2.OpenTextFile(Wscript.Arguments.Item(1),1)
if HeaderWritten = False then
Header = Header & "," & F2_File.Readline
File_Out.Writeline(Header)
HeaderWritten = True
else
F2_File.Readline 'Read the first header line from second file and ignore it
end if
Do While F2_File.AtEndOfStream = False
F2_Line = F2_File.Readline
out = F1_Line & "," & F2_Line
File_Out.Writeline(out)
Loop
F2_File.Close
Loop
F1_File.Close
File_Out.Close

User input after print

I'm trying to make a simple lua program that converts Fahrenheit to Celsius and kelvin and I don't know how to put an input command on the same line as a print line. Here's what I mean.
I want the program to display:
Fahrenheit = "Here's the user input"
I know how to make it say
Fahrenheit =
"User input"
I'm still a novice.
This is my code so far:
print("Fahrenheit = ") f = io.read() c = (5/9)*(f-32)
print("Celsius = "..c) k = c + 273 print("Kelvin = "..k)
Look into io.write() and io.read(). For instance, you could say:
io.write("Fahrenheit = ")
The write command writes output to the screen buffer, but doesn't add a newline. Similarly, read checks the latest input, and returns it.
For reference, I suggest this link from the tutorial.