How to fetch object file names from a .map file - scripting

I need to collect all the names of object files from a .map file, make a list of them and then calculate how much space they are taking in different memory areas. The map files are big (<2500 lines) and doing manually takes a lot of time.
I tried
grep -r '.o' *.map
but it gave me a lot of results, that contained 'o'.
This is a sample from the map file,
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(start.o)
(start)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-main.lib(contiki-main.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(start.o) (main)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-main.lib(contiki-main.o) (ss_dhanush_services_init)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(mio_dma_drv.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init) (MIO_Dma_Init)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(dma_drv.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init) (Dma_Init)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(memory_map.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(mio_dma_drv.o) (Virtual_To_Physical)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(socVer.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init) (System_SOC_VersionInit)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(c_fuction.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init) (memset)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(Rip_api.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(ss_dhanush_init) (ripStartService)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(irq_handler.o)
AProject/I3/S-SDK/MPU_Asymmetri_ported_Lib_changes_trunk_code/AON/output/contiki-dhanushss.a(mio_dma_drv.o) (register_isr)
I just need a list a list of object files in this map file.

I have no idea what do you mean list of object files, but give this line a try:
grep -r -Po '[^(]*[.]o(?=[)])' *.map
The result is:
start.o
contiki-main.o
start.o
contiki-main.o
mio_dma_drv.o
dma_drv.o
memory_map.o
mio_dma_drv.o
socVer.o
c_fuction.o
Rip_api.o
irq_handler.o
mio_dma_drv.o

Related

Nextflow input how to declare tuple in tuple

I am working with a nextflow workflow that, at a certain stage, groups a series of files by their sample id using groupTuple(), and resulting in a channel that looks like this:
[sample_id, [file_A, file_B, ... , file_N]]
[sample_id, [file_A, file_B, ... , file_N]]
...
[sample_id, [file_A, file_B, ... , file_N]]
Note that this is the same channel structure that you get from .fromFilePairs().
I want to use these channel items in a process in such a way that, for each item, the process reads the sample_id from the first field and all the files from the inner tuple at once.
The nextflow documentation is somewhat cryptic about this, and it is hard to find how to declare this type of input in a channel, so I thought I'd create a question on stack overflow and then answer it myself for anyone who will ever be looking for this answer.
How does one declare the inner tuple in the input section of a nextflow process?
In the example given above, my inner tuple contains items of only one type (files). I can therefore pass the whole second term of the tuple (i.e. the inner tuple) as a single input item under the file() qualifier. Like this:
input:
tuple \
val(sample_id), \
file(inner_tuple) \
from Input_channel
This will ensure that the tuple content is read as file (one by one), the same way as performing .collect() on a channel of files, in the sense that all files will then be available in the nextflow temp directory where the process is executed.
The question is how you come up with sample_id, but in case they just have different file extensions you might use something like this:
all_files = Channel.fromPath("/path/to/your/files/*")
all_files.map { it -> [it.simpleName, it] }
.groupTuple()
.set { grouped_files }
The path qualifier (previously the file qualifier) can be used to stage a single (file) value or a collection of (file) values into the process execution directory. The note at the bottom of the multiple input files section in the docs also mentions:
The normal file input constructs introduced in the input of files
section are valid for collections of multiple files as well.
This means, you can use a script variable, e.g.:
input:
tuple val(sample_id), path(my_files)
In which case, the variable will hold the list of files (preserving the original filenames). You could use it directly to refer to all of the files in the list, or, you could access specific (file) elements (if you need them) using square bracket (slice) notation.
This is the syntax you will want most of the time. However, if you need predicable filenames or if you need to deal with files with the identical filenames, you may need a different approach:
Alternatively, you could specify a target filename, e.g.:
input:
tuple val(sample_id), path('my_file')
In the case where a single file is received by the process, the file would be staged with the target filename. However, when a collection of files is received by the process, the filename will be appended with a numerical suffix representing its ordinal position in the list. For example:
process test {
tag { sample_id }
debug true
stageInMode 'rellink'
input:
tuple val(sample_id), path('fastq')
"""
echo "${sample_id}:"
ls -g --time-style=+"" fastq*
"""
}
workflow {
readgroups = Channel.fromFilePairs( '*_{1,2}.fastq' )
test( readgroups )
}
Results:
$ touch {foo,bar,baz}_{1,2}.fastq
$ nextflow run .
N E X T F L O W ~ version 22.04.4
Launching `./main.nf` [scruffy_caravaggio] DSL2 - revision: 87a80d6d50
executor > local (3)
[65/66f860] process > test (bar) [100%] 3 of 3 ✔
baz:
lrwxrwxrwx 1 users 20 fastq1 -> ../../../baz_1.fastq
lrwxrwxrwx 1 users 20 fastq2 -> ../../../baz_2.fastq
foo:
lrwxrwxrwx 1 users 20 fastq1 -> ../../../foo_1.fastq
lrwxrwxrwx 1 users 20 fastq2 -> ../../../foo_2.fastq
bar:
lrwxrwxrwx 1 users 20 fastq1 -> ../../../bar_1.fastq
lrwxrwxrwx 1 users 20 fastq2 -> ../../../bar_2.fastq
Note that the names of staged files can be controlled using the * and ? wildcards. See the links above for a table that shows how the wildcards are replaced depending on the cardinality of the input collection.

How to read the newly appended lines of a growing log file continuously in julia?

There is shell command:
tail -n0 -f /path/to/growing/log
to display the newly appended lines of a file continuously.
Please guide me in achieving the objective in Julia!
Just repeatedly read the file:
file = open("/path/to/growing/log")
seekend(file) # ignore contents that are already there to match the `-n0` option
while true
sleep(0.2)
data = read(file, String)
!isempty(data) && print(data)
end

perl gunzip to buffer and gunzip to file have different byte orders

I'm using Perl v5.22.1, Storable 2.53_01, and IO::Uncompress::Gunzip 2.068.
I want to use Perl to gunzip a Storable file in memory, without using an intermediate file.
I have a variable $zip_file = '/some/storable.gz' that points to this zipped file.
If I gunzip directly to a file, this works fine, and %root is correctly set to the Storable hash.
gunzip($zip_file, '/home/myusername/Programming/unzipped');
my %root = %{retrieve('/home/myusername/Programming/unzipped')};
However if I gunzip into memory like this:
my $file;
gunzip($zip_file, \$file);
my %root = %{thaw($file)};
I get the error
Storable binary image v56.115 more recent than I am (v2.10)`
so the Storable's magic number has been butchered: it should never be that high.
However, the strings in the unzipped buffer are still correct; the buffer starts with pst which is the correct Storable header. It only seems to be multi-byte variables like integers which are being broken.
Does this have something to do with byte ordering, such that writing to a file works one way while writing to a file buffer works in another? How can I gunzip to a buffer without it ruining my integers?
That's not related to unzip but to using retrieve vs. thaw. They both expect different input, i.e. thaw expect the output from freeze while retrieve expects the output from store.
This can be verified with a simple test:
$ perl -MStorable -e 'my $x = {}; store($x,q[file.store])'
$ perl -MStorable=freeze -e 'my $x = {}; print freeze($x)' > file.freeze
On my machine this gives 24 bytes for the file created by store and 20 bytes for freeze. If I remove the leading 4 bytes from file.store the file is equivalent to file.freeze, i.e. store just added a 4 byte header. Thus you might try to uncompress the file in memory, remove the leading 4 bytes and run thaw on the rest.

Tail and grep combo to show text in a multi line pattern

Not sure if you can do this through tail and grep. Lets say I have a log file that I would like to tail. It spits out quite a bit of information when in debug mode. I want to grep for information pertaining to only my module, and the module name is in the log like so:
/*** Module Name | 2014.01.29 14:58:01
a multi line
dump of some stacks
or whatever
**/
/*** Some Other Module Name | 2014.01.29 14:58:01
this should show up in the grep
**/
So as you can imagine, the number of lines that would be spit out pertaining to "Module Name" could be 2, or 50 until the end pattern appears (**/)
From your description, it seems like this should work:
tail -f log | awk '/^\/\*\*\* Module Name/,/^\*\*\//'
but be wary of buffering issues. (Lines printed to the file will very likely see high latency before actually being printed.)
I think you will have to install pcregrep for this. Then this will work:
tail -f logfile | pcregrep -M "^\/\*\*\* Some Other Module Name \|.*(\n|.)*?^\*\*/$"
This worked in testing, but for some reason, I had to append your example text to the log file over 100 times before output started showing up. However, all of the "Some other Module Name" data that was written to the log file as soon as this line was invoked was eventually printed to stdout.

alternative to tail -F

I am monitoring a log file by doing "TAIL -n -0 -F filename". But this is taking up a lof of CPU as there are many messages being written to the logfile. Is there a way, I can open a file and read new/few entries and close it and repeat it every 5 second interval? So that I don't need to keep following the file? How can I remember the last read line to start from the next one in the next run? I am trying to do this in nawk by spawning tail shell cmd.
You won't be able to magically use less resources to tail a file by writing your own implementation. If tail -f is using resources because the file is growing fast, a custom version won't help any if you still want to view all lines as they are being written. You are simply limited by your hardware I/O and/or CPU.
Try using --sleep-interval=S where "S" is a number of seconds (the default is 1.0 - you can specify decimals).
tail -n 0 --sleep-interval=.5 -F filename
If you have so many log entries that tail is bogging down the CPU, how are you able to monitor them?