Piping output to local machine in a loop - ssh

I am trying to read some files located on a server and write only a certain number of columns from those files onto my local machine. I am tried to do this in a for loop to avoid inputting my password for each file. Below is what I was able to cobble till now.
The following code works but writes all the output to a single file which is not manageable due to its large size.
ssh user#xx.xxx.xxx.xx 'for loc in /hel/insur/*/201701*; do zcat $loc | grep -v NUMBER | awk -F',' -v OFS="," '\''{print $1,$2,$3,$4,$5}'\'' | gzip; done' > /cygdrive/c/Users/user1/Desktop/test/singlefile.csv.gz
So, I tried to write each file individually as shown below but it gives me an error saying that it cannot find the location(possibly because I am sshed ito the remote server).
ssh user#xx.xxx.xxx.xx 'for loc in /hel/insur/*/201701*; do zcat $loc | grep -v NUMBER | awk -F',' -v OFS="," '\''{print $1,$2,$3,$4,$5}'\'' | gzip > /cygdrive/c/Users/user1/Desktop/test/`echo $loc | cut -c84-112` ; done'
Any ideas on how to solve this?

Related

Running "ip | grep | awk" within a sed replacement

Problem Set (Raspberry Pi OS):
I have a file example.conf that contains a line IPv4addr=XXXXX. I am attempting to change this to the IP that is generated the in the command
ipTest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
I want to automate this file change during a script install.sh, the line I am attempting is:
IPtest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
sudo sed -e "/IPv4addr/s/[^=]*$/$IPtest/" example.conf
Returns error:
sed: -e expression #1, char 32: unknown option to `s'
A simple line in that code works, such as SimpleTest='Works'
Any thoughts? I am open to other solutions as well, however I am not an experienced linux user so I am using the tools I know to work with other problem sets.
$IPtest contains the / character; try something like this:
IPtest=$(ip --brief a show | grep eth0 | awk '{ print $3 }')
sudo sed -e '/IPv4addr/s#[^=]*$#'"$IPtest"'#' example.conf
You can shorten your variable and allow awk to do the job of grep at the same time
IPtest=$(ip --brief a s | awk '/eth0/{print $3}')
Using sed grouping and back referencing
sed -i.bak "s|\([^=]*.\).*|\1$IPtest|" example.conf

Force write to Xcode 'Debugger Output' in console?

The Xcode console has a 'Debugger output' filter. I understand this is for use with lldb, and that you can get messages to print to this output by using breakpoints. My question is not how to do that.
My question is: what is the underlying mechanism Xcode itself uses to write lldb messages to Debugger Output (not Target Output)? Is there a variable similar to stdout or stderr that writes here? Is it possible, from Xcode target code (Swift/Obj-C/C), to write to this output?
Looks like Xcode uses a tty to communicate with lldb, and you can interface with the Debugger Output using that:
echo "Wheeeeeeee" > $(lsof -p $(ps -A | grep -m1 MacOS/Xcode | awk '{print $1}') | grep -m2 dev/ttys | tail -1 | awk '{print $9}')
Breaking the above down:
$ ps -A | grep -m1 MacOS/Xcode | awk '{print $1}'
21280
This gives the process ID of Xcode (21280). Using this, we can find the files it has open:
$ lsof -p 21280 | grep /dev/ttys
Xcode 21280 tres 47u CHR 16,3 0t0 3569 /dev/ttys003
Xcode 21280 tres 58u CHR 16,5 0t0 3575 /dev/ttys005
The one with the highest number (/dev/ttys005 in this case) is the one we want, so let's extract it. tail -1 will give us the last line of output, and awk '{print $9}' will give us the 9th item on the line, which is what we want!
$ lsof -p 21280 | grep /dev/ttys | tail -1 | awk '{print $9}'
/dev/ttys005
Now we can use this to write whatever we want:

Grep / awk, match exact string

I need to find the ID of some container docker, but some containers have similar names:
$ docker images
REPOSITORY TAG IMAGE ID
app-node latest 620350b79c5a
app-node-temp latest 461c5143a985
If I run:
$ docker images | grep -w app-node-temp | awk -e '{print $3}'
461c5143a985
If I run instead:
$ docker images | grep -w app-node | awk -e '{print $3}'
620350b79c5a
461c5143a985
How can I match the exact name?
I'd say just use awk with exact string matching:
docker images | awk '$1 == "app-node" { print $3 }'
Dashes are considered non-word characters, so grep -w won't work when the difference is marked by a dash.
In context, grep '^app-node[[:space:]]' would work. It looks for the required name followed by a space.
Of course, grep | awk is an anti-pattern most of the time; it would be better to use:
docker images | awk '/^app-node[[:space:]]/ { print $3 }'
Or, an easier solution with awk again uses equality — as suggested by Tom Fenech in his answer:
for server in app-node app-node-temp
do
docker images | awk -v server="$server" '$1 == server { print $3 }'
…
done
If running docker images is too expensive, you can run it once and capture the output in a file and then scan the file. This shows how to pass a shell variable into the awk script.
The chances are the pipeline would be run to capture the container's image ID information:
image_id=$(docker images | awk -v server="$server" '$1 == server { print $3 }')
docker images -q is good for your case

How to put this command in a Makefile?

I have the following command I want to execute in a Makefile but I'm not sure how.
The command is docker rmi -f $(docker images | grep "<none>" | awk "{print \$3}")
The command executed between $(..) should produce output which is fed to docker rmi but this is not working from within the Makefile I think that's because the $ is used specially in the Makefile but I'm not sure how to modify the command to fit in there.
Any ideas?
$ in Makefiles needs to be doubled to prevent substitution by make:
docker rmi -f $$(docker images | grep "<none>" | awk "{print \$$3}")
Also, it'd be simpler to use use a singly-quoted string in the awk command to prevent expansion of $3 by the shell:
docker rmi -f $$(docker images | grep "<none>" | awk '{print $$3}')
I really recommend the latter. It's usually better to have awk code in single quotes because it tends to contain a lot of $s, and all the backslashes hurt readability.

AWK to process compressed files and printing original (compressed) file names

I would like to process multiple .gz files with gawk.
I was thinking of decompressing and passing it to gawk on the fly
but I have an additional requirement to also store/print the original file name in the output.
The thing is there's 100s of .gz files with rather large size to process.
Looking for anomalies (~0.001% rows) and want to print out the list of found inconsistencies ALONG with the file name and row number that contained it.
If I could have all the files decompressed I would simply use FILENAME variable to get this.
Because of large quantity and size of those files I can't decompress them upfront.
Any ideas how to pass filename (in addition to the gzip stdout) to gawk to produce required output?
Assuming you are looping over all the files and piping their decompression directly into awk something like the following will work.
for file in *.gz; do
gunzip -c "$file" | awk -v origname="$file" '.... {print origname " whatever"}'
done
Edit: To use a list of filenames from some source other than a direct glob something like the following can be used.
$ ls *.awk
a.awk e.awk
$ while IFS= read -d '' filename; do
echo "$filename";
done < <(find . -name \*.awk -printf '%P\0')
e.awk
a.awk
To use xargs instead of the above loop will require the body of the command to be in a pre-written script file I believe which can be called with xargs and the filename.
this is using combination of xargs and sh (to be able to use pipe on two commands: gzip and awk):
find *.gz -print0 | xargs -0 -I fname sh -c 'gzip -dc fname | gawk -v origfile="fname" -f printbadrowsonly.awk >> baddata.txt'
I'm wondering if there's any bad practice with the above approach…