The question concerns the code optimization.
I want to define a json-file which specify what files will be copied and where. Namely, I want to apply it in npm-type project to transfer the files from node_modules to the destination directory from which I include files (on the templates of the pages for the web browser). So I wrote a json file:
{
"public": [
{
"bootstrap": {
"js": [
"bootstrap.min.js"
],
"css": [
"bootstrap.min.css"
]
}
},
{
"jquery": {
"js": [
"jquery.min.js"
]
}
}
]
}
The first level defines the name of the destination directory ('public') which would contain the packages specified on the second level. The third level defines the names of the destination folders inside 'public' ('js' or 'css') which contain the list of files to find and copy.
The code which traverses the json file follows:
#!/usr/bin/env bash
cfg='push2public.json'
P=$(cat "$cfg" | jq keys[0] -r)
n=$(cat "$cfg" | jq ".$P | length")
for (( i=0;i<$n;i++ )); do
p=$(cat "$cfg" | jq ".$P[$i]" | jq keys[0] -r)
m=$(cat "$cfg" | jq ".$P[$i].$p | length")
for (( j=0;j<$m;j++ )); do
d=$(cat "$cfg" | jq ".$P[$i].$p" | jq keys[$j] -r)
mkdir -p "$P/$d"
l=$(cat "$cfg" | jq ".$P[$i].$p" | jq ".$d | length" )
for (( k=0;k<$l;k++ )); do
f=$(cat "$cfg" | jq ".$P[$i].$p.$d[$k]" -r)
find . -path "./node_modules/$p/*" -name "$f" | xargs -I{} cp -fa "{}" "$P/$d/"
done
done
done
The code seems to work, yet it looks kinda strange. Can you think of a better way to apply jq for the task just described?
In this response, I'll focus on the main point - that it is possible to construct the
shell commands with just one invocation of jq. jq does not have a "system" command
for executing these commands, so the jq program given here may need to be modified, depending e.g. on security requirements.
To get the ball rolling, note that the script given in the question generates (and executes)
the following shell commands:
find . -path ./node_modules/bootstrap/* -name bootstrap.min.css | xargs -I{} cp -fa {} public/css/
find . -path ./node_modules/bootstrap/* -name bootstrap.min.js | xargs -I{} cp -fa {} public/js/
find . -path ./node_modules/jquery/* -name jquery.min.js | xargs -I{} cp -fa {} public/js/
These commands can be generated with just one invocation of jq using the following jq program:
def construct:
(.value | to_entries[] | "-name \(.value[0]) | xargs -I{} cp -fs {} public/\(.key)/") as $s
| "find . -path ./node_modules/\(.key)/* " + $s ;
.[][]
| to_entries[]
| construct
Note that in the output produced by this jq program, the ordering is different, because the script in the question uses keys, which sorts the keys alphabetically.
Related
I want cmake to output the path of the header file of the library that the project depends on, and give it to ctags to generate tags.
I have tried to generate tags of all header files of the system directly: ctags -R /usr/include, but the size of the generated tags file is 190MB, which is too large.
For example, if libcurl is used in the project, then let cmake output /usr/include/curl, and then ctags can ctags -R /usr/include/curl.
I looked at cmake --help, but didn't find what I was looking for. How can I achieve this?
Generate compile_commands.json. Parse compile_commands.json, extract all "command": keys, extract all -I<this paths> include paths from compile commands, interpret them relative to build directory. sort -u the list.
$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=1 ...
$ jq -r '.[] | .command' "$builddir"/compile_commands.json |
grep -o -- '-I[^ ]*' |
sed 's/^-I//' |
sort -u |
( cd "$builddir" && xargs -d '\n' readlink -f ) |
sort -u
I'm trying to set up my ci file with some variables. I'm able to generate a variable like so;
...
variables:
TARGET_PROJECT_DIR: "${CI_PROJECT_NAME}.git"
However, I don't seem to be able to do this;
...
variables:
PROJECT_PROTOCOL_RELATIVE_URL: "${CI_PROJECT_URL//https:\/\/}.git"
If I run that in bash, I get the expected output which is gitlab.com/my/repo/url.git with the 'https://' removed and the '.git' appended.
My workaround has just been to export it in the 'script' section, but it feels a lot neater to add this to the variables section, since this is part of a template that is being inherited by the actual jobs. Is it possible?
There are several more useful variables defined in the GitLab CI environment.
CI_PROJECT_PATH gives you the <namespace>/<project name> (or just <project name> if you have no extra namespace) string and
CI_SERVER_HOST gives you the server name, so you could do
variables:
PROJECT_PROTOCOL_RELATIVE_URL: ${CI_SERVER_HOST}/${CI_PROJECT_PATH}.git
I have similar setups (also without quotes).
I'm not sure if that will work for you, since my runners and my server are under my control and I don't run pipelines with external projects.
But you can get all available variables displayed in the job log by running a job like this:
stages:
- env
show-env:
stage: env
script:
- env
Also always helpful is https://docs.gitlab.com/ee/ci/variables/predefined_variables.html
After looking around for similar challenges I found your not answered question. Here are my suggestions:
stages:
- todo
todo-job:
stage: todo
only:
- master
script:
#your question / example
- echo ${CI_PROJECT_URL}
- echo ${CI_PROJECT_URL:8:100}.git
#Because you have the word manipulation in the title, I have some more examples:
#Return substring between the two '_'
- INPUT="someletters_12345_moreleters.ext"
- SUBSTRING=`expr match "$INPUT" '.*_\([[:digit:]]*\)_.*' `
- echo $SUBSTRING
#Store a substring in a new variable and create an output
- b=${INPUT:12:5}
- echo $b
#Substring using grep with regex (more readable)
- your_number=$(echo "someletters_12345_moreleters.ext" | grep -E -o '[0-9]{5}')
- echo $your_number
#Substring using variable and 'grep' with regex (more readable)
- your_number=$(echo "$INPUT" | grep -E -o '[0-9]{5}')
- echo $your_number
#split a string and return a part using 'cut'
- your_id=$(echo "Release V14_TEST-42" | cut -d "_" -f2 )
- echo $your_id
#split the string of a variable and return a part using 'cut'
- VAR="Release V14_TEST-42"
- your_number=$(echo "$VAR" | cut -d "_" -f2 )
- echo $your_number
Gitlab output looks like:
$ echo ${CI_PROJECT_URL}
https://gitlab.com/XXXXXXXXXX/gitlab_related_projects/test
$ echo ${CI_PROJECT_URL:8:100}.git
gitlab.com/XXXXXXXXXX/gitlab_related_projects/test.git
$ INPUT="someletters_12345_moreleters.ext"
$ SUBSTRING=`expr match "$INPUT" '.*_\([[:digit:]]*\)_.*' `
$ echo $SUBSTRING
12345
$ b=${INPUT:12:5}
$ echo $b
12345
$ your_number=$(echo "someletters_12345_moreleters.ext" | grep -E -o '[0-9]{5}')
$ echo $your_number
12345
$ your_number=$(echo "$INPUT" | grep -E -o '[0-9]{5}')
$ echo $your_number
12345
$ your_number=$(echo "Release V14_TEST-42" | cut -d "_" -f2 )
$ echo $your_number
TEST-42
$ VAR="Release V14_TEST-42"
$ your_number=$(echo "$VAR" | cut -d "_" -f2 )
$ echo $your_number
TEST-42
Cleaning up project directory and file based variables
00:01
Job succeeded
I'm trying to crawl some website. However my crawling process is so long i need to use multiple instances to shorten it. I've searched for other ways and aborted all the unnecessary resources requested still it's way too slow for me(around 8-9 secs).
What is the easiest way to parallel casperjs instances or even run only two casperjs at the same time to crawl in parallel?
I have used parallel gnu from a blog post i've found however it seems like although the process' are alive they are not crawling in parallel because total execution time is still the same with one instance.
Should i use a nodejs server to create instances?
What is the easiest and most practical way?
Can you adapt this:
https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Breadth-first-parallel-web-crawler-mirrorer
#!/bin/bash
# E.g. http://gatt.org.yeslab.org/
URL=$1
# Stay inside the start dir
BASEURL=$(echo $URL | perl -pe 's:#.*::; s:(//.*/)[^/]*:$1:')
URLLIST=$(mktemp urllist.XXXX)
URLLIST2=$(mktemp urllist.XXXX)
SEEN=$(mktemp seen.XXXX)
# Spider to get the URLs
echo $URL >$URLLIST
cp $URLLIST $SEEN
while [ -s $URLLIST ] ; do
cat $URLLIST |
parallel lynx -listonly -image_links -dump {} \; \
wget -qm -l1 -Q1 {} \; echo Spidered: {} \>\&2 |
perl -ne 's/#.*//; s/\s+\d+.\s(\S+)$/$1/ and do { $seen{$1}++ or print }' |
grep -F $BASEURL |
grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
mv $URLLIST2 $URLLIST
done
rm -f $URLLIST $URLLIST2 $SEEN
( find -print0 | xargs -0 cat ) | wc -l (from How to count all the lines of code in a directory recursively?) prints the total number of lines in all files in all subdirectories. But it also prints a bunch of lines like cat: ./x: Is a directory.
I tried ( find -print0 | xargs -0 cat ) | wc -l &> /dev/null (and also 2> /dev/null and > /dev/null 2>&1) but the messages are still printed to the shell.
Is it not possible to hide this output?
( find -type f -print0 | xargs -0 cat ) | wc -l overcomes this problem, but I'm still curious why redirecting stderr doesn't work, and if there is a more general purpose way to hide errors from cat.
You need to redirect the stderr stream of the cat command to /dev/null. What you have done is redirected the stderr stream of wc. Try this:
( find -print0 | xargs -0 cat 2>/dev/null ) | wc -l
If you want find only to find “regular” files, you must use find -type f ….
By the way, if you want to calculate lines of code, you should take a look at ohcount.
I want to make a list of files of locate's output.
I want scp to take the list.
I am not sure about the syntax.
My attempt with pseudo-code
locate labra | xargs scp {} masi#11.11.11:~/Desktop/
How can you move the files to the destination?
xargs normally takes as many arguments it can fit on the command line, but using -I it suddenly only takes one. GNU Parallel http://www.gnu.org/software/parallel/ may be a better solution:
locate labra | parallel -m scp {} masi#11.11.11:~/Desktop/
Since you are looking at scp, may I suggest you also check out rsync?
locate labra | parallel -m rsync -az {} masi#11.11.11:~/Desktop/
Typically, {} is a findism:
find ... -exec cmd {} \;
Where {} is the current file that find is working on.
You can get xargs to behave similar with:
locate labra | xargs -I{} echo {} more arguments
However, you'll quickly notice that it runs the commands multiple times instead of one call to scp.
So in the context of your example:
locate labra | xargs -I{} scp '{}' masi#11.11.11:~/Desktop/
Notice the single quotes around the {} as it'll be useful for paths with spaces in them.