BigQuery job filter by label

BigQuery job filter by label - google-bigquery

Is there any way to filter big query jobs with label(s)?
I created a job(query) with label task_id:my_task
bq query --use_legacy_sql=false --label "task_id:my_task" --project my-project 'SELECT * FROM `dataset.mytable`'
Tried those to get all jobs with the label but none of them worked:
bq ls -j --filter 'configuration.labels(task_id):my_task'
bq ls -j --filter 'configuration.labels.task_id:my_task'
bq ls -j --filter configuration.labels(task_id):my_task
bq ls -j --filter labels.task_id:my_task

According to the documentation for "bq ls" [1], --filter lists datasets that match the filter expression. But for listing jobs, the documentation [2] mentions that the allowed flags are three: -j, -a and -n.
So, there isn't a way to filter the listing of jobs, at least through the Bigquery command tool. But as a workaround, you can use the following command to get all the Jobs that are labeled as "task_id:my_task"
for i in $(bq ls -j | awk 'NR>2 {print $1}'); do echo "$(bq show -j $i) $i" | awk '/task_id:my_task/ && /SUCCESS/ {print $(NF)}'; done
This command may take some time, though; so consider adding the -n flag like this:
for i in $(bq ls -j -n 10 | awk 'NR>2 {print $1}'); do echo "$(bq show -j $i) $i" | awk '/task_id:my_task/ && /SUCCESS/ {print $(NF)}'; done
You may also submit a feature request to the BigQuery's Issue Tracker [3].
[1] https://cloud.google.com/bigquery/docs/reference/bq-cli-reference#bq_ls
[2] https://cloud.google.com/bigquery/docs/managing-jobs#listing_jobs
[3] https://issuetracker.google.com/issues/new?component=187149&template=1100108

Related

How to read lines from .txt file into this bash script?

I have this bash script which connects to a postgre sql db and performs a query. I would like to be able to read line from a .txt file into the query as parameters. What is the best way to do that? Your assistance is greatly appreciated! I have my example code below however it is not working.
#!/bin/sh
query="SELECT ci.NAME_VALUE NAME_VALUE FROM certificate_identity ci WHERE ci.NAME_TYPE = 'dNSName' AND reverse(lower(ci.NAME_VALUE)) LIKE reverse(lower('%.$1'));"
(echo $1; echo $query | \
psql -t -h crt.sh -p 5432 -U guest certwatch | \
sed -e 's:^ *::g' -e 's:^*\.::g' -e '/^$/d' | \
sed -e 's:*.::g';) | sort -u

Considering that the file has only one sql query per line:
while read -r line; do echo "${line}" | "your code to run psql here"; done < file_with_query.sql
That means: while read the content of file_with_query.sql line by line, do something with each line.

Use qdel to delete all my jobs at once, not one at a time

This is a rather simple question but I haven't been able to find an answer.
I have a large number of jobs running in a cluster (>20) and I'd like to delete them all and start over.
According to this site I should be able to just do:
qdel -u netid
to get rid of them all, but in my case that returns:
qdel: invalid option -- 'u'
usage: qdel [{ -a | -c | -p | -t | -W delay | -m message}] [<JOBID>[<JOBID>]|'all'|'ALL']...
-a -c, -m, -p, -t, and -W are mutually exclusive
which obviously indicates that the command does not work.
Just to check, I did:
qstat -u <username>
and I do get a list of all my jobs, but:
qdel -u <username>
also fails.

Found the answer buried in an old supercluster.org thread:
qselect -u <username> | xargs qdel
Worked flawlessly.

Building on what Gabriel answered:
qselect -u <username> | xargs qdel
qselect -u <username> -s <state> | xargs qdel
<state> would be R for running jobs only.
qselect will allow you to select job based on other criterias, like ressources asked (-l), destination queue (-q) ...
qdel -u <username>
will only work with SGE

sometimes a simple grep/cut can help too:
qstat | grep $USER | cut -d. -f1 | xargs qdel
This way we can also grep on a particular keyword for the jobs and delete them.
HTH

Try
$ qdel {id1..id2}
So for example:
$ qdel {1148613..1148650}

For UGE:
qstat -u | gawk '{print $1}' | xargs qdel

# Delete all jobs owned by the current user.
#
# Command breakdown:
# ------------------
#
# qselect
# -u selects all jobs that belong to the current user
# -s EHQRTW selects all job states except for Complete
#
# xargs
# --no-run-if-empty Do not run qdel if the result set is empty
# to avoid triggering a usage error.
#
# qdel
# -a delete jobs asynchronously
#
# The backslashes are a trick to avoid matching any shell aliases.
\qselect -u $(whoami) -s EHQRTW | \xargs --no-run-if-empty \qdel -a

Another possibility is to do qdel all. It deletes all jobs from everyone. When you don't have access for other people's job, it deletes only your jobs.
It is not the most beautiful solution, but it is surely the shortest!

qstat | cut -d. -f1 | sed "s; \(.*\) 0;qdel \1;" | bash
sed's power.

Just use the following command:
qdel all
It will cancel all jobs running on cluster.

Slick SourceCodeGenerator From SQL File

Is there a way to use the Slick SourceCodeGenerator to generate source code from a file of SQL CREATE statements? I know there is a way to connect to a DB and read in the schema, but I want to cut out that step and just give it the file. Please advise.

Slick ready meta data via jdbc. If you find a jdbc driver that can do that from a SQL file, you may be in luck. Otherwise, why not use an H2 in-memory database? It has compatibility modes for various SQL dialects. They are limited though. Another option would be using something like this: https://github.com/bgranvea/mysql2h2-converter first to produce an H2 compatible schema file.
We used the following script to load a sql schema from a mysql database, convert it to H2 compatible format and then use it in-memory for tests. You should be able to adapt it.
#!/bin/sh
echo ''
export IP=192.168.1.123
export user=foobar
export password=secret
export database=foobar
ping -c 1 $IP &&\
echo "" &&\
echo "Server is reachable"
# dump mysql schema for debuggability (ignore in git)
# convert the mysql to h2db using the converter.
## disable foreign key check in begining and enable it in the end. Prevents foreign key errors
echo "SET FOREIGN_KEY_CHECKS=0;" > foobar-mysql.sql
## Dump the Db structure and remove the auto_increment so as to set the id column back to 1
mysqldump --compact -u $user -h $IP -d $database -p$password\
|sed 's/CONSTRAINT `_*/CONSTRAINT `/g' \
|sed 's/KEY `_*/KEY `/g' \
|sed 's/ AUTO_INCREMENT=[0-9]*//' \
>> foobar-mysql.sql
echo "SET FOREIGN_KEY_CHECKS=1;" >> foobar-mysql.sql &&\
java -jar mysql2h2-converter.jar foobar-mysql.sql \
|perl -0777 -pe 's/([^`]),/\1,\n /g' \
|perl -0777 -pe 's/\)\);/)\n);/g' \
|perl -0777 -pe 's/(CREATE TABLE [^\(]*\()/\1\n /g' \
|sed 's/UNSIGNED/unsigned/g' \
|sed 's/float/real/' \
|sed "s/\(int([0-9]*).*\) DEFAULT '\(.*\)'/\1 DEFAULT \2/" \
|sed "s/tinyint(1)/boolean/" \
> foobar-h2.sql
perl -ne 'print "$ARGV\n" if /.\z/' -- foobar-h2.sql

A script to change file names

I am new to awk and shell based programming. I have a bunch of files name file_0001.dat, file_0002.dat......file_1000.dat. I want to change the file names such as the number after file_ will be a multiple of 4 in comparison to previous file name. SO i want to change
file_0001.dat to file_0004.dat
file_0002.dat to file_0008.dat
and so on.
Can anyone suggest a simple script to do it. I have tried the following but without any success.
#!/bin/bash
a=$(echo $1 sed -e 's:file_::g' -e 's:.dat::g')
b=$(echo "${a}*4" | bc)
shuf file_${a}.dat > file_${b}.dat

This script will do that trick for you:
#!/bin/bash
for i in `ls -r *.dat`; do
a=`echo $i | sed 's/file_//g' | sed 's/\.dat//g'`
almost_b=`bc -l <<< "$a*4"`
b=`printf "%04d" $almost_b`
rename "s/$a/$b/g" $i
done
Files before:
file_0001.dat file_0002.dat
Files after first execution:
file_0004.dat file_0008.dat
Files after second execution:
file_0016.dat file_0032.dat

Here's a pure bash way of doing it (without bc, rename or sed).
#!/bin/bash
for i in $(ls -r *.dat); do
prefix="${i%%_*}_"
oldnum="${i//[^0-9]/}"
newnum="$(printf "%04d" $(( 10#$oldnum * 4 )))"
mv "$i" "${prefix}${newnum}.dat"
done
To test it you can do
mkdir tmp && cd $_
touch file_{0001..1000}.dat
(paste code into convert.sh)
chmod +x convert.sh
./convert.sh

Using bash/sed/find:
files=$(find -name 'file_*.dat' | sort -r)
for file in $files; do
n=$(sed 's/[^_]*_0*\([^.]*\).*/\1/' <<< "$file")
let n*=4
nfile=$(printf "file_%04d.dat" "$n")
mv "$file" "$nfile"
done

ls -r1 | awk -F '[_.]' '{printf "%s %s_%04d.%s\n", $0, $1, 4*$2, $3}' | xargs -n2 mv
ls -r1 list file in reverse order to avoid conflict
the second part will generate new filename. For example: file_0002.dat will become file_0002.dat file_0008.dat
xargs -n2 will pass two arguments every time to mv

This might work for you:
paste <(seq -f'mv file_%04g.dat' 1000) <(seq -f'file_%04g.dat' 4 4 4000) |
sort -r |
sh

This can help:
#!/bin/bash
for i in `cat /path/to/requestedfiles |grep -o '[0-9]*'`; do
count=`bc -l <<< "$i*4"`
echo $count
done

Unix Df -k ouput in csv format

I’m trying to create a shell script to get the server stats of 100 servers and load the details into a table.Initially I’m creating a parameter file which has the list of all the servers, then I ‘m connecting these servers through ssh and run df –k. ssh keys are already setup.
Issues I’m facing is that I’m not able to associate server name to the result, I want server name added as a column to the df –k output.
Also the output format cannot be loaded into a table as there is no delimiter or tab or space properly formatted to load. I have tried sed & various other options but no luck.
#!/bin/ksh
PARMFILE=/opt/sdw/scripts/db_scripts/server_stats.txt
value=$(<server_list1.txt)
echo "$value"
sourceservers=`grep =/opt/sdw/scripts/db_scripts/server_stats.txt |cut -d= -f2`
#Input array passed as parameter file to the script
set -A array_value $value
vLen=${#array_value[#]}
echo $vLen
for(( j=0; j<$vLen; j++))
do
#echo "${array_value[$j]}"
#ssh -q "${array_value[$j]}"; df -k
ssh -q "${array_value[$j]}" 'df -h' >> df.out
ssh -q "${array_value[$j]}" df -h | column -t >> df1.out
ssh -q "${array_value[$j]}" df -k | tr -s " " | sed 's/ /, /g' | sed '1 s/, / /g' | column -t >> df3.out
[[ ! $? = 0 ]] && echo Failure, errno $?, cannot connect to host "${array_value[$j]}" >> sshfailed.list
done
Output
Filesystem Size Used Avail Use%
Mounted on
/dev/mapper/vg00-lvol3 1.5G 434M 923M 32%
Desired output
Filesystem, Size, Used , Avail, Use%, Mounted on, Servername
/dev/mapper/vg00-lvol3, 1.5G, 434M, 923M, 32%, / br724

put "${array_value[$j]}" in a variable like $server for better readability.
then in sed, do a substitution like sed "s/Mounted on/Mounted on $server/g"

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

BigQuery job filter by label - google-bigquery

Related

How to read lines from .txt file into this bash script?

Use qdel to delete all my jobs at once, not one at a time

Slick SourceCodeGenerator From SQL File

A script to change file names

Unix Df -k ouput in csv format

Categories

Resources