making scripts for specific output from the mysql slow log - awk

I want to select word from the file and copy the others line till next #.
Means I have mysql slow query like below. from it I want to select current date time query and till the next # .
Please guide for the same.
# Time: **161205 10:27:39**
# localhost []
# Query_time: 5.517501 Lock_time: 0.034388 Rows_sent: 50 Rows_examined: 27061434
SET timestamp=1480913859;
SELECT ,NULL,NULL,(SELECT
GROUP_CONCAT(project_master_name)
FROM
project_inquiry_detail pid,project_master pm
WHERE
order by T.InquiryDate desc , TL.rowid desc limit 0,50;
**# Time: 161205 14:53:50**

Is that what you are looking for?
sed -n -r -e '/^SELECT /p; /^(SELECT |#)/!{p};' mysql.log

This one works for me
sed -n -r '/^SELECT/,/^#/p' slow.log
The logic is simply "don't print anything unless told so" (the -n switch) and print only lines between (and including) a SELECT at the beginning and a # at the beginning of the line.

Related

Store my "Sybase" query result /output into a script variable

I need a variable to keep the results retrieved from a query (Sybase) that´s in a script.
I have built the following script, it works fine I get the desired result when I run it
Script: EXECUTE_DAILY:
isql -U database_dba -P password <<EOF!
select the_name from table_name where m_num="NUMB912" and date="17/01/2019"
go
quit
EOF!
echo "All Done"
Output:
"EXECUTE_DAILY" 97 lines, 293 characters
user#zp01$ ./EXECUTE_DAILY
the_name
-----------------------------------
NAME912
(1 row affected)
But now I would like to keep the output(the_name: NAME912) in a variable.
So far this is basically what I'm trying with no success.
variable=$(isql -U database_dba -P password -se "select the_name from table_name where m_num="NUMB912" and date="17/01/2019" ")
But, is not working. I can't save NAME912 in a variable.
You need to parse the output for the desired string/piece-of-data that you wish to store in your variable. I tend to make my life a bit easier by making sure I can easily/quickly search/parse out what I want.
Keeping a few issues in mind ...
I tend to use isql -s"|" -w10000 to ensure (most of the time) that a) the result set has all columns delimited with the pipe ('|') and b) a single row of data does not span multiple rows; the pipe delimiter makes it easier to parse out columns that may contain white space; obviously (?) use a different delimiter if a pipe may be part of your actual data
to make parsing of the isql output a bit easier I tend to add a unique, grep-able (literal) string to the rows that I'm looking to search/parse
some databases (eg, SQLAnywhere, Oracle) tend to mimic a literal value as the column header if said literal string has not been assigned an explicit alias/header; this means that if you do a simple search on your literal string then you'll get a match for the result set header as well as the actual data row
I tend to capture all isql output to a temporary file; this allows for easier follow-on processing, eg, error checking, data parsing, dumping contents to a logfile, etc
So, with the above in mind my code typically looks something like:
$ outfile=/tmp/.$$.isql.outfile
$ isql -s"|" -w10000 -U database_dba -P password <<-EOF > ${outfile} 2>&1
-- 'GREP'||'ME' ensures that 'GREPME' only shows up in the data row
select 'GREP'||'ME',the_name
from table_name
where m_num = "NUMB912"
and date = "17/01/2019"
go
EOF
$ cat ${outfile}
... snip ...
|'GREP'||'ME'|the_name | # notice the default column header = 'GREP'||'ME' which won't match my search for 'GREPME'
|------------|----------|
|GREPME |NAME912 | # this is the line I want to search/parse
... snip ...
$ read -r namevar < <(egrep GREPME ${outfile} | awk -F"|" '{print $3}')
$ echo ${namevar}
NAME912

Appending the datetime to the end of every line in a 600 million row file

I have a 680 million rows (19gig) file that I need the datetime appended onto every line. I get this file every night and I have to add the time that I processed it to the end of each line. I have tried many ways to do this including sed/awk and loading it into a SQL database with the last column being defaulted to the current timestamp.
I was wondering if there is a fast way to do this? My fastest way so far takes two hours and that is just not fast enough given the urgency of the information in this file. It is a flat CSV file.
edit1:
Here's what I've done so far:
awk -v date="$(date +"%Y-%m-%d %r")" '{ print $0","date}' lrn.ae.txt > testoutput.txt
Time = 117 minutes
perl -ne 'chomp; printf "%s.pdf\n", $_' EXPORT.txt > testoutput.txt
Time = 135 minutes
mysql load data local infile '/tmp/input.txt' into table testoutput
Time = 211 minutes
You don't specify if the timestamps have to be different for each of the lines. Would a "start of processing" time be enough?
If so, a simple solution is to use the paste command, with a pre-generated file of timestamps, exactly the same length as the file you're processing. Then just paste the whole thing together. Also, if the whole process is I/O bound, as others are speculating, then maybe running this on a box with an SSD drive would help speed up the process.
I just tried it locally on a 6 million row file (roughly 1% of yours), and it's actually able to do it in less than one second, on Macbook Pro, with an SSD drive.
~> date; time paste file1.txt timestamps.txt > final.txt; date
Mon Jun 5 10:57:49 MDT 2017
real 0m0.944s
user 0m0.680s
sys 0m0.222s
Mon Jun 5 10:57:49 MDT 2017
I'm going to now try a ~500 million row file, and see how that fares.
Updated:
Ok, the results are in. Paste is blazing fast compared to your solution, it took just over 90 seconds total to process the whole thing, 600M rows of simple data.
~> wc -l huge.txt
600000000 huge.txt
~> wc -l hugetimestamps.txt
600000000 hugetimestamps.txt
~> date; time paste huge.txt hugetimestamps.txt > final.txt; date
Mon Jun 5 11:09:11 MDT 2017
real 1m35.652s
user 1m8.352s
sys 0m22.643s
Mon Jun 5 11:10:47 MDT 2017
You still need to prepare the timestamps file ahead of time, but that's a trivial bash loop. I created mine in less than one minute.
A solution that simplifies mjuarez' helpful approach:
yes "$(date +"%Y-%m-%d %r")" | paste -d',' file - | head -n "$(wc -l < file)" > out-file
Note that, as with the approach in the linked answer, you must know the number of input lines in advance - here I'm using wc -l to count them, but if the number is fixed, simply use that fixed number.
yes keeps repeating its argument indefinitely, each on its own output line, until it is terminated.
paste -d',' file - pastes a corresponding pair of lines from file and stdin (-) on a single output line, separated with ,
Since yes produces "endless" output, head -n "$(wc -l < file)" ensures that processing stops once all input lines have been processed.
The use of a pipeline acts as a memory throttle, so running out of memory shouldn't be a concern.
Another alternative to test is
$ date +"%Y-%m-%d %r" > timestamp
$ join -t, -j9999 file timestamp | cut -d, -f2-
or time stamp can be generated in place as well <(date +"%Y-%m-%d %r")
join creates a cross product of the first file and second file using the non-existing field (9999), and since second file is only one line, practically appending it to the first file. Need the cut to get rid of the empty key field generated by join
If you want to add the same (current) datetime to each row in the file, you might as well leave the file as it is, and put the datetime in the filename instead. Depending on the use later, the software that processes the file could then first get the datetime from the filename.
To put the same datetime at the end of each row, some simple code could be written:
Make a string containing a separator and the datetime.
Read the lines from the file, append the above string and write back to a new file.
This way a conversion from datetime to string is only done once, and converting the file should not take much longer than copying the file on disk.

Query values into a variable in shell script

I have query which gives me result as
app_no
--------
(0 rows)
I need to get only the rows part and that too just the number. I am saving the result into a variable but I am not able to parse it.
napp=`psql -U postgres appdb -c "select appno from app.apps where properties&2048=1024
cap=$(echo "$napp"|sed -n 's/[0-9][0-9] rows/\1/p')
echo "$cap"
I just need number of rows and that too just number.
If you need the number of appno entries that match, then you should probably use:
SELECT COUNT(*) FROM app.apps WHERE properties & 2048 = 1024
but the answer will always be 0 because the condition is always going to give 0 or false. You need the same bit twice, either both 1024 or both 2048.
SELECT COUNT(*) FROM app.apps WHERE properties & 1024 = 1024
SELECT COUNT(*) FROM app.apps WHERE properties & 2048 = 2048
SQL interfaces that insist on headings and summaries are a nuisance when shell scripting. However, the psql manual suggests that -q and -t may help (with -A too, perhaps):
-A or --no-align
Switches to unaligned output mode. (The default output mode is otherwise aligned.)
-q or --quiet
Specifies that psql should do its work quietly. By default, it prints welcome messages and various informational output. If this option is used, none of this happens. This is useful with the -c option. Within psql you can also set the QUIET variable to achieve the same effect.
-t or --tuples-only
Turn off printing of column names and result row count footers, etc. This is equivalent to the \t command.
If you want to cut the string as-is :
napp=$(psql -U postgres appdb -c "
select appno frpm app.apps
where properties&2048=1024;"
)
cap=$(echo "$napp" | sed -nr 's/.*\(([0-9]+) rows.*/\1/p')
echo "$cap"
But a better solution is the Jonathan Leffler's one

How to delete last row in output file generated by nzsql

I am trying to delete last row in the file generated by nzsql.Please find the below query.
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" > abc.out
When I execute this query the output will be generated and stored in abc.out.This will include both header columns as well as some time information at the bottom.But I don't need the bottom metadata and want to keep only my header columns. How can I do this using only nzsql.Please help me.Thanks in advance.
use -r flag in the nzsql command to avoid getting that row [assuming the metadata referred in question is the row count summary line, ex: (3 rows)]
-r Suppresses the row count that is displayed at the end of the SQL output.
reference: http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fr_sysadm_nzsql_command.html
Why don't you just pipe the output to a unix command to remove it? I think something like this will work:
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" | sed '$d' > abc.out
Seems to be a recommended solution for getting rid of the last line (although ed, gawk, and other tools can handle it).

Nano hacks: most useful tiny programs you've coded or come across

It's the first great virtue of programmers. All of us have, at one time or another automated a task with a bit of throw-away code. Sometimes it takes a couple seconds tapping out a one-liner, sometimes we spend an exorbitant amount of time automating away a two-second task and then never use it again.
What tiny hack have you found useful enough to reuse? To make go so far as to make an alias for?
Note: before answering, please check to make sure it's not already on favourite command-line tricks using BASH or perl/ruby one-liner questions.
i found this on dotfiles.org just today. it's very simple, but clever. i felt stupid for not having thought of it myself.
###
### Handy Extract Program
###
extract () {
if [ -f $1 ] ; then
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
*.rar) unrar x $1 ;;
*.gz) gunzip $1 ;;
*.tar) tar xvf $1 ;;
*.tbz2) tar xvjf $1 ;;
*.tgz) tar xvzf $1 ;;
*.zip) unzip $1 ;;
*.Z) uncompress $1 ;;
*.7z) 7z x $1 ;;
*) echo "'$1' cannot be extracted via >extract<" ;;
esac
else
echo "'$1' is not a valid file"
fi
}
Here's a filter that puts commas in the middle of any large numbers in standard input.
$ cat ~/bin/comma
#!/usr/bin/perl -p
s/(\d{4,})/commify($1)/ge;
sub commify {
local $_ = shift;
1 while s/^([ -+]?\d+)(\d{3})/$1,$2/;
return $_;
}
I usually wind up using it for long output lists of big numbers, and I tire of counting decimal places. Now instead of seeing
-rw-r--r-- 1 alester alester 2244487404 Oct 6 15:38 listdetail.sql
I can run that as ls -l | comma and see
-rw-r--r-- 1 alester alester 2,244,487,404 Oct 6 15:38 listdetail.sql
This script saved my career!
Quite a few years ago, i was working remotely on a client database. I updated a shipment to change its status. But I forgot the where clause.
I'll never forget the feeling in the pit of my stomach when I saw (6834 rows affected). I basically spent the entire night going through event logs and figuring out the proper status on all those shipments. Crap!
So I wrote a script (originally in awk) that would start a transaction for any updates, and check the rows affected before committing. This prevented any surprises.
So now I never do updates from command line without going through a script like this. Here it is (now in Python):
import sys
import subprocess as sp
pgm = "isql"
if len(sys.argv) == 1:
print "Usage: \nsql sql-string [rows-affected]"
sys.exit()
sql_str = sys.argv[1].upper()
max_rows_affected = 3
if len(sys.argv) > 2:
max_rows_affected = int(sys.argv[2])
if sql_str.startswith("UPDATE"):
sql_str = "BEGIN TRANSACTION\\n" + sql_str
p1 = sp.Popen([pgm, sql_str],stdout=sp.PIPE,
shell=True)
(stdout, stderr) = p1.communicate()
print stdout
# example -> (33 rows affected)
affected = stdout.splitlines()[-1]
affected = affected.split()[0].lstrip('(')
num_affected = int(affected)
if num_affected > max_rows_affected:
print "WARNING! ", num_affected,"rows were affected, rolling back..."
sql_str = "ROLLBACK TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
sql_str = "COMMIT TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
ret_code = sp.call([pgm, sql_str], shell=True)
I use this script under assorted linuxes to check whether a directory copy between machines (or to CD/DVD) worked or whether copying (e.g. ext3 utf8 filenames -> fusebl
k) has mangled special characters in the filenames.
#!/bin/bash
## dsum Do checksums recursively over a directory.
## Typical usage: dsum <directory> > outfile
export LC_ALL=C # Optional - use sort order across different locales
if [ $# != 1 ]; then echo "Usage: ${0/*\//} <directory>" 1>&2; exit; fi
cd $1 1>&2 || exit
#findargs=-follow # Uncomment to follow symbolic links
find . $findargs -type f | sort | xargs -d'\n' cksum
Sorry, don't have the exact code handy, but I coded a regular expression for searching source code in VS.Net that allowed me to search anything not in comments. It came in very useful in a particular project I was working on, where people insisted that commenting out code was good practice, in case you wanted to go back and see what the code used to do.
I have two ruby scripts that I modify regularly to download all of various webcomics. Extremely handy! Note: They require wget, so probably linux. Note2: read these before you try them, they need a little bit of modification for each site.
Date based downloader:
#!/usr/bin/ruby -w
Day = 60 * 60 * 24
Fromat = "hjlsdahjsd/comics/st%Y%m%d.gif"
t = Time.local(2005, 2, 5)
MWF = [1,3,5]
until t == Time.local(2007, 7, 9)
if MWF.include? t.wday
`wget #{t.strftime(Fromat)}`
sleep 3
end
t += Day
end
Or you can use the number based one:
#!/usr/bin/ruby -w
Fromat = "http://fdsafdsa/comics/%08d.gif"
1.upto(986) do |i|
`wget #{sprintf(Fromat, i)}`
sleep 1
end
Instead of having to repeatedly open files in SQL Query Analyser and run them, I found the syntax needed to make a batch file, and could then run 100 at once. Oh the sweet sweet joy! I've used this ever since.
isqlw -S servername -d dbname -E -i F:\blah\whatever.sql -o F:\results.txt
This goes back to my COBOL days but I had two generic COBOL programs, one batch and one online (mainframe folks will know what these are). They were shells of a program that could take any set of parameters and/or files and be run, batch or executed in an IMS test region. I had them set up so that depending on the parameters I could access files, databases(DB2 or IMS DB) and or just manipulate working storage or whatever.
It was great because I could test that date function without guessing or test why there was truncation or why there was a database ABEND. The programs grew in size as time went on to include all sorts of tests and become a staple of the development group. Everyone knew where the code resided and included them in their unit testing as well. Those programs got so large (most of the code were commented out tests) and it was all contributed by people through the years. They saved so much time and settled so many disagreements!
I coded a Perl script to map dependencies, without going into an endless loop, For a legacy C program I inherited .... that also had a diamond dependency problem.
I wrote small program that e-mailed me when I received e-mails from friends, on an rarely used e-mail account.
I wrote another small program that sent me text messages if my home IP changes.
To name a few.
Years ago I built a suite of applications on a custom web application platform in PERL.
One cool feature was to convert SQL query strings into human readable sentences that described what the results were.
The code was relatively short but the end effect was nice.
I've got a little app that you run and it dumps a GUID into the clipboard. You can run it /noui or not. With UI, its a single button that drops a new GUID every time you click it. Without it drops a new one and then exits.
I mostly use it from within VS. I have it as an external app and mapped to a shortcut. I'm writing an app that relies heavily on xaml and guids, so I always find I need to paste a new guid into xaml...
Any time I write a clever list comprehension or use of map/reduce in python. There was one like this:
if reduce(lambda x, c: locks[x] and c, locknames, True):
print "Sub-threads terminated!"
The reason I remember that is that I came up with it myself, then saw the exact same code on somebody else's website. Now-adays it'd probably be done like:
if all(map(lambda z: locks[z], locknames)):
print "ya trik"
I've got 20 or 30 of these things lying around because once I coded up the framework for my standard console app in windows I can pretty much drop in any logic I want, so I got a lot of these little things that solve specific problems.
I guess the ones I'm using a lot right now is a console app that takes stdin and colorizes the output based on xml profiles that match regular expressions to colors. I use it for watching my log files from builds. The other one is a command line launcher so I don't pollute my PATH env var and it would exceed the limit on some systems anyway, namely win2k.
I'm constantly connecting to various linux servers from my own desktop throughout my workday, so I created a few aliases that will launch an xterm on those machines and set the title, background color, and other tweaks:
alias x="xterm" # local
alias xd="ssh -Xf me#development_host xterm -bg aliceblue -ls -sb -bc -geometry 100x30 -title Development"
alias xp="ssh -Xf me#production_host xterm -bg thistle1 ..."
I have a bunch of servers I frequently connect to, as well, but they're all on my local network. This Ruby script prints out the command to create aliases for any machine with ssh open:
#!/usr/bin/env ruby
require 'rubygems'
require 'dnssd'
handle = DNSSD.browse('_ssh._tcp') do |reply|
print "alias #{reply.name}='ssh #{reply.name}.#{reply.domain}';"
end
sleep 1
handle.stop
Use it like this in your .bash_profile:
eval `ruby ~/.alias_shares`