I want to generate some trace file from disk IO, but the problem is I need the actual input data along with timestamp, logical address and access block size, etc.
I've been trying to solve the problem by using the "blktrace | blkparse" with "iozone" on the ubuntu VirtualBox environment, but it seems not working.
There is an option in blkparse for setting the output format to show the packet data, -f "%P", but it dose not print anything.
below is the command that i use:
$> sudo blktrace -a issue -d /dev/sda -o - | blkparse -i - -o ./temp/blktrace.sda.iozone -f "%-12C\t\t%p\t%d\t%S:%n:%N\t\t%P\n"
$> iozone -w -e -s 16M -f ./mnt/iozone.dummy -i 0
In the printing format "%-12C\t\t%p\t%d\t%S:%n:%N\t\t%P\n", all other things are printed well, but the "%P" is not printed at all.
Is there anyone who knows why the packet data is not displayed?
OR anyone who knows other way to get the disk IO packet data with actual input value?
As far as I know blktrace does not capture the actual data. It just capture the metadata. One way to capture real data is to write your own kernel module. Some students at FIU.edu did that in this paper:
"I/O deduplication: Utilizing content similarity to ..."
I would ask this question in linux-btrace mailing list as well:
http://vger.kernel.org/majordomo-info.html
While trying to convert a multipage document from a tiff to a pdf, I encountered the following problem:
↪ tiff2pdf 0271.f1.tiff -o 0271.f1.pdf
tiff2pdf: No support for 0271.f1.tiff with no photometric interpretation tag.
tiff2pdf: An error occurred creating output PDF file.
Does anybody know what causes this and how to fix it?
This is caused because one or more of the pages in the multi-page tiff does not have the photometric interpretation tag set. This is a required tag, so that means your tiffs are technically invalid (though I bet they work fine anyway).
To fix this, you must identify the page (or pages) that does not have the photometric interpretation set and fix it.
To identify the page, you can simply run something like:
↪ tiffinfo your-file.tiff
This will spit out the info for every page of your tiff. For each good page, you'll see something like:
TIFF Directory at offset 0x105c0 (67008)
Subfile Type: (0 = 0x0)
Image Width: 1760 Image Length: 2639
Resolution: 300, 300 pixels/inch
Bits/Sample: 1
Compression Scheme: CCITT Group 4
**Photometric Interpretation: min-is-white**
FillOrder: msb-to-lsb
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 2639
Planar Configuration: single image plane
Software: ScanFix(TM) Enhanced ImageGear Version: 11.00.024
DateTime: Mon Oct 31 15:11:07 2005
Artist: 1996-2001 AccuSoft Co., All rights reserved
If you have a bad page, it'll lack the photometric interpretation section, and you can fix it with:
↪ tiffset -d $page-number -s 262 0 your-file.tiff
Note that the value of zero is the default for the photometric interpretation key, which is 262. You can see the other values for this key at the link above.
If your tiff has a lot of pages (like mine does), you may not be able to easily identify the bad page by eye. In that case, you can take a brute force approach, setting the photometric interpretation for all pages to the default value.
# First, split the tiff into many one-page files
↪ tiffsplit your-file.tiff
# Then, set the photometric interpretation to the default for all pages
↪ find . -name '*.tiff' -exec tiffset -s 262 0 '{}' \;
# Then rejoin the pages
↪ tiffcp *.tiff -o out-file.tiff
Lot of dummy work, but gets the job done.
I have a system that generates large quantities of PostScript files that each contain multiple, multi-page documents. I want to write a script that takes these large PostScript documents and outputs multiple PDF documents from each.
For example one postscript file contains 200 letters to customers, each of which is 10 pages long. This postscript file contains 2000 pages. I want to output from this 1 ps document, 200x 10 page PDFs, one for each customer.
I'm thinking GhostScript is the way to go for this level of document manipulation but I'm not sure the best way to go - Is there a function in GhostScript to take 'pages 1-10' of the input ps file? Do I have to output the entire ps file as 2000 separate ps files (1 per page) then combine them back together again?
Or is there a much simpler way of acheiving my goal with something other than GhostScript?
Many Thanks,
Ben
Technically this will be possible in the next release of Ghostscript, or using the HEAD code in the Git repository. It is now possible to switch devices when using pdfwrite which will cause the device to close and complete the current PDF file. Switching back again will start a new one.
Combine this with a BeginPage and/or EndPage procedure in the page device dictionary, and you should be able to do something like what you want.
Caveat; I haven't tried any of this, and it will take some PostScript programming to get it to work.
Because of the nature of PostScript, there is no way to extract the 'N'th page from a file, so there is no way to specify a range of pages.
As lsemi suggests you could first convert to one large PDF file and then extract the ranges you want. Ghostscript is able to use the FirstPage and LastPage switches to do this (unlike PostScript, it is possible to extract a specific page from a PDF file).
Well, you might first make the PS into a PDF object collection (or directly generate a PDF from GhostScript by printing to the PDFWriter device), and then "cut" from the big PDF using pdftk, which would be quite fast.
Create the complete PDF file first with the help of Ghostscript:
gs \
-o 2000p.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
2000p.ps
Use pdftk to extract PDF files with 10 pages each:
for i in $(seq 0 10 199); do \
export start=$(( ${i} * 1 + 1 )); \
export end=$(( ${start} + 9 )); \
pdftk \
2000p.pdf \
cat ${start}-${end} \
output pages---${start}..${end}.pdf; \
done
You can have Ghostscript generate a 2000page sample+test PDF for you by first creating a sample PostScript file named '2000p.ps' with these contents:
%!PS
/H1 {/Helvetica findfont 48 scalefont setfont .2 .2 1 setrgbcolor} def
/pageframe {1 0 0 setrgbcolor 2 setlinewidth 10 10 575 822 rectstroke} def
/gopageno {H1 300 700 moveto } def
1 1 2000 {pageframe gopageno
4 string cvs
dup stringwidth pop
-1 mul 0 rmoveto
show
showpage} for
and then run this command:
gs -o 2000p.pdf -sDEVICE=pdfwrite -g5950x8420 2000p.ps
Hallo.
I have a big video file. ffmpeg, tcprobe and other tool say, it is an h264-stream in an AVI-container.
Now i'd like to cut out small chunks form the video.
Problem: The index of the video seam corrupted/destroyed. I kind of fixed this via mplayer -forceidx -saveidx <IndexFile> <BigVideoFile>. The Problem here is, that I'm now stuck with mplayer/mencoder which can use this index file via -loadidx <IndexFile>. I have tried correcting the index like described in man aviindex (mplayer -frames 0 -saveidx mpidx broken.avi ; aviindex -i mpidx -o tcindex ; avimerge -x tcindex -i broken.avi -o fixed.avi), but this didn't fix my video - meaning that most tools i've tested couldn't search in the video file.
Problem: I cut out parts of the video via following command: mencoder -loadidx in.idx -ss 8578 -endpos 20 -oac faac -ovc x264 -sws 9 -lavfopts format=mp4 -x264encopts <LotsOfOpts> -of lavf -vf scale=800:-10,harddup in.avi -o out.mp4. Now here the problem is, that some videos are corrupted at the beginning. I think this is because the fact, that i do not necessarily cut at keyframe.
Questions:
What is the best way to fix the index of an avi "inline" so that every tool can again work as expected with it?
How can i split at the keyframes? Is there an mencoder-option for this?
Are Keyframes coming in a frequency? How to find out this frequency? (So with a bit of math it should be possible to calculate the next keyframe and cut there)
Is ther perhaps some completely other way to split this movie? Doing it by hand is no option, i've to cut out 1000+ chunks ...
Thanks a lot!
https://spreadsheets.google.com/ccc?key=0AjWmZ0umsuZHdHNzZVhuMTkxTHdYbUdCQzF3cE51Snc&hl=en lists the various options available for splitting accurately.
I would attempt to use avidemux to repair the file before doing anything. You may also have better results using an MP4 Based Container than AVI.
As for ensuring your specified intervals are right at the keyframe, I would suggest encoding with FFMPEG with the: -g 1 option before using the split below to ensure every frame is in fact a keyframe. FFMPEG refers to keyframs as GOP or Groups of Pictures instead.
ffmpeg -i input.avi -g 1 -vcodec copy -acodec copy out.avi
Then multiple splits (with FFMPEG) :
ffmpeg -i input.avi -ss 00:00:10 -t 00:00:30 out1.avi -ss 00:00:35 -t 00:00:30 out2.avi
Some more options to try:
x264 Mapping FFMPEG encoding in linux
It's the first great virtue of programmers. All of us have, at one time or another automated a task with a bit of throw-away code. Sometimes it takes a couple seconds tapping out a one-liner, sometimes we spend an exorbitant amount of time automating away a two-second task and then never use it again.
What tiny hack have you found useful enough to reuse? To make go so far as to make an alias for?
Note: before answering, please check to make sure it's not already on favourite command-line tricks using BASH or perl/ruby one-liner questions.
i found this on dotfiles.org just today. it's very simple, but clever. i felt stupid for not having thought of it myself.
###
### Handy Extract Program
###
extract () {
if [ -f $1 ] ; then
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
*.rar) unrar x $1 ;;
*.gz) gunzip $1 ;;
*.tar) tar xvf $1 ;;
*.tbz2) tar xvjf $1 ;;
*.tgz) tar xvzf $1 ;;
*.zip) unzip $1 ;;
*.Z) uncompress $1 ;;
*.7z) 7z x $1 ;;
*) echo "'$1' cannot be extracted via >extract<" ;;
esac
else
echo "'$1' is not a valid file"
fi
}
Here's a filter that puts commas in the middle of any large numbers in standard input.
$ cat ~/bin/comma
#!/usr/bin/perl -p
s/(\d{4,})/commify($1)/ge;
sub commify {
local $_ = shift;
1 while s/^([ -+]?\d+)(\d{3})/$1,$2/;
return $_;
}
I usually wind up using it for long output lists of big numbers, and I tire of counting decimal places. Now instead of seeing
-rw-r--r-- 1 alester alester 2244487404 Oct 6 15:38 listdetail.sql
I can run that as ls -l | comma and see
-rw-r--r-- 1 alester alester 2,244,487,404 Oct 6 15:38 listdetail.sql
This script saved my career!
Quite a few years ago, i was working remotely on a client database. I updated a shipment to change its status. But I forgot the where clause.
I'll never forget the feeling in the pit of my stomach when I saw (6834 rows affected). I basically spent the entire night going through event logs and figuring out the proper status on all those shipments. Crap!
So I wrote a script (originally in awk) that would start a transaction for any updates, and check the rows affected before committing. This prevented any surprises.
So now I never do updates from command line without going through a script like this. Here it is (now in Python):
import sys
import subprocess as sp
pgm = "isql"
if len(sys.argv) == 1:
print "Usage: \nsql sql-string [rows-affected]"
sys.exit()
sql_str = sys.argv[1].upper()
max_rows_affected = 3
if len(sys.argv) > 2:
max_rows_affected = int(sys.argv[2])
if sql_str.startswith("UPDATE"):
sql_str = "BEGIN TRANSACTION\\n" + sql_str
p1 = sp.Popen([pgm, sql_str],stdout=sp.PIPE,
shell=True)
(stdout, stderr) = p1.communicate()
print stdout
# example -> (33 rows affected)
affected = stdout.splitlines()[-1]
affected = affected.split()[0].lstrip('(')
num_affected = int(affected)
if num_affected > max_rows_affected:
print "WARNING! ", num_affected,"rows were affected, rolling back..."
sql_str = "ROLLBACK TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
sql_str = "COMMIT TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
ret_code = sp.call([pgm, sql_str], shell=True)
I use this script under assorted linuxes to check whether a directory copy between machines (or to CD/DVD) worked or whether copying (e.g. ext3 utf8 filenames -> fusebl
k) has mangled special characters in the filenames.
#!/bin/bash
## dsum Do checksums recursively over a directory.
## Typical usage: dsum <directory> > outfile
export LC_ALL=C # Optional - use sort order across different locales
if [ $# != 1 ]; then echo "Usage: ${0/*\//} <directory>" 1>&2; exit; fi
cd $1 1>&2 || exit
#findargs=-follow # Uncomment to follow symbolic links
find . $findargs -type f | sort | xargs -d'\n' cksum
Sorry, don't have the exact code handy, but I coded a regular expression for searching source code in VS.Net that allowed me to search anything not in comments. It came in very useful in a particular project I was working on, where people insisted that commenting out code was good practice, in case you wanted to go back and see what the code used to do.
I have two ruby scripts that I modify regularly to download all of various webcomics. Extremely handy! Note: They require wget, so probably linux. Note2: read these before you try them, they need a little bit of modification for each site.
Date based downloader:
#!/usr/bin/ruby -w
Day = 60 * 60 * 24
Fromat = "hjlsdahjsd/comics/st%Y%m%d.gif"
t = Time.local(2005, 2, 5)
MWF = [1,3,5]
until t == Time.local(2007, 7, 9)
if MWF.include? t.wday
`wget #{t.strftime(Fromat)}`
sleep 3
end
t += Day
end
Or you can use the number based one:
#!/usr/bin/ruby -w
Fromat = "http://fdsafdsa/comics/%08d.gif"
1.upto(986) do |i|
`wget #{sprintf(Fromat, i)}`
sleep 1
end
Instead of having to repeatedly open files in SQL Query Analyser and run them, I found the syntax needed to make a batch file, and could then run 100 at once. Oh the sweet sweet joy! I've used this ever since.
isqlw -S servername -d dbname -E -i F:\blah\whatever.sql -o F:\results.txt
This goes back to my COBOL days but I had two generic COBOL programs, one batch and one online (mainframe folks will know what these are). They were shells of a program that could take any set of parameters and/or files and be run, batch or executed in an IMS test region. I had them set up so that depending on the parameters I could access files, databases(DB2 or IMS DB) and or just manipulate working storage or whatever.
It was great because I could test that date function without guessing or test why there was truncation or why there was a database ABEND. The programs grew in size as time went on to include all sorts of tests and become a staple of the development group. Everyone knew where the code resided and included them in their unit testing as well. Those programs got so large (most of the code were commented out tests) and it was all contributed by people through the years. They saved so much time and settled so many disagreements!
I coded a Perl script to map dependencies, without going into an endless loop, For a legacy C program I inherited .... that also had a diamond dependency problem.
I wrote small program that e-mailed me when I received e-mails from friends, on an rarely used e-mail account.
I wrote another small program that sent me text messages if my home IP changes.
To name a few.
Years ago I built a suite of applications on a custom web application platform in PERL.
One cool feature was to convert SQL query strings into human readable sentences that described what the results were.
The code was relatively short but the end effect was nice.
I've got a little app that you run and it dumps a GUID into the clipboard. You can run it /noui or not. With UI, its a single button that drops a new GUID every time you click it. Without it drops a new one and then exits.
I mostly use it from within VS. I have it as an external app and mapped to a shortcut. I'm writing an app that relies heavily on xaml and guids, so I always find I need to paste a new guid into xaml...
Any time I write a clever list comprehension or use of map/reduce in python. There was one like this:
if reduce(lambda x, c: locks[x] and c, locknames, True):
print "Sub-threads terminated!"
The reason I remember that is that I came up with it myself, then saw the exact same code on somebody else's website. Now-adays it'd probably be done like:
if all(map(lambda z: locks[z], locknames)):
print "ya trik"
I've got 20 or 30 of these things lying around because once I coded up the framework for my standard console app in windows I can pretty much drop in any logic I want, so I got a lot of these little things that solve specific problems.
I guess the ones I'm using a lot right now is a console app that takes stdin and colorizes the output based on xml profiles that match regular expressions to colors. I use it for watching my log files from builds. The other one is a command line launcher so I don't pollute my PATH env var and it would exceed the limit on some systems anyway, namely win2k.
I'm constantly connecting to various linux servers from my own desktop throughout my workday, so I created a few aliases that will launch an xterm on those machines and set the title, background color, and other tweaks:
alias x="xterm" # local
alias xd="ssh -Xf me#development_host xterm -bg aliceblue -ls -sb -bc -geometry 100x30 -title Development"
alias xp="ssh -Xf me#production_host xterm -bg thistle1 ..."
I have a bunch of servers I frequently connect to, as well, but they're all on my local network. This Ruby script prints out the command to create aliases for any machine with ssh open:
#!/usr/bin/env ruby
require 'rubygems'
require 'dnssd'
handle = DNSSD.browse('_ssh._tcp') do |reply|
print "alias #{reply.name}='ssh #{reply.name}.#{reply.domain}';"
end
sleep 1
handle.stop
Use it like this in your .bash_profile:
eval `ruby ~/.alias_shares`