I am trying to merge a number of geotiffs into one large geotiff with overviews, however the final merged geotiff shows a number of horizontal artifacts around the edges of the original merged geotiffs (see here for an example).
I create the merged file using the following code:
'''Produce Combined VRT'''
string ='gdalbuildvrt -srcnodata "0 0 0 0" -hidenodata -r bilinear %s -overwrite %s' %(tmp_vrt, GDal_merge_string)
os.system(string)
'''Convert VRT to Geotiff'''
string ='gdal_translate -b 1 -b 2 -b 3 -mask 4 --config GDAL_TIFF_INTERNAL_MASK YES -of GTiff %s %s' %(tmp_vrt,tmp_fname)
os.system(string)
I have a hunch that this might have to do with using gdal_translate on a vrt, as the erorrs occur on the edges of the original geotiffs, and in this case it might be related or similar to the issue found in this post.
This code is using VRTs to combine the geotiffs for speed purposes, but perhaps it might be better to just merge these with gdalwarp?
Edit: I have reduced the number of flags and left out the overviews in the code above, as suggested in the comment below by Benjamin. The error seems to be produced in the code above. I think the issue may lie in the masking process. I guess at some point in the process of stacking the bands, the inputs are distorted. Is it generally inadvisable to gdal_translate VRTs?
Related
I am using gdal_rasterize and ogr2ogr with a goal to get a partial raster of .gpkg file.
With first command I want to clip a smaller area of a large map.
ogr2ogr -spat xmin ymin xmax ymax out.gpkg in.gpkg
This results in a file that with command ogrinfo out.gpkg gives expected output by listing the layers numbers and names.
Then trying to rasterize this new file with:
gdal_rasterize out.gpkg -burn 255 -ot Byte -ts 250 250 -l anylayer out.tif
results in: ERROR 1: Cannot get layer extent when tried with any of the layers names given by ogrinfo
Using the same command on the original in.gpkg doesnt give errors and results in expected .tiff file raster.
ogr2ogr --version GDAL 2.4.2, released 2019/06/28
gdal_rasterize --version GDAL 2.4.2, released 2019/06/28
This process should at the end be implemented with the gdal C++ API.
Are the commands given some how invalid this way, how?
Should the whole process be done differently, how?
What does the ERROR 1: Cannot get layer extent mean?
I have a PDF file with a grayscale image that I'm trying to convert to monochromatic bitmap (1 bit per pixel) using ghostscript. I can get everything to work fine, but I don't like the way the default grayscale conversion looks with coarse lines going through it. My understanding is that I can customize the halftone algorithm to my liking, but I can't seem to get the postscript commands to have an effect on the output. Am I using the 'sethalftone' command incorrectly? Is there a different approach I should consider?
gs -sDEVICE=bmpmono -sOutputFile=test.bmp -dBATCH -dNOPAUSE -r300 -c "<< /HalftoneType 1 /Frequency 40 /Angle 0 /SpotFunction {pop}>> sethalftone" -sPageList=1 input.pdf
I can completely remove the "-c" command line parameter and it makes no difference.
This is what the current mono conversion looks like that I'm trying to improve upon:
Using the default halftone built into Ghost Script has a limited working range and a tendency to provide an appearance leaning to one side, (it is deliberate design not accidental) there is a dither control but that tends to work effectively at r/20 to r/5 in this case showing -dDITHERPPI=25 to -dDITHERPPI=60 thus reducing the blocks in size. However there are other potential controls.
I am no expert on how these settings are best used (plagiarized) we would need to read the PS manual, or play with their values, but they seem to fare better.
gswin32c -sDEVICE=bmpmono -r300 -o out.bmp -c "<< /Frequency 133 /Angle 45 /SuperCellSize 36 /Levels 255 >> .genordered /Default exch /Halftone defineresource { } settransfer 0.003 setsmoothness" -f ../examples/text_graph_image_cmyk_rgb.pdf
(note you may wish to include -dUseFastColor before -c "....)
Changing values
/Angle common values may be 0 (0=GS default) 22 30 45 52 60 67 however /Angle 45 is most common for this mono conversion, and variations coupled with a range of /Frequency (Line Screen Frequency ~= LPI) and SuperCell size may produce unachievable or sub normal result.
/Frequency (75=GS default) can be set to different values but a common ratio is 4/3 = 133 (%) another is 170 however a rule of thumb is r/2 thus if your using 300dpi /Frequency 150 may be better in this case. The correct choice is down to a complex consideration of factors, so your grid at 45 degrees is spaced at sqrt(2)=1.41421356237 at an angle. Thus the horizontal & vertical effect is 0.7071....(in terms of page dpi) thus to match 300 dpi the LPI frequency needs to be possibly made relative so consider 212
/SuperCellSize eludes me Integer; default value = 1 -- actual cell size determined by Frequency, Angle, H/V Resolution. A larger value will allow more levels to be attained. Unclear as to what may be optimum without sample of target, so try a few values but some will trigger error feedback try around 32 or 64 as a reasonable max start point. note example uses 36 ?
To prevent the undesirable moire patterns, Peter Fink proposed the Supercell Method named the Adobe Accurate Screen to realize the screen
angle having RT with the required accuracy.3 The supercell designed in the image domain including m × m (m: integer) identical halftone cells is one solution to approximate accurately the required screen angle and screen ruling. https://www.imaging.org/site/PDFS/Papers/1998/PICS-0-43/668.pdf
You may also wish to try
/DotShape Integer; default value = 0 (CIRCLE). Other shapes available are:
1=REDBOOK, 2=INVERTED, 3=RHOMBOID, 4=LINE_X, 5=LINE_Y, 6=DIAMOND1, 7=DIAMOND2, 8=ROUNDSPOT,
These values and more can be found at https://ghostscript.com/doc/current/Language.htm
Upper Left is a mono area dithered in a graphics app where depending on resolution and density "worms" or "moire" patterns start to appear. To regulate such effects a deliberate screen is applied and at smallest units aberration's in the linework pattern will not be noticeable, unless zoomed beyond intended scrutiny.
gswin32c -sDEVICE=bmpmono -r300 -o out.bmp -c "<< /Frequency 106 /Angle 45 /SuperCellSize 64 /Levels 255 >> .genordered /Default exch /Halftone defineresource { } " -f test.pdf
Imagemagick can use custom halftones, defined by XML files, so if you can't get the result you need directly with GhostScript, another option is to use GhostScript to output a high-resolution greyscale image, and use Imagemagick to produce the black and white halftoned image.
I have a raster dataset and have used gdal_sieve to eliminate clumps of pixels made up of below a certain number of pixels. How do I eliminate large clumps of pixels above a certain number of pixels?
gdal_sieve.py does not support removing objects above a threshold size. However it seems that your desired output can be achieved by sieving objects below your threshold and then calculating the difference between the input and sieved images:
gdal_sieve.py -st <<threshold>> input.tif sieved.tif
gdal_calc.py -A input.tif -B sieved.tif --calc="A ^ B" --outfile=output.tif
I issue the following command:
gs \
-o downsampled.pdf \
-sDEVICE=pdfwrite \
-dDownsampleColorImages=true \
-dColorImageResolution=180 \
-dColorImageDownsampleThreshold=1.0 \
And get the following errors:
Subsample filter does not support non-integer downsample factor (1.994360)
Failed to initialise downsample filter, downsampling aborted
(on some pages)
and:
Subsample filter does not support non-integer downsample factor (2.000029)
Failed to initialise downsample filter, downsampling aborted
Originally I tried to downsample to 150dpi, which gave the error with factor (2.40????), meaning multiple errors, where the last few digits are different for different pages. So I guessed that images are approximately 150*2.4 = 360 dpi. So I try downsampling to 180. But it seems the images are all slightly off?
Is there a way to specify the factor instead of the dpi?
Is there a way to "round" the factor?
No, there is no way to specify the factor (this is the Adobe specification for distiller params, we are currently limited to those). You cannot specify an approximation for rounding either, without modifying the source code.
You can use a different downsampling algorithm.
[much later]
In fact I just checked the current code, and you must be using an old version of Ghostscript.
The current default downsampling filter is the Bicubic filter, and if you do force the Subsample filter, then the code checks to see if the downsample factor requested is an integer.
If the factor is not an integer but is within 0.1 of an integer then it forces factor to the nearest integer.
If its outside 0.1 of an integer factor then it aborts the subsample filter and switches to Bicubic.
I'd recommend upgrading.
[later edit]
So avoiding the bogus ColorDownsampleOption, the problem is actually not colour images at all, its monochrome images, or more precisely in your case, imagemasks.
I set up this command line:
gs
-sDEVICE=pdfwrite \
-sOutputFile=pdfwrite.pdf \
-dDownsampleColorImages=true \
-dDownsampleGrayImages=true \
-dDownsampleMonoImages=true \
-dColorImageDownsampleThreshold=1 \
-dGrayImageDownsampleThreshold=1 \
-dMonoImageDownsampleThreshold=1 \
-dColorImageDownsampleType=/Bicubic \
-dGrayImageDownsampleType=/Bicubic \
-dMonoImageDownsampleType=/Bicubic \
-dColorImageResolution=72 \
-dGrayImageResolution=72 \
-dMonoImageResolution=100 "gs sample.pdf"
And that produces an error message that the only filter available for monochrome images is Subsample, followed by the error messages you quote about the imprecise factor.
I guess basically this makes my point that an example file is pretty much vital in order to investigate problems.
So there is a problem there, and I will look into it, obviously for monochrome images it should be clamped to the nearest integer resolution, since no other filter is possible. However, Gray and Colour images do work as expected.
Reporting a bug, as I suggested in an early comment would probably have got to this point much sooner. I'd still suggest you do that, so that this is not overlooked.
You may be interested to note that, for me, the resulting file when I don't downsample monochrome images, but do downsample the others, as per the command line above, is 785KB the original being 2.5MB.
I have a lot of PDF documents that I want to convert to PNG, edit in Gimp, and then save back to the multipage Acrobat file. I'm filling out forms and adding scanned signature, trying to avoid printing, signing, then scanning back in, with the ability to type the information I need to enter.
I've been trying to use Imagemagick to convert to png files, which seems to work fine. I use the command convert -quality 100 -density 300x300 multipage.pdf single%d.png
(I'm not really sure if the quality parameter is right for png).
But I'm having problems with saving back to PDF. Some of the files have the wrong page size, and I've tried every command and procedure I can find, but there are always a few odd sizes. The resolution seems to vary so that it looks good at a certain zoom level, but either a few pages are specified at about 2" wide, or they are 8.5x11 but the others are about 35" wide. I've tried making sure Gimp had the canvass size and resolution correct, and to save the resolution in the file, but that doesn't seem to matter.
The command I use to save the files is convert -page letter -adjoin single*.png multipage.pdf I've tried other parameters, but none seemed to matter.
If anyone has any ideas or alternatives, I'd appreciate it.
"I'm not really sure if the quality parameter is right for PNG."
For PNG output, the -quality setting is very unlike JPEG's quality setting (which simply is an integer from 0 to 100).
For PNG it is composed by two single digits:
The first digit (tens) is (largely) the zlib compression level, and it may go from 0 to 9.
(However the setting of 0 has a special meaning: when you use it you'll get Huffman compression, not zlib compression level 0. This is often better... Weird but true.)
The second digit is the PNG data encoding filter type (before it is compressed):
0 is none,
1 is "sub",
2 is "up",
3 is "average",
4 is "Paeth", and
5 is "adaptive".
In practical terms that means:
For illustrations with solid sequences of color a "none" filter (-quality 00) is typically the most appropriate.
For photos of natural landscapes an "adaptive" filtering (-quality 05) is generally the best.
"I'm having problems with saving back to PDF. Some of the files have the wrong page size, and I've tried every command and procedure I can find [...] but either a few pages are specified at about 2" wide, or they are 8.5x11 but the others are about 35" wide."
Not having available your PNG files, I created a few simple ones with different dimensions to verify the different commands (as I wasn't sure myself any more). Indeed, the one you used:
convert -page letter -adjoin single*.png multipage.pdf
does create all PDF pages in (same) letter size, but it places my sample of (differently sized) PNGs always on the lower left corner of the PDF page. (Should a PNG exceed the PDF page size, it does scale them down to make them fit -- but it doesn't scale up smaller PNGs to fill the available page space.)
The following modification to the command will place the PNGs into the center of each PDF page:
convert \
-page letter \
-adjoin \
single*.png \
-gravity center \
multipage.pdf
If this is still not good enough for you, you can enforce a (possibly non-proportional!) scaling to almost fill the letter area by adding a -scale '590!x770!' parameter (this will leave a border of 11 pt at each edge of the page):
convert \
-page letter \
-adjoin \
single*.png \
-gravity center \
-scale '590!x770!' \
multipage.pdf
To leave away the extra border, use -scale '612!x792!'. -- Should you want only upward scaling to happen if required while keeping the aspect ratio of the PNG, use -scale '590<x770<':
convert \
-page letter \
-adjoin \
single*.png \
-gravity center \
-scale '590<x770<' \
multipage.pdf
Why not just use Xournal? That's what I use to annotate PDFs