Using GPU to rasterize image with 128 color channels - gpu

I need to rasterize a multispectral image, where each pixel contains the intensity (8 bits) at 128 different wavelengths, for a total of 1024 bits/pixel.
Currently I am using OpenGL, and rasterizing in 43 passes, each producing an image with 3 of the 128 channels, but this is too slow.
Is it possible to do it in a single pass by somehow telling the GPU to rasterize a 128 color component image (not necessarily using OpenGL)?

Related

Input feature to Feature maps

Can anybody please explain this basic thing to me that how does a 192x28x28 input image gets reduced to a 16x28x28 feature maps using a 1x1 conv mapping. My question is about the understanding of what exactly happens when 192 goes to 16 ??
i know about ((I-2P-F)/S)+1, but what happens in the process of reducing depth.
The 1x1 Convolution compresses the whole 192*28*28 input image (which could be read as 192 feature maps of 28px * 28px pixels images) into a single 1*28*28 image. So far it reduces depth in the "feature map axis" to 1 while preserving the height and width of the original image.
But then... why do you get the 16? In a convolutional layer you can have different kernels. Basically each kernel is an indepentent filter with the same size. In your case it looks like your 1x1 Conv layer has 16 kernels by default, hence you get 16 28*28 images (one per kernel).

Darknet YOLO image size

I am trying to train custom object classifier in Darknet YOLO v2
https://pjreddie.com/darknet/yolo/
I gathered a dataset for images most of them are 6000 x 4000 px and some lower resolutions as well.
Do I need to resize the images before training to be squared ?
I found that the config uses:
[net]
batch=64
subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
thats why I was wondering how to use it for different sizes of data sets.
You don't have to resize it, because Darknet will do it instead of you!
It means you really don't need to do that and you can use different image sizes during your training. What you posted above is just network configuration. There should be full network definition as well. And the height and the width tell you what's the network resolution. And it also keeps aspect ratio, check e.g this.
You don't need to resize your database images. PJReddie's YOLO architecture does it by itself keeping the aspect ratio safe (no information will miss) according to the resolution in .cfg file.
For Example, if you have image size 1248 x 936, YOLO will resize it to 416 x 312 and then pad the extra space with black bars to fit into 416 x 416 network.
It is very common to resize images before training. 416x416 is slightly larger than common. Most imagenet models resize and square the images to 256x256 for example. So I would expect the same here. Trying to train on 6000x4000 is going to require a farm of GPUs. The standard process is to square the image to the largest dimension (height, or width), padding with 0's on the shorter side, then resizing using standard image resizing tools like PIL.
You do not need to resize the images, you can directly change the values in darknet.cfg file.
When you open darknet.cfg (yolo-darknet.cfg) file, you can all
hyper-parameters and their values.
As showed in your cfg file images dimensions are (416,416)->(weight,height), you can change the values, so that darknet will automatically resize the images before training.
Since the images have high dimensions, you can adjust batch and sub-division values (lower the values 32,16,8 . it has to be multiples of 2), so that darknet will not crash (memory allocation error)
By default the darknet api changes the size of the images in both inference and training, but in theory any input size w, h = 32 x X where X belongs to a natural number should, W is the width, H the height. By default X = 13, so the input size is w, h = (416, 416). I use this rule with yolov3 in opencv, and it works better the bigger X is.

Can SoOffscreenRenderer use tiles bigger than 1024

The coin3d offscreen rendering class SoOffscreenRenderer is capable of rendering big images (e.g. 4000 x 2000 pixels), that don't fit on the screen or in a rendering buffer. This is done by partitioning the image into tiles that are rendered one after the other, where the default size of these tiles is 1024 x 1024.
I looked at the code of SoOffscreenRenderer and CoinOffscreenGLCanvas and found environment variables COIN_OFFSCREENRENDERER_TILEWIDTH COIN_OFFSCREENRENDERER_TILEHEIGHT. I could change the tile size using these variables, but only to sizes smaller than 1024. I could create tiles with 512 x 512 pixels, and also 768 x 768. When I used values bigger than 1024, the resulting tiles were always of size 1024 x 1024.
Is it possible to use bigger tile sizes like 2048 x 2048 or 4096 x 4096, and how would I do that?
It is possible to use larger tiles and coin does it automatically. It will find out which tile sizes work by querying the graphics card driver.
From CoinOffscreenGLCanvas.cpp:
// getMaxTileSize() returns the theoretical maximum gathered from
// various GL driver information. We're not guaranteed that we'll be
// able to allocate a buffer of this size -- e.g. due to memory
// constraints on the gfx card.
The reason why it did not work was that the environment variable COIN_OFFSCREENRENDERER_MAX_TILESIZE was set somewhere in our application using coin_setenv("COIN_OFFSCREENRENDERER_MAX_TILESIZE", "1024", 1);. Removing this call allowed bigger tile sizes to be used.
In CoinOffscreenGLCanvas::getMaxTileSize(void), the variable COIN_OFFSCREENRENDERER_MAX_TILESIZE is read and the tile size clamped accordingly.
On my older computer it generated tiles of size 1024, but on a newer machine the tiles were of size 4096.

Animated GIF larger than source images

I'm using imagemagick to create an animated GIF out of ~60 JPG 640x427px photos. The combined size of the JPGs is about 4MB.
However, the output GIF is ~12MB. Is there a reason why the GIF is considerably bigger? Can I conceivably achieve a GIF size of ~4MB?
The command I'm using is:
convert -channel RGB # no improvement in size
-delay 2x10 \
-size 640 \
-loop 0 \
-dispose Background # no improvement in size
-layers Optimize # about 2MB improvement
portrait/*.jpg portrait.gif
Using gifsicle didn't seem to improve either.
JPG is lossy compression.
GIF is lossless compression.
A better comparison would be to convert all the source images to GIF first, then combine them..
First google hit for GIF compression is http://ezgif.com/optimize which claims lossy GIF compresion, might work for you but I offer no warranty as I haven't tried it.
JPEG achieves it's compression through a (lossy) transform, where an 16x16 / 8x8 block of pixels is transformed to frequency representation and then quantized. Instead of selecting e.g. 256 levels (i.e. 8 bits) of red/green/blue per component, JPEG can ignore some frequency components, or use just 1 or 2 bits to represent them.
GIF on the other hand works by identifying repeated patterns from a paletted image (upto 256 entries), which occur exactly in the previously encoded/decoded stream. Both because of the JPEG compression, and the source of the images typically encoded by JPEG (natural full color), the probability of (long) exact matches is quite low.
60 RGB images with the size 640x427 is about 16 million pixels. To represent that much in 4 MB, requires a compression of 2 bits per pixel. To achieve this with GIF would require a very lossy algorithm, that would select (vector) quantization of true color pixels not to the closest pixel in the target GIF palette, but based also on the fact how good dictionary of code words this particular selection will make. The dictionary builds slowly and to achieve 2 bits/pixel, the average length of the decoded code word would have to map to 5.5 matching pixels in the close neighborhood.
By contrast, imagemagick has been able to compress the 16 million pixels (each selected from a palette of 256 elements) to 75% already!

How should I tile images for CATiledLayer?

I know how to tile images, I just don't get how the images should turn out, with sizes and stuff..
The names should be Image_size_row_colum, and one of the Apple tiles images is:
Lake_125_0_0.png
I use TileCutter to tile the images, but I don't know if I should tile my original image to 512x512px, and then make a worse resolution image of the original from ≈7000x6000 to ≈5000x4000 and then tile that image to 512x512px or whatever.. I just don't get the whole setup..
The class reads images like this:
NSString *tileName = [NSString stringWithFormat:#"%#_%d_%d_%d",
imageName, (int)(scale * 1000), row, col];
And with the first of apples tiles are named Lake_125_0_0.png, that gives me nothing.. I just don't get it.. Anyone?
Thanks.
the tiles are by default always 256 to 256 pixels (although in the apple example some tiles at the border of the image got cropped).
Lake_1000_1_2: full resolution tile at scale 1, row 1, col 2.
Lake_500_1_2: half resolution: the tile is also 256 to 256 pixel but you show an area of the image which is actually 512 to 512 pixels (so you loose quality)
Lake_250_1_2: quarter resolution
Lake_125_1_2: show 8*256 to 8*256 pixels of the original image inside a 256 to 256 pixels tile
I hope this helps.