I have an image with dimensions 1024*1024 stored in a HDF5 file, which is treated as a data cube of slice thickness 1, (so that stored dimension is 1024*1024*1) . I used the Niermann HDF5 plug-in (https://github.com/niermann/gms_plugin_hdf5) to import the data. After importing, the data cube became 1*1024*1024, and displayed as 1 pixel wide, 1024 pixels hight and 1024 slices images.
Before considering re-implement the plug-in, I'd like to ask, is there any way to "reshape" the data (like in "Numpy.reshape"), so that the dimensions can be properly treated?
Thanks!
If you don't like icol, irow and these expressions, another elegant solution is to just use the streaming object.
image ReShape3D( image input, number sx, number sy, number sz )
{
// Perform testing
number nPix=1
for ( number d=0; d<input.ImageGetNumDimensions(); d++ )
nPix *= input.ImageGetDimensionsize(d)
if ( sx*sy*sz < nPix ) Throw( "Input image larger than provided shape." )
if ( sx*sy*sz > nPix ) Throw( "Input image smaller than provided shape." )
image reShaped := input.Imageclone()
reShaped.ImageResize(3,sx,sy,sz)
object dStream = NewStreamFromBuffer(0)
ImageWriteImageDataToStream(input,dStream,0)
dStream.StreamSetPos(0,0)
ImageReadImageDataFromStream(reShaped,dStream,0)
return reshaped
}
Image before := RealImage("Before",4,10,20,30)
before = random()
Image after := ReShape3D( before,20,10,30 )
before.ShowImage()
after.ShowImage()
If your input/output array sizes do not match in dimension (such that slice would not work), then you can also 'stream' the data into and out of 1D using the following:
number sx = 4
number sy = 5
number sz = 2
image oneLine := RealImage( "1D",4, sx*sy*sz )
oneLine = icol
oneLine.ShowImage()
image reShape1Dto3D := RealImage( "1D->3D", 4, sx, sy, sz )
reShape1Dto3D = oneLine[icol + iwidth*irow + iwidth*iheight*iplane, 0 ]
reShape1Dto3D.ShowImage()
image reShape3Dto1D := RealImage( "3D->1D", 4, sx*sy*sz )
reShape3Dto1D[icol + iwidth*irow + iwidth*iheight*iplane, 0 ] = reShape1Dto3D
reShape3Dto1D.ShowImage()
The trick here is, that you can address a single value in an image expression using square-brackets. In a 3D image by [X,Y,Z], in a 2D image by [X,Y], and in a 1D images as [X,0]. [*]
The internal variables icol, irow, iplane are replaced by X,Y,Z coordinate of the evaluated expression, whereas iwidth, iheight and idepth are replaced by the dimension sizes of the evaluated expression.
What is the evaluated expression's size? It becomes defined by the only image of "known size" in the line - either left or right side, so that
reShape1Dto3D = oneLine[ icol + iwidth*irow + iwidth*iheight*iplane, 0 ]
becomes a loop over X/Y/Z of all pixels of reShape1Dto3D an the lefthand side of the expression. For each triplet (X/Y/Z) the value is taken from the computed position of oneLine.
Exactly the same is used in
reShape3Dto1D[ icol + iwidth*irow + iwidth*iheight*iplane, 0 ] = reShape1Dto3D
but here the loop is again over the size of reShape1Dto3D, because that is the image of "known size" in the line, even if it is on the righthand side.
* Higher dimensionality is not supported in this way, as [T,L,B,R] is already used for sub-areas.
After few more trial with the examples in "DM scripting handbook", a method is figured out:
image out = in.slice2(0,0,0, 1,1024,1, 2,1024,1)
that is, the output 2d image in the x-y take the projection of y-z plane of the input image using Slice2() command.
Related
I am trying to downsample or reduce the resolution of a 3D array only on the first two axes. For example, if the array size is 40x50x300, downsampling it with a degree of 2 will make it 20x25x300
for this purpose, I have found a function in scikit-image
def create_img(nX, nY, nMZ):
"this function creates a demo array"
img = np.zeros((nX,nY,nMZ))
img_sp = np.arange(nX*nY).reshape(nX, nY) + 1
for r in range(img.shape[0]):
for c in range(img.shape[1]):
img[r,c,:] = img_sp[r,c]
return img
image = create_img(7, 5, 10)
Now every pixel in the image(2D) has corresponding values on the z axis.
from skimage.measure import block_reduce
img_block = block_reduce(image, block_size=(2, 2, 1), cval=0, func=np.min)
now, the block_reduce function will take every minimum value in sliding 2x2 window and downsample the image on the x and y-axis.
If func arg is changed to np.max it will take the maximum value in the 2x2 window. The other supporting func are np.mean, np.median and so...
But I want to take XY values based on location/indices for example 0th element on 2x2 or max indice elements.
How to achieve that?
I wrote a script to convert the EELS map to EELS line scan data, and it works well with DM 2.0. I can deal with it as directly collected EELS line scan data with DM2.0. But it does not work with DM 3.0 and the above version. It looks DM 3.0 still recognizes it as an EELS map file. DM3.0 still tried to generate elemental maps with multiple windows from it not generate line scan profiles with one single window and said the display type is incorrect. Not sure what code/command I need to add to fit the DM 3.0 and above versions. Appreciate any suggestions/comments.
image source
source := getFrontImage()
number sizeX,sizeY,sizeZ
source.Get3Dsize(sizeX,sizeY,sizeZ)
Result( "Original size:"+ sizeX +"; "+ sizeY+"; "+sizeZ+""+"\n" )
image sum
number regionsizeX = 1
number regionsizeY = sizeY
number row,col
Result( "new size:"+ regionsizeX +"; "+ regionsizeY+"; "+row+""+row+" "+"\n" )
sum := RealImage("Line Scan of [0,0,"+regionSizeY+","+regionSizeX+"]",4,sizeX/regionSizeX,sizey/regionsizeY,sizeZ)
//sum := ImageClone(source)
sum = 0
for (row=0;row<regionsizeY;row++) for (col=0;col<regionSizeX;col++)
{
OpenAndSetProgressWindow("Doing sub-block","x = "+col," y = "+row)
sum += Slice3(source,col,row,0,0,sizeX/regionSizeX,regionsizeX,1,sizeY/regionSizeY,regionSizeY,2,sizez,1)
}
OpenAndSetProgressWindow("","","")
ImageCopyCalibrationFrom(sum, source)
sum.setdisplaytype (1)
sum.SetStringNote( "Meta Data:Format", "Spectrum image" )
sum.SetStringNote( "Meta Data:Signal", "EELS" )
showimage(sum)
I'm also a bit confused by your terminology. When you write "Convert a Map into a LineScan" do you mean:
a) Convert a 3D Spectrum-Image (xy scan, one spectral dimension) into a 2D Line-Scan Spectrum-Image (one spatial dimension, one spectral dimension)
or
b) Convert a 2D Map (xy scan, one value) in a 1D Line-Trace (one spatial dimension, one value per point) ?
I suppose you mean a) and answer to that.
I'm surprised if/that your script would work without issues in GMS 2.
Your final (supposedly line-scan SI) data is still a 3D dataset with the dispersion running in Z-direction. This is not the typical LineScan SI data format (which is dispersion in X, spatial dimension in Y, no Z dimension).
Am I right in thinking that you want to "collapse" your 3D data along the y-dimension (by summing) ?
If so, what you want to do is:
// Get Input
image src3D := GetFrontImage()
number sizeX,sizeY,sizeZ
if ( 3 != src3D.ImageGetNumDimensions() ) Throw( "Input not 3D")
src3D.Get3Dsize(sizeX,sizeY,sizeZ)
// Optional: Use Rect-ROI on image to specify area
// If no selection, will return full FOV
number t,l,b,r
src3D.GetSelection(t,l,b,r)
// Prepare output (for summing 3D of rect-selection along y)
// NB: 2D container has:
// X dimension (spatial) along Y
// Z dimension (energy) along X
number nSpatial = r - l
number nSpectral = sizeZ
number eOrig, eScale, sOrig, sScale
string eUnit, sUnit
src3D.ImageGetDimensionCalibration(0, sOrig, sScale, sUnit, 0)
src3D.ImageGetDimensionCalibration(2, eOrig, eScale, eUnit, 0)
string name
if ( nSpatial != sizeX )
name = "Y-projection of [" + t + "," + l + "," + b + "," + r + "] over " + (b-t) + " rows"
else
name = "Y-projection over " + sizeY + " rows"
image dst2D := RealImage( name, 4, nSpectral, nSpatial )
dst2D.ImageSetDimensionCalibration(0, eOrig, eScale, eUnit, 0)
dst2D.ImageSetDimensionCalibration(1, sOrig, sScale, sUnit, 0)
// Copy Tags (contains necessary meta tags! Meta Data Format & Signal)
dst2D.ImageGetTagGroup().TagGroupCopyTagsFrom( src3D.ImageGetTagGroup() )
// Display (with captions)
dst2D.ShowImage()
dst2D.ImageGetImageDisplay(0).ImageDisplaySetCaptionOn(1)
number doFAST = 0
if ( !doFAST )
{
// Perform actuall summing (projection) by summing "line by line"
// into the LinePlot SI. Note the flipping of input and output dimensions!
for( number y = t; y<b; y++ )
{
number lineNumber = y - t
dst2D.slice2( 0,0,0, 0,nSpectral,1, 1,nSpatial,1 ) += src3D.slice2( l,y,0, 2,nSpectral,1, 0,nSpatial,1)
}
}
else
{
// Alternative (faster) projection. Use dedicated projection command.
image proj := src3D[l,t,0,r,b,nSpectral].Project(1) // Outcome of projectsion is having x=x and y=z, so need flip axis
dst2D = proj.slice2(0,0,0, 1,nSpectral,1, 0,nSpatial,1 ) // Flip axis
}
// Display (with captions)
dst2D.ShowImage()
dst2D.ImageGetImageDisplay(0).ImageDisplaySetCaptionOn(1)
Note that iterating using slice blocks is fast, but not as fast as the dedicated 'Project' command available in latest GMS versions. The example uses either, but lines #51-56 might not be available in older GMS.
Edit to address comment below:
Other relevant meta data for spectra is also found in the tags. For EELS, in particular the collection & convergence angle as well as the HT is of importance. You can find out about the tag-path by checking the tags of a properly acquired EELS spectrum.
Or, you can find out about their tag-paths by "converting" an empty 1D line-plot into an EELS spectrum and then attempting a quantification. You will get the prompt to fill in the data. After doing so, check the tags of the image:
i tried getting individual characters from the image and passing them through the ocr, but the result is jumbled up characters. Passing the whole image is at least returning the characters in order but it seems like the ocr is trying to read all the other contours as well.
example image:
Image being used
The result : 6A7J7B0
Desired result : AJB6779
The code
img = cv2.imread("data/images/car6.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# resize image to three times as large as original for better readability
gray = cv2.resize(gray, None, fx = 3, fy = 3, interpolation = cv2.INTER_CUBIC)
# perform gaussian blur to smoothen image
blur = cv2.GaussianBlur(gray, (5,5), 0)
# threshold the image using Otsus method to preprocess for tesseract
ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
# create rectangular kernel for dilation
rect_kern = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
# apply dilation to make regions more clear
dilation = cv2.dilate(thresh, rect_kern, iterations = 1)
# find contours of regions of interest within license plate
try:
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
except:
ret_img, contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# sort contours left-to-right
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
# create copy of gray image
im2 = gray.copy()
# create blank string to hold license plate number
plate_num = ""
# loop through contours and find individual letters and numbers in license plate
for cnt in sorted_contours:
x,y,w,h = cv2.boundingRect(cnt)
height, width = im2.shape
# if height of box is not tall enough relative to total height then skip
if height / float(h) > 6: continue
ratio = h / float(w)
# if height to width ratio is less than 1.5 skip
if ratio < 1.5: continue
# if width is not wide enough relative to total width then skip
if width / float(w) > 15: continue
area = h * w
# if area is less than 100 pixels skip
if area < 100: continue
# draw the rectangle
rect = cv2.rectangle(im2, (x,y), (x+w, y+h), (0,255,0),2)
# grab character region of image
roi = thresh[y-5:y+h+5, x-5:x+w+5]
# perfrom bitwise not to flip image to black text on white background
roi = cv2.bitwise_not(roi)
# perform another blur on character region
roi = cv2.medianBlur(roi, 5)
try:
text = pytesseract.image_to_string(roi, config='-c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 8 --oem 3')
# clean tesseract text by removing any unwanted blank spaces
clean_text = re.sub('[\W_]+', '', text)
plate_num += clean_text
except:
text = None
if plate_num != None:
print("License Plate #: ", plate_num)
For me psm mode 11 worked able to detect single line and multi as well
pytesseract.image_to_string(img, lang='eng', config='--oem 3 --psm 11').replace("\n", ""))
11 Sparse text. Find as much text as possible in no particular order.
If you want to extract license plate number from two rows you can replace following line:
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0] + cv2.boundingRect(ctr)[1] * img.shape[1] )
with
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
Can anyone help me with parameters for SetGeoTransform? I'm creating raster layers with GDAL, but I can't find description of 3rd and 5th parameter for SetGeoTransform. It should be definition of x and y axis for cells. I try to find something about it here and here, but nothing.
I need to find description of these two parameters... It's a value in degrees, radians, meters? Or something else?
The geotransform is used to convert from map to pixel coordinates and back using an affine transformation. The 3rd and 5th parameter are used (together with the 2nd and 4th) to define the rotation if your image doesn't have 'north up'.
But most images are north up, and then both the 3rd and 5th parameter are zero.
The affine transform consists of six coefficients returned by
GDALDataset::GetGeoTransform() which map pixel/line coordinates into
georeferenced space using the following relationship:
Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)
Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)
See the section on affine geotransform at:
https://gdal.org/tutorials/geotransforms_tut.html
I did do like below code.
As a result I was able to do same with SetGeoTransform.
# new file
dst = gdal.GetDriverByName('GTiff').Create(OUT_PATH, xsize, ysize, band_num, dtype)
# old file
ds = gdal.Open(fpath)
wkt = ds.GetProjection()
gcps = ds.GetGCPs()
dst.SetGCPs(gcps, wkt)
...
dst.FlushCache()
dst = Nonet
Given information from the aforementioned gdal datamodel docs, the 3rd & 5th parameters of SatGeoTransform (x_skew and y_skew respectively) can be calculated from two control points (p1, p2) with known x and y in both "geo" and "pixel" coordinate spaces. p1 should be above-left of p2 in pixelspace.
x_skew = sqrt((p1.geox-p2.geox)**2 + (p1.geoy-p2.geoy)**2) / (p1.pixely - p2.pixely)`
y_skew = sqrt((p1.geox-p2.geox)**2 + (p1.geoy-p2.geoy)**2) / (p1.pixelx - p2.pixelx)`
In short this is the ratio of Euclidean distance between the points in geospace to the height (or width) of the image in pixelspace.
The units of the parameters are "geo"length/"pixel"length.
Here is a demonstration using the corners of the image stored as control points (gcps):
import gdal
from math import sqrt
ds = gdal.Open(fpath)
gcps = ds.GetGCPs()
assert gcps[0].Id == 'UpperLeft'
p1 = gcps[0]
assert gcps[2].Id == 'LowerRight'
p2 = gcps[2]
y_skew = (
sqrt((p1.GCPX-p2.GCPX)**2 + (p1.GCPY-p2.GCPY)**2) /
(p1.GCPPixel - p2.GCPPixel)
)
x_skew = (
sqrt((p1.GCPX-p2.GCPX)**2 + (p1.GCPY-p2.GCPY)**2) /
(p1.GCPLine - p2.GCPLine)
)
x_res = (p2.GCPX - p1.GCPX) / ds.RasterXSize
y_res = (p2.GCPY - p1.GCPY) / ds.RasterYSize
ds.SetGeoTransform([
p1.GCPX,
x_res,
x_skew,
p1.GCPY,
y_skew,
y_res,
])
I have array with shape 15x30, and want to save it as pseudocolor plot with imsave() in pylab mode. However the size of the output image produced is 15x30px. I tried setting dpi parameter, but it doesn't help nor this function have any other parameter that will change the image size.
So how can I save pseudocolor image from array, with imsave() and change the size of output image?
A really hacky solution to this to just scale up your data:
data = rand(10, 15)
new_data = np.zeros(np.array(data.shape) * 10)
for j in range(data.shape[0]):
for k in range(data.shape[1]):
new_data[j * 10: (j+1) * 10, k * 10: (k+1) * 10] = data[j, k]
imsave(new_data)