This question is possibly related to storing and retrieving a numpy array in the form of an image. So, I am saving an array of binary values to an image (using scipy.misc.toimage feature):
import numpy, random, scipy.misc
data = numpy.array([random.randint(0, 1) for i in range(100)]).reshape(100, 1).astype("b")
image = scipy.misc.toimage(data, cmin=0, cmax=1, mode='1')
image.save("arrayimage.png")
Notice that I am saving the data with mode 1 (1-bit pixels, black and white, stored with one pixel per byte). Now, when I try to read it back like:
data = scipy.misc.imread("arrayimage.png")
the resulting data array comes back as all zeroes.
The question is: is there any other way to retrieve data from the image, with the strict requirement that the image should be created with the mode 1. Thanks.
I think you want this:
from PIL import Image
import numpy
# Generate boolean data
data=numpy.random.randint(0, 2, size=(100, 1),dtype="bool")
# Convert to PIL image and save as PNG
Image.fromarray(data).convert("1").save("arrayimage.png")
Checking what you get with ImageMagick
identify -verbose arrayimage.png
Sample Output
Image: arrayimage.png
Format: PNG (Portable Network Graphics)
Mime type: image/png
Class: PseudoClass
Geometry: 1x100+0+0
Units: Undefined
Colorspace: Gray
Type: Bilevel <--- Bilevel means boolean
Base type: Undefined
Endianess: Undefined
Depth: 8/1-bit
Channel depth:
Gray: 1-bit
Channel statistics:
Pixels: 100
Gray:
min: 0 (0)
max: 255 (1)
mean: 130.05 (0.51)
standard deviation: 128.117 (0.502418)
kurtosis: -2.01833
skewness: -0.0394094
entropy: 0.999711
Colors: 2
Histogram:
49: ( 0, 0, 0) #000000 gray(0) <--- half the pixels are black
51: (255,255,255) #FFFFFF gray(255) <--- half are white
Colormap entries: 2
Colormap:
0: ( 0, 0, 0,255) #000000FF graya(0,1)
1: (255,255,255,255) #FFFFFFFF graya(255,1)
Related
I have a 16-bit image which I want to rescale to 8-bit while achieving a high contrast. Now I tried histogram equalization as follows:
image_equ = cv.equalizeHist(cv_image.astype(np.uint8))
But the output is super strange:
What is happening? Is the rescaling to 8-bit first maybe the problem?
cv2.equalizeHist does not support uint16 input, and cv_image.astype(np.uint8) results overflows.
The solution is using different library, or implement the equalization using NumPy.
We can find the NumPy implementation of uint8 equalization in the OpenCV documentation:
Histograms - 2: Histogram Equalization
We can adjust the code (using NumPy) for uint16 input and output:
Replace 256 with 65536 (256 = 2^8 and 65536 = 2^16).
Replace 255 with 65535.
Replace uint8 with uint16.
Assuming the original code is correct, the following should work for uint16:
hist, bins = np.histogram(img.flatten(), 65536, [0, 65536]) # Collect 16 bits histogram (65536 = 2^16).
cdf = hist.cumsum()
cdf_m = np.ma.masked_equal(cdf, 0) # Find the minimum histogram value (excluding 0)
cdf_m = (cdf_m - cdf_m.min())*65535/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint16')
# Now we have the look-up table...
img2 = cdf[img]
Complete code sample (building sample 16 bits input):
import cv2
import numpy as np
# Build sample input for testing.
################################################################################
img = cv2.imread('chelsea.png', cv2.IMREAD_GRAYSCALE) # Read sample input image.
cv2.imshow('img', img) # Show input for testing.
img = img.astype(np.uint16) * 16 + 1000 # Make the image 16 bit, but the pixels range is going to be [1000, 5080] not full range (for example).
################################################################################
#equ = cv2.equalizeHist(img) # error: (-215:Assertion failed) _src.type() == CV_8UC1 in function 'cv::equalizeHist'
# https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html
hist, bins = np.histogram(img.flatten(), 65536, [0, 65536]) # Collect 16 bits histogram (65536 = 2^16).
cdf = hist.cumsum()
cdf_m = np.ma.masked_equal(cdf, 0) # Find the minimum histogram value (excluding 0)
cdf_m = (cdf_m - cdf_m.min())*65535/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint16')
# Now we have the look-up table...
equ = cdf[img]
# Show result for testing.
cv2.imshow('equ', equ)
cv2.waitKey()
cv2.destroyAllWindows()
Input (before scaling to 16 bits):
Output:
I have a pan-tilt-zoom camera (changing focal length over time). There is no idea about its base focal length (e.g. focal length in time point 0). However, It is possible to track the change in focal length between frame and another based on some known constraints and assumptions (Doing a SLAM).
If I assume a random focal length (in pixel unit), for example, 1000 pixel. Then, the new focal lengths are tracked frame by frame. Would I get correct results relatively? Would the results (focal lengths) in each frame be correct up to scale to the ground truth focal length?
For pan and tilt, assuming 0 at start would be valid. Although it is not correct, The estimated values of new tili-pan will be correct up to an offset. However, I suspect the estimated focal length will not be even correct up to scale or offset.. Is it correct or not?
For a quick short answer - if pan-tilt-zoom camera is approximated as a thin lens, then this is the relation between distance (z) and focal length (f):
This is just an approximation. Not fully correct. For more precise calculations, see the camera matrix. Focal length is an intrinsic parameter in the camera matrix. Even if not known, it can be calculated using some camera calibration method such as DLT, Zhang's Method and RANSAC. Once you have the camera matrix, focal length is just a small part of it. You get many more useful things along with it.
OpenCV has an inbuilt implementation of Zhang's method. (Look at this documentation for explanations, but code is old and unusable. New up-to-date code below.) You need to take some pictures of a chess board through your camera. Here is some helper code:
import cv2
from matplotlib import pyplot as plt
import numpy as np
from glob import glob
from scipy import linalg
x,y = np.meshgrid(range(6),range(8))
world_points=np.hstack((x.reshape(48,1),y.reshape(48,1),np.zeros((48,1)))).astype(np.float32)
_3d_points=[]
_2d_points=[]
img_paths=glob('./*.JPG') #get paths of all checkerboard images
for path in img_paths:
im=cv2.imread(path)
ret, corners = cv2.findChessboardCorners(im, (6,8))
if ret: #add points only if checkerboard was correctly detected:
_2d_points.append(corners) #append current 2D points
_3d_points.append(world_points) #3D points are always the same
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(_3d_points, _2d_points, (im.shape[1],im.shape[0]), None, None)
print ("Ret:\n",ret)
print ("Mtx:\n",mtx)
print ("Dist:\n",dist)
You might want Undistortion: Correcting for Radial Distortion
# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((6*8,3), np.float32)
objp[:,:2] = np.mgrid[0:6,0:8].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.
for fname in img_paths:
img = cv2.imread(fname)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv2.findChessboardCorners(gray, (6,8),None)
# If found, add object points, image points (after refining them)
if ret == True:
objpoints.append(objp)
cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
imgpoints.append(corners)
if 'IMG_5456.JPG' in fname:
plt.figure(figsize=(20,10))
img_vis=img.copy()
cv2.drawChessboardCorners(img_vis, (6,8), corners, ret)
plt.imshow(img_vis)
plt.show()
#Calibration
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
# Reprojection Error
tot_error = 0
for i in range(len(objpoints)):
imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
tot_error += error
print ("Mean Reprojection error: ", tot_error/len(objpoints))
# undistort
mapx,mapy = cv2.initUndistortRectifyMap(mtx,dist,None,newcameramtx,(w,h),5)
dst = cv2.remap(img,mapx,mapy,cv2.INTER_LINEAR)
# crop the image
x,y,w,h = roi
dst = dst[y:y+h, x:x+w]
plt.figure(figsize=(20,10))
#cv2.drawChessboardCorners(dst, (6,8), corners, ret)
plt.imshow(dst)
plt.show()
# Reprojection Error
tot_error = 0
for i in range(len(objpoints)):
imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
tot_error += error
print ("Mean Reprojection error: ", tot_error/len(objpoints))
I've been using GEE to export some training patches from Sentinel-2 to be used in Python.
I could make it work, by following the GEE guide https://developers.google.com/earth-engine/tfrecord, and using the Export.image.toDrive function and then I can parse the exported TFRecord file to reconstruct my tiles.
var image_export_options = {
'patchDimensions': [366, 366],
'maxFileSize': 104857600,
// 'kernelSize': [366, 366],
'compressed': true
}
Export.image.toDrive({
image: clipped_img.select(bands.concat(['classes'])),
description: 'PatchesExport',
fileNamePrefix: 'Oros_1',
scale: 10,
folder: 'myExportFolder',
fileFormat: 'TFRecord',
region: export_area,
formatOptions: image_export_options,
})
However, when I try to specify the kernelSize in the formatOptions (that was supposed to "overlaps adjacent tiles by [kernelSize[0]/2, kernelSize[1]/2]", according to the guide) the files are exported but the '*mixer.json' doesn't reflect the increased number of patches and I am not able to iterate through the patches afterwards. The following command crashes the google colab session:
image_dataset = tf.data.TFRecordDataset(str(path/(file_prefix+'-00000.tfrecord.gz')), compression_type='GZIP')
first = next(iter(image_dataset))
first
The weird is that the problem happens only when I add the kernelSize to the formatOptions.
After some time trying to overcome this issue, I realized a not well documented behavior when one uses the kernel size to export patches from GEE.
Bundled with the exported TFRecord, there exists one xml file called mixer.
It doesn't matter if we use:
'patchDimensions': [184, 184],
'kernelSize': [1, 1], #default for no overlapping
or
'patchDimensions': [184, 184],
'kernelSize': [184, 184], #half patch overlapping
The mixer file remains the same and no mention to the kernel/overlapping size:
{'patchDimensions': [184, 184],
'patchesPerRow': 8,
'projection': {'affine': {'doubleMatrix': [10.0,
0.0,
493460.0,
0.0,
-10.0,
9313540.0]},
'crs': 'EPSG:32724'},
'totalPatches': 40}
In the second case, if we try to parse the patches using tf.io.parse_single_example(example_proto, image_features_dict), where image_features_dict equals something like:
{'B2': FixedLenFeature(shape=[184, 184], dtype=tf.float32, default_value=None),
'B3': FixedLenFeature(shape=[184, 184], dtype=tf.float32, default_value=None),
'B4': FixedLenFeature(shape=[184, 184], dtype=tf.float32, default_value=None)}
it will raise the error:
_FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.
Can't parse serialized Example. [Op:ParseExampleV2]
Instead, to parse these records which have kernelSize > 1, we have to consider patchDimentions + kernelSize as the resulting patch size, even though the mixer.xml file says on contraty. In this example, our patchSize would be 368 (original patch size + kernelSize). Be aware that for odd kernel sizes, the number to be added to the original patch size is kernelSize - 1.
I have tried to migrate a custom model to Android platform. The tensorflow version is 1.12. I used the command line recommended shown as below:
tflite_convert \
--output_file=test.tflite \
--graph_def_file=./models/test_model.pb \
--input_arrays=input_image \
--output_arrays=generated_image
to convert .pb file into tflite format.
I have checked input tensor shape of my .pb file in tensorboard:
dtype
{"type":"DT_FLOAT"}
shape
{"shape":{"dim":[{"size":474},{"size":712},{"size":3}]}}
Then I deploy tflite file upon Android, and allocate input ByteBuffer that planed to feed the model as:
imgData = ByteBuffer.allocateDirect(
4 * 1 * 712 * 474 * 3);
When I run the model on Android device the app crashed and then logcat prints like:
2019-03-04 10:31:46.822 17884-17884/android.example.com.tflitecamerademo E/AndroidRuntime: FATAL EXCEPTION: main
Process: android.example.com.tflitecamerademo, PID: 17884
java.lang.RuntimeException: Unable to start activity ComponentInfo{android.example.com.tflitecamerademo/com.example.android.tflitecamerademo.CameraActivity}: java.lang.IllegalArgumentException: Cannot convert between a TensorFlowLite buffer with 786432 bytes and a ByteBuffer with 4049856 bytes.
It's so weird since allocated ByteBuffer is exactly the product of 4 * 3 * 474 * 712 whereas tensorflow lite buffer is not the multiple of 474 or 712. I don't figure out why tflite model got a wrong shape.
Thanks in advance if anyone can give a solution.
You could visualize the TFLite model to debug what buffer sizes are actually allocated to the input tensors.
TensorFlow Lite models can be visualized using the
visualize.py
script.
If the input tensor's buffer size isn't what you expect it to be, then there might be a bug with the conversion (or with the arguments provided to tflite_convert)
Hello guys,
I also had the similar problem yesterday. I would like to mention solution which works for me.
Seems like TSLite only support exact square bitmap inputs
Like
Size 256* 256 detection working
Size 256* 255 detection not working throwing exception
And max size which is supported
257*257 should be max width and height for any bitmap input
Here is the sample code to crop and resize bitmap
private var MODEL_HEIGHT = 257
private var MODEL_WIDTH = 257
Crop bitmap
val croppedBitmap = cropBitmap(bitmap)
Created scaled version of bitmap for model input
val scaledBitmap = Bitmap.createScaledBitmap(croppedBitmap, MODEL_WIDTH, MODEL_HEIGHT, true)
https://github.com/tensorflow/examples/blob/master/lite/examples/posenet/android/app/src/main/java/org/tensorflow/lite/examples/posenet/PosenetActivity.kt#L578
Crop Bitmap to maintain aspect ratio of model input.
private fun cropBitmap(bitmap: Bitmap): Bitmap {
val bitmapRatio = bitmap.height.toFloat() / bitmap.width
val modelInputRatio = MODEL_HEIGHT.toFloat() / MODEL_WIDTH
var croppedBitmap = bitmap
// Acceptable difference between the modelInputRatio and bitmapRatio to skip cropping.
val maxDifference = 1e-5
// Checks if the bitmap has similar aspect ratio as the required model input.
when {
abs(modelInputRatio - bitmapRatio) < maxDifference -> return croppedBitmap
modelInputRatio < bitmapRatio -> {
// New image is taller so we are height constrained.
val cropHeight = bitmap.height - (bitmap.width.toFloat() / modelInputRatio)
croppedBitmap = Bitmap.createBitmap(
bitmap,
0,
(cropHeight / 2).toInt(),
bitmap.width,
(bitmap.height - cropHeight).toInt()
)
}
else -> {
val cropWidth = bitmap.width - (bitmap.height.toFloat() * modelInputRatio)
croppedBitmap = Bitmap.createBitmap(
bitmap,
(cropWidth / 2).toInt(),
0,
(bitmap.width - cropWidth).toInt(),
bitmap.height
)
}
}
return croppedBitmap
}
https://github.com/tensorflow/examples/blob/master/lite/examples/posenet/android/app/src/main/java/org/tensorflow/lite/examples/posenet/PosenetActivity.kt#L451
Thanks and Regards
Pankaj
I had changed the image dimensions from the standard 224 earlier in the model creation process to 299 for other reasons, so I just searched my Android Studio project for 224 and updated the two final references in ImageClassifier.java to 299, and I was back in business.
I use the following numpy array that hold an image which is black and white image with the following shape
print(img.shape)
(28, 112)
when I try to grayscale the image, to use it to get contours using opencv with following steps
#grayscale the image
grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#thredshold image
thresh = cv2.threshold(grayed, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
I got the following error
<ipython-input-178-7ebff17d1c18> in get_digits(img)
6
7 #grayscale the image
----> 8 grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
9
10
error: C:\projects\opencv-python\opencv\modules\imgproc\src\color.cpp:11073: error: (-215) depth == 0 || depth == 2 || depth == 5 in function cv::cvtColor
the opencv errors have no information in it to be able to get what is wrong
Here is the working code for how you were trying it:
img = np.stack((img,) * 3,-1)
img = img.astype(np.uint8)
grayed = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(grayed, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
A simpler way of getting the same result is to invert the image yourself:
img = (255-img)
thresh = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)[1]
As you discovered, as you perform different operations on images, the image is required to be in different formats.
cv2.THRESH_BINARY_INV and cv2.THRESH_BINARY are designed to take a color image (and convert it to grayscale) so you need a three channel representation.
cv2.THRESH_OTSU works with grayscale images so one channel is okay for that.
Since your image was already grayscale from the start, you weren't able to convert it from color to grayscale nor did you really need to. I assume you were trying to invert the image but that's easy enough on your own (255-img).
At one point you tried to do an cv2.THRESH_OTSU with floating point values but cv2.THRESH_OTSU requires integers between 0 and 255.
If openCV had more user-friendly error messages it would really help with issues like these.