I'm using AVMutableCompositionLayerInstruction and the setCropRectangleRamp function to create a moving crop effect.
When exporting using AVAssetExportSession I set the output and renderSize to match the crop dimensions.
However the outputted video doesn't seem to follow the moving crop, but rather just outputs the center of the original video.
How do I get the encoder to encode the pixels inside the moving crop?
Figured this out. You need to use setTransformRamp(fromStart:toEnd:timeRange:) with corresponding parameters!
Related
I have some medical images of nii.gz format which are of different shapes. I want to resize all to the same shape inorder to feed to a deep learnig model, I tried using resample_img() of nibabel, but it destroys my images. I want to do some other function just to resize it to a particular shape, say (512,512,129).
Someone please help me in this regard. I am stuck in this step for quite a good number of days.
Maybe you can use this:
https://scikit-image.org/docs/dev/api/skimage.transform.html
I saw it in one of the papers. Here is the example in function ScaleToFixed:
https://github.com/sacmehta/3D-ESPNet/blob/master/Transforms.py
Here is how I did it. I have the volume of shape 320x320x130 (black and white so no rgb dimension). I want to make it twice as small. This worked for me:
import skimage.transform as skTrans
im = nib.load(file_path).get_fdata()
result1 = skTrans.resize(im, (160,160,130), order=1, preserve_range=True)
You can use TorchIO:
import torchio as tio
image = tio.ScalarImage('path/to/image.nii.gz')
transform = tio.CropOrPad((512,512,129))
output = transform(image)
If you would like to keep the original field of view, you could use the Resample transform instead.
Disclaimer: I'm the main developer of TorchIO.
I'm trying to solve some simple captcha using OpenCV and pytesseract. Some of captcha samples are:
I tried to the remove the noisy dots with some filters:
import cv2
import numpy as np
import pytesseract
img = cv2.imread(image_path)
_, img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, np.ones((4, 4), np.uint8), iterations=1)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imwrite('res.png', img)
print(pytesseract.image_to_string('res.png'))
Resulting tranformed images are:
Unfortunately pytesseract just recognizes first captcha correctly. Any other better transformation?
Final Update:
As #Neil suggested, I tried to remove noise by detecting connected pixels. To find connected pixels, I found a function named connectedComponentsWithStats, whichs detect connected pixels and assigns group (component) a label. By finding connected components and removing the ones with small number of pixels, I managed to get better overall detection accuracy with pytesseract.
And here are the new resulting images:
I've taken a much more direct approach to filtering ink splotches from pdf documents. I won't share the whole thing it's a lot of code, but here is the general strategy I adopted:
Use Python Pillow library to get an image object where you can manipulate pixels directly.
Binarize the image.
Find all connected pixels and how many pixels are in each group of connected pixels. You can do this using the minesweeper algorithm. Which is easy to search for.
Set some threshold value of pixels that all legitimate letters are expected to have. This will be dependent on your image resolution.
replace all black pixels in groups below the threshold with white pixels.
Convert back to image.
Your final output image is too blurry. To enhance the performance of pytesseract you need to sharpen it.
Sharpening is not as easy as blurring, but there exist a few code snippets / tutorials (e.g. http://datahacker.rs/004-how-to-smooth-and-sharpen-an-image-in-opencv/).
Rather than chaining blurs, blur once either using Gaussian or Median Blur, experiment with parameters to get the blur amount you need, perhaps try one method after the other but there is no reason to chain blurs of the same method.
There is an OCR example in python that detect the characters. Save several images and apply the filter and train a SVM algorithm. that may help you. I did trained a algorithm with even few Images but the results were acceptable. Check this link.
Wish you luck
I know the post is a bit old but I suggest you to try this library I've developed some time ago. If you have a set of labelled captchas that service would fit you. Take a look: https://github.com/punkerpunker/captcha_solver
In README there is a section "Train model on external data" that you might be interested in.
I want to apply a warp to an image specified by the source and destination locations of a (potentially small) number of control points in deep learning framework. And I thought the function 'tf.contrib.image.sparse_image_warp' can do exactly what I want. But after I tried, the warped image didn't look good.
More specifically, I want to warp the source image to destination image by face landmarks. So, I used the following code:
warped_image, dense_flows = sparse_image_warp(source_image, source_image_landmarks, dest_image_landmarks)
And the results are here:
source image with landmark:
dest image with landmark:
warped result:
desired result generated by other method:
Am I using the function in wrong way? Or the function can't realize my need?
Pay close attention to tf.contrib.image.sparse_image_warp, you need to supply the control points (lfacial landmarks in your example) in y-x coordinate rather than x-y.
I have data that I would like to save as png's. I need to keep the exact pixel dimensions - I don't want any inter-pixel interpolation, smoothing, or up/down sizing, etc. I do want to use a colormap, though (and mayber some other features of matplotlib's imshow). As I see it there are a couple ways I could do this:
1) Manually roll my own colormapping. (I'd rather not do this)
2) Figure out how to make sure the pixel dimenensions of the image in the figure produced by imshow are exactly correct, and then extract just the image portion of the figure for saving.
3) Use some other method which will directly give me a color mapped array (i.e. my NxN grayscale array -> NxNx3 array, using one of matplotlibs colormaps). Then save it using another png save method such as scipy.misc.imsave.
How can I do one of the above? (Or another alternate)
My problem arose when I was just saving the figure directly using savefig, and realized that I couldn't zoom into details. Upscaling wouldn't solve the problem, since the blurring between pixels is exactly one of the things I'm looking for - and the pixel size has a physical meaning.
EDIT:
Example:
import numpy as np
import matplotlib.pyplot as plt
X,Y = np.meshgrid(np.arange(-50.0,50,.1), np.arange(-50.0,50,.1))
Z = np.abs(np.sin(2*np.pi*(X**2+Y**2)**.5))/(1+(X/20)**2+(Y/20)**2)
plt.imshow(Z,cmap='inferno', interpolation='nearest')
plt.savefig('colormapeg.png')
plt.show()
Note zooming in on the interactive figure gives you a very different view then trying to zoom in on the saved figure. I could up the resolution of the saved figure - but that has it's own problems. I really just need the resolution fixed.
It seems you are looking for plt.imsave().
In this case,
plt.imsave("filename.png", Z, cmap='inferno')
I have a series of straight line segments of varying thickness connected end-to-end to create meandering path. Does anyone know a way to paint this as a smooth meandering line, sort of like vectorizing it? I am using QPainter. I haven't had any success finding an appropriate function in QPainterPath.
The data looks something like this:
[(QPointF, width), (QPointF, width), (QPointF, width), ... ]
Thanks!
EDIT: Added example image
I wanted to leave it open to creative responses, but I am just looking to move from linear interpolation (QPainter::drawLine()) to spline interpolation.
If I understand your question correctly...
Don't draw a line, draw a filled polygon that encloses your line data with the right thickness. Drawback: That requires calculations on your data beforehand.