Cannot open direct import of RAW from camera with crtw2fits - fits

I open a BULB image with gphoto2 but I cannot find the way to separate the bands of the RAW image. The code crtwo2fits could convert the images into FITS format but cr2.py flags the file as closed:
from crtwo2fits import cr2 camera_file = cr2.CR2Image(self.camera.file_get(self.folder, self.file, gp.GP_FILE_TYPE_NORMAL) ) data_type = 9 # for int32 numpy array img = cr2.CR2Image.load (camera_file) band = cr2._getExifValue(img, data_type)

Related

Triangulated vtp File Plot Problem - matplotlib OpenFOAM vs Paraview Cut

maybe you can help me out with a right comment or hint for my problem.
Pretty easy, I would like to plot a 2D slice vtp file (OpenFOAM) via matplotlib as tricontourf plot.
1.) Creating the vtp slice by Paraview and saving as vtp file works like a charme
2.) Using the runtime vtp file, created by cuttingPlane - libsampling OpenFOAM creates a weird triangle order.
What am I missing?
Best,
def loadVTPFile(filename):
import vtk
from vtk.util.numpy_support import vtk_to_numpy
from vtk.util import numpy_support as npvtk
reader = vtk.vtkXMLPolyDataReader()
reader.SetFileName(filename)
reader.Update()
data = reader.GetOutput()
points = data.GetPoints()
npts = points.GetNumberOfPoints()
x = vtk_to_numpy(points.GetData())
triangles= vtk_to_numpy(data.GetPolys().GetData())
ntri = triangles.size // 4 # number of cells
tri = np.take(triangles,[n for n in range(triangles.size) if n%4 != 0]).reshape(ntri,3)
n_arrays = reader.GetNumberOfPointArrays()
for i in range(n_arrays):
print(reader.GetPointArrayName(i))
X = vtk_to_numpy(points.GetData())
x=X[:,0]
y=X[:,1]
z=X[:,2]
# Define the velocity components U=(u,v,w)
U = vtk_to_numpy(data.GetPointData().GetArray('UMean'))
u = U[:,0]
v = U[:,1]
w = U[:,2]
magU=np.sqrt(u**2+v**2+w**2)
p = vtk_to_numpy(data.GetPointData().GetArray('pMean'))
Ma = vtk_to_numpy(data.GetPointData().GetArray('MaMean'))
rho = vtk_to_numpy(data.GetPointData().GetArray('rhoMean'))
return x,y,z,u,v,w,magU,p,Ma,rho,tri
1st: Paraview vtp slice via matplotlib:
Paraview vtp slice via matplotlib
2nd image OpenFOAM cut via libsampling
2nd image OpenFOAM cut via libsampling
Thanks for your help
OpenFOAM vtp slice export:
cellPoint, triangulated true/false, interpolated true/false and so on...

Using string output from pytesseract to do a vlookup in pandas dataframe

I'm very new to Python, and I'm trying to make a simple image to song title to BPM program. My approach is using pytesseract to generate a string output; and then, using that string output, I wish to vlookup in a dataframe created by pandas. However, it always return zero value even though that song does exist in the data.
import PIL.ImageGrab
from PIL import ImageGrab
import numpy as np
import pytesseract
import pandas as pd
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
def getTitleImage(left, top, width, height):
printscreen_pil = ImageGrab.grab((left, top, left + width, top + height))
printscreen_numpy = np.array(printscreen_pil.getdata(), dtype='uint8') \
.reshape((printscreen_pil.size[1], printscreen_pil.size[0], 3))
return printscreen_numpy
# Printscreen:
titleImage = getTitleImage(x, y, w, h)
# pytesseract to string:
songTitle = pytesseract.image_to_string(titleImage)
print('Name of the song: ', songTitle)
# Importing the csv data via pandas.
songTable = pd.read_csv(r'C:\Users\leech\Desktop\songList.csv')
# A simple vlookup formula that return the BPM of the song by taking data from the same row.
bpmSong = songTable[songTable['Song Title'] == songTitle]['BPM'].sum()
print('The BPM of the song is: ', bpmSong)
Output:
Name of the song: Macarena
The BPM of the song is: 0
However, when I tried to forcefully provide the string to the songTitle variable, it works:
songTitle = 'Macarena'
print('Name of the song: ', songTitle)
songTable = pd.read_csv(r'C:\Users\leech\Desktop\songList.csv')
bpmSong = songTable[songTable['Song Title'] == songTitle]['BPM'].sum()
print('The BPM of the song is: ', bpmSong)
Output:
Name of the song: Macarena
The BPM of the song is: 103
I have checked the string generated from pytesseract: It has no extra space in the front or the back, totally identical to the forced string, but they still produce different results. What could be the problem?
I found the answer.
It is because the songTitle coming from:
songTitle = pytesseract.image_to_string(titleImage)
...is actually 'Macarena\n' instead of 'Macarena'.
They might look the same after print out, except the former will create a new line after it.
A great lesson learn for me.

Read pdf object from S3

I am trying to create a lambda function that will access a pdf form uploaded to s3 and strip out the data entered into the form and send it elsewhere.
I am able to do this when I can download the file locally. So the below script works and allows me to read the data from the pdf into my pandas dataframe.:
import PyPDF2 as pypdf
import pandas as pd
s3 = boto3.resource('s3')
s3.meta.client.download_file(bucket_name, asset_key, './target.pdf')
pdfobject = open("./target.pdf", 'rb')
pdf = pypdf.PdfFileReader(pdfobject)
data = pdf.getFormTextFields()
pdf_df = pd.DataFrame(data, columns=get_cols(data), index=[0])
But with lambda I cannot save the file locally because I get a "read only filesystem" error.
I have tried using the s3.get_object() method like below:
s3_response_object= s3.get_object(
Bucket='pdf-forms-bucket',
Key='target.pdf',
)
pdf_bytes = s3_response_object['Body'].read()
But I have no idea how to convert the resulting bytes into an object that can be parsed with PyDF2. The output that I need and that PyDF2 will produce is like below:
{'form1[0].#subform[0].nameandmail[0]': 'Burt Lancaster',
'form1[0].#subform[0].mailaddress[0]': '675 Creighton Ave, Washington DC',
'form1[0].#subform[0].Principal[0]': 'David St. Hubbins',
'Principal[1]': None,
'form1[0].#subform[0].Principal[2]': 'Bart Simpson',
'Principal[3]': None}
So in summary, I need o be able to read a pdf with fillable forms, into memory and parse it without downloading the file because my lambda function environment won't allow local temp files.
Solved:
This does the trick:
import boto3
from PyPDF2 import PdfFileReader
from io import BytesIO
bucket_name ="pdf-forms-bucket"
item_name = "form.pdf"
s3 = boto3.resource('s3')
obj = s3.Object(bucket_name, item_name)
fs = obj.get()['Body'].read()
pdf = PdfFileReader(BytesIO(fs))
data = pdf.getFormTextFields()

How to convert all type of images to text using python tesseract

I'm trying to convert all type of images in a folder to text using python tesseract. Below is the that I'm using, with this only .png files are being converted to .txt, and other types are not being converted to text.
import os
import pytesseract
import cv2
import re
import glob
import concurrent.futures
import time
def ocr(img_path):
out_dir = "Output//"
img = cv2.imread(img_path)
text = pytesseract.image_to_string(img,lang='eng',config='--psm 6')
out_file = re.sub(".png",".txt",img_path.split("\\")[-1])
out_path = out_dir + out_file
fd = open(out_path,"w")
fd.write("%s" %text)
return out_file
os.environ['OMP_THREAD_LIMIT'] = '1'
def main():
path = input("Enter the path : ")
if os.path.isdir(path) == 1:
out_dir = "ocr_results//"
if not os.path.exists(out_dir):
os.makedirs(out_dir)
with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:
image_list = glob.glob(path+"\\*.*")
for img_path,out_file in zip(image_list,executor.map(ocr,image_list)):
print(img_path.split("\\")[-1],',',out_file,', processed')
if __name__ == '__main__':
start = time.time()
main()
end = time.time()
print(end-start)
How to convert all type of image files to text. Please help me with the above code.
There is a bug in the ocr function.
First of all, the following does convert all type of image files to text.
text = pytesseract.image_to_string(img,lang='eng',config='--psm 6'))
However, what the next chunk of code does are
Select those file with .png extension using a regex
Create a new path with the same filename and a a .txt extension
Write the OCR output to the newly create text file.
out_file = re.sub(".png",".txt",img_path.split("\\")[-1])
out_path = out_dir + out_file
fd = open(out_path,"w")
fd.write("%s" %text)
In other words, all types of images files are converted but not all are written back correctly. The regex matching logic only replace .png with .txt and assign to out_path. When there is no .png (other image types), the variable gets the same value as the original filename (e.g. sampe.jpg). The next lines of code open the original image and overwrite with the OCR result.
One way to fix is by adding all the image formats you want to cover into the regex.
For example,
out_file = re.sub(".png|.jpg|.bmp|.tiff",".txt",img_path.split("\\")[-1])

OpenCv_Python - Convert Frame Sequence To a Video

I am a newbie in OpenCV using Python. I am currently working with a project related opencv using python language. I have a video data set named "VideoDataSet/dynamicBackground/canoe/input" that stores the sequence of image frames and I would like to convert the sequence of frames from the file path to a video. However, I am getting an error when I execute the program. I have tried various codecs but it still gives me the same errors, can any of you please shed some light on what might be wrong? Thank you.
This is my sample code:
import cv2
import numpy as np
import os
import glob as gb
filename = "VideoDataSet/dynamicBackground/canoe/input"
img_path = gb.glob(filename)
videoWriter = cv2.VideoWriter('test.avi', cv2.VideoWriter_fourcc(*'MJPG'),
25, (640,480))
for path in img_path:
img = cv2.imread(path)
img = cv2.resize(img,(640,480))
videoWriter.write(img)
print ("you are success create.")
This is the error:
Error prompt out:cv2.error: OpenCV(3.4.1) D:\Build\OpenCV\opencv-3.4.1\modules\imgproc\src\resize.cpp:4044: error: (-215) ssize.width > 0 && ssize.height > 0 in function cv::resize
(Note: the problem occur with the img = cv2.resize(img,(640,480)))
It is returning this error because you are trying to re-size the directory entry! You need to put:
filename = "VideoDataSet/dynamicBackground/canoe/input/*"
So that it will match all the files in the folder when you glob it. The error actually suggested that the source image had either zero width or zero height. Putting:
print( img_path )
In after your glob attempt showed that it was only returning the directory entry itself.
You subsequently discovered that although it was now generating a file, it was corrupted. This is because you are incorrectly specifying the codec. Replace your fourcc parameter with this:
cv2.VideoWriter_fourcc('M','J','P','G')
you can try this:
img_path = gb.glob(filename)
videoWriter = cv2.VideoWriter('frame2video.avi', cv2.VideoWriter_fourcc(*'MJPG'), 25, (640,480))
for path in img_path:
img = cv2.imread(path)
img = cv2.resize(img,(640,480))
videoWriter.write(img)