The following simple python code is for detecting and tracking the object based on color by using webcam.
My question is how can use the same code but by using Kinect v2 (NOT webcam).
I am using Ubuntu 16.04, linux
Any one can help with this, and tell me how to use Kinect v2 as webcam in linux ???
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(1):
# Take each frame
_, frame = cap.read()
# Convert BGR to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# define range of blue color in HSV
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower_blue, upper_blue)
# Bitwise-AND mask and original image
res = cv2.bitwise_and(frame,frame, mask= mask)
cv2.imshow('Original',frame)
cv2.imshow('mask',mask)
cv2.imshow('Detect-Blue',res)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
If you are still looking for a solution here is one. For Linux, there is an open source library called, "libfreenect2" which I have been using to grab images from Kinect2. Once you are done with the installation part then you could play with the program "Protonect.cpp" as per you needs. In the same program you could added your code after the "while" loop at line#349. It will do the job. And of course, you have to add the OpenCV header files as you are using cv2 functionalities.
By the way, I have installed the library on my laptop with Ubuntu 16.04 and Nvidia Jetson TK1 and both are working fine. In my work, I used it only to save the images and create 3D models out of it. Not doing any kind of tracking, though.
Related
I can't seem to find any documentation on how to use this model.
I am trying to use it to print out the objects that appear in a video
any help would be greatly appreciated
I am just starting out so go easy on me
I am trying to use it to print out the objects that appear in a video
I interpret that your problem is to print out the name of the found objects.
I don't know how you implemented where you got Fast RCNN trained on OpenImages v4. Therefore, I will give you the way with the model from Tensorflow Hub. Google Colab. AI Hub
After some digging around and a LOT of trial and error I came up with this
#!/home/ahmed/anaconda3/envs/TensorFlow/bin/python3.8
import tensorflow as tf
import tensorflow_hub as hub
import time,imageio,sys,pickle
# sys.argv[1] is used for taking the video path from the terminal
video = sys.argv[1]
#passing the video file to ImageIO to be read later in form of frames
video = imageio.get_reader(video)
dictionary = {}
#download and extract the model( faster_rcnn/openimages_v4/inception_resnet_v2 or
# openimages_v4/ssd/mobilenet_v2) in the same folder
module_handle = "*Path to the model folder*"
detector = hub.load(module_handle).signatures['default']
#looping over every frame in the video
for index, frames in enumerate(video):
# converting the images ( video frames ) to tf.float32 which is the only acceptable input format
image = tf.image.convert_image_dtype(frames, tf.float32)[tf.newaxis]
# passing the converted image to the model
detector_output = detector(image)
class_names = detector_output["detection_class_entities"]
scores = detector_output["detection_scores"]
# in case there are multiple objects in the frame
for i in range(len(scores)):
if scores[i] > 0.3:
#converting form bytes to string
object = class_names[i].numpy().decode("ascii")
#adding the objects that appear in the frames in a dictionary and their frame numbers
if object not in dictionary:
dictionary[object] = [index]
else:
dictionary[object].append(index)
print(dictionary)
There are several topics on this but many of them are very very old and no real solutions have been offered (or none that work for me for sure).
I am trying various libraries to get Python to read the frames of my USB camera (DCC1545M) and they all have various module or DLL import errors. I'm trying Instrumental, Thorcam API, py-harware, micromanager..
Specifically I would ideally love to get it to work with OpenCV, because of all the useful computer vision features that you can later use on the image, which I am not sure you can do with the other libraries.
However, I encounter the same issue as everyone else, in that openCV can not read the USB camera in the first place.
cap = cv2.VideoCapture(1) ## tried difference indices
cap.isOpened() ## returns FALSE
img_counter = 0
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
while True:
ret,frame = cap.read() ## returned frame is empty
cv2.imshow('preview',frame)
k = cv2.waitKey(1)
if k%256==32: # if SPACE is pressed, take image
img_name = 'frame_number_{}.png'.format(img_counter)
cv2.imwrite(img_name,frame)
print('frame taken ')
img_counter += 1
cap.release()
cv2.destroyAllWindows()
I have installed the driver from Thorlabs website and I have the uc480_64.dll. The camera is successfully located using the Instrumental() library:
from instrumental import list_instruments, instrument
from ctypes import *
paramsets = list_instruments() ## camera found
print(paramsets)
which returns
[<ParamSet[UC480_Camera] serial=b'4102675270' model=b'C1285R12M'
id=1>]
I was wondering if anyone knows if in the last couple of years openCV has managed to find a way to read USB cameras and if so, what is the way?
Or of any other reliable method, which allows further image processing on the captured frames.
PS: I posted this on superuser because apparently hardware questions are not allowed on stackoverflow, but suoeruser migrated it back here .. So apologies if it is off-topic here as well.
Can you communicate with the camera it its native software?
https://www.thorlabs.com/software_pages/ViewSoftwarePage.cfm?Code=ThorCam
Our lab is using "pylablib cam-control" to communicate with a variety of cameras (including Thorlabs USB ones): https://pylablib-cam-control.readthedocs.io/en/latest/
Or if you would prefer writing your own code, pylablib includes a class for Thorlabs USB cameras (actually has been tested with your specific camera).
https://pylablib.readthedocs.io/en/latest/devices/uc480.html#cameras-uc480
Try the following code. It works with my Thorlab DCx camera:
import cv2
import numpy as np
from instrumental.drivers.cameras import uc480
# init camera
instruments = uc480.list_instruments()
cam = uc480.UC480_Camera(instruments[0])
# params
cam.start_live_video(framerate = "10Hz")
while cam.is_open:
frame = cam.grab_image(timeout='100s', copy=True, exposure_time='10ms')
frame1 = np.stack((frame,) * 3,-1) #make frame as 1 channel image
frame1 = frame1.astype(np.uint8)
gray = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
#now u can apply opencv features
cv2.imshow('Camera', gray)
if cv2.waitKey(30) & 0xFF == ord('q'):
break
cam.close()
cv2.destroyAllWindows()
I'm trying to build a model so as i can predict the gender from handwrting, so it's basically a binary classification problem, where lables are either 0 for men or 1 for women, and to have better control on how my model is behaving, i'm using TensorBoard Callback in model.fil(), however, i have two issues :
1- When i try to visualize my images in image dash board on TB, all i get is grey and black boxes even though i set write_image to True, how can this can be solved ? (code below and image) ;
2- When i try to use the Projector dash board, i visualise my data without lables, so all i can see is set of the data in 3 dimensions (PCA Set de 3), does anyone have an idea of how it can be solved ? (image)
import os
!rm -rf logs
log_dir = "./logs/1"
if not os.path.exists(log_dir):
os.makedirs(log_dir)
tbCallBack = TensorBoard(log_dir=log_dir,
histogram_freq=1,
embeddings_freq=1,
write_graph=False,
write_images=True)
I want to compare two images for similarity. Since my purpose is to match a given image against a massive collection of images, I want to run the comparisons on GPU.
I came across tf.image.ssim and tf.image.psnr functions but I am unable to find and working examples only. The solutions in PyTorch is also appreciated. Since I don't have a good understanding of CUDA and C language, I am hesitant to try kernels in PyCuda.
Will it be helpful in terms of processing if I read the entire image collection and store as Tensorflow Records for future processing?
Any guidance or solution, greatly appreciated. Thank you.
Edit:- I am matching images of same size only. I don't want to do mere histogram match. I want to do SSIM or PSNR implementation for image similarity. So, I am assuming it would be similar in color, content etc
Check out the example on the tensorflow doc page (link):
im1 = tf.decode_png('path/to/im1.png')
im2 = tf.decode_png('path/to/im2.png')
print(tf.image.ssim(im1, im2, max_val=255))
This should work on latest version of tensorflow. If you use older versions tf.image.ssim will return a tensor (print will not give you a value), but you can call .run() to evaluate it.
There is no implementation of PSNR or SSIM in PyTorch. You can either implement them yourself or use a third-party package, like piqa which I have developed.
Assuming you already have torch and torchvision installed, you can get it with
pip install piqa
Then for the image comparison
import torch
from torchvision import transforms
from PIL import Image
im1 = Image.open('path/to/im1.png')
im2 = Image.open('path/to/im2.png')
transform = transforms.ToTensor()
x = transform(im1).unsqueeze(0).cuda() # .cuda() for GPU
y = transform(im2).unsqueeze(0).cuda()
from piqa import PSNR, SSIM
psnr = PSNR()
ssim = SSIM().cuda()
print('PSNR:', psnr(x, y))
print('SSIM:', ssim(x, y))
I'm discovering VTK and want to use it to plot a 3D numpy array. So far, I've managed to convert a numpy array to a vtk.Volume and displaying it but from there I am having a hard time getting something pretty.
I get a very blocky rendering like this :
and I would like a smooth rendering, so I guess either this volume but smoothed, or the surface extracted from this volume smoothed.
I've tested a bunch of vtk mappers for this volume, like SmartVolumeMapper, and played around with the Shader and the Interpolation, but did not get great results.
Here is my code (in Python) :
import vtk
import numpy as np
npa= #some 3D numpy array
[h,w,z]=npa.shape
#importing the numpy array (comes from http://www.vtk.org/Wiki/VTK/Examples/Python/vtkWithNumpy)
dataImporter = vtk.vtkImageImport()
data_string = npa.tostring()
dataImporter.CopyImportVoidPointer(data_string, len(data_string))
dataImporter.SetDataScalarTypeToUnsignedChar()
dataImporter.SetNumberOfScalarComponents(1)
dataImporter.SetDataExtent(0,z-1, 0, w-1, 0,h-1)
dataImporter.SetWholeExtent(0,z-1, 0,w-1, 0,h-1)
#Defining a transparency function
alphaChannelFunc = vtk.vtkPiecewiseFunction()
alphaChannelFunc.AddPoint(0, 0.0)
alphaChannelFunc.AddPoint(255, 1)
# Defining a color function
colorFunc = vtk.vtkColorTransferFunction()
colorFunc.AddRGBPoint(255, 1.0, 1.0, 1.0)
colorFunc.AddRGBPoint(128, 0.0, 0, 1.0)
#Creating the volume
volumeProperty = vtk.vtkVolumeProperty()
volumeProperty.SetColor(colorFunc)
volumeProperty.SetScalarOpacity(alphaChannelFunc)
volumeProperty.ShadeOn()
volumeProperty.SetInterpolationTypeToLinear()
#Creating the mapper
compositeFunction = vtk.vtkVolumeRayCastCompositeFunction()
volumeMapper = vtk.vtkVolumeRayCastMapper()
volumeMapper.SetVolumeRayCastFunction(compositeFunction)
volumeMapper.SetInputConnection(dataImporter.GetOutputPort())
#Creating the volume actor
volume = vtk.vtkVolume()
volume.SetMapper(volumeMapper)
volume.SetProperty(volumeProperty)
#Creating the renderer
renderer = vtk.vtkRenderer()
renderWin = vtk.vtkRenderWindow()
renderWin.AddRenderer(renderer)
renderInteractor = vtk.vtkRenderWindowInteractor()
renderInteractor.SetRenderWindow(renderWin)
#Adding the actor
renderer.AddVolume(volume)
renderer.SetBackground(0, 0, 0)
renderWin.SetSize(400, 400)
#Launching the renderer
renderInteractor.Initialize()
renderWin.Render()
renderInteractor.Start()
I get the impression that a Volume actor is not the way to go to get something pretty, maybe I should go for a PolyData or something ? I went through the Marching Cubes example (in C++) which seems to take a volume and extract a surface out of it, but I can't get it to work for the moment (no errors but output is a completely white buggy window which won't close).
I could dive more into it to try and get it to work but first I would like to get input from you guys, since I'm a beginner in VTK and maybe I'm handling this all wrong.
I'm using Python 2.7.12 and vtk 5.10.1 on Ubuntu 14.