In MediaPipe, is it possible to see augmented landmarks rendered in real time? - mediapipe

So I am using MediaPipe Holistic Solutions to extract keypoints from a body, hands and face, and I am using the data from this extraction for my calculations just fine. The problem is, I want to see if my data augmentation works, but I am unable to see it in real time. An example of how the keypoints are extracted:
lh_arr = (np.array([[result .x, result .y, result .z] for result in results.left_hand_landmarks.landmark]).flatten()
if I then do lets say, lh_arr [10:15]*2, I cant use this new data in the draw_landmarks function, as lh_arr is not class 'mediapipe.python.solution_base.SolutionOutputs'. Is there a way to get draw_landmarks() to use an np array instead or can I convert the np array back into the correct format? I have tried to get get the flattened array back into a dictionary of the same format of results, but it did not work. I can neither augment the results directly, as they are unsupported operand types.

Related

How to load in a downloaded tfrecord dataset into TensorFlow?

I am quite new to TensorFlow, and have never worked with TFRecords before.
I have downloaded a dataset of images from online and the download format was TFRecord.
This is the file structure in the downloaded dataset:
1.
2.
E.g. inside "test"
What I want to do is load in the training, validation and testing data into TensorFlow in a similar way to what happens when you load a built-in dataset, e.g. you might load in the MNIST dataset like this, and get arrays containing pixel data and arrays containing the corresponding image labels.
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
However, I have no idea how to do so.
I know that I can use dataset = tf.data.TFRecordDataset(filename) somehow to open the dataset, but would this act on the entire dataset folder, one of the subfolders, or the actual files? If it is the actual files, would it be on the .TFRecord file? How do I use/what do I do with the .PBTXT file which contains a label map?
And even after opening the dataset, how can I extract the data and create the necessary arrays which I can then feed into a TensorFlow model?
It's mostly archaeology, and plus a few tricks.
First, I'd read the README.dataset and README.roboflow files. Can you show us what's in them?
Second, pbtxt are text formatted so we may be able to understand what that file is if you just open it with a text editor. Can you show us what's in that.
The think to remember about a TFRecord file is that it's nothing but a sequence of binary records. tf.data.TFRecordDataset('balls.tfrecord') will give you a dataset that yields those records in order.
Number 3. is the hard part, because here you'll have binary blobs of data, but we don't have any clues yet about how they're encoded.
It's common for TFRecord filed to contian serialized tf.train.Example.
So it would be worth a shot to try and decode it as a tf.train.Example to see if that tells us what's inside.
ref
for record in tf.data.TFRecordDataset('balls.tfrecord'):
break
example = tf.train.Example()
example.ParseFromString(record.numpy())
print(example)
The Example object is just a representation of a dict. If you get something other than en error there look for the dict keys and see if you can make sense out of them.
Then to make a dataset that decodes them you'll want something like:
def decode(record):
return tf.train.parse_example(record, {key:tf.io.RaggedFeature(dtype) for key, dtype in key_dtypes.items()})
ds = ds.map(decode)

Storing pre-processed images

I am evaluating a couple of object detection models on a data set and was planning on performing pre-processing on the data using standardization to zero mean and unit variance. But I don't know how to store the images when they have been pre-processed. Currently they are in jpg format, but what format can be used after I have pre-processed them? Some of the models I evaluate are yolov4, yolov5, and SSD.
If i instead scaled the pixel values from 0-255 to 0-1, what image format could I then use?
Also, if I train the object detector on pre-processed images and then want to apply it to a video, I assume I need to somehow pre-process the video to get decent results. How would I go about doing that?
I have calculated mean and std on my data set using the python module cv2. I read the images using imread which returns a numpy array. Then I subtract mean and divide with std. This gives me a numpy array with both negative and positive floating point values. But when I try to save this numpy array as an image using the function imwrite(filename, array), it doesn't work. I assume because the numpy array isn't allowed to contain negative values.

How do I get and use value from a tensor within a TF 2.0 Dataset map step?

I'm using TensorFlow Alpha 2.0.
I have TFRecords files I'm reading from, each one holding a short video clip with each frame encoded as jpeg byte string to save space:
{
'numframes': tf.io.FixedLenFeature([], tf.int64),
'frames': tf.io.VarLenFeature(tf.string)
}
I have a map step in my tf.data.Dataset pipeline that successfully parses each example:
def parse_tfrecord(p):
return tf.io.parse_single_example(p, example_schema)
My next step is to read out the number of frames from numframes and run the tf.io.decode_jpeg function on each frame in frames.values[i] with i being from range(numframes):
def parse_jpegs(p):
numframes = p['numframes']
return tf.map_fn(tf.io.decode_jpeg, [p['frames'].values[i] for i in range(numframes)])
My dataset pipeline for completeness:
def dataset():
dataset = tf.data.Dataset.list_files("*.tfrecord")
dataset = tf.data.TFRecordDataset(dataset)
dataset = dataset.shuffle(1000).repeat()
dataset = dataset.map(parse_tfrecord)
dataset = dataset.map(parse_jpegs)
return dataset
If I exclude the dataset.map(parse_jpegs) line it all works alright, showing me something like {'frames': <tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7f394c285518>, 'numframes': <tf.Tensor: id=2937, shape=(), dtype=int64, numpy=25>}
(Note that the numframes tensor includes a numpy value of 25. I can get that outside my dataset pipeline with the tensor.numpy() method)
Within that map function though, I can't call .numpy() to get the value out of the tensor, and when printing the tensor itself it hasn't been evaluated or something because there is no value shown yet.
What is the best way to parse all these frames within the dataset pipeline?
EDIT: Error message I'm getting is TypeError: 'Tensor' object cannot be interpreted as an integer in parse_jpegs when trying to get numframes. This makes sense to me why a tensor can't be interpreted as an int, but how can I get the value from that tensor to use to set the range?
The problem I'm running into comes down to the fact that each "frames" object has a different number of frames. If I can apply tf.io.decode_jpeg to each frame in that list without needing to record number of frames separately I would be fine with that, but I have "numframes" here so I know how many frames need to be decoded in my "frames" list.
EDIT: I'll heave the question up for anyone else who might find it helpful, but I ended up just returning the raw bytestrings and doing the decode_jpeg in a separate generator function outside the dataset API. It was much easier that way, even if it might be slower.
In my specific case, I ended up finding out that map_fn was trying to turn my input tensor into an output tensor of the same type. In this case, tf.io.decode_jpeg takes in a string (of bytes) and outputs a uint8 array, which was causing problems. Another argument to tf.map_fn(... output_type=tf.uint8) seems to have fixed it for me! Maybe not exactly as written since I continued tinkering with it since asking the question, but I got it working now.

H2OTwoDimTable seems to be missing functionality

I discovered that I can get a collection of EigenVectors from glrm_model (H2O Generalized Low Rank Model Estimateor glrm (Sorry I can't put this in the tags)) this way:
EV = glrm_model._model_json["output"]['eigenvectors'])
However the type of EV is H2OTwoDimTable which is not very capable.
If I try to do (where M is an H2O Data Frame):
M.mult(EV)
I get the error
AttributeError: 'H2OTwoDimTable' object has no attribute 'nrows'
If I try to convert EV to a numpy matrix:
EV.as_matrix()
I get the error:
AttributeError: 'H2OTwoDimTable' object has no attribute 'as_matrix'
I can convert EV to a panda data frame and then convert it to a numpy matrix, which is an extra step and do the matrix multiplication
IMHO, it would be better if the eigenvector reference return an H2O Data Frame.
Also, it would be good if H2OTwoDimTable could better support matrix multiplication either as a left or right operand.
And EV.as_data_frame() has no use_pandas=False option.
Here's the python code which could be modified to better support matrix type things:
https://github.com/h2oai/h2o-3/blob/master/h2o-py/h2o/two_dim_table.py
The "TwoDimTable" class is used to store lightweight tabular data in a model. I am agreement with you about using H2OFrames instead of TwoDimTables, but it's a design choice that was made a long time ago (can't change it now).
Since H2OFrames can contain non-numeric data, there is an .as_data_frame() method to from an H2OFrame or TwoDimTable to a Pandas DataFrame. So you can chain .as_data_frame().as_matrix() together to get a matrix (numpy.ndarray) if that's what you're looking for. Here's an example:
import h2o
from h2o.estimators.glrm import H2OGeneralizedLowRankEstimator
h2o.init()
data = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/glrm_test/cancar.csv")
# Train a GLRM model with recover_svd=True to keep eigenvectors
glrm = H2OGeneralizedLowRankEstimator(k=4,
transform="NONE",
loss="Quadratic",
regularization_x="None",
regularization_y="None",
max_iterations=1000,
recover_svd=True)
glrm.train(x=data.names, training_frame=data)
# Get eigenvector TwoDimTable from the model
EV = glrm._model_json["output"]['eigenvectors']
# Convert to various formats
evdf = EV.as_data_frame() #pandas.core.frame.DataFrame
evmat = evdf.as_matrix() #numpy.ndarray
# or directly
evmat = EV.as_data_frame().as_matrix()
If you're interested in adding a .as_matrix() method to the TwoDimTable class, you could submit a pull request on the h2o-3 repo for that. I think that would be a useful extension. There's more info about how to contribute to H2O in our contributing guide.

Why am I getting an error when trying to plot this graph?

Everything else works except when I try to plot this graph.
That is because the type of data expected by the Waveform Graph is not what you give it.
You may want to use Waveform Chart. Have a look at Type of Graphs and Charts. A graph expects a data-set (or multiple data points) as an array or waveform, while a chart expects a data-point.
You must connect array of numbers to Waveform Graph.
Just add a shift register with empty array on initialization and build new array by adding a new value on any iteration.