Is tensorflow_transform a going concern for tf 2.0? - tensorflow

For example, will it eventually work? Does it work? What are the goals and plans? Where can we read about it.

Is tensorflow_transform a going concern for tf 2.0?
Absolutely! Development is ongoing. Issues are being actively discussed, PRs are being worked on and there have been several changes to the master branch this week.
will it eventually work? Does it work?
Yes it works now (in general at least). Perhaps if you are encountering some specific issue could ask a new question with what, specifically, isnt working for you.
What are the goals and plans? Where can we read about it.
The tensorflow team are really good at communicating plans via RFCs and doing development in the open. I am less familiar with work on tf-transform but all the signs are this is developed with the same culture. Check out:
the github repo
the official site

Related

Omnet++ with Reinforcement Learning Tools [ML]

I am currently failing into find an easy and modular framework to link openAI gym or tensorflow or keras with omnet++ in such a way I can produce communication between each tool and have online learning.
There are tools like omnetpy and veins-gym, however one is very strict and not trustworthy (and no certainty into bridge with openAI, for example) and the other is really poor documented in such a way one person can’t taper how it is supposed to be incorporated into a project.
Being omnet so big project, how is it possible that it is so disconnected to ML world like this?
On top of that, I still will need to use federated learning, so a custom scrappy solution would be even more difficult.
I found various articles that say “we have used omnet++ and keras or tensorflow”, etc, but none of them shared their code, so it is kinda misterious how they did it.
Alternatively, I could use NS3, but as far as I know, it is very steeped to learn it. Some ML tools are well documented, apparently, for NS3. But since I didn’t tried to implement something in NS3 with those tools, I can’t know for sure. Omnet++ was easy to learn for what I need, changing to NS3 still seems a burden with no clear guarantees.
I would like to ask help in both senses:
if u have links regarding good middleware between omnetpp and openai-gym or keras or such, and you have used them, please share with me.
if u have experience with NS3 and ML using ML middleware to link NS3 with openai-gym and keras and so on, please share with me.
I will only be able to finish my POC if I manage to use Reinforcement Learning tooling online a omnet++ simulation (i.e., agent is deciding on simulation runtime which actions to take).
My project is actually complex, but the POC may be simple. I am relying in these tools because I have no sufficient experience to build a complex system translating a domain to another. So a help will be nice.
Thank You.

Which model should i use for tensorflow (contrib or models)?

For example if i want to use renset_v2, there are two model file on tensorflow:
one is here, another is here. Lots of tensorflow model are both in models/research and tensorflow/contrib.
I am very confused: which model is better? which model should i use?
In general, tf.contrib contains contributed code mostly by the community. It is meant to contain features and contributions that eventually should get merged into core TensorFlow, but whose interfaces may still change, or which require some testing to see whether they can find broader acceptance.
The code in tf.contrib isn't supported by the Tensorflow team. It is included in the hope that it is helpful, but it might change or be removed at any time; there are no guarantees.
tf.research folder contains machine learning models implemented by researchers in TensorFlow. The models are maintained by their respective authors and have a lower chance of being deprecated.
On the other hand models present directly are officially supported by Tensorflow team and are generally preferred as they have a lower chance of being deprecated in future releases, If you have a model implemented in both, you should generally avoid using the contrib version keeping in mind future compatibility, but the community does do some awesome stuff there, so you might find some models/work not present in the main repository but would be helpful if you used them directly from contrib branch.
Also notice the phrase generally avoid since it is a bit application dependent.
Hope that answers your question, comment with your doubts.
With Tensorflow 2.0 (that will come soon) tf.contrib will be removed.
Therefore, you have to start using models/research if you want your project to be up-to-date and still working in the next months.

Why is there no mention of contrib.layers.linear in the Tensorflow documentation?

I'm trying to understand someone else's simple tensorflow model and they make use of contrib.layers.linear.
However I cannot find any information on this anywhere and it's not mentioned in the tensorflow documentation.
The tf.contrib.layers module has API documentation here. As you observed in your answer, the contrib APIs in TensorFlow are (especially) subject to change. The tf.contrib.layers.linear() function appears to have been removed, but you can use tf.contrib.layers.fully_connected(…, activation_fn=None) to achieve the same effect.
I managed to find the answer and felt it was still worth posting this to save others wasting their time.
"In general, tf.contrib contains contributed code. It is meant to contain features and contributions that eventually should get merged into core TensorFlow, but whose interfaces may still change, or which require some testing to see whether they can find broader acceptance.
Code in tf.contrib isn't supported by the Tensorflow team. It is included in the hope that it is helpful, but it might change or be removed at any time; there are no guarantees." source
According to what I can see in the Master branch, the function linear still exists in contrib.layers. It actually is a "simple alias which removes the activation_fn parameter":
linear = functools.partial(fully_connected, activation_fn=None)
Here is a link from the 1.0 branch (to increase link persistence).
Though, if the doc still shows it, the link to contrib.layers.linear seems indeed broken.

Does anyone have any idea how to create a 2D skeleton with the Kinect depthmap?

I'm currently using a Processing Kinect library which supplies a depth map. I was wondering how I could take that and use it to create a 2D skeleton, if possible. Not looking for any code here, just a general process I could use to achieve those results.
Also, given that we've seen this in several of the Kinect games so far, would it be difficult to have multiple skeletons running at once?
Disclaimer: the reason why you still didn't get an answer for this question is probably because that's a current research problem. So I can't give you a direct answer but will try to help with some information and useful resources for this topic.
There are mainly 2 different approaches to create a skeleton from a depth map. The first one is to use machine learning, the second is purely algorithmic.
For the machine learning one, you'd need many samples of people doing a predetermined move, and use those samples to train your favorite learning algorithm. That's the approach that was taken and implemented by Microsoft in the XBox (source), it works really well BUT you need millions of samples to make it reliable... quite a drawback.
The "algorithmic" approach (understand without using a training set) can be done in many different ways and is a research problem. It's often based on modeling the possible body postures and trying to match that with the depth image received. That's the approach that was chosen by PrimeSense (the guys behind the kinect depth camera technology) for their skeleton tracking tool NITE.
The OpenKinect community maintains a wiki where they list some interesting research material about this topic. You might also be interested in this thread on the OpenNI mailing list.
If you're looking for an implementation of a skeleton tracking tool, PrimeSense released NITE (closed source), the one they made: it's part of the OpenNI framework. That's what's used in most of the videos you might have seen that involve skeleton tracking. I think it's able to handle up to 2 skeletons at the same time, but that requires confirmation.
The best solution is to use FAAST (http://projects.ict.usc.edu/mxr/faast/) which requires OpenNI. I have struggled to get OpenNI to work on my computer. I have not seen an approach yet using Code Laboratories' CL NUI.
An algorithmic approach is http://code.google.com/p/skeletonization/ but you may have a problem because your depthmap only represents surfaces and no closed objects.

Setting up a lab for developers performance testing

Our product earned bad reputation in terms of performance. Well, it's a big enterprise application, 13 years old, that needs a refreshment treat, and specifically a boost in its performance.
We decided to address the performance problem strategically in this version. We are evaluating a few options on how to do that.
We do have an experienced load test engineers equipped with the best tools in the market, but usually they get a stable release late in the version development life cycle, therefore in the last versions developers didn't have enough time to fix all their findings. (Yes, I know we need to deliver earlier a stable versions, we are working on this process as well, but it's not in my area)
One of the directions I am pushing is to set up a lab environment installed with the nightly build so developers can test the performance impact of their code.
I'd like this environment to be constantly loaded by scripts simulating real user's experience. On this loaded environment each developer will have to write a specific script that tests his code (i.e. single user experience in a real world environment). I'd like to generate a report that shows each iteration impact on existing features, as well as performance of new features.
I am a bit worried that I'm aiming too high, and it it will turn out to become too complicated.
What do you think of such an idea?
Does anyone have an experience with setting up such an environment?
Can you share your experience?
It sounds like a good idea, but in all honesty, if your organisation can't get a build to the expensive load test team it has employed just for this purpose, then it will never get your idea working.
Go for the low hanging fruit first. Get a nightly build available to the performance testing team earlier in the process.
In fact, if this version is all about performance, why not have the team just take this version to address all the performance issues that came late in the iteration for the last version.
EDIT: "Don't developers have a responsibility to performance test code" was a comment. Yes, true. I personally would have every developer have a copy of YourKit java profiler (it's cheap and effective) and know how to use it. However, unfortunately performance tuning is a really, really fun technical activity and it is possible to spend a lot of time doing this when you would be better developing features.
If your developer team are repeatedly developing noticeably slow code then education on performance or better programmers is the only answer, not more expensive process.
One of the biggest boost in productivity is an automated build system which runs overnight (this is called Continuous Integration). Errors made yesterday are caught today early in the morning, when I'm still fresh and when I might still remember what I did yesterday (instead of several weeks/months later).
So I suggest to make this happen first because it's the very foundation for anything else. If you can't reliably build your product, you will find it very hard to stabilize the development process.
After you have done this, you will have all the knowledge necessary to create performance tests.
One piece of advice though: Don't try to achieve everything at once. Work one step at a time, fix one issue after the other. If someone comes up with "we must do this, too", you must do the same triage as you do with any other feature request: How important is this? How dangerous? How long will it take to implement? How much will we gain?
Postpone hard but important tasks until you have sorted out the basics.
Nightly builds are the right approach to performance testing. I suggest you require scripts that run automatically each night. Then record the results in a database and provide regular reports. You really need two sorts of reports:
A graph of each metric over time. This will help you see your trends
A comparison of each metric against a baseline. You need to know when something drops dramatically in a day or when it crosses a performance threshold.
A few other suggestions:
Make sure your machines vary similarly to your intended environment. Have low and high end machines in the pool.
Once you start measuring, never change the machines. You need to compare like to like. You can add new machines, but you can't modify any existing ones.
We built a small test bed, to do sanity testing - ie did the app fire up and work as expected when the buttons were pushed, did the validation work etc. Ours was a web app and we used Watir a ruby based toolkit to drive the browser. The output from those runs are created as Xml documents, and the our CI tool (cruise control) could output the results, errors and performance as part of each build log. The whole thing worked well, and could have been scaled onto multiple PCs for proper load testing.
However, we did all that because we had more bodies than tools. There are some big end stress test harnesses that will do everything you need. They cost, but that will be less than the time spent to hand roll. Another issue we had was getting our Devs to write Ruby/Watir tests, in the end that fell to one person and the testing effort was pretty much a bottleneck because of that.
Nightly builds are excellent, lab environments are excellent, but you're in danger of muddling performance testing with straight up bug testing I think.
Ensure your lab conditions are isolated and stable (i.e. you vary only one factor at a time, whether that's your application or a windows update) and the hardware is reflective of your target. Remember that your benchmark comparisons will only be bulletproof internally to the lab.
Test scripts written by the developers who wrote the code tends to be a toxic thing to do. It doesn't help you drive out misunderstandings at implementation (since the same misunderstanding will be in the test script), and there is limited motivation to actually find problems. Far better is to take a TDD approach and write the tests first as a group (or a separate group), but failing that you can still improve the process by writing the scripts collaboratively. Hopefully you have some user-stories from your design stage, and it may be possible to replay logs for real world experience (app varying).