Custom Data Reader (LMDB) in Tensorflow

Custom Data Reader (LMDB) in Tensorflow - tensorflow

In general, there are three ways I can think of for reading custom data in TF:
Native Implementation / Custom Data Reader
https://www.tensorflow.org/versions/r0.10/how_tos/new_data_formats/index.html
Python Function Wrapping
https://www.tensorflow.org/versions/r0.9/api_docs/python/script_ops.html
Placeholders
I have already implemented this succesfully. But I want an in-graph solution like (1) or (2).
Can someone elaborate on the pros and cons (mainly from performance/efficiency standpoint) the difference between (1) and (2), so I can use the queue runners.
My feeling says (1) should be the most efficient and robust way. But that solution would not be portable unless I share or PR the code and other users would have to compile. Whereas (2) and (3) are portable, right?
I have also opened a feature request 'LMDB Reading Feature' issue on GitHub that was misinterpreted and closed as a question.
UPDATE
TensorFlow not has a native reader: https://github.com/tensorflow/tensorflow/pull/9950

(2) and (3) Both suffer from Python's GIL; eventually you'll probably lock up. This implemented is therefore also slower because it's not in-graph, and will be quite hard to parallelize correctly. It's quick and easy but also suboptimal. So, go for 1 to go pro;
I have also found that (1) has two solutions:
(1A) Implement a custom op in the source.
This is the way if you want your Op to end up in the Tensorflow source-code at some point, quality permitting.
(1B) Implement a stand-alone custom op.
This turns out to be very easy and portable. You can just compile your own .cc code, and register it through Python. No rebuilding of source code required: https://www.tensorflow.org/extend/adding_an_op

Related

how to fix the problem of downloading fasttext-model300?

I'm using windows 10 and python 3.3. I tried to download fasttext_model300 to calculate soft cosine similarity between documents, but when I run my python file, it stops after arriving at this statement:
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')
There are no errors or not responding, It just stops without any reaction.
Does anybody know why it happens?
Thanks

I'd recommend against using the gensim api.load() functionality. It dynamically runs new, unversioned source code from remote servers – which is opaque in its operations & suboptimal for maintaining a secure local configuration, or debugging any issues which occur.
Instead, find the actual exact data files you trust and download them as plain data. Then, use specific library operations, like the KeyedVectors.load_word2vec_format() method, so instantiate exactly the model you need, using precise local-file paths you understand.
Following those steps may make it clearer what, if anything, is going wrong. If it doesn't, try also enabling logging at the INFO level to gather more information about what progress is made before failure (and add any new details as a comment or to your question).

python3 -m gensim.downloader --download fasttext-wiki-news-subwords-300
Try using this. Source : https://awesomeopensource.com/project/RaRe-Technologies/gensim-data

Gnuradio software source block

I'm currently trying to do some real-time signal-processing and I would like to use "gnuradio". I will be processing multiple channels of EEG which come in trough a custom interface (namely "Lab Streaming Layer"; LSL) in python.
Now my question is if there is an existing block already where you can kind of "push" samples into the signal-processing-graph during run-time? The only blocks I've found so far offer support for audio hardware, TCP-streams and files.

You will have to write your own block; that can be done in Python or C++, whatever is better for your case.
The GNU Radio Guided Tutorials (you should really read them in order from 1 to 5, at least) do explain how to do that.
Because we all know that people are lazy at reading, here's a rough preview of what you'll learn:
make a new Out-of-tree module: gr_modtool newmod sensorinterface, change into the newly generated directory: cd gr-sensorinterface
add a new source block: gr_modtool add eeg_sensor_source; the block type you'll want is "source"; you will be asked to fill in some block details.
edit the generated source file (in lib/ or python/, depending on which language you chose:
add a proper io signature: your output will probably have the size of float
edit the central work function; add code to get new samples, and copy those to the output_items buffer.
The guided tutorials are really nice!

The most flexible method is to write your own GNU Radio block, but there are several options for getting data into a flow graph without using any custom blocks. (Naming from the Python perspective.)
gnuradio.blocks.message_source, which takes data from a gnuradio.gr.msg_queue.
You can use a gnuradio.blocks.file_descriptor_source where the file descriptor is one end of a pipe.

Is it feasible to use Antlr for source code completion?

I don't know, if this question is valid since i'm not very familiar with source code parsing. My goal is to write a source code completion function for one existing programming language (Language "X") for learning purposes.
Is Antlr(v4) suitable for such a task or should the necessary AST/Parse Tree creation and parsing be done by hand, assuming no existing solutions exists?
I haven't found much information about that specific topic, except a list of compiler books, except a compiler is not what i'm after for.

The code completion in GoWorks is completely implemented using ANTLR 4. The following video shows the level of completion of this code completion engine. The code completion example runs from 5 minutes through the end of the video.
Intro to Tunnel Vision Labs' GoWorks IDE (Preview Release)
I have been working on code completion algorithms for many years, and strongly believe that there is no better solution (automated or manual) for producing a code completion solution for a new language that meets the requirements for what I would call highly-responsive code completion. If you are not interested in that level of performance or accuracy, other solutions may be easier for you to get involved with (I don't work with those personally, because I am too easily disappointed in the results).

Xtext uses ANTLR3 and has good autocomplete facilities. The problem is, it generates a seperate parser (again using antlr3) for autocomplete processing which is derived from AbstractInternalContentAssistParser. This multi-thousand line code part shows that the error recovery of ANTLR3 alone found to be insufficient by the xtext team.
Meanwhile ANTLR4 has a function parser.getExpectedTokensWithinCurrentRule() which lists possible token types for given position. It works when used in a ParseTreeListener. Remaining is semantics, scoping etc which is out of ANTLRs scope.

Is there any good tutoria or reference for writing code with Magma?

Currently I am trying to use Magma to do matrix operation on GPU, however, I found few documents about it. The only thing I can refer to is its testing program and the online generated document(here), which is not convenient to use. And the user guide seems outdated.

If you look here, getri and potri are supported.

Porting newlib to a custom ARM setup

this is my first post, and it covers something which I've been trying to get working on and off for about a year now.
Essentially it boils down to the following: I have a copy of newlib which I'm trying to get working on an LPC2388 (an ARM7TDMI from NXP). This is on a linux box using arm-elf-gcc
The question I have is that I've been looking at a lot of the tutorials talking about porting newlib, and they all talk about the stubs (like exit, open, read/write, sbrk), and I have a pretty good idea of how to implement all of these functions. But where should I put them?
I have the newlib distribution from sources.redhat.com/pub/newlib/newlib-1.18.0.tar.gz and after poking around I found "syscalls.c" (in newlib-1.18.0/newlib/libc/sys/arm) which contains all of the stubs which I have to update, but they're all filled in with rather finished looking code (which does NOT seem to work without the crt0.S, which itself does not work with my chip).
Should I just be wiping out those functions myself, and re-writing them? Or should I write them somewhere else. Should I make a whole new folder in newlib/libc/sys with the name of my "architecture" and change the target to match?
I'm also curious if there's proper etiquette on distribution of something like this after releasing it as an open source project. I currently have a script which downloads binutils, arm-elf-gcc, newlib, and gdb, and compiles them. If I am modifying files which are in the newlib directory, should I hand a patch which my script auto-applies? Or should I add the modified newlib to the repository?
Thanks for bothering to read! Following this is a more detailed breakdown of what I'm doing.
For those who want/need more info about my setup:
I'm building a ARM videogame console based loosely on the Uzebox project ( http://belogic.com/uzebox/ ).
I've been doing all sorts of things pulling from a lot of different resources as I try and figure it out. You can read about the start of my adventures here (sparkfun forums, no one responds as I figure it out on my own): forum.sparkfun.com/viewtopic.php?f=11&t=22072
I followed all of this by reading through the Stackoverflow questions about porting newlib and saw a few of the different tutorials (like wiki.osdev.org/Porting_Newlib ) but they also suffer from telling me to implements stubs without mentioning where, who, what, when, or how!

But where should I put them?
You can put them where you like, so long as they exist in the final link. You might incorporate them in the libc library itself, or you might keep that generic, and have the syscalls as a separate target specific object file or library.
You may need to create your own target specific crt0.s and assemble and link it for your target.
A good tutorial by Miro Samek of Quantum Leaps on getting GNU/ARM development up and running is available here. The examples are based on an Atmel AT91 part so you will need to know a little about your NXP device to adapt the start-up code.
A ready made Newlib porting layer for LPC2xxx was available here, but the links ot teh files appear to be broken. The same porting layer is used in Martin Thomas' WinARM project. This is a Windows port of GNU ARM GCC, but the examples included in it are target specific not host specific.
You should only need to modify the porting layer on Newlib, and since it is target and application specific, you need not (in fact probably should not) submit your code to the project.

When I was using newlib that is exactly what I did, blew away crt0.s, syscalls.c and libcfunc.c. My personal preference was to link in the replacement for crt0.s and syscalls.c (rolled the few functions in libcfunc into the syscalls.c replacement) based on the embedded application.
I never had an interest in pushing any of that work back into the distro, so cannot help you there.
You are on the right path though, crt0.S and syscalls.c are where you want to work to customize for your target. Personally I was interested in a C library (and printf) and would primarily neuter all of the functions to return 0 or 1 or whatever it took to get the function to just work and not get in the way of linking, periodically making the file I/O functions operate on linked in data in rom/ram. Basically without replacing or modifying any other files in newlib I had a fair amount of success, so you are on the right path.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas