Neural Networks Project? [closed] - project

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm looking for ideas for a Neural Networks project that I could complete in about a month or so. I'm doing it for the National Science Fair, so I need something that has some curb appeal as well since it's being judged.
It doesn't necessarily have to be completely new and unique, I'm just looking for ideas, but it should be complex enough that it would impress someone who knows about the field. My first idea was to implement a spam filter of sorts, but I recently found out that NN's aren't a very good way to do it. I've already got a basic NN simulator with Genetic Algorithms, and I'm also adding the the generic back-propagation algorithms as well.
Any ideas?

Look into Numenta's Hierarchical Temporal Memory (HTM) concept. This may be slightly off topic if the expectation is of "traditional" Neural Nets, but it is also an extremely promising avenue for Artificial Intelligence.
Although Numenta introduced HTM and its associated software platform, NuPIC, almost five years ago, the first commercial product based upon this technology was released (in beta) a few weeks ago by Vitamin D. It is called Vitamin D Video and essentially turns any webcam or IP camera into a sophisticated video monitoring system, recognizing classes of items (say persons vs. cats or other animals) in the video feed.
With the proper setup, this type of application could make for an interesting display at the Science Fair, one with much "curb appeal".
To wet your appetite or even get your feet wet with HTM technology you can download NuPIC and check its various sample applications. Chances are that you may find something that meets typical criteria of both geekness and coolness for science fairs.
Generally, HTMs aim at solving problems which are simple for humans but difficult for computers; such a statement is somewhat of a generic/applicable to Neural Nets, but HTMs take this to the "next level".
Although written in C (I think) NuPIC is typically interfaced in Python, which makes it a convenient test bed for simple yet sophisticated proofs of concept applications.

You could always try to play around with a neural network and stock courses, if I had a month of spare time for a neural network implementation, thats what I would play with.

A friend of mine in college wrote a NN to play go on a 9x9 board.
I don't think it ever got very good, but I think it would be fun to try.

Look on how a bidirectional associative memory compare with other classical edit distance algorithms (Levenshtein, Damerau-Levenshtein etc) for typo correction. Also consider the articles on hebbian unlearning while training your NN - it seems that the confabulation phenomena is avoided.

I've done some works on top of NN, mainly an XML based language (Neural XML). See details here
http://amazedsaint.blogspot.com/search/label/Neural%20Network
Also, one interesting .NET Neural network project is Aforge.net - Check out that as well..

You can implement the game Cellz or create a controller for it. It was first created by Simon M Lucas. It's a nice and interesting game, and i'm sure that everyone will love it. I used it also for a school project and it turned out very ok.
You can find in that page some links to other interesting games.

How about applying it to predicting exchange rate (USD - EUR for example for sub minute trading) should be fun to show net gain of money over 1 month.
I doubt this will work for trades longer than a minute... without a lot of extra work.
I like using committee machines so why not apply it to Face-Detection in images / movies or voice print authentication.
Finally you could get it to play pleasing music and use a crowd sourcing fitness function whereby people vote for the best "musicians"

Related

Choosing a chat-bot framework for data science research project and understanding the hidden costs of the development and rollout?

The question is about using a chat-bot framework in a research study, where one would like to measure the improvement of a rule-based decision process over time.
For example, we would like to understand how to improve the process of medical condition identification (and treatment) using the minimal set of guided questions and patient interaction.
Medical condition can be formulated into a work-flow rules by doctors; possible technical approach for such study would be developing an app or web site that can be accessed by patients, where they can ask free text questions that a predefined rule-based chat-bot will address. During the study there will be a doctor monitoring the collected data and improving the rules and the possible responses (and also provide new responses when the workflow has reached a dead-end), we do plan to collect the conversations and apply machine learning to generate improved work-flow tree (and questions) over time, however the plan is to do any data analysis and processing offline, there is no intention of building a full product.
This is a low budget academy study, and the PHD student has good development skills and data science knowledge (python) and will be accompanied by a fellow student that will work on the engineering side. One of the conversational-AI options recommended for data scientists was RASA.
I invested the last few days reading and playing with several chat-bots solutions: RASA, Botpress, also looked at Dialogflow and read tons of comparison material which makes it more challenging.
From the sources on the internet it seems that RASA might be a better fit for data science projects, however it would be great to get a sense of the real learning curve and how fast one can expect to have a working bot, and the especially one that has to continuously update the rules.
Few things to clarify, We do have data to generate the questions and in touch with doctors to improve the quality, it seems that we need a way to introduce participants with multiple choices and provide answers (not just free text), being in the research side there is also no need to align with any specific big provider (i.e. Google, Amazon or Microsoft) unless it has a benefit, the important consideration are time, money and felxability, we would like to have a working approach in few weeks (and continuously improve it) the whole experiment will run for no more than 3-4 months. We do need to be able to extract all the data. We are not sure about which channel is best for such study WhatsApp? Website? Other? and what are the involved complexities?
Any thoughts about the challenges and considerations about dealing with chat-bots would be valuable.

how to determine if a picture is explicit [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm looking for a way to determine whether a picture is explicit (is Safe For Work ) or not.
I am currently looking for an API that is capable of doing it, but so far I didn't have any success.
One of the ideas I had was to use the google search API and provide a URL to a picture, and looking whether or not it is in the results when safeSearch is enabled, but it will fail on a picture that was added before the crawler got to it.
Alternatively, I'm looking for pointers regarding what to look for in an image to determine how SFW it is. Any suggestions regarding shapes, colors or patterns?
As promised, a SFW paper from Google researchers and a patent for your study procured from this blog entry.
One of my colleagues led the development of the porn classification technology at one of the largest web companies. I will share what he told me about the development of the filter.
The definition of what is explicit varying greatly among jurisdictions so what is considered explicit in the US might not be in other parts of the world and vice-versa. So models need to take into account the users origin.
A purely imaged based approach is almost impossible to use effectively at web scale. The feature space is very complex in terms of how humans judge what is explicit and what is not and developing appropriate feature extraction technology for images proved to be exceedingly difficult.
Some of the most predictive features are the text on pages that link to the images. These are among the easiest features to develop also.
Building labeled training sets is very difficult since classifying porn and other explicit content for 8 hours a day tends to take a toll on the labelers. Because of this the turn over is fairly high with almost no one lasting a year.
Getting a high accuracy from the classifiers is still very, very difficult. They worked on it with several PhD's and a very experienced team and still did not achieve the accuracy that you are probably looking for.
If you have a more constrained problem space you can probably achieve a higher accuracy. If you are using image features only the algorithm or model will probably not generalize well and will have a high false positive rate. Best of luck.
See papers:
Detection of Pornographic Digital Images
Jorge A. Marcial-Basilio, Gualberto Aguilar-Torres, Gabriel Sánchez-Pérez, L. Karina Toscano-
Medina, and Héctor M. Pérez-Meana
Pornography Detection Using Support Vector Machine
Yu-Chun Lin (林語君) Hung-Wei Tseng (曾宏偉) Chiou-Shann Fuh
Image-Based Pornography Detection
Rigan Ap-apid
De La Salle University, Manila, Philippines
You can also take some hints from existing implementations e.g.:
"The Porn Detection Stick uses advanced image analyzing algorithms that categorize images as potentially harmful by identifying facial features, flesh tone colors, image back grounds, body part shapes, and more."

Is there a list of famous software products that do and do not do testing? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would be interested in looking at a list of projects that did and did not do unit testing, and other forms of regression testing, to see how those companies turned out.
All test infected developers know it saves them time, but it would be interesting to what correlation there is between code quality/test coverage and business success. Something objective like:
xyz corp, makes operating systems, didnt test, makes $50M
123 corp, makes operating systems, does test, makes $100M
Does anyone know of any studies done?
Microsoft commissioned this internal study not so long ago. It compared teams that did and didn't use TDD. To quote the summary:
Based on the findings of the existing studies, it can be concluded that TDD seems to improve software quality, especially when employed in an industrial context. The findings were not so obvious in the semiindustrial or academic context, but none of those studies reported on decreased quality either. The productivity effects of TDD were not very obvious, and the results vary regardless of the context of the study. However, there were indications that TDD does not necessarily decrease the developer productivity or extend the project leadtimes: In some cases, significant productivity improvements were achieved with TDD while only two out of thirteen studies reported on decreased productivity. However, in both of those studies the quality was improved.
Yes, pick up a copy of Code Complete or even Rapid Development by Steve McConnell. He cites a number of studies.
Any realistic study would have to include thousands of companies. There are far too many factors other than does/doesn't unit test that affect the bottom line. I doubt Microsoft's profit changes all that much whether or not they release an amazing OS every year or one that's as buggy as hell. Just listing a few companies is anecdotal evidence.
Perl is big on testing and regression testing.
I always associate Unit testing with Agile development (XP in particular); you might find that any link between project success and unit testing is influenced by use of agile as well.
I don't know of any surveys specifically, but I did find this just now:
http://people.engr.ncsu.edu/txie/testingresearchsurvey.htm which has around 30 llinks to stuff such as: "Qualitative methods in empirical studies of software engineering. Seaman, C.B, Software Engineering, IEEE Transactions on , Volume: 25 , Issue: 4 , July-Aug. 1999"
Not wanted to sound rude - I assume you've already done bit of a search online?
I seem to remember that Code Complete might have references to research into unit testing and project success - but I'm not sure.
Another option would be to approach some software testing companies and see if they had any useful data.

Basic skills to work as an optimiser in the gaming industry [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm curious about a certain job title, that of "senior developer with a specialty in optimisation." It's not the actual title but that's essentially what it would be. What would this mean in the gaming industry in terms of knowledge and skills? I would assume basic stuff like
B-trees
Path finding
Algorithmic analysis
Memory management
Threading (and related topics like thread safety, atomicity, etc)
But this is only me conjecturing. What would be the real-life (and academic) basic knowledge required for such a job?
I interviewed for such a position a few years ago at one of the Big North American game studios.
The job required a lot of deep pipeline assembly programming, arithmetic optimization algorithms (think Duff's Device, branchless ifs), compile-time computation (SWAR), meta-template programming, computation of many values at once in parallel in very large registers (I forget the name for that)... You'll need to be solid on operating system fundementals, low level system operations, linear algebra, and C++ especially templates. You'll also become very familiar with the peculiar architecture of the PlayStation3, and probably be involved in developing libraries for that environment that the company's game teams will build on top of.
Generally I concur with Ether's post; this will typically be more about low-level optimisation than algorithmic stuff. Knowing good algorithms comes in handy, but there are many cases in games where you prefer the O(N) solution over the O(logN) solution because the first is far friendlier on the cache and requires less memory management. So you need a more holistic knowledge.
Perhaps on a more general level, the job may want to know if you can do some or all of the following:
use a CPU profiler (eg. VTune, CodeAnalyst) in both sampling and call graph mode;
use graphical profilers (eg. Microsoft Pix, NVPerfHud)
write your own profiling/timer code and generate useful output with it;
rewrite functions to remove dynamic memory allocations;
reorganise and reduce data to be more cache-friendly;
reorganise data to make it more SIMD-friendly;
edit graphics shaders to use fewer and cheaper instructions;
...and more, I'm sure.
This is a lot like my job actually. Real-life knowledge that would be practical for this:
Experience in using profilers of all kinds to locate bottlenecks.
Experience and skill in determining the reason those bottlenecks exist.
Good understanding of CPU caches, virtual memory, and common bottlenecks such as load-hit-store penalties, L2 misses, floating point code, etc.
Good understanding of multithreading and both lockless and locking solutions.
Good understanding of HLSL and graphics programming, including linear algebra.
Good understanding of SIMD techniques and the specific SIMD interfaces on relevant hardware (paired singles, VMX, SSE/MMX).
Good understanding of the assembly language used on relevant hardware. If writing assembly, a good understanding of instruction pairing, branch prediction, delay slots (if applicable), and any and all applicable stalls on the target platform.
Good understanding of the compilation and linking process, binary formats used on the target hardware, and tools to manipulate all of the above (including available compiler flags and optimizations).
Every once in a while people ask how to become good at low-level optimization. There are a few good sources of info, mostly proprietary, but I think it generally comes down to experience.
This is one of those "if you got it you know it" type of things. It's hard to list out specifics, and some studios will have different criteria than others.
To put it simply, the 'Senior Developer' part means you've been around the block; you have multiple years of experience in which you've excelled and have shipped games. You should have a working knowledge of a wide range of topics, with things such as memory management high up the list.
"Specialty in Optimization" essentially means that you know how to make a game run faster. You've already spent a significant amount of time successfully optimizing games which have shipped. You should have a wide knowledge of algorithms, 3d rendering (a lot of time is spent rendering), cpu intrinsics, memory management, and others. You should also typically have an in depth knowledge of the hardware you'd be working on (optimizing PS3 can be substantially different than optimizing for PC).
This is at best a starting point for understanding. The key is having significant real world experience in the topic; at a senior level it should preferably be from working on titles that have shipped.

Given this expectations, what language or system would you choose to implement the solution? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Here are the estimates the system should handle:
3000+ end users
150+ offices around the world
1500+ concurrent users at peak times
10.000+ daily updates
4-5 commits per second
50-70 transactions per second (reads/searches/updates)
This will be internal only business application, dedicated to help shipping company with worldwide shipment management.
What would be your technology choice, why that choice and roughly how long would it take to implement it? Thanks.
Note: I'm not recruiting. :-)
So, you asked how I would tackle such a project. In the Smalltalk world, people seem to agree that Gemstone makes things scale somewhat magically.
So, what I'd really do is this: I'd start developing in a simple Squeak image, using SandstoneDB. Then, this moment would come where a single image begins being too slow.
GemStone then takes care of copying your public objects (those visible from a certain root) back and forth between all instances. You get sessions and enhanced query functionalities, plus quite a fast VM.
It shares data with C, Java and Ruby.
In fact, they have their own VM for ruby, which is also worth a look.
wikipedia manages much more demanding requirements with MySQL
Your volumes are significant but not likely to strain any credible RDBMS if programmed efficiently. If your team is sloppy (i.e., casually putting SQL queries directly into components which are then composed into larger components), you face the likelihood of a "multiplier" effect where one logical requirement (get the data necessary for this page) turns into a high number of physical database queries.
So, rather than focussing on the capacity of your RDBMS, you should focus on the capacity of your programmers and the degree to which your implementation language and environment facilitate profiling and refactoring.
The scenario you propose is clearly a 24x7x365 one, too, so you should also consider the need for monitoring / dashboard requirements.
There's no way to estimate development effort based on the needs you've presented; it's great that you've analyzed your transactions to this level of granularity, but the main determinant of development effort will be the domain and UI requirements.
Choose the technology your developers know and are familiar with. All major technologies out there will handle such requirements with ease.
Your daily update numbers vs commits do not add up. Four commits per second = 14,400 per hour.
You did not mention anything about expected database size.
In any case, I would concentrate my efforts on choosing a robust back end like Oracle, Sybase, MS etc. This choice will make the most difference in performance. The front end could either be a desktop app or WEB app depending on needs. Since this will be used in many offices around the world, a WEB app might make the most sense.
I'd go with MySQL or PostgreSQL. Not likely to have problems with either one for your requirements.
I love object-databases. In terms of commits-per-second and database-roundtrip, no relational database can hold up. Check out db4o. It's dead easy to learn, check out the examples!
As for the programming language and UI framework: Well, take what your team is good at. Dynamic languages with fewer meta-time wasting will probably save time.
There is not enough information provided here to give a proper recommendation. A little more due diligence is in order.
What is the IT culture like? Do they prefer lots of little servers or fewer bigger servers or big iron? What is their position on virtualization?
What is the corporate culture like? What is the political climate like? The open source offerings may very well handle the load but you may need to go with a proprietary vendor just because they are already used to navigating the political winds of a large company. Perception is important.
What is the maturity level of the organization? Do they already have an Enterprise Architecture team in place? Do they even know what EA is?
You've described the operational side but what about the analytical side? What OLAP technology are they expecting to use or already have in place?
Speaking of integration, what other systems will you need to integrate with?