Choosing a chat-bot framework for data science research project and understanding the hidden costs of the development and rollout? - data-science

The question is about using a chat-bot framework in a research study, where one would like to measure the improvement of a rule-based decision process over time.
For example, we would like to understand how to improve the process of medical condition identification (and treatment) using the minimal set of guided questions and patient interaction.
Medical condition can be formulated into a work-flow rules by doctors; possible technical approach for such study would be developing an app or web site that can be accessed by patients, where they can ask free text questions that a predefined rule-based chat-bot will address. During the study there will be a doctor monitoring the collected data and improving the rules and the possible responses (and also provide new responses when the workflow has reached a dead-end), we do plan to collect the conversations and apply machine learning to generate improved work-flow tree (and questions) over time, however the plan is to do any data analysis and processing offline, there is no intention of building a full product.
This is a low budget academy study, and the PHD student has good development skills and data science knowledge (python) and will be accompanied by a fellow student that will work on the engineering side. One of the conversational-AI options recommended for data scientists was RASA.
I invested the last few days reading and playing with several chat-bots solutions: RASA, Botpress, also looked at Dialogflow and read tons of comparison material which makes it more challenging.
From the sources on the internet it seems that RASA might be a better fit for data science projects, however it would be great to get a sense of the real learning curve and how fast one can expect to have a working bot, and the especially one that has to continuously update the rules.
Few things to clarify, We do have data to generate the questions and in touch with doctors to improve the quality, it seems that we need a way to introduce participants with multiple choices and provide answers (not just free text), being in the research side there is also no need to align with any specific big provider (i.e. Google, Amazon or Microsoft) unless it has a benefit, the important consideration are time, money and felxability, we would like to have a working approach in few weeks (and continuously improve it) the whole experiment will run for no more than 3-4 months. We do need to be able to extract all the data. We are not sure about which channel is best for such study WhatsApp? Website? Other? and what are the involved complexities?
Any thoughts about the challenges and considerations about dealing with chat-bots would be valuable.

Related

Do game developers build custom game engines just for a single game or a game franchise?

I am aware that game engines like Unity, Unreal, Cry Engine provide almost all the tools necessary to build an AAA title game. Its also the best choice if the game has a tight release data or if your new to game development. But since they are generalist game engines (meaning that they are made to fit multiple genre of games. Correct me if in wrong) for some games (next-gen or games which require a lot of performance), they might leave some performance on the table, something which could be accomplished by developing a custom engine.
This brings me to my question,
Do game developers (indie game developers, large teams or even companies) still build game engines from scratch to tailor fit a game or a game franchise?
Thank You!
when we talk about big companies like Ubisoft or rockstar they built their own engines and didn't use Unity or unreal
Rockstar uses "Rockstar Advanced Game Engine"
and Ubisoft uses "AnvilNext"
but why?
there are millions of reasons they do such a thing, I'm gonna say just 2 from #scremyCat
the support
and the license
Support: Highest degree of support and understanding - as they built it all, they understand all of its internals and can offer
complete support. E.g. A game needs X feature, they'll easily know
if they can implement it or not. Another benefit of this is not
having to wait on external entities, if there's a game breaking bug
in the engine they can get right on it, while a third party engine
depending on the licensing agreement this might not be possible
(though they would typically license the source code anyway).
License: Licensing - as an indie developer accepting that you might have to pay a small percentage of your revenue for licensing
the engine might not be as much of an issue seeing as the amount you
need to breakeven is unlikely going to be very high and chances are
you're already making when your revenue is at the levels needed to
pay a %, and your total revenue from a game isn't likely going to be
huge anyway so the amount in licensing fees you need to pay may seem
very reasonable. Meanwhile a AAA game will have a much higher
break-even target and their expected revenue is most definitely in
the tens to hundreds of millions, which now means they're paying a
large amount in licensing fees. Now it should be said they usually
get much better licensing deals to begin with than the indie dev
gets, but still they're paying huge amounts.
As for timeframe, it can take years to fully develop an engine of their scale. Often why you'll see them using the same version of the engine for a good cycle of games whilst working on the next version of the engine. And as for what's involved, a LOT. They need to handle every platform they'll be targeting, the rendering, the physics, the AI, the audio, the input, the file system access, the asset management pipeline, the tools, etc.
How are they better than current popular engines? They aren't necessarily (to other developers), but to them with their own reasoning for doing it they are. The simplest answer for how can they be better is that when you're creating your own engine from scratch you can do whatever you want.
It should also be said that developing your own engine isn't just limited to large game companies, a number of smaller developers also do this. The more popular reasons for this are typically because they enjoy it, and have some functionality they want that isn't available in existing options. E.g. While you can create many games with Unity or Unreal, there's plenty of things which just aren't feasible or might take considerable work to even make possible anyway. This can be a reason for a smaller dev to make their own engine.
Yes, they absolutely do. Nintendo is a good example.

Building GIS apps from scratch?

I am a very beginner in software and I am asking or a direction to proceed for research technologies to build my app. I am having just an idea for the app. I am trying to build something like zomato but different services. The idea of location based system is similar. I searched online and came to know about GIS systems. But while researching further, it seems I've to create a map all together. This feels redundant to build as we have api of google maps.
But can i use this api to build a system "ON" it????
Any tutorials or some direction in this direction would be helpful.
Also what is difference between GIS and gps based apps.
As you see, I am not very clear in the fundamentals of the GIS and GPS based apps
Thanks for the help
Regarding Android, you have almost all you need by combining the platform API and the comprehensive Google Maps Android API. Regarding the later, it's actually a matter of opting by convenience and possibly paying a licence fee to Google, versus developing your own solutions of aggregating free or cheaper services from elsewhere.
Most problems solved by apps are not the same problems solved by classical GIS software, since the former are more consumer-oriented (using public transportation, navigating a route, planning a trip, finding a nearby restaurant), and the later are more specialist-oriented, typically solving larger-scale and more technical issues (detecting regions with flood risk, monitoring deforestation, calculating volumes of terrain to be bulldozed, etc.)
You should not, IMO, be discouraged by the seemingly hard technical concepts of geography and map making. Your best bet is to have a clear vision of what actual problems you app should be solving, and study the geography topics gradually, as the need arises.
A bit of consideration on your question about GIS:
If it were created today, the GIS acronym would mean any software dealing with geographic data, be it a mobile app or a workstation software suite destined to specialized professional use.
But when it was created, the term meant almost exclusively the later sense, and so it has a lot of tradition and cultural legacy to it - which is of couse not always a good thing. Specifically (at least in my experience), it seems to me the jargon and concepts used by the classic GIS community are a bit impenetrable to the newcomer, specially if she comes from the software-development field instead of the geo-sciences field.
But geographic information availability has gone from scarcity to overwhelming abundance, and so have its enabling technologies: GPS satellites, mobile computing and mobile connectivity.

Is using a geographically distributed development team a better approach for running a software startup?

It's commonly agreed that successful software development is as much about teamwork and communication as it is about individual programming expertise. Given this, one might assume that by operating a geographically distributed team you are at an immediate disadvantage to a tight-knit team all working locally.
When my startup company was founded, we couldn't afford shared office space and I was actually located in a different city to the rest of the team, so we all had to work remotely and use tools such as Basecamp, Skype and Trac to communicate. One the whole, this was really successful - we got a huge amount of quality work done in a short space of time and launched a successful product. Working remotely gave our developers the time and space they needed to focus on the job and be productive without having interruptions or enduring office politics. To me, this is a huge advantage.
Given my experience, as well as the success of software companies with distributed teams such as 37signals and StackOverflow (and I'm sure many more), I'm increasingly of the opinion that the advantages of running a distributed team outweighs those of running a centralised team, especially for start-up companies.
Would you agree?
Given my experience, as well as the
success of software companies with
distributed teams such as 37signals
and StackOverflow (and I'm sure many
more), I'm increasingly of the opinion
that the advantages of running a
distributed team outweighs those of
running a centralised team, especially
for start-up companies.
Would you agree?
I half agree.
Running a distributed team definitely has its disadvantages. As you pointed out in your own post, communication is a big problem. There are times, as a developer, I enjoy just bouncing ideas off other developers and swapping ideas that I may not have thought up on my own. In addition, it can be tough to get feedback or to perform code reviews (practices that I have found useful in my development experience).
With that said, I also think there is an advantage to a distributed team. The biggest of these being that developers tend to do better when they can focus and just develop and not have to worry about being interrupted or having to attend frequent meetings, etc. This was a huge advantage at one job I had at a smaller company.
In your specific situation, have you considered that one reason you were so successful was not because you were geographically dispersed, but you were successful because you're a small company? Small companies have an advantage in that you have a limited number of products, there tends to be more focus, and, as a result, you can maintain a better control over your products/schedules/etc.
That's my 2 cents.
I agree that offices are quite distracting due to noise and interruptions. But the distractions that hinder you are the other side of the coin to the ability to ask people around you questions. Although I've not tried remote working for more than a few days at a time, the inability to get an answer to a quick question in 30s is the main disadvantage that I see.
Like-for-like comparisons that might give us empirical data are very hard to do, arguably practically impossible. So that gives us the licence to speculate, right?
My pet theory is that any sufficiently talented and motivated team can make most any system, method, geographical dispersion work.
I totally agree. An office environment provides mainly distractions and opportunities to waste time and look busy. A distributed team doesn't have to pay rent, they can deduct part of their own rent or mortgage from their taxes, and they can recruit talent from virtually anywhere in the world (instead of trying to find capable RoR developers in East Bumwipe, Oklahoma).
Are you a regular reader of Joel Spolsky's blog?
Joel described the centralized offices they have set up in order to increase productivity.
More than enough room for each developer, so they can walk up and down for a while whenever a bug haunts one of them. :)
Separated offices. During work hours, only the developer and the given task exist. Nothing else.
Sound-proof walls. (As far as I can remember.) Generally useful to provide full control over work space. Devs can listen to music without headphones, for example.
As you can see, FogCreek has managed to combine most advantages of remote work, while still keeping live communication as an option.
However, due to lack of teleportation, this customized and professional office is yet to solve the problem of different world-wide locations.
From personal experience I am much more productive when working remotely. I lose the sense that someone is staring over my shoulder, criticizing me for being lazy when I'm really just taking a moment to collect my thoughts.
I also appreciate not having a commute, even if I'm only saving 20 minutes each way it's a huge load off of my back, plus I don't have to dress to be in the office so I save time getting ready in the morning.
I've found that it's fairly easy to mitigate the communication issues by implementing a certain time during the day to be online, we had people on the east and west coast so we had people stay online between 1-4p EST. Also, just making sure that everyone has each other's phone numbers was a good thing, there were many problems that could be resolved with a quick phone call.
I wish that more businesses would support remote developers, I'm in an office right now and I feel that being here is so wasteful. I could get more done in less time without the distractions involved, and would have a better ability to manage my time.
Pros: You can hire the person you like instead of sticking with those available in the neighborhood.
Cons: It can be difficult to communicate if your team members live in various time zones.
I think a start up works best if the core team are physically close in space. As the team grows and the product and processes matures remote work gains traction in my experience. During that critical first year there can't be too much communication between developers and founders.
Once the startup has real direction and good processes in place remote working becomes very effective.
Certainly having some developers working remotely saves real money in overhead costs and makes everyone happy if its possible.
In my startup a lot of our work requires direct physical interaction with expensive equipment, so we can't all be virtual. Some of us can, and our remote developers are good contributors.
I've been working for US based companies from my country for about 4 years (as of Feb 2014). The experience has been very rewarding, and I feel now absolutely comfortable doing my job remotely, but there is a learning curve that needs to be endured, which cannot be overlooked. There are so many subtleties to communication that suddenly get lost when chatting over skype or sending emails. A whole level of information brought by body language and the sheer empathy that comes from knowing personally the person you're dealing with. Over time, you learn strategies around that, but there's no denial that it is a learning process.
Also, even though sometimes having the team working on the same office is perceived as distraction-prone, in my view, it also fosters a more dynamic environment, where ideas flow more freely and faster. It also encourages a "team-attitude" towards problem solving, which is great for consistency.
I think the best approach, whenever possible, is having a bit of both - work a few days from home, so people can focus and self organize their time, and then work a few days on the same office so that they are still part of a team, instead of islands in isolation.

Ideas for a distributed processing project?

I am looking for a project idea in distributed processing on Unix based systems. I wish to use only the C programming language. I have to finish the project in 4 months and it's a part of my course work. Can someone help me with an idea?
Cryptography problems
Distributed Ray Tracer
Chess AI (really, AI for any game)
Large Prime Number Search
Web crawler or other search mechanism
Generic Problem Solver (push out problem definition on the fly, followed by problem data).
Note on the last one:
An example would be if you have a gaming website with lots of board games that you were coming out with all the time. You don't want to have to install new clients on all your servers every time you write a new AI for a board game, so you have a program which you can send new AIs to and then after that you can just send the game data and the pushed AI will be used to solve the problem. This is best used for problems which can be broken into smaller chunks.
It is hard to answer without knowing anything about performance, the scale of the project, what you are trying to accomplish, etc. For example, is it one task or multiple tasks? Is the project just totally open?
4 months is pretty short, but maybe some kind of physics problem or math problem. Sorting or some kind of database work might be dull but beneficial.
Check out mapreduce for ideas! I was really motivated by this work, personally.
We used distributed processing here at work, but it's such a broad field..
Yeah.
Why not write a distributed compiler. You may then present an interface for people to compile things on the fly, and it will be passed to your distribute compilenet. Java is probably well-suited, and you'll get to do fun things, like be very mindful of security and so on.
The BOINC project is always looking for help and is very interesting:
http://boinc.berkeley.edu/
If you want to leave your mark and change the way we search the web,
look into B-Trees.
B-Trees and offspring/variants are the working horse of the internet.
Google uses them extensively to index the web.
Database indexes/indices are B-Tree offspring/variants.
Every LAMP system uses a database and indexes/indices.
Also, they are used extensively in distributed VLDB (Very Large DataBases)
Perhaps you can improve existing distributed databases (Cassandra and HBase)
These are lofty goals, but for me, this would leave a lasting mark
in the way Web data is processed, indexed and stored.
Write a distributed, fault tolerant, redundant network B+Tree or B*Tree.
Read Drozdek's book Data Structures and Algorithms in C++.
It's a good survey of B-Trees.
Read about skip trees
http://www.cs.huji.ac.il/~ittaia/papers/AAY-OPODIS05.pdf
Read about Efficient B-tree Based Indexing for Cloud Data Processing
http://www.comp.nus.edu.sg/~ooibc/vldb10-cgindex.pdf
Google search "Network B+Tree"
https://www.google.com/search?rlz=1C1CHKZ_enUS431US431&sourceid=chrome&ie=UTF-8&q=Network+B%2BTree

Does anybody actually use the PSP (Personal Software Process)?

I've been reading a bit about this recently but it looks to be a bit heavy. Does anybody have real world experience using it?
Are there any light weight alternatives?
The Personal Software Process is a personal improvement process. The full-blown PSP is quite heavy and there are several forms, templates, and documents associated with it. However, a key point is that you are supposed to tailor the PSP to your specific needs.
Typically, when you are learning about the PSP (especially if you are learning it in a course), you will use the full PSP with all of its forms. However, as Watts S. Humphrey says in "PSP: A Self-Improvement Process for Software Engineers", it's important to "use a process that both works for you and produces the desired results". Even for an individual, multiple projects will probably require variations on the process in order to achieve the results you want to.
In the book I mentioned above, "PSP: A Self Improvement Process for Software Engineers", the steps that you should follow when defining your own process are:
Determine needs and priorities
Define objectives, goals, and quality criteria
Characterize the current process
Characterize the target process
Establish a strategy to develop the process
Validate the process
Enhance the process
If you are familiar with several process models, it should be fairly easy to take pieces from all of them and create a process or workflow that works on your particular project. If you want more advice, I would suggest picking up the book. There's an entire chapter dedicated to extending and modifying the PSP as well as creating your own process.
The Personal Software Process itself is a subset of the Capability Maturity Model (CMM) processes. There are no light weight alternatives available as of now.