What Data Science method can I use for this specific sleep schedule problem of mine? - data-science

I have a little personal project of mine, in which I want to find out what my optimal sleep related behaviors are, that influence my sleep quality and my sleep through quality.
My explanatory variables are "Last Meal", "Turn Down", "Fell Asleep", "Woke Up" (all time variables) and my dependent variables are sleep quality and sleep through quality (both scale of 1 to 10).
What Data Science, Machine Learning or statistical method might be best for this problem?
Bonus question: can it be done in excel? Else, what are the methods/packages for it in Python or R? Excel Table

Related

Taguchi methods - number of experiments - reg

I want to conduct an experiment with ten factors ( factors like costs and capacities) to know the influence of each factor on the optimum value of an optimization problem. I want to know the number of levels required for each factor and the number of experiments required with factor levels for each experiment.
Cost of experiment is not a matter because these are experiments are going to be run using a software, but the time required to run is important because if large number of experiments are required the time will be more.
please throw light.
You have Minitab as a tag on this question; that said, Minitab has an excellent capability to help in planning DOEs. Go to the Assistant menu, then DOE, then Plan and Create...
If you click "Create Modeling Design" in the optimization experiment path, it will give you the setup screen where you can specify response, optimization objective, factors, etc. Notice that the design is a typical factorial-type design where low/high values are used in each experimental run. This should give good results, but just to let you know there are other design types that can be even better given the circumstances of each situation. For instance, you mentioned these are software experiments -- there is a nice design called a "Space Filling Design" which creates factor design points (not necessarily at low/high values) to optimally fill the design search space. These designs are often used for computer simulation experiments.
An excellent text on DOE is https://www.amazon.com/Design-Analysis-Experiments-Douglas-Montgomery/dp/1118146921

any path-finding tricks and strategies for mobile game (e.g rts) on less powerful device?

i'm developing a 2d game , rts game, is sort of like COC (Clash of Clans). cool mobile game,huh. but i run into some problem with path-finding, as usual,i do path-finding algorithm once every agent was placed somewhere in screen by finger-touch,but in some case, this incur performance penalty, and your mobile phone will be very hot as your agents increase suddenly and simultaneously.
actually,no matter what path-finding i use,e.g a*,dijkstra, or something special(maybe optimal),which is always time-comsuming process throughout the whole game loop ,especially massive agents on less powerfull mobile cpu. as far as i know, some game like this, shortest path is not the focus (will people care about the path agent walk through intentionally?) instead of efficient and naturally path-finding. so my mind come up with some solutions,maybe impractical.
solution 1 : use some cheaper path-finding algorithm, could be graph related or somethingelse because shortest path doesn't matter.
solution 2 : put some limits on ai module to process agent for path-finding, e.g upper limit to path-finding algorithm calls at interval,that is,just one or two agents of the those agents got planning, let rest of them plann after several game frames. as you know ,its drawback is obvious.
the above is what i thougt. hope your game dev disciplined guys give me brilliant idea , tricks, i'll appricate. thank you very much.
EDIT:
here is my related pseudo code,and procedure cresspond to my game logic.
//inside logic thread
procedure putonagent
if (need to put agent on world space)
//do standard a* path-finding for an agent
path_list=do_aStar_path_finding(attacktargetpos,startpos);
and then enqueue path_list;
......
end
the path_list's queue finally used by visual agents for stepping forward. any hints?
Look up "Hierarchical pathfinding" Say you're driving to a city far away, you don't plan the entire path before you get in the car!
Pathfinding is usually done in steps, like it's not one function call, after N iterations it'll return (and indicate it's not complete) so it can be run at the next available time. Basically rather than a function with locals think operator() and state variables as members of a class.
To make it fast you can make the heuristic crap with A* pathfinding, suppose I use a heuristic of 10* the distance-as-the-crow-flies, it may not find the shortest path but it will have a strong preference for heading towards the target rather than "fanning out" and exploring further around the closed region.

Profiling a VxWorks system

We've got a fairly large application running on VxWorks 5.5.1 that's been developed and modified for around 10 years now. We have some simple home-grown tools to show that we are not using too much memory or too much processor, but we don't have a good feel for how much headroom we actually have. It's starting to make it difficult to do estimates for future enhancements.
Does anybody have any suggestions on how to profile such a system? We've never had much luck getting the Wind River tools to work.
For bonus points: the other complication is that our system has very different behaviors at different times; during start-up it does a lot of stuff, then it sits relatively idle except for brief bursts of activity. If there is a profiler with some programmatic way to have to record state information, I think that'd be very useful too.
FWIW, this is compiled with GCC and written entirely in C.
I've done a lot of performance tuning of various kinds of software, including embedded applications. I won't discuss memory profiling - I think that is a different issue.
I can only guess where the "well-known" idea originated that to find performance problems you need to measure performance of various parts. That is a top-down approach, similar to the way governments try to control budget waste, by subdividing. IMHO, it doesn't work very well.
Measurement is OK for seeing if what you did made a difference, but it is poor at telling you what to fix.
What is good at telling you what to fix is a bottom-up approach, in which you examine a representative sample of microscopic units of what is being spent, and finding out the full explanation of why each one is being spent. This works for a simple statistical reason. If there is a reason why some percent (for example 40%) of samples can be saved, on average 40% of samples will show it, and it doesn't require a huge number of samples. It does require that you examine each sample carefully, and not just sort of aggregate them into bigger bunches.
As a historical example, this is what Harry Truman did at the outbreak of the U.S. involvement in WW II. There was terrific waste in the defense industry. He just got in his car, drove out to the factories, and interviewed the people standing around. Then he went back to the U.S. Senate, explained what the problems were exactly, and got them fixed.
Maybe this is more of an answer than you wanted. Specifically, this is the method I use, and this is a blow-by-blow example of it.
ADDED: I guess the idea of finding-by-measuring is simply natural. Around '82 I was working on an embedded system, and I needed to do some performance tuning. The hardware engineer offered to put a timer on the board that I could read (providing from his plenty). IOW he assumed that finding performance problems required timing. I thanked him and declined, because by that time I knew and trusted the random-halt technique (done with an in-circuit-emulator).
If you have the Auxiliary Clock available, you could use the SPY utility (configurable via the config.h file) which does give you a very rough approximation of which tasks are using the CPU.
The nice thing about it is that it does not require being attached to the Tornado environment and you can use it from the Kernel shell.
Otherwise, btpierre's suggestion of using taskHookAdd has been used successfully in the past.
I've worked on systems that have had luck using locally-built monitoring utilities based on taskSwitchHookAdd and related functions (delete hook, etc).
"Simply" use this to track the number of ticks a given task runs. I realize that this is fairly gross scale information for profiling, but it can be useful depending on your needs.
To see how much cpu% each task is using, calculate the percentage of ticks assigned to each task.
To see how much headroom you have, add a lowest priority "idle" task that just does "while(1){}", and see how much cpu% it is assigned to it. Roughly speaking, that's your headroom.

Does anybody actually use the PSP (Personal Software Process)?

I've been reading a bit about this recently but it looks to be a bit heavy. Does anybody have real world experience using it?
Are there any light weight alternatives?
The Personal Software Process is a personal improvement process. The full-blown PSP is quite heavy and there are several forms, templates, and documents associated with it. However, a key point is that you are supposed to tailor the PSP to your specific needs.
Typically, when you are learning about the PSP (especially if you are learning it in a course), you will use the full PSP with all of its forms. However, as Watts S. Humphrey says in "PSP: A Self-Improvement Process for Software Engineers", it's important to "use a process that both works for you and produces the desired results". Even for an individual, multiple projects will probably require variations on the process in order to achieve the results you want to.
In the book I mentioned above, "PSP: A Self Improvement Process for Software Engineers", the steps that you should follow when defining your own process are:
Determine needs and priorities
Define objectives, goals, and quality criteria
Characterize the current process
Characterize the target process
Establish a strategy to develop the process
Validate the process
Enhance the process
If you are familiar with several process models, it should be fairly easy to take pieces from all of them and create a process or workflow that works on your particular project. If you want more advice, I would suggest picking up the book. There's an entire chapter dedicated to extending and modifying the PSP as well as creating your own process.
The Personal Software Process itself is a subset of the Capability Maturity Model (CMM) processes. There are no light weight alternatives available as of now.

Is it premature optimization to develop on slow machines?

We should develop on slow boxen because it forces us to optimize early.
Randall Hyde points out in The Fallacy of Premature Optimization, there are plenty of misconceptions around the Hoare quote:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
In particular, even though machines these days scream compared with those in Hoare's day, it doesn't mean "optimization should be avoided." So does my respected colleague have a point when he suggests that we should develop on boxes of modest tempo? The idea is that performance bottlenecks are more irritating on a slow box and so they are likely to receive attention.
This should be community wiki since it's pretty subjective and there's no "right" answer.
That said, you should develop on the fastest machine available to you. Yes, anything slower will introduce irritation and encourage you to fix the slowdowns, but only at a very high price:
Your productivity as a programmer is directly related to the number of things you can hold in your head, and anything which slows down your process or impedes you at all lengthens the amount of time you have to hold those ideas in short term-memory, making you more likely to forget them, and have to go re-learn them.
Waiting for a program to compile allows the stack of bugs, potential issues, and fixes to drop out of your head as you get distracted. Waiting for a dialog to load, or a query to finish interrupts you similarly.
Even if you ignore that effect, you've still got the truth of the later statement - early optimization will leave you chasing yourself round in circles, breaking code that already works, and guessing (with often poor accuracy) about where things might get bogged down. Design your code properly in the first place, and you can forget about optimization until it's had a chance to settle for a bit, at which point any necessary optimization will be obvious.
Slow computers are not going to help you find your performance problems.
If your test data is only a few hundred rows in a table your db will cache it all and you'll never find badly written queries or bad table/index design. If your server application is not multi-threaded server you will not find that out until you stress test it with 500 users. Or if the app bottlenecks on bandwidth.
Optimization is "A Good Thing" but as I say to new developers who have all sorts of ideas about how to do it better 'I don't care how quickly you give me the wrong answer'. Get it right first, then make it faster when you find a bottleneck. An experienced programmer is going to design and build it reasonably well to start with.
If performance is really critical (real time? millisecond-transactions?) then you need to design and implement a set of benchmarks and tools to scientifically prove to yourselves that your changes are making it faster. There are way too many variables out there that affect performance.
Plus there's the classic programmer excuse they will bring out - 'but it's running slow because we have deliberately picked slow computers, it will run much faster when we deploy it.'
If your colleague thinks its important give him a slow computer and put him in charge of 'performance' :-)
I guess it would depend on what you're making and what the intended audience is.
If you're writing software for fixed hardware (say, console games) then use equipment (at least test equipment) that is similar or the same as what you will deploy on.
If you're developing desktop apps or something in that realm then develop on whatever machine you want and then tune it afterward to run on the desired min-spec hardware. Likewise, if you're developing in-house software, there is likely to be a min-spec for the machines that the company wants to buy. In that case, develop on a fast machine (to decrease development time and therefore costs) and test against that min-spec.
Bottom line, develop on the fastest machine you can get your hands on, and test on the minimum or exact hardware that you'll be supporting.
If you are programming on hardware that is close to the final test and production environments, you tend to find that there are less nasty surprises when it comes time to release the code.
I've seen enough programmers get side-swiped by serious, but unexpected problems caused by their machines being way faster than their most of their users. But also, I've seen the same problem occur with data. The code is tested on a small dataset and then "crumbles" on a large one.
Any differences in development and deployment environments can be the source of unexpected problems.
Still, since programming is expensive and time-consuming, if the end-user is running slow out-of-date equipment, the better solution is to deal with it at testing time (and schedule in a few early tests just to check usability and timing).
Why cripple your programmers just because you're worried about missing a potential problem? That's not a sane development strategy.
Paul.
for the love of Codd, use profiling tools, not slow development machines!
Optimization should be avoided, didn't that give us Vista? :p
But in all seriousness, its always a matter of tradeoffs. Important questions to ask yourself
What platform will your end users be using?
Can I drop cycles? What will happen if I do?
I agree with most that initial development should be done on the fastest or most efficient (not neccesarily the same) machine available to you. But for running tests, run it on your target platform, and test often and early.
Depends on your time to delivery. If you are in a 12 month delivery cycle then you should develop on a box with decent speed since your customers' 12 months from now will have better "average" boxes than the current "average".
As your development cycle approaches "today", your development machines should approach the current "average" speed of your clients' boxes.
I typically develop on the fastest machine I can get my hands on.
Most of the time I'm running a debug build, which is slow enough already.
I think it is a sound concept (but maybe because it works for me).
If my developer workstation is too fast I find I don't think ideas through thorougly enough simply because there is little time-penalty in re-generating the software image or downloading it to the target. I'd say at least half my downloads were unnecessary because I just remembered something I'd missed right before I was going to debug the code.
The target machine could well contain a throttled processor. If - on an embedded MCU - you have half the FLASH, RAM and clock cycles per second chances are developers will be a lot more careful when designing their code. I once suggested byte variables for the lengths of individual records in a data area (not in RAM but in a serial eeprom) and received the reply "we don't need to be stingy." A few months later they hit the RAM ceiling (128KiB). My reflection was that for this app there would never be any records larger than 256 bytes simply because there was no RAM to copy them to.
For server applications I think it would be a great idea to have a (much) lower-performing hardware to test on. Two or four cores instead of sixteen (or more). 1.6 GHz istdo 2.8. The list goes on. A server is usually - due to the very fact that everyone talks to it - a bottleneck in the system architecture. And that is long before you start developing the (server) application for it.