I've faced some problems regarding how to input data in SPSS when it comes to multiple answers. Let say the question is like this:
What is the main mode of access to these online courses? (you may choose more than one answer if applicable)
Wired campus network
Wireless campus network
Mobile broadband
Wired broadband/ADSL
Mobile packet data
And the student answers more than one answer. So how can I input all these data in SPSS. This is different from a scaling question where each parameter has a scale. It is only one question, but multiple answer... I really dont know how to find the solutuion. I've been asking many people, refer on books, searching on internet, but all that is not enough and I didn't find any answer until now.
These are sometimes referred to as multiple response sets. You would typically have separate variables (i.e. columns) for each potential answer, and then use some type of integer representation for when a person checked that response and when they did not check that response. Most frequently people use 0 for when they did not check that response and 1 for when they did. Afterwards you can define multiple response sets through a GUI dialog, and this is useful when generating tables.
Googling for multiple response sets SPSS seems to bring up alot of useful resources. I also know John Hall has posted tutorials for multiple response sets in SPSS that may be useful.
Related
I was reading some interesting questions about the topic "Can we make a program that, given a particular sequence, produces the next terms", like this one, and I really like the detailed answer of this one. I understand that the answer is "That's impossible without more restrictions", and that given some restrictions (polynomials, rational function or boolean map) we know some good algorithms, as the second answer I linked explains.
Now, a natural question is how much can we solve, trying our best even if we can't always solve it, to answer the original, general question. What I usually do when facing a hard sequence is trying to see if it's in OEIS, and if it seems to be there, seeing if there is any formula or algorithm to produce it in there. You can download a small version of OEIS with the first terms of each sequence, and you can make queries to find formulas or maple algorithms for a particular sequence. My question is, do you think it's feasible to download a small version of OEIS that includes, with the first terms, a little algorithm to produce it?
The natural problem here is that I haven't seen any link to download the entire database of OEIS with all the details, which maybe deserves its own question. Even if we had this, you need to read the formulas/algorithms (that can be written in different languages, from what I've seen) and interpret them correctly. But I thought maybe someone here knows how to solve this, in any case thanks in advance.
You could, as you note, download the sequences and their A-numbers from the link mentioned here: https://oeis.org/wiki/Welcome#Compressed_Versions
After searching that and finding one sequence (or a small number of sequences) of interest, you could scrape the respective page(s) for formulas. There are specific fields for Maple and Mathematica, which may be helpful, and otherwise, an entry in the PROGRAM field should include identifying information when it is not one of the standard languages with its own field in the database. See: http://oeis.org/wiki/Style_Sheet
Unofficially, but with the interests of the OEIS in mind, I would not recommend trying to download or scrape the OEIS in its entirety. Whether it's one person, or a whole host of people, we would certainly recommend using the compressed version of the database to identify sequences of interest by A-number first, then pulling their entire entry by scraping the site or querying the OEIS using methods that you have already mentioned: Programmatic access to On-Line Encyclopedia of Integer Sequences
If this sounds laborious, perhaps an alternative is the Wolfram Cloud, which actives this through other means. For example, you can navigate to the cloud (you may have to register just to get access) at: https://www.wolframcloud.com/
Typing in something like FindSequenceFunction[{1, 2, 3, 5, 17, 305, 34865}] will give you a formula, if Wolfram/Mathematica can find one. The documentation for FindSequenceFunction can be found here: https://reference.wolfram.com/language/ref/FindSequenceFunction.html
Wolfram/Mathematica can also invoke the OEIS using packages like the one described here: https://mathematica.stackexchange.com/questions/40/is-it-possible-to-invoke-the-oeis-from-mathematica
i want to copy a data from a website which sells courses like ITIL, Prince2 and PMP and many other IT sector courses now there are 20,000 different courses's description is there.
However, i want to use selenium to scrape all the data but description is still subject to copyright.
Kindly let me know how i can manipulate all of that description to data to same meaning but different words.
Is there any API which can give me an access to build an code which will be helping these description data by using it's synonymous or which can change it's grammer to completely new sentennces but same meaning.
Kindly let me know where to start this.
Thanks,
The task you are referring to is called paraphrasing.
There is a lot of research on the field. In arXiv you fill find research papers on the topic. However, since you are asking for an API, I am assuming you don't want to implement these models by your self. Luckily, some authors have published their models online on GitHub. (Note: some are a re-implementation by someone else.)
When you use some of these implementations, note that most offer a pre-trained model. Do read which data set was used for training and try to pick the one that is the most similar to the data that you are facing. By doing so, more words in the domain of your descriptions will be available and more synonyms can be used.
Recently I have attended two different job interviews and one of the questions they made was something like this:
1- You need to create an API that will use some microservices that are very slow. Some of them respond under a few seconds (let's say 2 seconds). We have to make our best to build our API very reliable in terms of latency. What would you do to make this system work fast?
2- This led me to other questions like if I choose to cache some data, what do I have to do avoid old cache? For example, if i cached the user personal info and he just updated his profile?
3- Finally if it was not a reading operation, what do we have to do to use services that take a long time not impact the user experience? In this case imagine that it's a writing operation
How would you answer these questions?
The question is a little vague but I'll try and throw a couple of solutions out there.
Before jumping into the cache, I would first ask questions about the data set. For instance, how large is this data set and how often does the data set change? If the data set isn't large, you can probably store all of it in memory indefinitely and on updates, you can update individual records in the cache.
Of course when we say we store it in cache, we also have to keep in mind data retrieval. If data retrieval requires grabbing the data in many different ways and the data set is large, caching may not be as great as a solution. This kind of addresses the first and second question that you've posted without further information from the interviewer. This in turn is really where you need to tease out requirements from the interviewer to see if you're on the right track.
Now finally for the third question, I think the interviewer is trying to get you to write asynchronously to something like a queuing mechanism that allows user to get a quick response and your system to take its time processing it. A follow up question here may be about how long can a write take to be processed and that will lead to a series of more domain specific questions. Again, you'll have to dig into the requirements of this to see what kind of trade-offs the interviewer wants you to make because there is no silver bullet.
As a way to score points for the study I’m doing, as well as out of interest into databases and wanting to help my team I’m trying to build a capacity planning tool in MS Access 2007. I work in a department that handles registering and supporting tenders. I have attached two pictures of what I’m trying to do here.
I’ve already spent some weeks making multiple iterations with colleagues who are involved and help write VBA and SQL (out of interest, wanting to learn something or otherwise. Our core business, however, isn’t developing). The primary goal of the database is as follows:
A user can access, create and modify “cases” that correlates to a case ID that we use in a different system.
A user can write down his capacity per week per year for a case.
multiple users can assign themselves to a case.
Users can leave messages (records) for other users to see on a case
Metadata can be attached to the case
The main problem we seem to be running into is that whenever a user tries through to edit an existing case through the overview, the case data no longer “complies” with entries elsewhere. Forcing updates through visual basic also seems to not have worked so far.
Adding to the complexity: most of the names we use are in dutch.
Here is an overview of the relations.
http://imgur.com/O022LAG
Here is a screenshot of the case overview as seen by a user.
http://imgur.com/kuENqaq
Main question:
How can I make entire records change for multiple users based on the input of one user.
In compliance with the guidelines regarding asking subjective questions I’m trying to be a bit more precise here:
Additionally I’m uncertain:
whether it is our approach that is wrong,
if perhaps we’re overlooking a glaring issue, or
if we should redesign this from scratch with a different layout.
Any help specifying where we should look or what would be advisable to do would be much appreciated!
Kind regards,
Timo
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
The community reviewed whether to reopen this question 12 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I am interested in learning how a database engine works (i.e. the internals of it). I know most of the basic data structures taught in CS (trees, hash tables, lists, etc.) as well as a pretty good understanding of compiler theory (and have implemented a very simple interpreter) but I don't understand how to go about writing a database engine. I have searched for tutorials on the subject and I couldn't find any, so I am hoping someone else can point me in the right direction. Basically, I would like information on the following:
How the data is stored internally (i.e. how tables are represented, etc.)
How the engine finds data that it needs (e.g. run a SELECT query)
How data is inserted in a way that is fast and efficient
And any other topics that may be relevant to this. It doesn't have to be an on-disk database - even an in-memory database is fine (if it is easier) because I just want to learn the principals behind it.
Many thanks for your help.
If you're good at reading code, studying SQLite will teach you a whole boatload about database design. It's small, so it's easier to wrap your head around. But it's also professionally written.
SQLite 2.5.0 for Code Reading
http://sqlite.org/
The answer to this question is a huge one. expect a PHD thesis to have it answered 100% ;)
but we can think of the problems one by one:
How to store the data internally:
you should have a data file containing your database objects and a caching mechanism to load the data in focus and some data around it into RAM
assume you have a table, with some data, we would create a data format to convert this table into a binary file, by agreeing on the definition of a column delimiter and a row delimiter and make sure such pattern of delimiter is never used in your data itself. i.e. if you have selected <*> for example to separate columns, you should validate the data you are placing in this table not to contain this pattern. you could also use a row header and a column header by specifying size of row and some internal indexing number to speed up your search, and at the start of each column to have the length of this column
like "Adam", 1, 11.1, "123 ABC Street POBox 456"
you can have it like
<&RowHeader, 1><&Col1,CHR, 4>Adam<&Col2, num,1,0>1<&Col3, Num,2,1>111<&Col4, CHR, 24>123 ABC Street POBox 456<&RowTrailer>
How to find items quickly
try using hashing and indexing to point at data stored and cached based on different criteria
taking same example above, you could sort the value of the first column and store it in a separate object pointing at row id of items sorted alphabetically, and so on
How to speed insert data
I know from Oracle is that they insert data in a temporary place both in RAM and on disk and do housekeeping on periodic basis, the database engine is busy all the time optimizing its structure but in the same time we do not want to lose data in case of power failure of something like that.
so try to keep data in this temporary place with no sorting, append your original storage, and later on when system is free resort your indexes and clear the temp area when done
good luck, great project.
There are books on the topic a good place to start would be Database Systems: The Complete Book by Garcia-Molina, Ullman, and Widom
SQLite was mentioned before, but I want to add some thing.
I personally learned a lot by studying SQlite. The interesting thing is, that I did not go to the source code (though I just had a short look). I learned much by reading the technical material and specially looking at the internal commands it generates. It has an own stack based interpreter inside and you can read the P-Code it generates internally just by using explain. Thus you can see how various constructs are translated to the low-level engine (that is surprisingly simple -- but that is also the secret of its stability and efficiency).
I would suggest focusing on www.sqlite.org
It's recent, small (source code 1MB), open source (so you can figure it out for yourself)...
Books have been written about how it is implemented:
http://www.sqlite.org/books.html
It runs on a variety of operating systems for both desktop computers and mobile phones so experimenting is easy and learning about it will be useful right now and in the future.
It even has a decent community here: https://stackoverflow.com/questions/tagged/sqlite
Okay, I have found a site which has some information on SQL and implementation - it is a bit hard to link to the page which lists all the tutorials, so I will link them one by one:
http://c2.com/cgi/wiki?CategoryPattern
http://c2.com/cgi/wiki?SliceResultVertically
http://c2.com/cgi/wiki?SqlMyopia
http://c2.com/cgi/wiki?SqlPattern
http://c2.com/cgi/wiki?StructuredQueryLanguage
http://c2.com/cgi/wiki?TemplateTables
http://c2.com/cgi/wiki?ThinkSqlAsConstraintSatisfaction
may be you can learn from HSQLDB. I think they offers small and simple database for learning. you can look at the codes since it is open source.
If MySQL interests you, I would also suggest this wiki page, which has got some information about how MySQL works. Also, you might want to take a look at Understanding MySQL Internals.
You might also consider looking at a non-SQL interface for your Database engine. Please take a look at Apache CouchDB. Its what you would call, a document oriented database system.
Good Luck!
I am not sure whether it would fit to your requirements but I had implemented a simple file oriented database with support for simple (SELECT, INSERT , UPDATE ) using perl.
What I did was I stored each table as a file on disk and entries with a well defined pattern and manipulated the data using in built linux tools like awk and sed. for improving efficiency, frequently accessed data were cached.