I was given an problem statement and the problem statement was Suppose you’re building an app for restaurants to take customer orders.
Your app needs to store a list of orders. Servers keep adding
orders to this list, and chefs take orders off the list and make them.
It’s an order queue: servers add orders to the back of the queue,
and the chef takes the first order off the queue and cooks it.
Would you use an array or a linked list to implement this queue.
I repaid linked list. Lots of inserts are happening (servers
adding orders), which linked lists excel at. You don’t need search
or random access (what arrays excel at), because the chefs always
take the first order off the queue.
Now please advise was my answer was correct and i was also thinking it mite happen that server put the ten items in queue simultaneously but at the other end chef decides that it will pick the item first which will take less time to prepare so in that case which data structure is best
Linked List Based Queue will be the best option for the above asked Question. In the linked list Queue you have the advantage of O(1) operation for insert(placing the orders by servers) and delete (Taking the orders by chef). Also you have dynamic memory allocation (Memory is used up only when it needs to ).
I have worked on a similar project where we used to club the similar orders which is what chefs generally do in the real world. Ex: 2 French Fries from server 1 and 3 French Fries from server2 will become 5 French Fries order in total.
When you want to Implement a scenario where random access based on the time taken for the preparation of item is considered, I still think the linked list queue will be more useful.
yes, linked-list should be the way to go.
This is typically because not only are you trying to insert from one end, you are also trying to remove elements from the other end.
Now in this linked list, you'll have a head and a tail, for which you'll keep their pointers always available (as you need to remove elements from tail and add elementss on top of head).
As the orders are added towards the head, the head keeps on updating : No problem here.
For the part where chef removes one order: You'll have to remove item from the list and also keep the tail pointer point to the second last object, which is now the last object in the list.
The only way in which I can think of achieving this is by using a doubly linked list, were the tail gets updated every time you remove (or pop) an element from the queue (oops sorry, linked list).
I guess that's how i was taught how to implement a queue, using a doubly linked list.
You can refer to the impelemtation here.
EDIT:
You can do it with a singly linked list as well. Steps are:
Whenever a new element is added, add it to the back of the list and update the tail pointer.
Whenever the chef needs the element from the linked list, access the head, take out one element and update head to head-> next.
Related
I have one redis list "waiting" and one redis list "partying".
I have a long running process that safely blocks on the "waiting" list item to come along, and then pops it and pushes it onto the "partying" list atomically using BRPOPLPUSH. Awesome.
Users in the "waiting" list are repeatedly querying "am I in the "partying" list yet?", but there is no fast (i.e. < O(n)) method of checking if a user is in a redis list. You have to grab the whole list and loop through it.
So I'm resorting to switching from a redis list to a redis sorted set, with the 'score' as the unix timestamp of when they joined the "waiting" sorted set. I can blocking pop on the lowest score (user at the head of the queue). Using sorted sets, I can use ZSCORE to check in O(1) time if they're on either list, so it's looking hopeful.
How can I perform the nice atomic equivalent of BRPOPLPUSH on sorted sets?
It's like I need a mythical BZRPOPMIN & ZADD= BZRPOPMINZADD. If the process dies between these two, a user will effectively be disappeared from both sets.
Looking into MULTI EXEC transactions in redis, they are not what they appear to be at first glance, they're more like 'pipelines', in that I can't get the result of the first command (BZRPOPMIN) and feed it into the second command (ZADD). I'm very suspicious of putting the blocking BZRPOPMIN into the MULTI too, am I right to be?
How can I perform the nice atomic equivalent of BRPOPLPUSH on sorted sets?
Sorry, you can't. We actually discussed this when the ZPOP family was added and decided against it: "However I'm not for the BZPOPZADD part, because instead experience with lists shown that this is not a good idea in general, unfortunately, and that that adding safety of message processing may be used other means. The worst thing abut BZPOPZADD and BRPOPLPUSH and so forth are the cascading effects, they create a lot of problems in replication for instance, and our BRPOPLPUSH replication is still not correct in certain ways (we can talk about it if you want)." (ref: https://github.com/antirez/redis/pull/4879#issuecomment-389116241)
I'm very suspicious of putting the blocking BZRPOPMIN into the MULTI too, am I right to be?
Definitely, and blocking commands can't be called inside a transaction anyway.
This is a concept question, regarding "best practice" and "efficient use" of resources.
Specifically dealing with large data sets in a db and on-line web applications, and moving from a procedural processing approach to a more Object Oriented approach.
Take a "list" page, found in almost all CRUD aspects of the application. The list displays a company, address and contact. For the sake of argument, and "proper" RDBM, assume we've normalized the data such that a company can have multiple addresses, and contacts.
- for our scenario, lets say I have a list of 200 companies, each with 2-10 addresses, each address has a contact. i.e. any franchise where the 'store' is named 'McDonalds', but there may be multiple addresses by that 'name').
TABLES
companies
addresses
contacts
To this point, I'd make a single DB call and use joins to pull back ALL my data, loop over the data and output each line... Some grouping would be done at the application layer to display things in a friendly manner. (this seems like the most efficient way, as the RDBM did the heavy lifting - there was a minimum network calls (one to the db, one from the db, one http request, one http response).
Another way of doing this, if you couldn't group at the application layer, is to query for the company list, loop over that, and inside the loop make separate DB call(s) for the address, contact. less efficient, because you're making multiple DB calls
Now - the question, or sticking point.... Conceptually...
If I have a company object, an address object and a contact object - it seems that in order to achieve the same result - you would call a 'getCompanies' method that would return a list, and you'd loop over the list, and call 'getAdderss' for each, and likewise a 'getContact' - passing in the company ID etc.
In a web app - this means A LOT more traffic from the application layer to the DB for the data, and a lot of smaller DB calls, etc. - it seems SERIOUSLY less effective.
If you then move a fair amount of this logic to the client side, for an AJAX application, you're incurring network traffic ON TOP of the increased internal network overhead.
Can someone please comment on the best ways to approach this. Maybe its a conceptual thing.
Someone suggested that a 'gateway' is when you access these large data-sets, as opposed to smaller more granular object data - but this doesn't really help my understanding,and Im not sure it's accurate.
Of course getting everything you need at once from the database is the most efficient. You don't need to give that up just because you want to write your code as an OO model. Basically, you get all the results from the database first, then translate the tabular data into a hierarchical form to fill objects with. "getCompanies" could make a single database call joining addresses and contacts, and return "company" objects that contain populated lists of "addresses" and "contacts". See Object-relational mapping.
I've dealt with exactly this issue many times. The first and MOST important thing to remember is : don't optimize prematurely. Optimize your code for readability, the DRY principle, etc., then come back and fix things that are "slow".
However, specific to this case, rather than iteratively getting the addresses for each company one at a time, pass a list of all the company IDs to the fetcher, and get all the addresses for all those company ids, then cache that list of addresses in a map. When you need to fetch an address by addressID, fetch it from that local cache. This is called an IdentityMap. However, like I said, I don't recommend recoding the flow for this optimization until needed. Most often there are 10 things on a page, not 100 so you are saving only a few milliseconds by changing the "normal" flow for the optimized flow.
Of course, once you've done this 20 times, writing code in the "optimized flow" becomes more natural, but you also have the experience of when to do it and when not to.
In my app I'm displaying Race objects that essentially have three states: pending, inProgress and completed. I want to display all Races that are currently pending or inProgress, but not the ones that are completed. To do this, I want to create a RESTful API for getting these resources from my server, but I'm not sure what the best (i.e. most RESTful) approach would be.
The issue is that when someone opens or refreshes the app, I need to two things:
Perform a GET on all the Races that are currently displayed in the client to update their status.
GET all of the new pending or inProgress Races that have been created since the client last updated
I've come up with a few different solutions, though I don't know which, if any, would be best:
Simply delete the old Race records on the client and always GET all new records
Perform 2 separate GET operations, the first which updates all the old records, and the second where I GET all the new pending / inProgress Races
Perform a single GET operation where I specify the created date of the last client record, and GET all records that are newer.
To me, this seems like a pretty common scenario but I haven't been able to find a specific answer to this type of problem. I'd like to see what SO thinks :)
Thanks in advance for your help!
Simply delete the old Race records on the client and always GET all new records
This is probably the easiest solution. However you shouldn't do that if you need a very smooth update on your client (for games, data visualization, etc.).
Perform 2 separate GET operations (...) / Perform a single GET operation where I specify the created date of the last client record, and GET all records that are newer.
I would definitely do it with a single operation. Better than an update timestamp (timestamp operations are costly, and several operations could happen at the same time), I would use a sequence number. This is the way CouchDB handles "changes".
Moreover, as you will see in the documentation, this solution can then be upgraded for asynchronous notifications (if you need so).
I have a table on my database that outlines complex processes in a work breakdown structure (similar to what's used to create Gantt charts). There are multiple rows for a particular process, each row outlining a hierarchical step of a particular process.
I then have a table with some product types, each being linked to a particular process. When an order for a particular product is placed - it is to be manufactured with the associated process.
In my situation, the processes can be dynamic (steps added or removed, for example).
I'm curious as to what the best way to capture current and historical revisions of each process is, such that even though a process may have evolved over time - I can historically go back to a particular order and determine what the process looked like at that time.
I'm sure there are multiple ways to go about this, using logging or triggers with a new history table - but I've had no experience doing something like this and I'd like to know what worked well for others.
A ConcurrentBag will allow multiple threads to add and remove items from the bag. It is possible that a thread will add an item to the bag and then end up taking that same item right back out. It says that the ConcurrentBag is unordered, but how unordered is it? On a single thread, the bag acts like a Stack. Does unordered mean "not like a linked list"?
What is a real world use for ConcurrentBag?
Because there is no ordering the ConcurrentBag has a performance advantage over ConcurrentStack/Queue. It is implemented by Microsoft as local thread storage. So every thread that adds items does this in it's own space. When retrieving items they come from the local storage. Only when that is empty the thread steals item from another threads storage. So instead of a simple list a ConcurrentBag is a distributed list of items. And is almost lockfree and should scale better with high concurrency.
Unfortunately in .NET 4.0 there was a performance issue (fixed in 4.5) see
http://ayende.com/blog/156097/the-high-cost-of-concurrentbag-in-net-4-0
Bags are really useful for tracking instance counts. For example, if you want to keep a record of which hosts you're servicing web requests for, you can add their IP to the bag when you start servicing the request, and remove it when done.
Using a bag will allow you to tell at a glance which IPs you're currently servicing. It will also let you quickly query whether you're servicing a given IP address.
If you use a set for this rather than a bag, then having multiple concurrent requests from the same IP address will mess up your record-keeping.
Anything where you just need to keep track of what's there and don't need random access or guaranteed order. If you have a thread that adds items to process, and a thread that removes items in order to process them, a concurrent bag would work well if you don't care that they're processed in FIFO order.
Thanks to #Chris Jester-Young I came up with a good, real world, scenario that actually applies to a project i'm working on.
Find - Process - Store
Find - threads 1 & 2 are set to find or scrape data (file system, web, etc). These results are stored in ConcurrentBag1.
Process - threads 3 & 4 are set to take out of ConcurrentBag1, clean/transform/process the data and then store the results in ConcurrentBag2.
Store - threads 5 is set to gather results from ConcurrentBag2 and store the results in SQL.