By user sorting I mean that as a user on the site you see a bunch of items, and you are supposed to be able to reorder them (I'm using jQuery UI).
The user only sees 20 items on each page, but the total number of items can be thousands.
I assume I need to add another column in the table for custom ordering.
If the user sees items from 41-60, and and he sorts them like:
41 = 2nd
42 = 1st
43 = fifth
etc.
I can't just set the ordering column to 2,1,5.
I would need to go through the entire table and change each record.
Is there any way to avoid this and somehow sort only the current selection?
Add another column to store the custom order, just as you suggested yourself. You can avoid the problem of having to reassign all rows' values by using a REAL-typed column: For new rows, you still use an increasing integer sequence for the column's value. But if a user reorders a row, the decimal data type will allow you to use the formula ½ (previous row's value + next row's value) to update the column of the single row that was moved. You
have got two special cases to take care of, namely if a user moves a row to the very beginning or end of the list. In that case, just use min - 1 rsp. max + 1.
This approach is the simplest I can think of, but it also has some downsides. First, it has a theoretical limitation due to the datatype having only double-precision. After a finite number of reorderings, the values are too close together for their average to be a different number. But that's really only a theoretical limit you should never reach in practical applications. Also, the column will use 8 bytes of memory per row, which probably is much more than you actually need.
If your application might scale to the point where those 8 bytes matter or where you might have users that overeagerly reorder rows, you should instead stick to the INTEGER column and use multiples of a constant number as the default values (e.g. 100, 200, 300, ..). You still use the update formula from above, but whenever two values become too close together, you reassign all values. By tweaking the constant multiplier to the average table size / user behaviour, you can control how often this expensive operation has to be done.
There are a couple ways I can think of to do this. One would be to use a SELECT FROM SELECT style statement. As in something like this.
SELECT *
FROM (
SELECT col1, col2, col3...
FROM ...
WHERE ...
LIMIT n,m
) as Table_A
ORDER BY ...
The second option would be to use temp tables such as:
INSERT INTO temp_table_A SELECT ... FROM ... WHERE ... LIMIT n,m;
SELECT * FROM temp_table_A ORDER BY ...
Another option to look at would be jQuery plugin like DataTables
one way i can think of is:
Add a new column (if feasible) or create a new table for holding the order of the items.
On any page you will show around 20 items based on the initial ordering.
Using the jquery's Draggable you can send updates to this table
I think you can do this with an extra column.
First, you could prepopulate this new column with a default sort order and then allow the user to interactively modify it with the drag and drop of jquery-ui.
Lets say this user has 100 items in the table. You set the values in the order column to [1,2,3,...,99,100]. I suggest that you run a script on the original table to set all items to a default sort order.
Now going back to your example where the user is presented with items 41-60: the initial presentation in their browser would rank those at orders [41,42,43,...,59,60]. You might also need to save the lowest order that appears in this subset, in this case 41. Or better yet, save the entire array of rankings and restore the exact same numbers in the new order. This covers the case where they select a set of records that are not already consecutively ordered, perhaps because they belong to someone else.
To demonstrate what I mean: when they reorder them in the page, your javascript reassigns those same numbers back to the subset in the new order. Like this:
item A : 41
item B : 45
item C : 46
item D : 47
item E : 51
item F : 54
item G : 57
then the user changes them to this order, but you reassign the numbers like this:
item D : 41
item F : 45
item E : 46
item A : 47
item C : 51
item B : 54
item G : 57
This should also work if the subset is consecutive.
Related
I am using MongoDB as my database.
I have data which contains rank and name as columns. Now a new row can be updated with a rank different from ranks already existing or can be same.
If same then the ranks of other rows must be adjusted .
Rows having lesser rank than the to be inserted one must be incremented by one and the rows which are having ranks can remain as it it.
Feature is something like number bulleted list in MS Word type of applications. Where inserting a row in between adjust the numbering of other rows below it.
Rank 1 is the highest rank.
For e.g. there are 3 rows
Name Rank
A 1
B 2
C 3
Now i want to update a row with D as name and 2 as rank. So now after the row insert, the DB should like below
Name Rank
A 1
B 3
C 4
D 2
Probably using Database triggers i can achieve this by updating the other rows.
I have couple of questions
(a) Is there any other better way than using database trigger for achieving this kind of scenario ? Updating all the rows might be a time consuming job.
(b) Does MongoDB support database trigger natively ?
Best Regards,
Saurav
No, MongoDB, does not provide triggers (yet). Also I don't think trigger is really a great way to achieve this.
So I would just like to throw some ideas, see if it makes sense.
Approach 1
Maybe instead of disturbing those many documents, you can create a collection with only one document (Let's call the collection ranking). In that document, have an array field call ranks. Since it's an array it's already maintaining a sequence.
{
_id : "RANK",
"ranks" : ["A","B","C"]
}
Now if you want to add D to this rank at 2nd position
db.ranking.update({_id:"RANK"},{$push : {"ranks":{$each : ["D"],$position:1}}});
it would add D to index 1 which is 2nd position considering index starts at 0.
{
_id : "RANK",
"ranks" : ["A","D","B","C"]
}
But there is a catch, what if you want to change C position to 1st from 4th, you need to remove it from end and put it in the beginning, I am sure both operation can't be achieved in single update (didn't dig in the options much), so we can run two queries
db.ranking.update({_id:"RANK"},{$pull : {"ranks": "C"}});
db.ranking.update({_id:"RANK"},{$push : {"ranks":{$each : ["C"],$position:0}}});
Then it would be like
{
_id : "RANK",
"ranks" : ["C","A","D","B"]
}
maintaining the rest of sequence.
Now you would probably want to store id instead of A,B,C etc. one document can be 16MB so basically this ranks array can store more than 1.3 million entries of id, if id is MongoDB ObjectId of 12 bytes each. if that is not enough, we still have option to have followup document(s) with further ranking.
Approach 2
you can also, instead of having rank as number, just have two field like followedBy and precededBy.
so your user document would look
{
_id:"A"
"followedBy":"B",
}
{
_id:"B"
"followedBy":"C",
"precededBy":"A"
}
{
_id:"c"
"precededBy":"B",
}
if you want to add D at second position, then you need to change the current 2nd position and you need to insert the new One, so it would be change in only two document
{
_id:"A"
"followedBy":"B",
}
{
_id:"B"
"followedBy":"C",
"precededBy":"D" //changed from A to D
}
{
_id:"c"
"precededBy":"B",
}
{
_id:"D"
"followedBy":"B",
"precededBy":"A"
}
The downside of this approach is that you cannot sort in query based on the ranking until and unless you get all these in application and create a linkedlist sort of structure.
This approach just preserve the ranking with minimum DB changes.
Summary:
I have a list of work items that I am attempting to assign to a list of workers. Each working is allowed to only have a max of 100 work items assigned to them. Each work item specifies the user that should work it (associated as an owner).
For example:
Jim works a total of 5 accounts each with multiple work items. In total jim has 50 items to work already assigned to him. I am allowed to assign only 50 more.
My plight/goal:
I am using a temp table and a select statement to get the # of items each owner has currently assigned to them and I calculate the available slots for new items and store the values in new column. I need to be able to select from the items table where the owner matches my list of owners and their available items(in the temp table), only retrieving the number of rows for each user equal to the number of available slots per user - query would return only 50 rows for jim even though there may be 200 matching the criteria while sam may get 0 rows because he has no available slots while there are 30 items for him to work in the items table.
I realize I may be approaching this problem wrong. I want to avoid using a cursor.
Edit: Adding some example code
SELECT
nUserID_Owner
, CASE
WHEN COUNT(c.nWorkID) >= 100 THEN 0
ELSE 100 - COUNT(c.nWorkID)
END
,COUNT(c.nWorkID)
FROM tblAccounts cic
LEFT JOIN tblWorkItems c
ON c.sAccountNumber = cic.sAccountNumber
AND c.nUserID_WorkAssignedTo = cic.nUserID_Owner
AND c.nTeamID_WorkAssignedTo = cic.nTeamID_Owner
WHERE cic.nUserID_Collector IS NOT NULL
AND nUserID_CurrentOwner = 5288
AND c.bCompleted = 0
GROUP BY nUserID_Owner
This provides output vaulues of 5288, 50, 50 (in Jim's scenario)
It took longer than I wanted it to but I found a solution.
I did use a sub-query as suggested above to produce the work items with a unique row count by user.
I used PARTITION BY to produce a unique row count for each worker and included in my HAVING clause that the row number must be < the count of available slots. I'd post the code but it's beyond the char limit and I'd also have a lot of things to change to anon the system properly.
Originally I was approaching the problem incorrectly focusing on limiting the results rather than thinking about creating the necessary data to relate the result sets.
I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.
To put this work in context... I'm trying to filter a database of objects and build descriptions which can be verbalized for a speech UI. To minimise the descriptions I want to find the shortest way to describe an object, based on the idea of Grices Maxims.
It's possible in code by iterating through the records, and running through all permutations, but I keep thinking there ought to be a way to do this in SQL... so far I haven't found it. (I'm using PostGRES.)
So I have a table that looks something like this:
id colour position height
(int) (text) (text) (int)
0 "red" "left" 9
1 "red" "middle" 8
2 "blue" "middle" 8
3 "blue" "middle" 9
4 "red" "left" 7
There are two things I wish to find based on the attributes (excluding the ID).
a) are any of the records unique, based on the minimum number of attributes?
=> e.g. record 0 is unique based on colour and height
=> e.g. record 1 is the only red item in the middle
=> e.g. record 4 is unique as its the only one which has a height of 7
b) how is a particular record unique?
=> e.g. how is record 0 unique? because it is the only item with a colour red, and height of 9
=> e.g. record 4 is unique because it is the only item with a height of 7
It may of course be that no objects are unique based on the attributes which is fine.
+++++++++++++++++++++++++
Answer for (a)
So the only way I can think to do this in SQL is to start off by testing a single attribute to see if there is a single match from all records. If not then add attribute 2 and test again. Then try attributes 1 and 3. Finally try attributes 1,2 and 3.
Something like this:-
single column test:
select * from griceanmaxims
where height=(Select height from griceanmaxims
group by height
having (count(height)=1))
or
relpos=
(Select relpos
from griceanmaxims
group by relpos
having (count(relpos)=1))
or
colour=
(Select colour
from griceanmaxims
group by colour
having (count(colour)=1))
double column tests:
(Select colour,relpos
from griceanmaxims
group by colour,relpos
having (count(colour)=1))
(Select colour,height
from griceanmaxims
group by colour,height
having (count(colour)=1))
etc
++++++++
I'm not sure if there's a better way or how to join up the results from the double column tests.
Also if anyone has any suggestions on how to determine the distinguishing factors for a record (as in question b), that would be great. My guess is that (b) would require (a) to be run for all of the field combinations, but I'm not sure if there's a better way.
Thanks in advance for any help on this one....
I like the idea of attacking the problem using a General Purpose Language eg C#:
1) Iterate through and see if any have 1 attribute which is unique eg ID = 4, which is unique because height is 7. Take ID 4 out of the 'doing' collection, and put into 'done' collection with appropriate attribute
Use a unit testing tool eg MSUNIT to prove the above works
2) Try and extend to n attibutes
Unit Test
3) See if any can be unique with 2 attributes. Take those IDs out of doing and into done with the pairs of attributes
Unit Test
4) Extend to m attributes
Unit Test
3) Refactor maybe using recursion
Hope this helps.
I want to create a table, with each row containing some sort of weight. Then I want to select random values with the probability equal to (weight of that row)/(weight of all rows). For example, having 5 rows with weights 1,2,3,4,5 out of 1000 I'd get approximately 1/15*1000=67 times first row and so on.
The table is to be filled manually. Then I'll take a random value from it. But I want to have an ability to change the probabilities on the filling stage.
I found this nice little algorithm in Quod Libet. You could probably translate it to some procedural SQL.
function WeightedShuffle(list of items with weights):
max_score ← the sum of every item’s weight
choice ← random number in the range [0, max_score)
current ← 0
for each item (i, weight) in items:
current ← current + weight
if current ≥ choice or i is the last item:
return item i
The easiest (and maybe best/safest?) way to do this is to add those rows to the table as many times as you want the weight to be - say I want "Tree" to be found 2x more often then "Dog" - I insert it 2 times into the table and I insert "Dog" once and just select elements at random one by one.
If the rows are complex/big then it would be best to create a separate table (weighted_Elements or something) in which you'll just have foreign keys to the real rows inserted as many times as the weights dictate.
The best possible scenario (if i understand your question properly) is to setup your table as you normally would and then add two columns both INT's.
Column 1: Weight - This column would hold your weight value going from -X to +X, X being the highest value you want to have as a weight (IE: X=100, -100 to 100). This value is populated to give the row an actual weight and increase or decrease the probability of it coming up.
Column 2: *Count** - This column would hold the count of how many times this row has come up, this column is needed only if you want to use fair weighting. Fair weighting prevents one row from always showing up. (IE: if you have one row weighted at 100 and another at 2 the row with 100 will always show up, this column will allow weight 2 to be more 'valueable' as you get more weight 100 results). This column should be incremented by 1 each time a row result is pulled but you can make the logic more advanced later so it adds the weight etc.
Logic: - Its really simple now, your query simply has to request all rows as you normally would then make an extra select that (you can change the logic here to whatever you want) takes the weights and subtracts the count and order by that column.
The end result should be a table where you will get your weights appearing more often until a certain point where the system will evenly distribute itself out (leave out column 2) and you will have a system that will always return the same weighted order unless you offset the base of the query (IE: LIMIT [RANDOM NUMBER], [NUMBER OF ROWS TO RETURN])
I'm not an expert in probability theory, but assuming you have a column called WEIGHT, how about
select FIELD_1, ... FIELD_N, (rand() * WEIGHT) as SCORE
from YOURTABLE
order by SCORE
limit 0, 10
This would give you 10 records, but you can change the limit clause, of course.
The problem is called Reservoir Sampling (https://en.wikipedia.org/wiki/Reservoir_sampling)
The A-Res algorithm is easy to implement in SQL:
SELECT *
FROM table
ORDER BY pow(rand(), 1 / weight) DESC
LIMIT 10;
I came looking for the answer to the same question - I decided to come up with this:
id weight
1 5
2 1
SELECT * FROM table ORDER BY RAND()/weight
it's not exact - but it is using random so i might not expect exact. I ran it 70 times to get number 2 10 times. I would have expect 1/6th but i got 1/7th. I'd say that's pretty close. I'd have to run a script to do it a few thousand times to get a really good idea if it's working.