I have a table "People" with a primary key "PersonID" and a field that is "Supervisor". The "Supervisor" field contains the foreign key of a "PersonID" to create a self join.
I would like to create an sql query that returns all people with "Me" (the PersonID that is logged into the database) as their supervisor, and anyone that has someone on that list labeled as their supervisor. Essentially I would like to list anyone below the supplied PersonID in the chain of command.
SQL is great for many things, but hierarchical data is one of bigger challenges. Some vendors has provided custom extensions to work around this (e.g. Oracle's CONNECT syntax or SQL Server's hierarchyid data type), but we probably want to keep this standard SQL1.
What you have modeled is called "adjacency list" -- this is very simple and straightforward, and always consistent2. But as you found out, this sucks for querying, especially for an unknown depth or for a subtree, rather than from the root node.
Therefore, we need to supplement this with an additional model. There are basically 3 other models that you should use in conjunction with the adjacency list model.
Nested sets
Materialized Path
Ancestry traversal closure
To study them in depth, we'll use this diagram:
For this discussion, we are also assuming this is a simple hierarchy, that there are no cycles.
Joe Celko's Nested Sets.
Basically, you store the "Left" and "Right" value of each node which indicates its position in the tree. The root node will always have 1 for "Left" and <count of nodes * 2> for "Right". This is easier to illustrate with a diagram:
Note that each node gets assigned a pair of number, one for "Left", and other for "Right". With that information, you can do some logical deductions. Finding all children becomes easy - you filter for values where the nodes' "Left" is greater than the target node's "Left" and where the same nodes' "Right" is smaller than the target node's "Right".
The biggest downside with the model is that a change to the hierarchy almost invariably requires updating the entire tree, which makes it very awful to maintain for a fast moving charts. If this is something you only update once a year, this might be acceptable.
The other issue with this model is that if there is a need for a multiple hierarchies, the nested set will not work without additional columns to track the separate hierarchy.
Materialized Path
You know how a filesystem path works, right? This is basically the same thing, except that we are storing this in the database3. For instance, a possible implementation of a materialized path might look like this:
ID Name Path
1 Alice 1/
2 Bob 1/2/
3 Christina 1/3/
4 Dwayne 1/4/
5 Erin 1/2/5/
6 Frank 1/2/6/
7 Georgia 1/2/7/
8 Harry 1/2/7/8/
9 Isabella 1/3/9/
10 Jake 1/3/10/
11 Kirby 1/3/10/11/
12 Lana 1/3/12/
13 Mike 1/4/13/
14 Norma 1/4/13/14/
15 Opus 1/4/15/
16 Rianna 1/4/16/
This is quite intuitive and can perform OK as long you write your SQL queries to use predicates like WHERE Path LIKE '1/4/*'. Engines will be able to use index on the path column. Note that if your queries involve querying a middle of the tree or from bottom up, that means index cannot be used and performance will suffer for it. But programming against a materialized path is pretty easy to understand. Updating a part of the tree won't propagate to unrelated nodes as the nested sets so that's also a plus in its favor.
The biggest downside is that to be indexable, the text has to be a short column. For Access database that puts a 255 character limit on your path field. Even worse, there is no good way to predict when you are about to hit the limit -- you could hit it because you have too deep tree, or because you have too wide tree (e.g. bigger numbers taking up too much spaces). For that reason, large trees might necessitate some hard-coded limit to avoid this situation.
Ancestry Traversal Closure
This model involves a separate table which is updated whenever the employee table is updated. Instead of only recording the immediate relationship, we enumerate all the ancestry between two nodes. To illustrate, this is how the table will look like:
Employee table:
ID Name
1 Alice
2 Bob
3 Christina
4 Dwayne
5 Erin
6 Frank
7 Georgia
8 Harry
9 Isabella
10 Jake
11 Kirby
12 Lana
13 Mike
14 Norma
15 Opus
16 Rianna
Employee Ancestry Table:
Origin Ancestor
1 1
2 1
2 2
3 1
3 3
4 1
4 4
5 1
5 2
5 5
6 1
6 2
6 6
7 1
7 2
7 7
8 1
8 2
8 7
8 8
9 1
9 3
9 9
10 1
10 3
10 10
11 1
11 3
11 10
11 11
12 1
12 3
12 12
13 1
13 4
14 1
14 4
14 13
14 14
15 1
15 4
15 15
16 1
16 4
16 16
As you see, we generate several rows worth of all possible relationship between two nodes. As a bonus because it's a table, we can make use of foreign key and cascade delete to help keep it consistent. We still have to manually manage the inserts & updates however. Because the table is also narrow, it makes it very easy to create query that can leverage index on the key, the origin and the ancestor to find the subtree, the children, the parent. This is the most flexible system at expense of extra complexity around the maintenance.
Maintaining the model
All 3 models discussed are basically denormalizing the data a bit in order to simplify the query and support an arbitrary depth search. A consequence of that is this necessitates us to manually manage the changes when the employee table is modified in some fashion.
The most simplest approach is simply to just write a VBA procedure that will truncate and re-build the entire chart using your preferred model. This can work very well when the chart is small or does not change often.
On the other end, you could consider using Data Macros on your employee table to perform the maintenance required to propagate the updates to the hierarchy. A caveat, though, if you use data macros, this makes it harder to port the data to another RDBMS system since none of those support data macros. (To be fair, the problem would still exist if you were porting from SQL Server's stored procedures/triggers to Oracle's stored procedure/triggers - those are very steeped in vendor's dialect that porting is a challenge). Using data macros or trigger + stored procedure mean that you can rely on the engine to maintain the hierarchy for you without any programming in the forms.
A common temptation is to use form's AfterUpdate event to maintain the changes and that would work.... unless someone update it outside the form. For that reason, I would actually prefer that we used a data macro rather than relying on everyone to always use the form.
Note that in all of this discussion, we should NOT discard the adjacency list model. As I commented earlier, this is the most normalized and consistent way to model the hierarchy. It is literally impossible to create a nonsensical hierarchy with it. For that reason alone, you should keep it as your "authoritative truth", which you can then build your model upon to aid the querying performance.
Another good reason to keep using the adjacency list model is regardless of which model you use above, they introduce either additional columns or additional tables that are not meant to be directly edited by users but are for purpose somewhat equivalent to a calculated field and thus should not be tinkered with. If the users are allowed to edit only the SupervisorID field, then it becomes easy to code your data macros/triggers/VBA procedure around that one field, and updating the "calculations" of the additional fields/table to ensure correctness for the queries depending on such models.
1. SQL Standard does describe a way to create a recursive query. However, the compliance for that particular feature seems to be poor. Furthermore, the performance may not be that great. (which is the case with SQL Server's particular implementation) The 3 models discussed are easily implemented in most of RDBMS and queries for querying the hierarchy can be easily written and ported. However, the implementation to automatically manage the changes to the hierarchy invariably requires vendor-specific dialect, using triggers or stored procedure which is not very portable.
2. When I say consistent, I only mean that the model cannot create a nonsensical output. It's still possible to provide wrong data and make a weird hierarchy such as an employee's supervisor reporting to the employee, but not one that would give undefined results. However, it still is a hierarchy (even if it ends up as a cyclical graph). With other models, failing to maintain the derived data correctly means the queries will start returning undefined results.
3. SQL Server's hierarchyid data type is in fact an implementation of this model.
As you probably will have a rather limited count, say six, of levels deep, you can use a query with subqueries with subqueries ... etc. Very simple.
For an unlimited number of levels, the fastest way I've found, is to create a lookup function which walks the tree for each record. This can output either the level of the record or a compound key build by the key of the record and all keys above.
As the lookup function will use the same recordset for every call, you can make it static, and (for JET) you can improve further by using Seek to locate the records.
Here's an example which will give you an idea:
Public Function RecursiveLookup(ByVal lngID As Long) As String
Static dbs As Database
Static tbl As TableDef
Static rst As Recordset
Dim lngLevel As Long
Dim strAccount As String
If dbs Is Nothing Then
' For testing only.
' Replace with OpenDatabase of backend database file.
Set dbs = CurrentDb()
Set tbl = dbs.TableDefs("tblAccount")
Set rst = dbs.OpenRecordset(tbl.Name, dbOpenTable)
End If
With rst
.Index = "PrimaryKey"
While lngID > 0
.Seek "=", lngID
If Not .NoMatch Then
lngLevel = lngLevel + 1
lngID = !MasterAccountFK.Value
If lngID > 0 Then
strAccount = str(!AccountID) & strAccount
End If
Else
lngID = 0
End If
Wend
' Leave recordset open.
' .Close
End With
' Don't terminate static objects.
' Set rst = Nothing
' Set tbl = Nothing
' Set dbs = Nothing
' Alternative expression for returning the level.
' (Adjust vartype of return value of function.) ' RecursiveLookup = lngLevel ' As Long
RecursiveLookup = strAccount
End Function
This assumes a table with a primary key ID and a foreign (master) key pointing to the parent record - and a top level record (not used) with a visible key (AccountID) of 0.
Now your tree will be nicely shown almost instantaneously using a query like this where Account will be the visible compound key:
SELECT
*, RecursiveLookup([ID]) AS Account
FROM
tblAccount
WHERE
(AccountID > 0)
ORDER BY
RecursiveLookup([ID]);
If you wish to use this to add records to another table, you should not make an SQL call for each, as this is very slow, but first open a recordset, then use AddNew-Update to append each record and, finally, close this recordset.
Consider the following set of functions:
Function BuildQuerySQL(lngsid As Long) As String
Dim intlvl As Integer
Dim strsel As String: strsel = selsql(intlvl)
Dim strfrm As String: strfrm = "people as p0 "
Dim strwhr As String: strwhr = "where p0.supervisor = " & lngsid
While HasRecordsP(strsel & strfrm & strwhr)
intlvl = intlvl + 1
BuildQuerySQL = BuildQuerySQL & " union " & strsel & strfrm & strwhr
strsel = selsql(intlvl)
If intlvl > 1 Then
strfrm = "(" & strfrm & ")" & frmsql(intlvl)
Else
strfrm = strfrm & frmsql(intlvl)
End If
Wend
BuildQuerySQL = Mid(BuildQuerySQL, 8)
End Function
Function HasRecordsP(strSQL As String) As Boolean
Dim dbs As DAO.Database
Set dbs = CurrentDb
With dbs.OpenRecordset(strSQL)
HasRecordsP = Not .EOF
.Close
End With
Set dbs = Nothing
End Function
Function selsql(intlvl As Integer) As String
selsql = "select p" & intlvl & ".personid from "
End Function
Function frmsql(intlvl As Integer) As String
frmsql = " inner join people as p" & intlvl & " on p" & intlvl - 1 & ".personid = p" & intlvl & ".supervisor "
End Function
Here, the BuildQuerySQL function may be supplied with the PersonID corresponding to a Supervisor and the function will return 'recursive' SQL code for an appropriate query to obtain the PersonID for all subordinates of the supervisor.
Such function may therefore be evaluated to construct a saved query, e.g. for a supervisor with PersonID = 5, creating a query called Subordinates:
Sub test()
CurrentDb.CreateQueryDef "Subordinates", BuildQuerySQL(5)
End Sub
Or the SQL may be evaluated to open a RecordSet of the results perhaps, depending on the requirements of your application.
Note that the function constructs a UNION query, with each level of nesting unioned with the previous query.
After considering the options presented here I have decided that I am going about this the wrong way. I have added a field to the "People" table "PermissionsLevel" which is a lookup from another table with a simple "PermissionNumber" and "PermissionDescription". I then use a select case in the Form_load() event for the logged in user's permission level.
Select Case userPermissionLevel
Case Creator
'Queries everyone in the database
Case Administrator
'Queries everyone in the "Department" they are a member of
Case Supervisor
'Queries all people WHERE supervisor = userID OR _
supervisor IN (Select PersonID From People WHERE supervisor = userID)
Case Custodian '(Person in charge of maintaining the HAZMAT Cabinet and SDS)
'Queries WHERE supervisor = DLookup("Supervisor", "People", "PersonID = " & userID)
Related
I need to process through a list of technical skills one by one and get a count of the number of developers we have in 3 locations who have that skill. For example, car type = "Java". How many persons have this skill listed in their resume.
I have 2 tables:
Skills: contains a single column listing skills - "Java" for example
Resources: contains 4 columns, Resource-ID, Name, Location, and a long text field called "Resume" containing text of their resume.
If this were a single skill I would process the SQL something like below (SQL syntax not exact)
SELECT count FROM [Resources] WHERE ([Resources].[Resume] Like "SKILL-ID*");
I want to process the Skills table serially printing the "Skill" and the count in each location.
Help appreciated
I've only used Access as a DB for single record retrieval, never using table values as input to loop through a process. I suspect that this is a simple execution in MS Access.
Ok, so we have to process that table.
I would simple "send out" a row with Location and skill for each match. We could write some "messy" code to then group by, but that is what SQL is for!!!
We could quite easy keep/have/use/enjoy the results in code, but its better to send the results out to a working table. Then we can use what sql does best - group and count that data.
So, then, the code could be this:
Sub CountSkills()
' empty out our working report table
CurrentDb.Execute "DELETE * FROM ReportResult"
Dim rstSkills As DAO.Recordset
Dim rstResources As DAO.Recordset
Dim rstReportResult As DAO.Recordset
Dim strSQL As String
Set rstSkills = CurrentDb.OpenRecordset("Skills")
strSQL = "SELECT Location, Resume FROM Resources " & _
"ORDER BY Location"
Set rstResources = CurrentDb.OpenRecordset(strSQL)
Set rstReportResult = CurrentDb.OpenRecordset("ReportResult")
Do While rstResources.EOF = False
' now for each resource row, process skill set
rstSkills.MoveFirst
Do While rstSkills.EOF = False
If InStr(rstResources!Resume, rstSkills!Skill) > 0 Then
rstReportResult.AddNew
rstReportResult!Location = rstResources!Location
rstReportResult!Skill = rstSkills!Skill
rstReportResult.Update
End If
rstSkills.MoveNext
Loop
rstResources.MoveNext
Loop
End Sub
Now, the above will wind up with a table looking like this:
So, now we can query (and count) against above data.
So, this query would do the trick:
SELECT Location, Skill, Count(1) AS SkillCount
FROM ReportResult
GROUP BY Location, Skill
And now we get this:
And you can flip the above query to group by skil, then location if you wish.
so, at the most simple level?
We write out ONE row + location for every match, and then use SQL on that to group by and count.
We COULD write code to actually count up by location, but that VBA code would as noted be a bit messy, and just spitting out rows of location and skill means we can then group by skill count, skill location count, or location, skill counts just by using "simple" sql against that list of location and skill record list.
So, now use the report wizard on that query above, and we get something like this:
Of course it is simple to change around the above report, but you get the idea for such a simple task as you noted.
Summarizing count of developers by skill and location can be accomplished with SQL. It requires a dataset of all possible skill/location pairs. Consider this simple example:
Resources
ID
Name
Location
Resume
1
a
x
Java,Excel
2
b
x
Excel
3
c
y
Excel
4
d
z
VBA,Java
SELECT Skills.SkillName, Resources.Location,
Sum(Abs([Resume] Like '*' & [SkillName] & "*")) AS DevCt
FROM Skills, Resources
GROUP BY Skills.SkillName, Resources.Location;
SkillName
Location
DevCt
Excel
x
2
Excel
y
1
Excel
z
0
Java
x
1
Java
y
0
Java
z
1
VBA
x
0
VBA
y
0
VBA
z
1
This approach utilizes a Cartesian product of Skills and Resources tables to generate the data pairs. This type of query can perform slowly with large dataset. If it is too slow, saving the pairs to a table could improve performance. Otherwise, a VBA solution will be only recourse and would likely involve looping recordset object.
Regardless of approach, be aware results will be skewed if Resume has content like "excellence". Bad data output is pitfall of poor database design.
Solution:
Resource and Skill table were added into MS_Access
Step 1: Create a query that executes the below SQL to get a counts (here named "Step1Query"):
SELECT Skills.SkillName, Resources.Location,
Sum(Abs([Resume] Like '*' & [SkillName] & "*")) AS DevCt
FROM Skills, Resources
GROUP BY Skills.SkillName, Resources.Location;
Step 2: Create a second query that uses the Step 1 query as input. (you can do this via the wizard):
TRANSFORM Sum(Step1Query.DevCt) AS SumOfDevCt
SELECT Skills.SkillName, Resources.Location,
Sum(Abs([Resume] Like '*' & [SkillName] & "*")) AS DevCt
FROM Skills, Resources
GROUP BY Skills.SkillName, Resources.Location
PIVOT Step1Query_qry.[Location];
Result lists out a matrix form. Thanks all for your help.
I have the following SQL query which I am loading in to a DataSet:
SELECT i1.* , i2.* From tblMMLettersImportTable i1 Join tblMMLettersImportTable i2 on i1.SectionID + 1 = i2.SectionID Where i2.startpage - i1.endpage <> 1
Idea is to check that the index for various sections of a document lead one page on to the other with no gaps. I.e section 2 ends on page 5 and section 3 starts on page 6.
I'm happy that the SQL works, however by joining on itself the field "SectionID" is duplicated. In SQL easy enough, just use i1. or i2. to reference the correct one.
The issue comes when I load this in to a VB.net Dataset. I need to raise an error message with something like:
MessageBox.Show("There is a page gap between sections " & row.item("i1.sectionID") & " and " & row.item("i2.sectionID")
I get the error message Column 'i1.intline' does not belong to table Table. Makes sense as that is not its name in the dataset. I've considered using the column number to reference the item to pull out, however the SQL Table tblMMLettersImportTable is deleted, created and populated dynamically depending on the type of Letter/document being produced so I cannot always guarantee that the columns will always numbered the same. This is also why i1.* and i2.* is used instead of listing each column.
Is there a way that I can reference 2 items in a DataSet that have the same item name with VB.Net?
I have a table in MS Access which has stock prices arranged like
Ticker1, 9:30:00, $49.01
Ticker1, 9:30:01, $49.08
Ticker2, 9:30:00, $102.02
Ticker2, 9:30:01, $102.15
and so on.
I need to do some calculation where I need to compare prices in 1 row, with the immediately previous price (and if the price movement is greater than X% in 1 second, I need to report the instance separately).
If I were doing this in Excel, it's a fairly simple formula. I have a few million rows of data, so that's not an option.
Any suggestions on how I could do it in MS Access?
I am open to any kind of solutions (with or without SQL or VBA).
Update:
I ended up trying to traverse my records by using ADODB.Recordset in nested loops. Code below. I though it was a good idea, and the logic worked for a small table (20k rows). But when I ran it on a larger table (3m rows), Access ballooned to 2GB limit without finishing the task (because of temporary tables, the size of the original table was more like ~300MB). Posting it here in case it helps someone with smaller data sets.
Do While Not rstTickers.EOF
myTicker = rstTickers!ticker
rstDates.MoveFirst
Do While Not rstDates.EOF
myDate = rstDates!Date_Only
strSql = "select * from Prices where ticker = """ & myTicker & """ and Date_Only = #" & myDate & "#" 'get all prices for a given ticker for a given date
rst.Open strSql, cn, adOpenKeyset, adLockOptimistic 'I needed to do this to open in editable mode
rst.MoveFirst
sPrice1 = rst!Open_Price
rst!Row_Num = i
rst.MoveNext
Do While Not rst.EOF
i = i + 1
rst!Row_Num = i
rst!Previous_Price = sPrice1
sPrice2 = rst!Open_Price
rst!Price_Move = Round(Abs((sPrice2 / sPrice1) - 1), 6)
sPrice1 = sPrice2
rst.MoveNext
Loop
i = i + 1
rst.Close
rstDates.MoveNext
Loop
rstTickers.MoveNext
Loop
If the data is always one second apart without any milliseconds, then you can join the table to itself on the Ticker ID and the time offsetting by one second.
Otherwise, if there is no sequence counter of some sort to join on, then you will need to create one. You can do this by doing a "ranking" query. There are multiple approaches to this. You can try each and see which one works the fastest in your situation.
One approach is to use a subquery that returns the number of rows are before the current row. Another approach is to join the table to itself on all the rows before it and do a group by and count. Both approaches produce the same results but depending on the nature of your data and how it's structured and what indexes you have, one approach will be faster than the other.
Once you have a "rank column", you do the procedure described in the first paragraph, but instead of joining on an offset of time, you join on an offset of rank.
I ended up moving my data to a SQL server (which had its own issues). I added a row number variable (row_num) like this
ALTER TABLE Prices ADD Row_Num INT NOT NULL IDENTITY (1,1)
It worked for me (I think) because my underlying data was in the order that I needed for it to be in. I've read enough comments that you shouldn't do it, because you don't know what order is the server storing the data in.
Anyway, after that it was a join on itself. Took me a while to figure out the syntax (I am new to SQL). Adding SQL here for reference (works on SQL server but not Access).
Update A Set Previous_Price = B.Open_Price
FROM Prices A INNER JOIN Prices B
ON A.Date_Only = B.Date_Only
WHERE ((A.Ticker=B.Ticker) AND (A.Row_Num=B.Row_Num+1));
BTW, I had to first add the column Date_Only like this (works on Access but not SQL server)
UPDATE Prices SET Prices.Date_Only = Format([Time_Date],"mm/dd/yyyy");
I think the solution for row numbers described by #Rabbit should work better (broadly speaking). I just haven't had the time to try it out. It took me a whole day to get this far.
I am bringing over a record set that needs to be divided into 6 lists. I am using the field WrkList to hold the list number that will range from 1-6. I don't want to manually add the numbers to each of the new records with a repeating squence of (1, 2, 3, 4, 5, 6) as they are brought in. The WrkList field allows the records to be worked by 6 employees using queries that use the field as the criteria for that query. In any given day, over 1200 records may be appended to the table throughout the day and would need to have the WrkList field updated. I want these divided out as evenly as possible among the 6 groups as each new set of records are appended. Any help on getting started would be greatly appreciated.
Basically, you will open a recordset in DAO that includes all the records for which WrkList is Null. You will sort this by the order they came in, or some other logical criteria - whatever helps your workers have a coherent work queue (perhaps no order at all).
You will go through the recordset from beginning to end and update the WrkList field with a variable, byteWrkList.
This variable will have a value that changes with each edit. It will increment up by one, or if it was 6 for the last edit, it will return to 1.
NOTE: This code does not specify that you have filtered for Null! OpenRecordset must be based on a query that does filter for Null! (Or it must be based on a SQL string that does the same thing.)
Option Compare Database
Option Explicit
Private Sub AllocateTasks()
Public byteMax As Byte, byteWrkList as Byte
byteMax = 6
byteWrkList = 1
Dim rstTask As Recordset
Set rstTask = CurrentDb.OpenRecordset("tableOfTasks")
Do Until rstTask.EOF
if byteWrkList > byteMax then
byteWrkList = 1
else
byteWrkList = byteWrkList + 1
end if
rstTask.Edit
' Make sure you are not over-writing an existing value!
' Make sure it is NULL, or that your recordset excluded NULLs.
rstTask!WrkList = byteWrkList
rstTask.Update
rstTask.MoveNext
Loop
rstTask.Close
Set rstTask = Nothing
End Sub
Then you just need a way to invoke (trigger) the above code ... but your post doesn't really have enough information to suggest what that is.
There are alternate (and elegant) methods to obtain byteWrkList, such as using the mod() function applied to an autonumber index. (This is not important. I just had to get it off my chest because mod() is fun.) Indeed, there are alternate methods to handle this entirely; but this is what I would start with.
i'm facing a recurring problem. I've to let a user reorder some list that is stored in a database.
The fist straightforward approach i can think is to have a "position" column with the ordering saved as a integer. p.e.
Data, Order
A 1
B 2
C 3
D 4
Problem here is that if i have to insert FOO in position 2, now my table become
Data, Order
A 1
FOO 2
B 3
C 4
D 5
So to insert a new line, i have to do one CREATE and three UPDATE on a table of five elements.
So my new idea is using Real numbers instead of integers, my new table become
Data, Order
A 1.0
B 2.0
C 3.0
D 4.0
If i want to insert a element FOO after A, this become
Data, Order
A 1.0
FOO 1.5
B 2.0
C 3.0
D 4.0
With only one SQL query executed.
This would work fine with theoretical Real Numbers, but floating point numbers have a limited precision and i wondering how feasible this is and whether and how can i optimize it to avoid exceeding double precision with a reasonable number of modifications
edit:
this is how i implemented it now in python
#classmethod
def get_middle_priority(cls, p, n):
p = Decimal(str(p))
n = Decimal(str(n))
m = p + ((n - p)/2)
i = 0
while True:
m1 = round(m, i)
if m1 > p and m1 < n:
return m1
else:
i += 1
#classmethod
def create(cls, data, user):
prev = data.get('prev')
if prev is None or len(prev)<1:
first = cls.list().first()
if first is None:
priority = 1.0
else:
priority = first.priority - 1.0
else:
prev = cls.list().filter(Rotator.codice==prev).first()
next = cls.list().filter(Rotator.priority>prev.priority).first()
if next is None:
priority = prev.priority + 1.0
else:
priority = cls.get_middle_priority(prev.priority, next.priority)
r = cls(data.get('codice'),
priority)
DBSession.add(r)
return r
If you want to control the position and there is no ORDER BY solution then a rather simple and robust approach is to point to the next or to the previous. Updates/inserts/deletes (other than the first and last) will require 3 operations.
Insert the new Item
Update the Item Prior the New Item
Update the Item After the New Item
After you have that established you can use a CTE (with a UNION ALL) to create a sorted list that will never have a limit.
I have seen rather large implementations of this that were done via Triggers to keep the list in perfect form. I however am not a fan of triggers and would just put the logic for the entire operation in a stored procedure.
You may use a string rather then numbers:
item order
A ffga
B ffgaa
C ffgb
Here, the problem of finite precision is handled by the possibility of growing the string. String storage is theoretically unlimited in the database, only by the size of the storage device. But there is no better solution for absolute-ordering items. Relative-ordering, like linked-lists, might work better (but you can't do order by query then).
The linked list idea is neat but it's expensive to pull out data in order. If you have a database which supports it, you can use something like connect by to pull it out. linked list in sql is a question dedicated to that problem.
Now if you don't, I was thinking of how one can achieve an infinitely divisable range, and thought of sections in a book. What about storing the list initially as
1
2
3
and then to insert between 1 and two you insert a "subsection under 1" so that your list becomes
1
1.1
2
3
If you want to insert another one between 1.1 and 2 you place a second subsection under 1 and get
1
1.1
1.2
2
3
and lastly if you want to add something between 1.1 and 1.2 you need to introduce a subsubsection and get
1
1.1
1.1.1
1.2
2
3
Maybe using letters instead of numbers would be less confusing.
I'm not sure if there is any standard lexicographic ordering in sql databases which could sort this type of list correctly. But I think you could roll your own with some "order by case" and substringing. Edit: I found a question pertaining to this: linky
Another downside is that the worst case field size of this solution would grow exponentially with the number of input items (You could get long rows like 1.1.1.1.1.1 etc). But in the best case it would be linear or almost constant (Rows like 1.934856.1).
This solution is also quite close to what you already had in mind, and I'm not sure that it's an improvement. A decimal number using the binary partitioning strategy that you mentioned will probably increase the number of decimal points between each insert by one, right? So you would get
1,2 -> 1,1.5,2 -> 1,1.25,1.5,2 -> 1,1.125,1.25,1.5,2
So the best case of the subsectioning-strategy seems better, but the worst case a lot worse.
I'm also not aware of any infinite precision decimal types for sql databases. But you could of course save your number as a string, in which case this solution becomes even more similar to your original one.
Set all rows to a unique number starting at 1 and incrementing by 1 at the start. When you insert a new row, set it to count(*) of the table + 1 (there are a variety of ways of doing this).
When the user updates the Order of a row, always update it by calling a stored procedure with this Id (PK) of the row to update and the new order. In the stored procedure,
update tableName set Order = Order + 1 where Order >= #updatedRowOrder;
update tablename set Order = #updatedRowOrder where Id = #pk;
That guarantees that there will always be space and a continuous sequence with no duplicates. I haven't worked you what would happen if you put silly new Order numbers of a row (e.g. <= 0) but probably bad things; that's for the Front End app to prevent.
Cheers -