How to make BST with given integers - binary-search-tree

I have a given integers set like ( 18, 22,7,23,25,37 ) . I have knowledge about Binary search tree. but for this case i can't understand what will be the root node & where to start ?

Well to begin with you can just have the first element as your root and then add any element that is less than root to its left and greater than it to its right and so on.
18
7 22
23
25
37
This makes sense when the numbers that are inserted are in random order, else in case or sorted numbers it will be as bad as a linked list. In fact in your example, half the numbers are in sorted order already.

Related

Generate a progressive number when new record are inserted (some record need to have the same number)

the Title can be a little confused. Let me explain the problem. I have a pipeline that loads new record daily. This record contain sales. The key is <date, location, ticket, line>. This data are loaded into a redshift table and than are exposed through a view that is read by a system. This system have a limit, the column for the ticket is a varchar(10) but the ticket is a string of 30 char. If the system take only the first 10 character will generate duplicate. The ticket number can be a "fake" number. Doesn't matter if it isn't equal to the real number. So I'm thinking to add a new column on the redshift table that contain a progressive number. The problem is that I cannot use an identity column because the record belonging to the same ticket must have the same "progressive number". Then I will expose this new column (ticket_id) instead of the original one.
That is what I want:
day
location
ticket
line
amount
ticket_id
12/12/2020
67
123...GH
1
10
1
12/12/2020
67
123...GH
2
5
1
12/12/2020
67
123...GH
3
23
1
12/12/2020
23
123...GB
1
13
2
12/12/2020
23
123...GB
2
45
2
...
...
...
...
...
...
12/12/2020
78
123...AG
5
100
153
The next day when new data will be loaded I want start with the ticket_id 154 and so on.
Each row have a column which specify the instant in which it was inserted. Rows inserted the same day have the same insert_time.
My solution is:
insert the record with ticket_id as a dense_rank. But each time (that I load new record, so each day) the ticket_id start by one, so...
... update the rows just inserted as ticket_id = ticket_id + the max number that I find under the ticket_id column where insert_time != max(insert_time)
Do you think that there is a better solution? It would be very nice if a hash function existed that take <day, location, ticket> as input and return a number of max 10 characters.
So from the comments it sounds like you cannot add a dimension table to just look up the number or 10 character string that identifies each ticket as this would be a data model change. This is likely the best and most accurate way to do this.
You asked about a hash function to do this and there are several. But first let's talk about hashes - these take strings of varying length and make a signature out of them. Since this process can significantly reduce the number of characters there is a possibility that 2 different string will generate the same hash. The longer the hash value is the lower the odds are for having such a collision but the odds are never zero. Since you can only have 10 chars this sets the odds of a hash collision.
The md5() function on Redshift will take a string and make a 32 character string (base 16 characters) out of it. md5(day::text || location || ticket:text) will make such a hash out of the columns you mentioned. This process can make 16^32 possible different strings which is a big number.
But you only want a string of 10 character. The good news is that hash functions like md5() spread the differences between strings across the whole output so you can just pick any 10 characters to use. Doing this will reduce the number of unique values to 16^10 or about 1.1 trillion - still a big number but if you have billions of rows you could see a collision. One way to improve this would be to base64 encode the md5() output and then truncate to 10 characters. Doing this will require a UDF but would improve the number of possible hashes to 1.1E18 - a million times larger. If you want the output to be an integer you can convert hex strings to integers with strtol() but a 10 digit number only has 10 billion possible values.
So if you are sure you want to use a hash this is quite possible. Just remember what a hash does.

Equivalent of Python's Map operator in PostgreSQL

I have a table in PostgreSQL DB like this.
ID, Name, scores.
10, abc,{23,19,34}
11, def, {2333,233,24}
12, ghi, {321,34}
13,hio,{}
scores in the above data model is an array or list of numbers.
Now we have to find all students who has any single score which is 19 when divided by 10. How can I achieve this?
I tried this, but it doesn't work.
SELECT * FROM students where 19 = ANY(scores)/10
Some thing like this works. But we need a better solution(using an inverted index probably) as the rows are in the order of millions.
SELECT * FROM students where 190 <= ANY(scores) < 200
In your data example there is no record that meet your condition of some value that when divided by 10 is =19. But i understand that you are sure about your condition that single number from array when divided by 10 should be equal to 19. Not sum of all or something?
First do you have gin index on score column?
Second - really single numbers in your score array will be bigger then int limit? If not bigint will just slow your queries.
Consider bigint issue and if you have to have it then, else remove ::bigint from query below
SELECT * FROM students where
scores&&array[190,191,192,193,194,195,196,197,198,199]::bigint
It is ugly but probably faster then any tries of extract all number from array first then dividing them by 10 and then comparing to 19.
If you only search score for numbers that are equal 19 after dividing by 10, just use the method of Abelisto but it will be a killer during inserting as saving function index for milions of rows and long arrays will be slow.

What is the best way to reassign ordinal number of a move operation

I have a column in the sql server called "Ordinal" that is used to indicate the display order of the rows. It starts from 0 and skips 10 for the next row. so we have something like this:
Id Ordinal
1 0
2 20
3 10
It skips 10 because we wanted to be able to move item in between items (based on ordinal) without having to reassign ordinal number for the entire table.
As you can imagine eventually, Ordinal number will need to be reassign somehow for a move in between operation either on surrounding rows or for the entire table as the unused ordinal numbers between the target items are all used up.
Is there any algorithm that I can use to effectively reorder the ordinal number for the move operation taken in the consideration like long term maintainability of the table and minimizing update operations of the table?
You can re-number the sequences using a somewhat complicated UPDATE statement:
UPDATE u
SET u.sequence = 10 * (c.num_below-1)
FROM test u
JOIN (
SELECT t.id, count(*) AS num_below
FROM test t
JOIN test tr ON tr.sequence <= t.sequence
GROUP BY t.id
) c ON c.id=u.id
The idea is to obtain a count of items with the sequence lower than that of the current row, multiply the count by ten, and assign it as the new count.
The content of test before the UPDATE:
ID Sequence
__ ________
1 0
2 10
3 20
4 12
The content of test after the UPDATE:
ID Sequence
__ ________
1 0
2 30
3 10
4 20
Now the sequence numbers are evenly spread again, so you can continue inserting in the middle until you run out of new sequence numbers; then you can re-number again.
Demo.
These won't answer your question directly--I just thought I might suggest some other approaches:
One possibility--don't try to do it by hand. Have your software manage the numbers. If they need re-writing, just save them with new numbers.
a second--use a "Linked List" instead. In each record store the index of the next record you want displayed, then have your code load that directly into a linked list.
Yet another simple approach. Let's say you're inserting a new record with an ordinal equal x.
First, check if there's a row having ordinal value equal x. In case there's one, just update all the records having the ordinal value equal or bigger than x increasing them by y. Then, you are safe to insert a new record.
This way you're sure you'll not run update every time and of course, you'll keep the order.

Most efficient ordering post database design

I have posts table that is has post_order column. I store order of each post in it. when I change the order of a row from 25 to 15, I should update all the row from 15 to end.
It's good for few rows, But in thousands rows it's worst.
Is there any better design for ordering posts, that is more efficient?
Why not change with related order instead of all the way down from 15? Lets say you have a table like this:
Post Post_Order
----------------------
x 1
y 2
z 3
. .
. .
t 10
if you want to change t to be the first post, you can change t's post_order to 1 and set row which has order 1 (x) to the value you selected first (10).
You can use the old BASIC trick (from the time BASIC still had line numbers), of leaving gaps.
For example (shamelessly copied from Kuzgun's answer):
x 10
y 20
z 30
. .
. .
t 100
Then moving t to, say, second place would involve updating just one row:
x 10
t 15
y 20
z 30
. .
. .
Of course, you'll still need to move more than one row from time to time (when they "bunch up" too much), but this should be relatively rare (and you can make initial gaps larger if that becomes a problem).
Alternatively, just continue doing what you're doing.
Unless reordering thousands of items is really frequent, modern DBMS on modern hardware shouldn't have much trouble with it - just be careful to do it from one command, for example...
UPDATE POST
SET POST_ORDER = POST_ORDER + 1
WHERE POST_ORDER > 1 -- AND other criteria
...instead of issuing a separate UPDATE for each row.

How to get Next 4 digit number from Table (Contains 3,4, 5 & 6, digit long numbers)

I found a good method of getting the next 4 digit number.
How to find next free unique 4-digit number
But in my case I want to get next available 4 or 5 digit number.
And this will change depending upon the users request. Theses number are not a key ID columns, but they are essential to the tagging structure for the business purpose.
Currently I use a table adabpter query. But how would I write a query.
I suppose I could do a long iterative loop through all values until I see a 4 digit.
But I'm trying to think of something more efficient.
Function GetNextAvailableNumber(NumofDigits) as Long
'SQL Code Here ----
'Query Number Table
Return Long
End Function
Here's my current SQL:
'This Queries my View
SELECT MIN([Number]) AS Expr1
FROM LineNumbersNotUsed
'This is my View SQL
SELECT Numbers.Number
FROM Numbers
WHERE (((Exists (Select * From LineList Where LineList.LineNum = Numbers.Number))=False))
ORDER BY Numbers.Number;
Numbers is the List of All available number from 0 to 99999, basically what's available to use.
LineList is my final master table where I keep the long and all the relevant other business information.
Hopefully this make sense.
Gosh you guys are so tough on new guys.
I accidentally hit the enter key, and the question posted and I instantly get -3 votes.
Give a new guy a break will you! Please.
I apologize in advance in case I overlooked something in your question. Using your design, won't a query like this return the next unused 4 digit number?
SELECT MIN([Number]) AS next_number
FROM LineNumbersNotUsed
WHERE
[Number] > 999
AND [Number] < 10000;
This approach is not adequate with multiple concurrent users, but you didn't indicate that is an issue for you.
The question you linked to explains that what you need is a table with 2 fields:
Number InUse
0000 No
0001 No
0002 Yes
0003 No
0005 Yes
Whenever a number is used/released, the table must be updated to set InUse to Yes/No.
Maybe I'm missing something, but from your explanation, and the SQL code you show us, it seems that you only have a table with a single field containing all numbers from 0 to 100000.
If that's the case, I don't see the usefulness of that table at all.
If I were you, and if I understand your need correctly, what you want is something like this:
First of all, create the table as above, with all running numbers from 0 to 100000, and a field for confirming if that number is used or not.
Initialise the InUse field with all the numbers already taken in your LineList table, something like:
UPDATE Numbers SET InUse = True
WHERE Numbers.Number IN (SELECT LineNum FROM LineList)
Write a function ReserveNumber(NumOfDigits as Integer) As Long to find and reserve a 4-digit or 5-digit free number following this logical sequence:
Depending on NumOfDigits (4 or 5) get the result of one of the queries as LowestNumber:
SELECT Min(Number) FROM Numbers WHERE Number < 10000 AND NOT InUse
SELECT Min(Number) FROM Numbers WHERE Number >= 10000 AND NOT InUse
Reserve that particular number to ensure it's not going to be used again:
UPDATE Numbers SET InUse = True WHERE Number = #LowestNumber
Return LowestNumber
Whenever
Notes: the logic above is a bit naive as it suppose that no two users will attempt to get the lowest number at the same time. There is however a risk that this may happen one day.
To remove that risk, you can, for instance, add a TakenBy column to the Numbers table and set it to the current username. Then, after you have reserved the number, read-it again to ensure that the TakenBy is really updated by the current client. If not, just try gain.
There are lots of ways to do this. You can try to fiddle around table locks as well, but whatever your solution, make sure you test it.