Is there a better method of determining if an integer ends in specific two digits? - sql

In a table having an integer primary key of indexRow in which the last two digits are currently 55, I'd like to change that to 50 but only if the column added is an integer value 55 and the indexRow ends in 55. I'm using SQLite.
I tested it as follows. Would you please tell me if this is the correct approach (if there is a better method) because I'd like to use it to run an update on the table?
Of course, I'll do it within a transaction and test before committing; but wanted to ask. I expected to have to use some math to determine which indexRows ended in 55, but converting to string seems quite easy.
select indexRow, indexRow-5, substring(format('%s', indexRow),-2)
from newSL
where added=55
and substring(format('%s', indexRow),-2)='55'
limit 10;
indexRow indexRow-5 substring(format('%s', indexRow),-2)
----------- ----------- ------------------------------------
10080171455 10080171450 55
10130031255 10130031250 55
10140021655 10140021650 55
10140080955 10140080950 55
10240330155 10240330150 55
10250230555 10250230550 55
10270031155 10270031150 55
10270290355 10270290350 55
10300110355 10300110350 55
10300110455 10300110450 55

Yes, use the modulo operator, %. In the expression x % y, the result is the remainder of dividing x by y. Therefore, 4173 % 100 = 73.
Note that % is a math operator, just like * for multiplication and / for division, and is not related to using the % in the format function.

Related

I want to know the specific reason why we have take those 256,.. numbers in the conversion below

Projected code is used to convert a date into integer and vice-versa. I want to know the reason why here we have used this specific hexadecimal codes and the number series to get back the date from int. If there is an article about this code sample it would also help me understand this code actually.
I have tried online Hex to Decimal conversion for this codes and found its a 256^1,256^2... even though trying not able to find the exact reason.
declare #dDate date = '2017-10-12'
declare #iDate int = 0
select #iDate = ( (datepart(year,#dDate)*65536 | datepart(month,#dDate)*256 | datepart(dd,#dDate)))
select (#iDate&0xfff0000)/65536 --year
select (#iDate&0xff00)/256 --Month
select (#iDate&0xff) --Date
& is an operator doing bitwise AND. "|" is bitwise OR. See here and here. Also see here for an explanation on using bitwise AND/OR to store multiple number values in a single number column.
This part:
#iDate&0xfff0000
will "mask", or eliminate/replace-with-zeros, the portion of iDate that isn't from 256^2. Then you divide by 65536 -- which is simply reversing the original math of multiplying the year by 65536.
If the concept of bitwise AND is foreign, I'll give an example that DOESN'T WORK in decimal. Bitwise AND converts the whole thing to binary and then masks things (like IP subnetting, if you're familiar with that).
Anyway, consider a decimal number 20171012. If such a thing as a decimal-wise AND existed, it could look like 20171012&11110000. The "1" places are "keepers" and the "0" places are "throw-aways". If you stack them vertically, the result is to keep the values with a "1" beneath them and replace the values with a "0" beneath them with a "0".
number 20171012
dec-wise AND 11110000
result 20170000
now the result isn't 2017, so you'd have to divide by 10000 to get 2017.
For 20171012&1100 you have to use implied leading zeros:
number 20171012
dec-wise AND 00001100
result 1000
I probably would have converted to int by adding the year*10000 and month * 100 and day. Reverting back I would use a combination of integer division and MOD. But I think the bitwise AND is perhaps a bit more elegant (particularly for getting the month).
Based on your comment, I will include how I have converted dates to int and reverted back:
declare #dDate date = '2017-10-12'
declare #iDate int
set #iDate = year(#dDate) * 10000 + month(#dDate) * 100 + day(#dDate)
select #iDate
select 'year', #iDate/10000 -- basic integer division provides the year
select 'month', (#iDate % 10000)/100 -- combine modulo and integer division to get the month
select 'day', #iDate % 100 -- basic modulo arithmetic provides the day
returns:
20171012
year 2017
month 10
day 12
This is bit manipulation.
Bit Shifting
Decimal 3 = Binary 11
If we do a left shift (<<) 4 bits in 3 it will become 48 which is equal to binary 110000 <- 4 zero bits added due to left shift
But since we don't have bit shifting operators in T-SQL therefore we can do the math.
Left Shifting of n bits in number x = x * 2^n
Therefore, multiple a number with 256 is actually left shift 8 bits from that number (2^8 = 256).
Later on when you do bitwise OR between 2 numbers they actually "concatenate" the bits up.
For example, you need to concatenate 2 binary numbers, (3) 11 and (2) 10, the resultant number should be 1110 = 14
So first we'll do 2 left shift in 3 = 3 * 2^2 = 12 and then we will do bitwise OR this number with the next number
12 = 1100
2 = 0010
OR
---------------
14 = 1110
Your example is actually saving the whole date in an integer variable which is actually efficient way of saving a date.

How to handle decimal numbers in solidity?

How to handle decimal numbers in solidity?
If you want to find the percentage of some amount and do some calculation on that number, how to do that?
Suppose I perform : 15 % of 45 and need to divide that value with 7 how to get the answer.
Please help. I have done research, but getting answer like it is not possible to do that calculation. Please help.
You have a few options. To just multiply by a percentage (but truncate to an integer result), 45 * 15 / 100 = 6 works well. (45 * 15%)
If you want to keep some more digits around, you can just scale everything up by, e.g., some exponent of 10. 4500 * 15 / 100 = 675 (i.e. 6.75 * 100).

Composite indexing using Redis in a hierarchical data model

I have a data model like this:
Fields:
counter number (e.g. 00888, 00777, 00123 etc)
counter code (e.g. XA, XD, ZA, SI etc)
start date (e.g. 2017-12-31 ...)
end date (e.g. 2017-12-31 ...)
Other counter date (e.g. xxxxx)
Current Datastructure organization is like this (root and multiple child format):
counter_num + counter_code
---> start_date + end_date --> xxxxxxxx
---> start_date + end_date --> xxxxxxxx
---> start_date + end_date --> xxxxxxxx
Example:
00888 + XA
---> Jan 10 + Jan 20 --> xxxxxxxx
---> Jan 21 + Jan 31 --> xxxxxxxx
---> Feb 01 + Dec 31 --> xxxxxxxx
00888 + ZI
---> Jan 09 + Feb 24 --> xxxxxxxx
---> Feb 25 + Dec 31 --> xxxxxxxx
00777 + XA
---> Jan 09 + Feb 24 --> xxxxxxxx
---> Feb 25 + Dec 31 --> xxxxxxxx
Today the retrieval happens in 2 ways:
//Fetch unique counter data using all the composite keys
counter_number + counter_code + date (start_date <= date <= end_date)
//Fetch all the counter codes and corresponding data matching the below conditions
counter_number + date (start_date <= date <= end_date)
What's the best way to model this in redis as I need to cache some of the frequently hit data. I feel sorted sets should do this somehow, but unable to model it.
UPDATE:
Just to remove the confusion, the ask here is not for an SQL "BETWEEN" like query. 'Coz I don't know what the start_date and end_date values are. Think they are just column names.
What I don't want is
SELECT * FROM redis_db
WHERE counter_num AND
date_value BETWEEN start_date AND end_date
What I want is
SELECT * FROM redis_db
WHERE counter_num AND
start_date <= specifc_date AND end_date >= specific_date
NOTE: The requirement is pretty much close to 2D indexing of what is proposed in Redis multi-dimensional indexing document
https://redis.io/topics/indexes#multi-dimensional-indexes
I understood the concept but unable to digest the implementation detail that is given.
I'm unlikely to get this done in time for the bounty, but what the hell...
This sounds like a job for geohashing. Geohashing is what you do when you want to index a 2-dimensional (or higher) dataset. For example, if you have a database of cities and you want to be able to quickly respond to queries like "find all the cities within 50km of X", you use geohashing.
For the purposes of this question, you can think of start_date and end_date as x and y coordinates. Normally in geohashing you're searching for points in your dataset near a particular point in space, or in a certain bounded region of space. In this case you just have a lower bound on one of the coordinates and an upper bound on the other one. But I suppose in practice the whole dataset is bounded anyway, so that's not a problem.
It would be nice if there was a library for doing this in Redis. There probably is, if you look hard enough. The newer versions of Redis have built-in geohashing functionality. See the commands starting with GEO. But it doesn't claim to be very accurate, and it's designed for the surface of a sphere rather than a flat surface.
So as far as I can see you have 3 options:
Map your search space to a small part of the sphere, preferably near the equator. Use the Redis GEO commands. To search, use GEOSPHERE on a circle covering the triangle you're trying to search, taking into account the inbuilt inaccuracy and the distortion you get by mapping onto the sphere, then filter the results to get the ones that are actually inside the triangle.
Find some 3rd-party geohashing client for Redis which works on flat space and is more accurate than GEO.
Read the rest of this answer, or some other primer on geohashing, then implement it yourself on top of Redis. This is the hardest (but most educational) option.
If you have a database that indexes data using a numerical ordering, such that you can do queries like "find all the rows/records for which z is between a and b", you can build a geohash index on top of it. Suppose the coordinates are (non-negative) integers x and y. Then you add an integer-valued column z, and index by z. To calculate z, write x and y in binary, then take alternate digits from each. Example:
x = 969 = 0 1 1 1 1 0 0 1 0 0 1
y = 1130 = 1 0 0 0 1 1 0 1 0 1 0
z = 1750214 = 0110101011010011000110
Note that the index allows you to find, for example, all records positioned with z between 0101100000000000000000 and 0101101111111111111111 inclusive. In other words, all records for which z starts with 010110. Or to put it another way, you can find all records for which x starts with 001 and y starts with 110. This set of records corresponds to a square in the 2-dimensional space we are trying to search.
Not all squares can be searched in this way. We'll call these ones searchable squares. Suppose the client sends a request for all records for which (x,y) is inside a particular rectangle. (Or a circle, or some other reasonable geometric shape.) Then you need to find a set of searchable squares which cover the rectangle. Then, for each of these squares you've chosen, query the database for records inside that square and send the results to the client. (But you'll have to filter the results, because not all the records in the square are actually in the original rectangle.)
There's a balance to be struck. If you choose a small number of large special squares, you'll probably end up covering a much larger area of the map than you need; the query to the database will return lots of extra results that you'll have to filter out. Alternatively, if you use lots of little special squares, you'll be doing lots of queries to the database, many of which will return no results.
I said above that x and y could be start_time and end_time. But actually the distribution of your dataset won't be as symmetrical as in most uses of geohashing. So the performance might be better (or worse) if you use x = end_time + start_time and y = end_time - start_time.
Because your question remains a bit vague on how you desire to query your data, it remains unclear on how to solve your question. With that in mind, however, here are my thoughts on how I might model your data:
Updated answer, detailing how to use SORTED SET
I have edited this answer to be able to store your values in a way that you can query by dynamic date ranges. This edit assumes that your database values are timestamps, as in the value is for a single time, not 2, as in your current setup.
Yes, you are correct that using Sorted Sets will be able to accomplish this. I suggest that you always use a Unix timestamp value for the score component in these sorted sets.
In case you were not already familiar with redis, let's explain indexing limitations. Redis is a simple key-value designed to quickly retrieve values by a key. Because of this design, it does not contain many features of your traditional DBMS, like indexing a column for instance.
In redis, you accomplish indexing by using a key, and the most nested key-like structures are available in HASH and SORTED SET, but you only get 2 key-like structures. In a HASH, you have the key (same as any data type), and a inner hash key, which can take the form of any string.
In a SORTED SET, you have the key (same as any data type), and a numeric value.
A HASH is nice to use to keep a grouped data organized.
A SORTED SET is nice if you want to query by a range of values. This could be a good fit for your data.
Your SORTED SET would look like the following:
key
00888:XA =>
score (date value) value
1452427200 (2016-01-10) xxxxxxxx
1452859200 (2016-01-10) yyyyxxxx
1453291200 (2016-01-10) zzzzxxxx
Let's use a more intuitive example, the 2017 Juventus roster:
To produce the SORTED SET in the table below, issue this command in your redis client:
ZADD JUVENTUS 32 "Emil Audero" 1 "Gianluigi Buffon" 42 "Mattia Del Favero" 36 "Leonardo Loria" 25 "Neto" 15 "Andrea Barzagli" 4 "Medhi Benatia" 19 "Leonardo Bonucci" 3 "Giorgio Chiellini" 40 "Luca Coccolo" 29 "Paolo De Ceglie" 26 "Stephan Lichtsteiner" 12 "Alex Sandro" 24 "Daniele Rugani" 43 "Alessandro Semprini" 23 "Dani Alves" 22 "Kwadwo Asamoah" 7 "Juan Cuadrado" 6 "Sami Khedira" 18 "Mario Lemina" 46 "Mehdi Leris" 38 "Rolando Mandragora" 8 "Claudio Marchisio" 14 "Federico Mattiello" 45 "Simone Muratore" 20 "Marko Pjaca" 5 "Miralem Pjanic" 28 "Tomás Rincón" 27 "Stefano Sturaro" 21 "Paulo Dybala" 9 "Gonzalo Higuaín" 34 "Moise Kean" 17 "Mario Mandzukic"
Jersey Name Jersey Name
32 Emil Audero 23 Dani Alves
1 Gianluigi Buffon 42 Mattia Del Favero
36 Leonardo Loria 25 Neto
15 Andrea Barzagli 4 Medhi Benatia
19 Leonardo Bonucci 3 Giorgio Chiellini
40 Luca Coccolo 29 Paolo De Ceglie
26 Stephan Lichtsteiner 12 Alex Sandro
24 Daniele Rugani 43 Alessandro Semprini
22 Kwadwo Asamoah 7 Juan Cuadrado
6 Sami Khedira 18 Mario Lemina
46 Mehdi Leris 38 Rolando Mandragora
8 Claudio Marchisio 14 Federico Mattiello
45 Simone Muratore 20 Marko Pjaca
5 Miralem Pjanic 28 Tomás Rincón
27 Stefano Sturaro 21 Paulo Dybala
9 Gonzalo Higuaín 34 Moise Kean
17 Mario Mandzukic
To query the roster by a range of jersey numbers:
ZRANGEBYSCORE JUVENTUS 1 5
Output:
1) "Gianluigi Buffon"
2) "Giorgio Chiellini"
3) "Medhi Benatia"
4) "Miralem Pjanic"
Note that the scores are not returned, however ZRANGEBYSCORE command orders the results in ASC order by score.
To add the scores, append "WITHSCORES" to the command, like so: ZRANGEBYSCORE JUVENTUS 1 5 WITHSCORES
By using ZRANGEBYSCORE, you should be able to query any key (counter number + counter code) with a date range,
producing the values in that range.
Original: Below is my original answer, recommending HASH
Based on your examples, I recommend you use a HASH.
With a hash, you would have a main key to find the hash (Ex. 00888:XA). Then within the hash, you have key -> value pairs (Ex. 2017-01-10:2017-01-20 -> xxxxxxxx). I prefer to delimit or tokenize my keys' components with the colon char :, but you can use any delimiter.
HASH follows your example data structure very well:
key
00888:XA =>
hashkey value
2017-01-10:2017-01-20 xxxxxxxx
2017-01-21:2017-01-31 yyyyxxxx
2016-02-01:2016-12-31 zzzzxxxx
key
00888:ZI =>
hashkey value
2017-01-10:2017-01-20 xxxxxxxx
2017-01-21:2017-01-31 xxxxyyyy
2016-02-01:2016-12-31 xxxxzzzz
When querying for data, instead of GET key, you would query with HGET key hashkey. Same for setting values, instead of SET key value, use HSET key hashkey value.
Example commands
HSET 00777:XA 2017-01-10:2017-01-20 xxxxxxxx
HSET 00777:XA 2017-01-21:2017-01-31 yyyyyyyy
HSET 00777:XA 2016-02-01:2016-12-31 zzzzzzzz
(Note: there is also a HMSET to simplify this into a single command)
Then:
HGET 00777:XA 2017-01-21:2017-01-31
Would return yyyyyyyy
Unless there is some specific performance consideration, or other goal for your data, I think Hashes will work great for your system.
It's also very convenient if you want to get all hashkeys or all values for a given hash, using commands like HKEYS, HVALS, or HGETALL.

Efficient way to store FilePath

Currently I have a table with the following format/Desc:
ColumnName ColID PK IndexPos Null DataType
ID 1 1 N VARCHAR2 (1 Byte)
FILEPATH 2 N VARCHAR2 (127 Byte)
As you can see the length of ID Column is only 1 Byte we can store only 36 different file paths. I have more than 35 different file paths that has to be stored and retrieved. I know increasing the length of ID solves the issue but I want to also know/suggestion that is there any Efficient way to handle this.
Thanks!
The assertion that you can store only 35 different values in the table is incorrect, because varchar2 characters are not limited to letters and digits (even if they were you'd have 26 letters + 10 digits + 1 empty string = 37, not 35 possibilities).
If you need to store few more paths, say, 40 or 50, you could make your keys mixed case, so 'a' and 'A' would reference different paths. This would instantly give you 26 extra possibilities.
Expanding past the limit of 63 is a little harder, because you need to bring special characters into the mix. However, the theoretical maximum for a single character is 256 plus one combination for an empty string.

How to sew multiple values into one

I need to store 5 values in a single SQL Server column, each range 1-90. The values cannot be repeated. I though of using the 2, 4, 8, 16, 32, 64, ... system but you guess it will get really big, using decimal I risk wrong calculation. Is there a convenient way to:
store the 5 values into a single column so that to avoid having 90 bit column in the table, see my previous post here.
quickly query the database for example to return all records with number X and Y
another option was a string (90) containing flags like 000001000011000 but this way I have to use substrings to query and I fear it will slow down on a table with 25.000 records or more.
First request: You say most are bit. But if not all then you cant use bitwise operator. And can't save it in a single field
In that case you need an aditional table.
Row_id | fieldName | fieldValue
1 | name1 | value1
1 | name2 | value2
.
.
.
1 | name90 | value90
Second request: Save the 5 values is very easy and fast on the aditional table. Just create and index for row_id on both tables.
Third Request: Here you say again can save it as bits. But instead using strings, that is a bad idea.
Your are right, number isnt big enough to hold 90 bit, that is because a number can only hold 32 or 64 bits depending on type.
In that case you need to use two field (64 bits) or three field (32 bits) to store all 90 possible flags.
Again easy to do and really fast.
EDIT
For use multiple fields you have to create categories
Like imagine there are 16 bits split into two 8 bits (0..256)
01234567 89ABCDEF
01010101 11111111
Create fieldUp and fieldDown
SAVE
FieldUp = 01234567
FieldUp = 1 + 4 + 16 + 64
FieldDown = 89ABCDEF
FieldDown = 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128
Then Select a row with FLAGS [b1, b5, bA] would be
SELECT *
FROM TABLE
WHERE FieldUp & (4 + 32)
AND FieldDown & 8
I have resolved saving the numbers comma separated, then in my code i split this field into an array and can process the data. Numbers are not meant for math operations but just as a string.