Efficient way to store FilePath - sql

Currently I have a table with the following format/Desc:
ColumnName ColID PK IndexPos Null DataType
ID 1 1 N VARCHAR2 (1 Byte)
FILEPATH 2 N VARCHAR2 (127 Byte)
As you can see the length of ID Column is only 1 Byte we can store only 36 different file paths. I have more than 35 different file paths that has to be stored and retrieved. I know increasing the length of ID solves the issue but I want to also know/suggestion that is there any Efficient way to handle this.
Thanks!

The assertion that you can store only 35 different values in the table is incorrect, because varchar2 characters are not limited to letters and digits (even if they were you'd have 26 letters + 10 digits + 1 empty string = 37, not 35 possibilities).
If you need to store few more paths, say, 40 or 50, you could make your keys mixed case, so 'a' and 'A' would reference different paths. This would instantly give you 26 extra possibilities.
Expanding past the limit of 63 is a little harder, because you need to bring special characters into the mix. However, the theoretical maximum for a single character is 256 plus one combination for an empty string.

Related

Is there a better method of determining if an integer ends in specific two digits?

In a table having an integer primary key of indexRow in which the last two digits are currently 55, I'd like to change that to 50 but only if the column added is an integer value 55 and the indexRow ends in 55. I'm using SQLite.
I tested it as follows. Would you please tell me if this is the correct approach (if there is a better method) because I'd like to use it to run an update on the table?
Of course, I'll do it within a transaction and test before committing; but wanted to ask. I expected to have to use some math to determine which indexRows ended in 55, but converting to string seems quite easy.
select indexRow, indexRow-5, substring(format('%s', indexRow),-2)
from newSL
where added=55
and substring(format('%s', indexRow),-2)='55'
limit 10;
indexRow indexRow-5 substring(format('%s', indexRow),-2)
----------- ----------- ------------------------------------
10080171455 10080171450 55
10130031255 10130031250 55
10140021655 10140021650 55
10140080955 10140080950 55
10240330155 10240330150 55
10250230555 10250230550 55
10270031155 10270031150 55
10270290355 10270290350 55
10300110355 10300110350 55
10300110455 10300110450 55
Yes, use the modulo operator, %. In the expression x % y, the result is the remainder of dividing x by y. Therefore, 4173 % 100 = 73.
Note that % is a math operator, just like * for multiplication and / for division, and is not related to using the % in the format function.

Unable to identify strange whitespace character in MSSQL table

We have a process that reads an XML file into our database and inserts any rows that aren't currently in another table to that table.
This process also has a trigger to write to an audit table and a nightly snapshot is also held in another table.
In the XML holding table a field looks like 1234567890123456 but it exists on our live table as 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6. Those spaces will not be removed by any combination of REPLACE functions. We have tried all CHAR values and it does not recognise the character. The audit table and nightly snapshot, however, contain the correct values.
Similarly, if we run a comparison between SELECT CASE WHEN '1234567890123456' = '1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 ' THEN 1 ELSE 0 END, this returns 1, so they match. However LEN('1234567890123456') is 16 and LEN('1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 ') is 32.
We have ran some queries to loop through the characters in the field and output the ASCII and Unicode values for the characters. The digits return the correct ASCII/Unicode values, but this random whitespace character does not return a value.
An example of the incorrectly displayed one is 0x35000000320000003800000036000000380000003300000039000000370000003800000037000000330000003000000035000000340000003000000033000000 and a correct one is 0x3500320038003600380033003200300030003000360033003600380036003000. Both were added by the same means on the same day. One has the extra bytes, the other is fine.
How can we identify this character and get rid of it? Is there a reason this would have been inserted originally? How can we avoid this in future?
Data entry
It looks like some null (i.e. Char(0)) characters have got into the data.
If the data was supposed to be ASCII when it was entered but UTF-16 data got, then it could be:
Entered character codes: 48 00
Sent to database: 48 00 00 00
To avoid that, remove disallowed characters as the first step in processing the input, say by using a regex to replace [\x00-\x1F] with an empty string.
Data repair
Search for entries which a Char(0) in them to confirm that they can be found that way.
If so, replace the Char(0) with an empty string.
If that doesn't work, you could convert the data to the format '0x35000000320000003800000036000000380000003300000039000000370000003800000037000000330000003000000035000000340000003000000033000000', replace '000000' with '00', and then convert back.

How to sew multiple values into one

I need to store 5 values in a single SQL Server column, each range 1-90. The values cannot be repeated. I though of using the 2, 4, 8, 16, 32, 64, ... system but you guess it will get really big, using decimal I risk wrong calculation. Is there a convenient way to:
store the 5 values into a single column so that to avoid having 90 bit column in the table, see my previous post here.
quickly query the database for example to return all records with number X and Y
another option was a string (90) containing flags like 000001000011000 but this way I have to use substrings to query and I fear it will slow down on a table with 25.000 records or more.
First request: You say most are bit. But if not all then you cant use bitwise operator. And can't save it in a single field
In that case you need an aditional table.
Row_id | fieldName | fieldValue
1 | name1 | value1
1 | name2 | value2
.
.
.
1 | name90 | value90
Second request: Save the 5 values is very easy and fast on the aditional table. Just create and index for row_id on both tables.
Third Request: Here you say again can save it as bits. But instead using strings, that is a bad idea.
Your are right, number isnt big enough to hold 90 bit, that is because a number can only hold 32 or 64 bits depending on type.
In that case you need to use two field (64 bits) or three field (32 bits) to store all 90 possible flags.
Again easy to do and really fast.
EDIT
For use multiple fields you have to create categories
Like imagine there are 16 bits split into two 8 bits (0..256)
01234567 89ABCDEF
01010101 11111111
Create fieldUp and fieldDown
SAVE
FieldUp = 01234567
FieldUp = 1 + 4 + 16 + 64
FieldDown = 89ABCDEF
FieldDown = 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128
Then Select a row with FLAGS [b1, b5, bA] would be
SELECT *
FROM TABLE
WHERE FieldUp & (4 + 32)
AND FieldDown & 8
I have resolved saving the numbers comma separated, then in my code i split this field into an array and can process the data. Numbers are not meant for math operations but just as a string.

SQL - Create Unique AlphaNumeric based on a 10-digit integer stored as VARCHAR

I'm trying to emulate a function in SQL that a client has produced in Excel. In effect, they have a unique, 10-digit numeric value (VARCHAR) as the primary key in one of their enterprise database systems. Within another database, they require a unique, 5-digit alphanumeric identifier. They want that 5-digit alphanumeric value to be a representation of the 10-digit number. So what they did in excel was to split the 10-digit number into pairs, then convert each of those pairs into a hexadecimal value, then stitch them back together.
The EXCEL equation is:
=IF(VALUE(MID(A2,1,4))>0,DEC2HEX(VALUE(MID(A2,3,2)))&DEC2HEX(VALUE(MID(A2,5,2)))&DEC2HEX(VALUE(MID(A2,7,2)))&DEC2HEX(VALUE(MID(A2,9,2))),DEC2HEX(VALUE(MID(A2,5,2)))&DEC2HEX(VALUE(MID(A2,7,2)))&DEC2HEX((VALUE(MID(A2,9,2)))))
I need the SQL equivalent of this. Of course, should someone out there know a better way to accomplish their goal of "a 5-digit alphanumeric identifier" based off the 10-digit number, I'm all ears.
ADDED 8/2/2011
First of all, thank you to everyone for the replies. Nice to see folks willing to help and even enjoying it! Based on all the responses, I'm apt to tell my client they're intent is sound, only their method is off kilter. I'd also like to recommend a solution. So the challenge remains, just modified slightly:
CHALLENGE: Within SQL, take a 10 digit, unique NUMERIC string and represent it ALPHANUMERICALLY in as few characters as possible. The resulting string must also be unique.
Note that the first 3-4 characters in the 10-digit string are likely to be zeros, and that they could be stripped to shorten the resulting alphanumeric string. Not required, but perhaps helpful.
This problem is inherently impossible. You have a 10 digit numeric value that you want to convert to a 5 digit alphanumeric value. Since there are 10 numeric characters, this means that there are 10^10 = 10 000 000 000 unique values for your 10 digit number. Since there are 36 alphanumeric characters (26 letters + 10 numbers), there are 36^5 = 60 466 176 unique values for your 5 digit number. You cannot map a set of 10 billion elements into a set with around 60 million.
Now, lets take a closer look at what your client's code is doing:
So what they did in excel was to split the 10-digit number into pairs, then convert each of those pairs into a hexadecimal value, then stitch them back together.
This isn't 100% accurate. The excel code never uses the first 2 digits, but performs this operation on the remaining 8. There are two main problems with this algorithm which may not be intuitively obvious:
Two 10 digit numbers can map to the same 5 digit number. Consider the numbers 1000000117 and 1000001701. The last four digits of 1000000117 get mapped to 1 11, where the last four digits of 1000001701 get mapped to 11 1. This causes both to map to 00111.
The 5 digit number may not even end up being 5 digits! For example, 1000001616 gets mapped to 001010.
So, what is a possible solution? Well, if you don't care if that 5 digit number is unique or not, in MySQL you can use something like:
hex(<NUMERIC VALUE> % 0xFFFFF)
The log of 10^10 base 2 is 33.219280948874
> return math.log(10 ^ 10) / math.log(2)
33.219280948874
> = 2 ^ 33.21928
9999993422.9114
So, it takes 34 bits to represent this number. In hex this will take 34/4 = 8.5 characters, much more than 5.
> return math.log(10 ^ 10) / math.log(16)
8.3048202372184
The Excel macro is ignoring the first 4 (or 6) characters of the 10 character string.
You could try encoding in base 36 instead of 16. This will get you to 7 characters or less.
> return math.log(10 ^ 10) / math.log(36)
6.4254860446923
The popular base 64 encoding will get you to 6 characters
> return math.log(10 ^ 10) / math.log(64)
5.5365468248123
Even Ascii85 encoding won't get you down to 5.
> return math.log(10 ^ 10) / math.log(85)
5.1829075929158
You need base 100 to get to 5 characters
> return math.log(10 ^ 10) / math.log(100)
5
There aren't 100 printable ASCII characters, so this is not going to work, as zkhr explained as well, unless you're willing to go beyond ASCII.
I found your question interesting (although I don't claim to know the answer) - I googled a bit for you out of interest and found this which may help you http://dpatrickcaldwell.blogspot.com/2009/05/converting-decimal-to-hexadecimal-with.html

How to split a really long mysql result set into two lines?

Suppose you have a result that is 100 chars long but you only have a 50 char width. How do you split a MYSQL result into two rows of 50 chars each?
Could you clarify the question a bit? Are you looking to insert 100 chars of data into a 50 char column? Or do you have 100 chars in the database but only have space in your app to display 50 chars?
I have 100 chars in the database result set but I want the result set string to have a break after the 50th char and continue onto the next line.
Example
SELECT * FROM FOO
returns
1 2 3 4 5 6 7 8 9...50 51 52 53..98 99 100
but I want
1 2 3 4 5 6 7 8 9...50
51 52... 99 100
Is this possible?
SELECT substring(col, 1, 50) FROM foo
UNION ALL
SELECT substring(col, 51) FROM foo
Your'e asking a question about formatting data for viewing. SQL is a declarative data retrieval language, not a data pretty formatting language. You should solve this problem in your non-SQL code.
Formatting data in a SQL query is not a good idea, unless you have to write something that will run in a query analyzer. Your question isn't specific about whether or not that is the case.
Do you want to return the result set in PHP or MySQL? If the former, then it's easier.
Take the string, and take the first 100 characters, put in a line break, and then the rest of the string.
MySQL would work on the same principle, but you may have issues with line-break characters.