multiple headers - header

I am passing a hash identifier to a server through a webservice call.
The hash value is passed in the header.
If there are multiple hash values, which of the following implementation is suggested:
1) multiple headers - one for each hash identifier
2) single header - combine the hash values with a separator
which one is better? is there a better way?

As long as the (length of the hash * number of hashes to send) is reasonable (<4K), then a single header value is OK... If you use multiples, you'll probably want to number them in your header, like "APP_HASH_1", "APP_HASH_2", etc.

Related

Oracle SQL trying to Hash 25 columns in one column

So when I was trying to hash 25 columns using ORA_HASH function I was getting error: too many parameter.
Is there any way we can hash all 25 columns and quickly because we have around 60M rows and no Update date :(
select ORA_HASH
(id,name,c....,...) form table name
Use concatenation with some special string as delimited e.g. here chr(10) assuming this charter doesn't appear in you data
col1||chr(10)||col1||....
Be carefull with numeric and data columns.
Either convert them explicitely in character columns, e.g.
...||to_char(col_date,'yyyy-mm-dd hh24:mi:ss')||...
or temorary override the session setting to have a constant values
ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ',.']';
ALTER SESSION SET NLS_DATE_FORMAT = 'DD.MM.YYYY HH24:MI:SS';
The problem with NLS setting is, when they change and you perform a default conversion to character string - you get a different hash code.
Note also, that ORA_HASH can lead to duplicates, consider e.g. MD5 hash code to recognise change in the table data.
Final note Oracle has a (not well known) function DBMS_SQLHASH.GETHASH whitch may or may not be what you are looking for.
Surely your ultimate goal is not to get a hash? What is the hash for? It may very well not be the right way to achieve your goal.
Second, ORA_HASH is a weak, 32-bit hash that will produce a hash collision about every 25,000 rows! I wrote a whole blog post about this, see:
https://stewashton.wordpress.com/2014/02/15/compare-and-sync-tables-dbms_comparison/
Third, starting with version 12c there is a STANDARD_HASH function that seems to perform quite well and that goes up to 512 bits! (not bytes as I said before editing this answer...)
Finally, the right way to hash several things together is "hash chaining", not concatenating the values. ORA_HASH appears to support hash chaining (or something of similar effect) using the third parameter:
ora_hash(column1, 4294967295, ora_hash(column2))
With STANDARD_HASH, I would first use it on each column individually, then use UTL_RAW.CONCAT to concatenate the results, then either use STANDARD_HASH on the concatenated result or just use the concatenated value as if it were a big hash.

Storing HASHBYTES output in NVARCHAR vs BYTES

I am going to create:
a table for storing IDs and unique text values (which are expected to
be large)
a stored procedure which will have a text value as input parameter
(it will check if the value exists in the above table and return the
corresponding ID if it exists, or inserted a new record if not and
return the new ID as well)
I want to optimize the search of text values using hash value of the text and created index on it. So, during the search I expect a non-clustered index to be used (not the clustered index).
I decided to use the HASHBYTES with SHA2_256 and I am wondering are there any differences/benefits if I am storing the hash value as BINARY(32) or NVARCHAR(16)?
You can't reasonably store a hash value as chars because binary data is not text. Various text processing and comparison functions interpret those chars. For example trailing whitespace is sometimes ignored leading to incorrect results.
Since you've got 32 totally random unstructured bytes to store a binary(32) is the most natural format and it is the fastest one.

Sql function to turn character field into number field

I'm importing data from one system to another. The former keys off an alphanumeric field whereas the latter requires a numeric integer field. I'd like to find or write a function that I can feed the alphanumeric value to and have it return a number that would be unique to the value passed in.
My first thought was to do a hash, but of course the result of any built in hashes are going to contains letters and plus it's technically possible (however unlikely) that a hash may not be unique.
My first question is whether there is anything built in to sql that I'm overlooking, and short of that I'd like to hear suggestions on the easiest way to implement such a function.
Here is a function which will probably convert from base 10 (integer) to base 36 (alphanumeric) and back again:
https://www.simple-talk.com/sql/t-sql-programming/numeral-systems-and-numbers-conversion-in-sql/
You might find the resultant number is too big to be held in an integer though.
You could concatenate the ascii values of each character of your string and cast the result as a bigint.
If the original data is known to be integers you can use cast:
SELECT CAST(varcharcol AS INT) FROM Table

Selecting rows where a column contains at least one character not in a whitelist

I'm converting some sensitive data from a low-security encryption to a higher security encryption (specifically, from CFMX_COMPAT to AES with a 256-bit key). I intend to encode my AES-encrypted strings using Hex, and CFMX_COMPAT is extremely likely to use special characters, so finding records that aren't yet converted should be as simple as (pseudocode):
select from table where column has at least one character not in [A-Z0-9]
Is this possible in SQL? If so, how?
I found this documentation, but I had no idea it was possible in a simple LIKE clause. Awesome!
select top 10 foo
from bar
where foo like '%[^A-Z0-9]%'

Lucene field from TokenStream with stored values

I have a field which needs to come from a token stream; it cannot be instantiated with a string and then analyzed into tokens. For example, I might want to combine the data from multiple columns (in my RDBMS) into a single Lucene field, but I want to analyze each column in its own way. So I cannot simply concat them all as a single string then analyze the resulting string.
The problem I am running into now is that fields created from token streams cannot be stored, which makes sense in the general case since the stream may not have an obvious string representation. However, I know the string representation, and I would like to store that.
I tried adding the same field twice, once with it being stored and having string data and once with it coming from a token stream, but it seems that this can't be done. Apart from some hack like adding a field with a name of "myfield__stored" is there a way to do this?
I am using 2.9.2.
I found a way. You can sneak it in by instantiating it as a normal field but calling SetTokenStream later:
Field f = new Field(Name, StringValue, Store, Analyzed, TV);
f.SetTokenStream(TokenStreamValue);
Because the reader/string value is only indexed if the token stream value is null, the token stream value will be indexed. The store methods look at string/reader regardless of token stream, so it will be this value which is stored.