SQL Server: what to do when a decimal isn't really a decimal? - sql

I have a situation where a decimal figure isn't truly a decimal, it is a sub-count of sorts. For example, when an prescription for medication is filled it is given an Rx number (lets say 345673). That number will stay with the prescription throughout any refills and the refills append a .1, .2 etc. So, over the life of that Rx number you could end up with 345673, 345673.1, 345673.2... ongoing. The problem is when you hit .10, .20, .30 etc. In decimal form those records are the same as .1, .2, .3.
Is there any way to track these numbers and support the trailing zeros without having use VARCHAR etc? This is the primary key column and I'm not crazy about using varchar on a Pk (is that old fashioned?)
Any and all suggestions/help is appreciated.
Edit to add: I should have explained why we can't use separate columns for this. This data originates elsewhere and is merged into our database from a bulk import operation. The merge is perpetual since much of the data is altered at the origin and combined on the next bulk import. The Rx number has to match exactly to perform the bulk import / merge.

I would recommend using a composite primary key. Add a second column, perhaps called refill, and use that for your incremental values. That way both columns would be integers and no need to use varchar for your primary key.
You may need to use a trigger to maintain the value as identity fields don't work with groupings.

Convert the decimal to varchar using either CONVERT or CAST. Then you can use character/string functions such as LEFT(), SUBSTRING(), and REPLACE().
The other option is to parse out the decimal using math:
DECLARE #RxValue decimal(18,4) = 100.234
SELECT #RxValue [Rx Number]
,#RxValue - CONVERT(int,#RxValue) [Refill Count Decimal]
,LEN(CONVERT(float,(#RxValue - CONVERT(int,#RxValue))))-2 [After Decimal Length]
,POWER(10, LEN(#RxValue - CONVERT(int,#RxValue))-2) [Calc multiplier for Refill Count]
,CONVERT(int,#RxValue) [Rx ID]
,CONVERT(int,(#RxValue - CONVERT(int,#RxValue)) * POWER(10, LEN(CONVERT(float,(#RxValue - CONVERT(int,#RxValue))))-2)) [Refill Count]
The last two columns in the above select are your two desired columns. All that is above is my attempt to "show my work" so you can see what I was thinking. The "-2" is to to remove the "0." out of the LEN() function's result. The POWER() function takes the 10 and raises it to the power of the decimal result.
You should only need to replace all references to #RxValue with the column name of your "Rx number" to make it work. If you want to play around with it, you can change the values in #RxValue to whatever you want and make sure the result is what you expect. Be sure you change #RxValue's data type to match yours.
You can, of course keep your "Rx number" as the PK if that is the situation it is in. You'll just be adding a couple of derived columns: [Rx ID] and [Refill Count].
Please leave a comment if you have a question.

Related

Real number comparison for trigram similarity

I am implementing trigram similarity for word matching in column comum1. similarity() returns real. I have converted 0.01 to real and rounded to 2 decimal digits. Though there are rank values greater than 0.01, I get no results on screen. If I remove the WHERE condition, lots of results are available. Kindly guide me how to overcome this issue.
SELECT *,ROUND(similarity(comum1,"Search_word"),2) AS rank
FROM schema.table
WHERE rank >= round(0.01::real,2)
I have also converted both numbers to numeric and compared, but that also didn't work:
SELECT *,ROUND(similarity(comum1,"Search_word")::NUMERIC,2) AS rank
FROM schema.table
WHERE rank >= round(0.01::NUMERIC,2)
LIMIT 50;
The WHERE clause can only reference input column names, coming from the underlying table(s). rank in your example is the column alias for a result - an output column name.
So your statement is illegal and should return with an error message - unless you have another column named rank in schema.table, in which case you shot yourself in the foot. I would think twice before introducing such a naming collision, while I am not completely firm with SQL syntax.
And round() with a second parameter is not defined for real, you would need to cast to numeric like you tried. Another reason your first query is illegal.
Also, the double-quotes around "Search_word" are highly suspicious. If that's supposed to be a string literal, you need single quotes: 'Search_word'.
This should work:
SELECT *, round(similarity(comum1,'Search_word')::numeric,2) AS rank
FROM schema.table
WHERE similarity(comum1, 'Search_word') > 0.01;
But it's still pretty useless as it fails to make use of trigram indexes. Do this instead:
SET pg_trgm.similarity_threshold = 0.01; -- set once
SELECT *
FROM schema.table
WHERE comum1 % 'Search_word';
See:
Finding similar strings with PostgreSQL quickly
That said, a similarity of 0.01 is almost no similarity. Typically, you need a much higher threshold.

CHECK Constraint for SUM failing on double precision column

I'm trying to understand a failure on my CHECK constraint in SQL Server 2008 R2 (the same problem occurs on SQL Server 2012).
My sql command just update the amount by 126.3 on two columns and the constraint checks if the sum of two columns match a third column.
Below are the steps to reproduce the problem:
CREATE TABLE FailedCheck ( item VARCHAR(10), qty_total DOUBLE PRECISION, qty_type1 DOUBLE PRECISION, qty_type2 DOUBLE PRECISION )
ALTER TABLE FailedCheck ADD CONSTRAINT TotalSum CHECK(qty_total = (qty_type1 + qty_type2));
INSERT INTO FailedCheck VALUES ('Item 2', 101.66, 91.44, 10.22);
UPDATE FailedCheck SET qty_total = qty_total + 126.3, qty_type1 = qty_type1 + 126.3
Column qty_total must contain the sum of (qty_type1 and qty_type2). All columns are 'Double precision'. If I change the value from 126.3 to 126, it works, I've tested other values (int and double) and couldn't understand why sometimes it works and sometimes doesn't.
What's wrong with my CHECK constraint ?
PS: Sorry for my english, it's not my primary language.
You decided for a floating point data type which only holds an approximate value - quite precise, but only up to an extent. 1.3 may well be stored as 1.299999999999998 or something along the lines. So the sum for the approximate values of 91.44 and 10.22 may happen to be exactly the approximate value for 101.66, but may also be very slightly different.
Never compare floating point values with the equal sign (=).
And better don't use floting point types in the first place, if not really, really needed. Use DECIMAL instead.

sql strip non-number characters to do avg (average) function on column like "$12.39"

I'm using Rails 3 & postgresql
I have column like -> formatted_price: "$17.99"
How can I use avg on this column?
I tried :
#items = Item.where(:user_id => #category.user_id, :asin => [#category.asins[0..-2].split(',')]).select("asin as asin, title as title, avg(sales_rank) as avg_rank, avg(formatted_price) as avg_price").group(:asin, :title)
getting an error cuz of avg(formatted_price) as avg_price
Quick solution:
AVG(CAST(TRIM(LEADING '$' FROM formatted_price) AS NUMERIC))
Better solution: change the column to a more suitable type, eg money or fixed precision numeric, and format it only when needed for display purposes.
Update: seems from the comments that the column is not formatted uniformly the way it was described in the OP. Although you could follow the suggestion from MatBailie and use substring with a regex to extract the numeric portion to get an average, to me it just does not make sense to take an average of a bunch of monetary values in different currencies.
So, either add a where clause to limit the query to those that are in the currency you want, or go back and rethink what you are trying to do.

Manipulating a record data

I am looking for a way to take data from one table and manipulate it and bring it to another table using an SQL query.
I have a Column called NumberStuff that has data like this in it:
INC000000315482
I need to cut off the INC portion of the number and convert it into an integer and store it into a Column in another table so that it ends up looking like this:
315482
Any help would be much appreciated!
Another approach is to use the Replace function. Either in TSQL or as a Derived Column Expression in SSIS.
TSQL
SELECT REPLACE(T.MyColumn, 'INC', '') AS ReplacedINC
SSIS
REPLACE([MyColumn], "INC", "")
This removes the character based data. It then becomes an optional exercise in converting to a numeric type before storing it to the target table or letting the implicit conversion happen.
Simplest version of what you need.
select cast(right(column,6) as int) from table
Are you doing this in a SSIS statement, or?...is it always the last 6 or?...
This is a little less dependant on your formatting...removes 0's and can be any length (will trim the first 3 chars and the leading 0's).
select cast(SUBSTRING('INC000000315482',4,LEN('INC000000315482') - 3) as int)

How to find MAX() value of character column?

We have legacy table where one of the columns part of composite key was manually filled with values:
code
------
'001'
'002'
'099'
etc.
Now, we have feature request in which we must know MAX(code) in order to give user next possible value, in example case form above next value is '100'.
We tried to experiment with this but we still can't find any reasonable explanation how DB2 engine calculates that
MAX('001', '099', '576') is '576'
MAX('099', '99', 'www') is '99' and so on.
Any help or suggestion would be much appreciated!
You already have the answer to getting the maximum numeric value, but to answer the other part with regard to 'www','099','99'.
The AS/400 uses EBCDIC to store values, this is different to ASCII in several ways, the most important for your purposes is that Alpha characters come before numbers, which is the opposite of Ascii.
So on your Max() your 3 strings will be sorted and the highest EBCDIC value used so
'www'
'099'
'99 '
As you can see your '99' string is really '99 ' so it is higher that the one with the leading zero.
Cast it to int before applying max()
For the numeric maximum -- filter out the non-numeric values and cast to a numeric for aggregation:
SELECT MAX(INT(FLD1))
WHERE FLD1 <> ' '
AND TRANSLATE(FLD1, '0123456789', '0123456789') = FLD1
SQL Reference: TRANSLATE
And the reasonable explanation:
SQL Reference: MAX
This max working well in your type definition, when you want do max on integer values then convert values to integer before calling MAX, but i see you mixing max with string 'www' how you imagine this works?
Filter integer only values, cast it to int and call max. This is not good designed solution but looking at your problem i think is enough.
Sharing the solution for postgresql
which worked for me.
Suppose here temporary_id is of type character in database. Then above query will directly convert char type to int type when it gives response.
SELECT MAX(CAST (temporary_id AS Integer)) FROM temporary
WHERE temporary_id IS NOT NULL
As per my requirement I've applied MAX() aggregate function. One can remove that also and it will work the same way.