How can I get SAS to forget decimal places i variables? - formatting

I have a problem with decimal places in my SAS-variables.
I have two numeric variables which I want to compare in SAS.
They are both formatted with numx10.1 and one of the variables is calculated like this:
avg(delay)/30.44 as delay_months format=numx10.1
I want to compare the two values in a datastep using
where var1^=delay_months ;
But even thought both variables have the same value for example. 4,1, SAS still shows this observation in the datastep with the where-statement.
I guess it is because of the decimals in the delay_months variable.
How can I get SAS to forget the decimal places and only focus on the number I see - for example 4,1?

Format does not change the value - it is only the way SAS represents it to you. One suggestion could be rounding during comparison (or when you create the variables)
where round(var1, 0.1) ^= round(delay_months, 0.1);
SAS uses floating-point representation for numeric values. Because of that sometimes what you see is not what it is.
More about that in SAS documentation

Related

SAS VARTYPE Function: run it (or equivalent) against all Variables

I am a DB administrator with 0 SAS experience and I work in government and have been tasked with ingesting SAS output from another team. The other team has limited SAS experience apparently and cannot answer the question "what is the data type of each SAS variable". We have dozens of tables and thousands of variables to import. Is there a way to run the SAS function "VarType" against all columns?
I've not found what I needed on SAS docs, SO search, etc.
I am expecting code that I can hand to the other team which they will run to produce the following (with only hard-coding the "dataset" ; no hard-coded table names/variable names):
TableName
VariableName
DataType
DataLength and/or Other attributes as needed
MyTable 1
Column1
char
25
MyTable 1
Col2
numeric
scale 10 precision 2
MyTable 2
Col1
(small? big? 32? ) int
bytes? or something that tells me max range
...
MyTable102
Column100
date
yyyy-mm-dd
Update: here's what I used based on the accepted answer. You would change:
library=SASHELP to library=YourLibrary to change the dataset being scraped
out=yourDataset.sasSchemaDump replace yourDataset with the destination dataset where a new table named sasSchemaDump will be created/populated. Rename sasSchemaDump to your desired table name.
proc datasets library=SASHELP  memtype=data;
contents data=_ALL_ (read=green) out=yourDataset.sasSchemaDump;
title 'SAS Schema Dump';
run;
There is a dedicated SAS procedure for this: PROC CONTENTS
proc contents data=sashelp.cars out=want; run;
It will create a SAS table want with all the information needed.
FYI: TYPE 1 is numeric, TYPE 2 is character.
If all tables are in the same library you could do the following to cycle through all the tables within the library
proc contents data=sashelp._all_ out=want; run;
Run PROC CONTENTS on the dataset and you will have the information you need.
SAS has only two data TYPE. Fixed length character strings and floating point numbers. The LENGTH is the number of bytes that are stored in the dataset. So for character variables the length determines how many characters it can store (assuming you are using a single byte encoding). Floating point numbers require 8 bytes to store, but you can store it with fewer in the dataset if you don't mind the loss of precision that means. For example if you know the values are integers you might choose to store only 4 of the bytes.
You can sometimes tell more information about a variable if the creator attach a permanent FORMAT to control how the variable is displayed. For example SAS stores DATE values as the number of days since 1960. So to make those number meaningful to humans you need to attach a format such as DATE9. or YYMMDD10. so that the numbers print as strings that a human would see as a date. Similarly there are display formats for displaying time of day value (number of seconds since midnight) or datetime values (number of seconds since 1960). Also if they attached a format that does not display decimal places that might mean the values are intended to be integers.
And if they attached a LABEL to the variable that might explain more about the variable than you can learn from the name alone.
They could also attach user defined formats to a variable. Those could be simple code/decode lookups, but they could also be more complex. A common complex one is used for collapsing a range (or multiple values and/or ranges) to a single decode. The definition of a user defined format is stored in a separate file, called a catalog, in particular a format catalog. You can use PROC FORMAT with the FMTLIB or CNTLOUT= option to see the definition of the user defined formats.

Float type storing values in format "2.46237846387469E+15"

I have a table ProductAmount with columns
Id [BIGINT]
Amount [FLOAT]
now when I pass value from my form to table it gets stored in format 2.46237846387469E+15 whereas actual value was 2462378463874687. Any ideas why this value is being converted and how to stop this?
It is not being converted. That is what the floating point representation is. What you are seeing is the scientific/exponential format.
I am guessing that you don't want to store the data that way. You can alter the column to use a fixed format representation:
alter table ProductAmount alter amount decimal(20, 0);
This assumes that you do not want any decimal places. You can read more about decimal formats in the documentation.
I would strongly discourage you from using float unless:
You have a real floating point number (say an expected value from a statistical calculation).
You have a wide range of values (say, 0.00000001 to 1,000,000,000,000,000).
You only need a fixed number of digits of precision over a wide range of magnitudes.
Floating point numbers are generally not needed for general-purpose and business applications.
The value gets stored in a binary format, because this is what you specified by requesting FLOAT as the data type for the column.
The value that you store in the field is represented exactly, because 64-bit FLOAT uses 52 bits to represent the mantissa*. Even though you see 2.46237846387469E+15 when selecting the value back, it's only the presentation that is slightly off: the actual value stored in the database matches the data that you inserted.
But i want to store 2462378463874687 as a value in my db
You are already doing it. This is the exact value stored in the field. You just cannot see it, because querying tool of SQL Management Studio formats it using scientific notation. When you do any computations on the value, or read it back into a double field in your program, you will get back 2462378463874687.
If you would like to see the exact number in your select query in SQL Management Studio, use CONVERT:
CONVERT (VARCHAR(50), float_field, 128) -- See note below
Note 1: 128 is a deprecated format. It will work with SQL Server-2008, which is one of the tags of your question, but in versions of SQL Server 2016 and above you need to use 3 instead.
Note 2: Since the name of the column is Amount, good chances are that you are looking for a different data type. Look into decimal data types, which provide a much better fit for representing monetary amounts.
* 2462378463874687 is right on the border for exact representation, because it uses all 52 bits of mantissa.

data conversion issue

I have a table in which one of field is Real data type. I need to show the values in decimal format like #.###. So i'm converting the real values to decimal. But when i convert for some values it is not generating actual value. For eg:- 20.05 is the actual value. multiple it by 100 and then it to decimal(9,4) it will return like 2004.9999.
select cast(cast(20.05 as real)*100.00 as decimal(9,4))
Why this is returning like this ?
Real or Float are not precise...
Even if you see the value as "20.05", even if you type it in like this, there will be tiny differences.
Your value 2004.9999 (or similar something like 2005.00001) is due to the internal representation of this type.
If you do the conversion to decimal first, it should work as expected:
select cast(cast(20.05 as real) as decimal(9,4))*100.00
But you should really think about, where and why you use floating point numbers...
UPDATE: Format-function
With SQL-Server 2012+ you might use FORMAT() function:
SELECT FORMAT(CAST(20.05 AS REAL)*100,'###.0.000')
This will allow you, to sepcify the format, and you will get text back.
This is fine for presentation output (lists, reports), but not so fine, if you want to continue with some kinds of calculations.

Parsing a large value that includes 3 smaller values of scientifc notation

I'm using VB.Net 2013 and really could use some help. Perhaps I have been staring at it too long. I am presented with a value from a variable. The specific value is this
3.190E+01+3.366E+01+8.036E+00
The value is actually 3 smaller values in scientific notation as follows
3.190E+01
3.366E+01
8.036E+00
I need to get the individual values into individual variables. Once I have the individual values I need to calculate the notation of each value so 3.190E+01 is equivalent to 3.190*10^1 and 8.036E+00 is equivalent to 8.036*10^0. I can probably figure out the last part of this question if I can just get the individual values. The caveat is that the numbers will vary in size and the scientific notation part will not always be the same. I do believe it will always be E+XX though so possible to use some regex stuff that I don't fully understand.
Thank you, I look forward to your help and it is very much appreciated.

SQL ROUND() function with truncate takes 119.1 and returns 119.09

I have data for pounds and pence stored within concatenated strings (unfortunately no way around this) but can not guarantee 2 decimal places.
E.g. I may get a value of 119.109, so this must translated to 2 decimal places with truncation, i.e. 119.10, NOT 119.11.
For this reason I am avoiding "CAST as Decimal" because I do not want to round. Instead I am using ROUND(amount, 2, 1) to force truncation at 2 decimal places.
This works for the most part but sometimes exhibits strange behaviour. For example, 119.10 outputs as 119.09. This can be replicated as:
ROUND(CAST('119.10' AS varchar),2,1)
My target field is Decimal(19,4) (but the 3rd and 4th decimal places will always be 0, it is a finance system so always pounds and pence...).
I assume the problem is something to do with ROUNDing a varchar....but I don't know anyway around this without having to CAST and therefore introduce rounding that way?
What is happening here?
Any ideas greatly appreciated.
This is due to the way floating point numbers work, and the fact that your string number is implicitly converted to a floating point number before being rounded. In your test case:
ROUND(CAST('119.10' AS varchar),2,1)
You are implicitly converting 119.10 to float so that it can be passed to the ROUND function, 119.10 exactly cannot be stored as a float, proven by running the following:
SELECT CAST(CONVERT(FLOAT, '119.10') AS DECIMAL(30, 20))
Which returns:
119.09999999999999000000
Therefore, when you round this with truncate you get 119.09.
For what it is worth, you should always specify a length when converting to, or declaring a varchar