How can I avoid this error in SAS? - sql

When trying to merge datasets in SAS I continuously get the following error for a number of variables:
Column 115 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second
I've been able to get around this error usually by doing the following:
Changing one of the variables to the same "type" of the other. For example, changing variable A to a character type from a numeric type so that it matches the variable in the other dataset thereby allowing the merge to happen.
Importing the datasets that I am trying to merge together as CSV files and then adding the "guessing rows" option in the proc import step. For example:
proc import datafile='xxxxx'
out=fadados
dbms=csv replace;
getnames=yes;
guessingrows=200;
run;
However, sometimes in spite of importing my files as CSVs and using "guessingrows" I still get the above error and sometimes there are so many that it is VERY time consuming and not feasible to actually convert all variables to the same "type" so that they match between datasets.
Can anyone advise me on how I can easily AVOID this error? Is there another way that people get around this? I get this error so often that I am tired of having to convert every single variable. There must be another way!
******UPDATE*****
Here is an example that everyone is asking for:
proc sql;
title 'MED REC COMBINED';
create table combined_bn_hw as
select * from bndados
outer union corr
select * from hwdados;
quit;
And here is the output I get in the log:
21019 proc sql;
21020 title 'MED REC COMBINED';
21021 create table combined_bn_hw as
21022 select * from bndados
21023 outer union corr
21024 select * from hwdados;
ERROR: Column 115 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
ERROR: Column 120 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
ERROR: Column 173 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
ERROR: Numeric expression requires a numeric format.
ERROR: Column 181 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
ERROR: Column 185 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
ERROR: Column 186 from the first contributor of OUTER UNION is not the same type as its
counterpart from the second.
21025 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds

Don't use PROC IMPORT to guess what types of variables you have in your data. Its decision is going to depend on what values are in the file. Just write a data step to read your CSV files yourself. Then you can control how the variables are defined.
PROC IMPORT has to guess if your ID variable is numeric or character. And since it is doing based on what is in the file it can make different decisions for different sets of data. A common example is when a character variable is totally empty then PROC IMPORT will think it should be a numeric variable.
You could recall the data step code that PROC IMPORT generates and update that to use consistent data types for your variables. But writing your own is not very hard. you don't have to make as complicated a program as PROC IMPORT generates. Just include an INFILE statement, define your variables, including attaching any required INFORMATS (like for date values) and then use a simple INPUT statement.
data want;
infile 'myfile.csv' dsd firstobs=2 truncover;
length var1 $20 var2 8 ... varlast 8 ;
informat var2 yymmdd10.;
format var2 yymmdd10.;
input var1 -- varlast;
run;

Without an example it is difficult to test. Did you try the FORCE option on PROC APPEND?
Example:
proc append base=base data=one force; run;
proc append base=base data=two force; run;
proc append base=base data=e04 force; run;
Source:
http://www.sascommunity.org/wiki/PROC_APPEND_Alternatives

Related

SAS Proc SQL string truncation issue

Have the creation of a simple table from values in another table below:
create table summary3 as
select
substr(&Start_dt.,1,4) as time_range,
NFDPs ,
NFDPExceeds,
NblkExceeds,
NFDPExceedsLT30s as NFDPExceedsLT30,
NReports as Nbr_report ,
prcnt_FDP_ext ,
prcnt_blk_ext ,
prcnt_extLT30 as prcnt_ext_LT30,
prcnt_report,
monotonic() as id
from OAP_exceedances_by_year;
my problem is arising on the very first column i created, time_range. When i try adding values to this table later on, I noticed that this column is capped to char's of length 4 or shorter, and it automatically truncates anything greater. Is there a way I can either change that first statement, or perhaps my future insert / set statements to avoid the truncation? IE i still want the first row to only be 4 characters but I may need future rows to be more.
Thanks!
This depends on how you do your future processing. If your data step later on says
data summary_final;
set summary3;
time_range = "ABCDEF";
run;
Then you could just change it like so:
data summary_final;
length time_Range $6;
set summary3;
time_range = "ABCDEF";
run;
But you certainly could do what you say also in the initial pull. For example...
proc sql;
create table namestr as
select substr(name,1,4) as namestr length=8
from sashelp.class;
quit;
That creates namestr as length=8 even though it has substr(1,4) in it; the names there will be truncated, as the substr asks it to, but future names will be allowed to be 8 long.

Commas using SAS and TD SQL

I am using SAS to pull data in a Teradata environment. I am counting the rows in the Teradata table, but want the output to be in a comma format (i.e. 1,000,000). I was able to use the code below to display the value as a comma, but when I try to add the column in SAS, I can't since the output is in a character format. Does anyone have any suggestions on how to format the number value as comma, so that it can be used for calculation purposes in SAS? Thanks.
CAST(Count(*) as (format 'Z,ZZZ,ZZ9')) as char(10)) as rowCount,
Assuming you're using pass through, pull it in as numeric and format it on the SAS side. You've now converted it to character (char10) and SAS doesn't do math on character variables which makes logical sense.
select rowCount format=comma12. from con
(select
count(*) as rowCount ....
)
If you have a select * you can always format it later in a data step or via PROC DATASETS. SAS separates the display and storage layers so the format controls the appearance but the underlying data still remains numeric.

Rename row variables names in SAS after Proc FREQ function

I do this to get a TABLE like below
PROC FREQ data=projet.matchs;
TABLES circuit/ NOCUM;
run;
Circuit Fréquence Pourcentage
ATP 127 50.00
WTA 127 50.00
I need exactly the same except that I want "Male" instead of ATP and "female instead of "WTA"
So i tues it is a renaming function but I don't know how to use it.
Thanks for the help
Note those are not "row variable names". They are the actual (or formatted) values of your variable CIRCUIT.
Looks like you want to create a custom format to change how the values in your variable are displayed.
proc format ;
value $gender 'ATP'='Male' 'WTA'='Female';
run;
Then tell the proc to use that format for your variable.
PROC FREQ data=projet.matchs;
TABLES circuit/ NOCUM;
format circuit $gender. ;
run;

SAS Character Values Exist in Numeric Type

I have a dataset at work that is a numeric variable when I do a PROC CONTENTS. However, when I look at the actual underlying data, there are letters values that are part of the variable like 'R', 'A', etc....
Was wondering if anyone has an explanation for how/why SAS allows this kind of type assignment?
It's not except if :
1) You have a format applied to the variable that is displaying it as a character variable. The display appears as a character, however the underlying variable is numeric.
proc format ;
value age
0 - 10='young'
11 - 12='preteen'
13 - 19='teen'
;
run;
proc print data=sashelp.class;
format age age.;
run;
2) If it's actually .R/.A, these are special missing variables.
My guess is that you have a format applied to the data.

Netezza Formatting Functions

I've imported a somewhat large set of data, with, at times, an odd number format (e.g., 12,345.01- and 1,945.001-), and I am trying to 'fix' it.
The data was imported as VARCHAR(20)
My solution:
to_number(BadNumCol, 'S999G999G999D999')
input: 10426.95 ;261.000 ;33.93-
outputs:42695.00 ;261.000 ; 3.93
the output is NUMERIC(12,3)
desired output: 10426.95 ; 261.000 ; -33.93
What's going on here? What am I missing/not understanding in my ignorance?
And, how do I fix these ~400 Million data elements?
Ther are two issues that I see here.
In the case of the sample input data from the "solution" section of you data, none of the sample input data has group separators, so the group separators you are specifying in your TO_NUMBER function is mismatched.
Also, you are specifyng a sign character anchored to the beginning of the string, when your data has a trailing minus instead.
The most appropriate conversion format string I can infer from your sample input data is: '999999999D999MI'
select * from num_test;
BADNUMCOL
-----------
261.000
10426.95
33.93-
(3 rows)
select to_number(BadNumCol,'999999999D999MI')::NUMERIC(12,3) GoodNumCol from num_test;
GOODNUMCOL
------------
10426.950
-33.930
261.000
(3 rows)