how to union 2 tables based on specific columns in sas - sql

I want to union 2 tables, but get the error
proc sql;
select * from Table1
outer union corr
select * from table2;
But get the error:
ERROR: The type of column EntryId from the left hand side of the OUTER UNION set operation is
different from EntryId on the right hand side
If I understand this correct and based on UNION ALL two SELECTs with different column types - expected behaviour?, the first column is a different data type and cannot proceed with the union (which is true)
RecordID num label='RecordID' format=20. informat=20.
and
RecordID num label='RecordID' format=11. informat=11.
BUT, there is a column I want to use which has the same format
Pseu char(64) label='Pseu' format=$64. informat=$64.
Pseu char(64) label='Pseu' format=$64. informat=$64.
and in each table they are columns 3 and 4.
Is there a way to union these table together using that column as the reference, as opposed to the original?
I tried to no avail:
proc sql;
select * from Table1
outer union corr
select * from table2
on Table1.Pseu=Table2.Pseu;
ERROR: Found "on" when expecting ;
It follows from the OUTER UNION CORRESPONDING example given on http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473694.htm, and is here based on what I want:
table1
R y p
1 A 100
2 B 101
3 R 102
table2
R z p
4 A 102
5 R 103
6 T 104
MERGED
p R y R z
100 1 A
101 2 B
102 3 R 4 A
103 5 R
104 6 T

Something like this perhaps:
proc sql;
select * from Table1
outer union corr
select p, r as r2, z from table2
Having a column alias for the column r.
Using a regular UNION:
select p, r, y, null, null from Table1
outer union corr
select p, null, null, r as r2, z from table2

The answer supplied by my search and jarlh were correct.
The issues arose due to the size and number of columns in the data sets to be union'd. I had to make sure that there were no column names repeated in the union'd data sets (in my 600 total columns, some had similar names), so I had to rename columns

Related

SQL: select rows from a certain table based on conditions in this and another table

I have two tables that share IDs on a postgresql .
I would like to select certain rows from table A, based on condition Y (in table A) AND based on Condition Z in a different table (B) ).
For example:
Table A Table B
ID | type ID | date
0 E 1 01.01.2022
1 F 2 01.01.2022
2 E 3 01.01.2010
3 F
IDs MUST by unique - the same ID can appear only once in each table, and if the same ID is in both tables it means that both are referring to the same object.
Using an SQL query, I would like to find all cases where:
1 - the same ID exists in both tables
2 - type is F
3 - date is after 31.12.2021
And again, only rows from table A will be returned.
So the only returned row should be:1 F
It is a bit hard t understand what problem you are actually facing, as this is very basic SQL.
Use EXISTS:
select *
from a
where type = 'F'
and exists (select null from b where b.id = a.id and dt >= date '2022-01-01');
Or IN:
select *
from a
where type = 'F'
and id in (select id from b where dt >= date '2022-01-01');
Or, as the IDs are unique in both tables, join:
select a.*
from a
join b on b.id = a.id
where a.type = 'F'
and b.dt >= date '2022-01-01';
My favorite here is the IN clause, because you want to select data from table A where conditions are met. So no join needed, just a where clause, and IN is easier to read than EXISTS.
SELECT *
FROM A
WHERE type='F'
AND id IN (
SELECT id
FROM B
WHERE DATE>='2022-01-01'; -- '2022' imo should be enough, need to check
);
I don't think joining is necessary.

SQL Union and special sum

I'm new to SQL queries so hopes this question isn't stupid.
I got two tables like this:
Table 1:
Name
Value
Count
global
g
1
domain
x
2
domain
y
1
agg
ba
1
Table 2:
Name
Value
Count
global
g
1
domain
z
1
agg
bb
1
I need to get this kind of table - which is consist of all rows without duplications, and the global row should changed it's count to the sum of the 'domain' rows from the first table only:
Table 3:
Name
Value
Count
global
g
3
domain
x
2
domain
y
1
domain
z
1
agg
ba
1
agg
bb
1
is this kind of operation is possible?
demo:db<>fiddle
SELECT * FROM table1
WHERE "Name" <> 'global' -- 1
UNION
SELECT -- 2
'global',
'g',
SUM("Count")
FROM table1
WHERE "Name" = 'domain'
UNION
SELECT * FROM table2
WHERE "Name" <> 'global' -- 1
Union both tables without the global row
Create a new global row for the expected sum of the table1 domain records. Union it as well.
Try this out
SELECT x.name, x.total_val, sum(occurence)
FROM (SELECT name, total_val, occurence FROM test union all select name, total_val, occurence from test2) x
group by x.name, x.total_val
You can check on this db fiddle as well test case

how to join two tables with different columns datatypes

Im using toad for oracle,
I have two tables the old table CUSTOMER has columns, and primary Key is **custnum** ,
its datatype is "number".
I have created a new table TRANSACTION and it has a column with name **custnum** but its datatype is "varchar2" .
custnum field values in customer table:
14953252
14442752
19321147
74893221
custnum field values in transaction table:
14953252
AR7475552
19321147
JK8932214
P887655532
WX7893534
My query has an error
select t.custnum,t.trascnum,c.custname,c.custaddress
from transaction t, customer c
where
t.custnum=c.custnum(+)
how to join two tables with different columns datatypes ??
You can do use LIKE. And LEFT JOIN:
select t.custnum, t.trascnum, c.custname, c.custaddress
from customer c left join
transaction t
on t.custnum like '%' || c.custnum;
use the CAST function to change the data type of one variable to the other.
you would want to do something like this:
where transcation.custnum = CAST(customer.custnum as varchar2) (+)
To me, it looks as if you're looking for a full outer join:
SQL> with
2 -- sample data; you have that & don't have to type it
3 customer (custnum, custname) as
4 (select 14953252, 'Scott' from dual union all
5 select 14442752, 'Mike' from dual union all
6 select 19321147, 'King' from dual
7 ),
8 transaction (custnum, trascnum) as
9 (select '14953252' , 25 from dual union all
10 select 'AR7475552', 13 from dual union all
11 select '19321147' , 82 from dual
12 )
13 -- query you might be interested in
14 select c.custnum,
15 c.custname,
16 t.custnum,
17 t.trascnum
18 from customer c full outer join transaction t
19 on c.custnum = to_number(regexp_substr(t.custnum, '\d+$'));
CUSTNUM CUSTN CUSTNUM TRASCNUM
---------- ----- --------- ----------
14953252 Scott 14953252 25
AR7475552 13
19321147 King 19321147 82
14442752 Mike
SQL>
In order for the database to compare the two values, either customer.cnum has to be converted to a string, or transaction.custnum has to be converted to a number. Probably the error you got was
ORA-01722: invalid number
as the database tried to treat AR7475552 as a number.
Converting a number to a string is easier:
select c.custnum
, c.custname
, t.custnum
, t.trascnum
from transaction t
left join customer c
on to_char(c.custnum) = t.custnum;
but there are ways to treat a string as a number, such as this (requires Oracle 12.2 or later):
select c.custnum
, c.custname
, t.custnum
, t.trascnum
from transaction t
left join customer c
on c.custnum = to_number(t.custnum default 0 on conversion error);
(I found that using default null on conversion error in a join crashes the session, even though it is valid in a query. If it's possible for c.custnum to contain 0 then you could use a more obscure value such as -1e-9.)
This code works for me:
select t.custnum,c.salary
from transactions t left outer join customer c on t.custnum=to_char(c.custnum);
#Littlefoot
#Mureinik

SQL UNION ALL only include newer entries from 'bottom' table

Fair warning: I'm new to using SQL. I do so on an Oracle server either via AQT or with SQL Developer.
As I haven't been able to think or search my way to an answer, I put myself in your able hands...
I'd like to combine data from table A (high quality data) with data from table B (fresh data) such that the entries from B are only included when the date stamp are later than those available from table A.
Both tables include entries from multiple entities, and the latest date stamp varies with those entities.
On the 4th of january, the tables may look something like:
A____________________________ B_____________________________
entity date type value entity date type value
X 1.jan 1 1 X 1.jan 1 2
X 1.jan 0 1 X 1.jan 0 2
X 2.jan 1 1 X 2.jan 1 2
Y 1.jan 1 1 (new entry)X 3.jan 1 1
Y 3.jan 1 1 Y 1.jan 1 2
Y 3.jan 1 2
(new entry)Y 4.jan 1 1
I have made an attempt at some code that I hope clarify my need:
WITH
AA AS
(SELECT entity, date, SUM(value)
FROM table_A
GROUP BY
entity,
date),
BB AS
(SELECT entity, date, SUM(value)
FROM table_B
WHERE date > ALL (SELECT date FROM AA)
GROUP BY
entity,
date
)
SELECT * FROM (SELECT * FROM AA UNION ALL SELECT * FROM BB)
Now, if the WHERE date > ALL (SELECT date FROM AA)would work seperately for each entity, I think have what I need.
That is, for each entity I want all entries from A, and only newer entries from B.
As the data in table A often differ from that of B (values are often corrected) I dont think I can use something like: table A UNION ALL (table B MINUS table A)?
Thanks
Essentially you are looking for entries in BB which do not exist in AA. When you are doing date > ALL (SELECT date FROM AA) this will not take into consideration the entity in question and you will not get the correct records.
Alternative is to use the JOIN and filter out all matching entries with AA.
Something like below.
WITH
AA AS
(SELECT entity, date, SUM(value)
FROM table_A
GROUP BY
entity,
date),
BB AS
(SELECT entity, date, SUM(value)
FROM table_B
LEFT OUTER JOIN AA
ON AA.entity = BB.entity
AND AA.DATE = BB.date
WHERE AA.date == null
GROUP BY
entity,
date
)
SELECT * FROM (SELECT * FROM AA UNION ALL SELECT * FROM BB)
I find your question confusing, because I don't know where the aggregation is coming from.
The basic idea on getting newer rows from table_b uses conditions in the where clause, something like this:
select . . .
from table_a a
union all
select . . .
from table_b b
where b.date > (select max(a.date) from a where a.entity = b.entity);
You can, of course, run this on your CTEs, if those are what you really want to combine.
Use UNION instead of UNION ALL , it will remove the duplicate records
SELECT * FROM (
SELECT *
FROM AA
UNION
SELECT *
FROM BB )

SQL "IN" statement for multiple columns

I would like to filter Name,X combinations for which is never X=Y
Let's assume the following table:
*Name* *X* *Y*
A 2 1
A 2 2 <--- fulfills requirement for Name=A, X=2
A 10 1
A 10 2
B 3 1
B 3 3 <--- fulfills requirement for Name=B, X=3
B 1 1 <--- fulfills requirement for Name=B, X=1
B 1 3
So I would like to return the combination Name=A, X=10 for which X=Y is never true.
This was my approach (which is syntactically incorrect)
SELECT *
FROM TABLE
WHERE NAME
, X NOT IN (SELECT DISTINCT NAME
, X
FROM TABLE
WHERE X=Y)
My problem is the where statement which cannot handle multiple columns. Does anyone know how to do this?
Just put the columns into parentheses
SELECT *
FROM TABLE
WHERE (NAME, X) NOT IN (SELECT NAME, X
FROM TABLE WHERE X=Y);
The above is ANSI standard SQL but not all DBMS support this syntax though.
A distinct is not necessary for a sub-query for IN or NOT IN.
However NOT EXISTS with a co-related sub-query is very often faster that an NOT IN condition.
I use this on SQL Server
SELECT *
FROM TABLE
WHERE (SELECT NAME + ';' + X)
NOT IN (SELECT NAME + ';' + X
FROM TABLE WHERE X = Y);
I think you can use two condition to achieve this
SELECT *
FROM TABLE
WHERE NAME NOT IN(
SELECT a.NAME FROM TABLE a WHERE a.X=a.Y
) AND X NOT IN (
SELECT b.X FROM TABLE b WHERE b.X=b.Y
)
SELECT *
FROM TABLE T
WHERE NOT EXISTS (SELECT NAME
,X
FROM TABLE t2
WHERE t1.Name=t2.Name
AND t1.X=t2.Y)
This will check if there is such a record