Merge two tables and chain id fields in order - sql

I'm looking for a way to merge two tables (or more) and modify/order their numeric id. To put it simply here is what I want to do schematically :
Table example 1 :
Id
Field
4
x
1
x
5
x
3
x
2
x
Table example 2 :
Id
Field
1
x
3
x
5
x
2
x
4
x
Expected result (modify table 1 as 1-2-3-4-5 and table 2 as 6-7-8-9-10 THEN order both id by asc)
Id
Field
1
x
2
x
3
x
4
x
5
x
6
x
7
x
8
x
9
x
10
x
I was aiming for a union tables nested in a select row_number() over (order by id) but I don't really know how to modify table 2 as 6-7-8-9-10 before

Try using this example:
SELECT id, Field FROM t1
UNION ALL
SELECT (SELECT MAX(id) FROM t1) + ROW_NUMBER() OVER (ORDER BY id) AS id, Field
FROM t2
ORDER BY id
fiddle

Related

SAS - PROC SQL- Assign a value to a column based on condition based on another columns

I want to assign a value in new_col based on value in column 'ind' when months = 1;
idnum1 months ind new_col
1 1 X X
1 2 X X
1 3 Y X
1 4 Y X
1 5 X X
2 1 Y Y
2 2 Y Y
2 3 X Y
2 4 X Y
2 5 X Y
Below query just assign the value X where months = 1 but I want in all the rows of new_col for all the id -
create table tmp as
select t1.*,
case when months = 1 then ind end as new_col
from table t1;
I am trying to do it in SAS using proc sql;
Ideally you would use RETAIN within a data step:
data want;
set have;
retain new_var;
if month=1 then new_var = ind;
run;
SQL isn't as good with this as a data step.
But assuming your variable ID is repeated then this would work. If it's not then you really do need the data step approach.
proc sql;
create table want as
select *, max(ind) as new_col
from have
group by ID;
quit;
EDIT: If you want to retain the first per ID just use FIRST. instead of If month =1.
data want;
set have;
by ID;
retain new_var;
if first.id then new_var = ind;
run;
A robust Proc SQL statement that deals with possibly repeated first month situations that chooses the lowest ind to distribute to the group
data have; input
idnum1 months ind $ new_col $; datalines;
1 1 X X
1 2 X X
1 3 Y X
1 4 Y X
1 5 X X
2 1 Y Y
2 2 Y Y
2 3 X Y
2 4 X Y
2 5 X Y
3 1 Z .
3 1 Y .
3 1 X .
3 2 A .
;
create table want as
select
have.idnum1, months, ind, new_col, lowest_first_ind
from
have
join
( select idnum1, min(ind) as lowest_first_ind from
(
select idnum1, ind
from have
group by idnum1
having months = min(months)
)
group by idnum1
) value_seeker
on
have.idnum1 = value_seeker.idnum1
;
You can use a window function:
select t1.*,
max(case when months = 1 then ind end) over (partition by id) as new_col
from t1;
If there is only one MONTH=1 observation per BY group then just use a simple join.
create table WANT as
select t1.*,t2.ind as new_col
from table t1
left join (select idnum1,ind from table where month=1) t2
on t1.idnum1 = t2.idnum1
;

how to select one row from several rows with minimum value

The question based on SQL query to select distinct row with minimum value.
Consider the table:
id game point
1 x 1
1 y 10
1 z 1
2 x 2
2 y 5
2 z 8
Using suggested answers from mentioned question (select the ids that have the minimum value in the point column, grouped by game) we obtain
id game point
1 x 1
1 z 1
2 x 2
The question is how to obtain answer with single output for each ID. Both outputs
id game point
1 x 1
2 x 2
and
id game point
1 z 1
2 x 2
are acceptable.
Use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by point asc) as seqnum
from t
) t
where seqnum = 1;
We assume that all point entries are distinct(for each id and it's game so we can obtain the minimum of each id with it's game), Using a subquery and an inner join with two conditions would give you the result you,re waiting for.If it doesnt work with you I got another solution :
SELECT yt.*,
FROM Yourtable yt INNER JOIN
(
SELECT ID, MIN(point) MinPoint
FROM Yourtable
GROUP BY ID
) t ON yt.ID = t.ID AND yt.Record_Date = yt.MinDate

SQL query for counting sets of values

Added more information to clear up some confusions. Thanks.
I am trying to group sets of values in SQL. I have the following table and trying to somehow get the results as shown in the following table. I have explored group sets in SQL 2008, cubes, basic group by clauses, but I am not able to figure out the SQL query. Can someone please help. You can change the end resultant table format if you want but the basic idea is about how to count similar sets of values. In this table a,b,c exists 2 times so the count is 2 and x,y exists 3 times so the count is 3 and x, y, z exists 1 time so the count is 1. Please help.
UserId ProductId
1 a
1 b
1 c
2 x
2 y
3 x
3 y
4 x
4 y
5 a
5 b
5 c
6 x
6 y
6 z
ProductId Count
a 2
b 2
c 2
x 3
y 3
x 1
y 1
z 1
SELECT COUNT(`ProductId`),`ProductId ` WHERE 1 GROUP BY `ProductId` ORDER BY `ProductId` ASC
SELECT ProductId, COUNT(UserId) AS NbrOfUsers
FROM TABLE_NAME
GROUP BY ProductId, COUNT(UserId)
You're selecting ProductId & the count of how many UserId exist for that ProductId.
GROUP BY ProductId will group your counted UserId based on ProductId and also display the count as NbrOfUsers.
Your output will look like this:
ProductId NbrOfUsers
a 2
b 2
c 2
x 3
y 3

SQL query: same rows

I'm having trouble finding the right sql query. I want to select all the rows with a unique x value and if there are rows with the same x value, then I want to select the row with the greatest y value. As an example I've put a part of my database below.
ID x y
1 2 3
2 1 5
3 4 6
4 4 7
5 2 6
The selected rows should then be those with ID 2, 4 and 5.
This is what I've got so far
SELECT *
FROM base
WHERE x IN
(
SELECT x
FROM base
HAVING COUNT(*) > 1
)
But this only results in the rows that occur more than once. I've added the tags R, postgresql and sqldf because I'm working in R with those packages.
Here is a typical way to formulate the query in ANSI SQL:
select b.*
from base b
where not exists (select 1
from base b2
where b2.x = b.x and
b2.y > b.y
);
In Postgres, you would use distinct on for performance:
select distinct on (x) b.*
from base b
order by x, y desc;
You could try this query:
select x, max(y) from base group by x;
And, if you'd also like the id column in the result:
select base.*
from base join (select x, max(y) from base group by x) as maxima
on (base.x = maxima.x and base.y = maxima.max);
Example:
CREATE TABLE tmp(id int, x int ,y int);
INSERT INTO .....
test=# SELECT x, max(y) AS y FROM tmp GROUP BY x;
x | y
---+---
4 | 7
1 | 5
2 | 6

PLSQL or SSRS, How to select having all values in a group?

I have a table like this.
ID NAME VALUE
______________
1 A X
2 A Y
3 A Z
4 B X
5 B Y
6 C X
7 C Z
8 D Z
9 E X
And the query:
SELECT * FROM TABLE1 T WHERE T.VALUE IN (X,Z)
This query gives me
ID NAME VALUE
______________
1 A X
3 A Z
4 B X
6 C X
7 C Z
8 D Z
9 E X
But i want to see all values of names which have all params. So, only A and C have both X and Z values, and my desired result is:
ID NAME VALUE
______________
1 A X
2 A Y
3 A Z
6 C X
7 C Z
How can I get the desired result? No matter with sql or with reporting service. Maybe "GROUP BY ..... HAVING" clause will help, but I'm not sure.
By the way I dont know how many params will be in the list.
I realy appreciate any help.
The standard approach would be something like
SELECT id, name, value
FROM table1 a
WHERE name IN (SELECT name
FROM table1 b
WHERE b.value in (x,y)
GROUP BY name
HAVING COUNT(distinct value) = 2)
That would require that you determine how many values are in the list so that you can use a 2 in the HAVING clause if there are 2 elements, a 5 if there are 5 elements, etc. You could also use analytic functions
SELECT id, name, value
FROM (SELECT id,
name,
value,
count(distinct value) over (partition by name) cnt
FROM table1 t1
WHERE t1.value in (x,y))
WHERE cnt = 2
I prefer to structure these "sets within sets" of queries as an aggregatino. I find this is the most flexible approach:
select t.*
from t
where t.name in (select name
from t
group by name
having sum(case when value = 'X' then 1 else 0 end) > 0 and
sum9case when value = 'Y' then 1 else 0 end) > 0
)
The subquery for the in finds all names that have at least one X value and one Y value. Using the same logic, it is easy to adjust for other conditions (X and Y and Z,; X and Y but not Z and so on). The outer query just returns all the rows instead of the names.