joining table with multiple rows and same column name - sql

table 1
id | name | gender
1 | ABC | M
2 | CDE | M
3 | FGH | M
table 2
id | name | gender
4 | BAC | F
5 | DCE | F
6 | GFH | F
how to make output in oracle database like this :
id | name | gender
1 | ABC | M
2 | CDE | M
3 | FGH | M
4 | BAC | F
5 | DCE | F
6 | GFH | F

Use UNION [ALL]:
select * from table1
union all
select * from table2;
P.S. If there exists any duplicated row for individual SELECT statements, UNION would remove duplicates, but UNION ALL concatenates rows even they are duplicates.

If you really need to "join" 2 tables:
with a as (
select 1 id, 'ABC' name, 'M' gender from dual union all
select 2 id, 'CDE' name, 'M' gender from dual union all
select 3 id, 'FGH' name, 'M' gender from dual ),
b as (
select 4 id, 'BAC' name, 'F' gender from dual union all
select 5 id, 'DCE' name, 'F' gender from dual union all
select 6 id, 'GFH' name, 'F' gender from dual )
select coalesce(a.id, b.id) id,
coalesce(a.name, b.name) name,
coalesce(a.gender, b.gender) gender
from a
full join b
on a.id = b.id
/* if name, gender not in pk */
-- and a.name = b.name
-- and a.gender = b.gender
;
In this case all duplicated "ID"s will be removed. And first not null value of "name", "gender" columns will be returned becouse of coalesce function.
You can even use greatest, least and ets, instead of coalesce..
p.s. Be careful if you don't have PK on table!

Related

SQL Multiple Join but Retain all Records from One Table

I have a difficult operation that must be performed within SQL due to operational limitations.
I have 3 tables that contain required information. They each have some common columns that can be used for a join but do not have a single one that all three share.
The 3 tables are:
Table 1:
| rule_type | code |
|-----------|------|
| Type A | A1 |
| Type A | A1 |
| Type B | B1 |
| Type B | B1 |
| Type C | C1 |
| Type C | C1 |
Table 2:
| site_ref | code |
|----------|------|
| XYZ | A1 |
| XYZ | A1 |
| XYZ | C1 |
| XYZ | C1 |
Table 3:
| site_ref | population |
|----------|------------|
| XYZ | 100 |
| XYZ | 100 |
| XYZ | 100 |
The JOIN required must contain all 3 columns and an additional one that counts the number of distinct entries from table 1. The desired outcome would be:
| rule_type | code | site_ref | population | count |
|-----------|------|----------|------------|-------|
| Type A | A1 | XYZ | 100 | 2 |
| Type B | B1 | XYZ | 100 | 0 |
| Type C | C1 | XYZ | 100 | 2 |
I have attempted to create this joining on common columns via a FULL OUTER JOIN:
SELECT code, count(*) as count, site_ref, population, rule_type, population
FROM
(SELECT A.code, count(*) as count, B.site_ref, C.population, A.rule_type
FROM table_1 as A
FULL OUTER JOIN table_2 AS B ON A.code = B.code
JOIN table_3 as C ON B.site_ref = C.site_ref
WHERE site_ref = 'XYZ' AND rule_type in ('Type A', 'Type B', 'Type C'))
GROUP BY code, count(*) as count, site_ref, population, rule_type, population
But this is returning:
| rule_type | code | site_ref | population | count |
|-----------|------|----------|------------|-------|
| Type A | A1 | XYZ | 100 | 2 |
| Type C | C1 | XYZ | 100 | 2 |
As because there is no corresponding Type B count in table 2, it has nothing to count. I thought using a FULL OUTER JOIN would bring in these additional rule types but it hasn't. Is there a way to adapt the JOIN that will bring in the additional columns from table 1 and create an entry showing the columns of Type B but with a count of 0?
i dont understand the table 3, they have always the same data, if not how we know which dataset is for the type B?
If you want to have all possibilities, you can do a cross join.
with table_1
as
(
Select 'Type A' as rule_type, 'A1' as code
Union all
Select 'Type A' as rule_type, 'A1' as code
Union all
Select 'Type B' as rule_type, 'B1' as code
Union all
Select 'Type B' as rule_type, 'B1' as code
Union all
Select 'Type C' as rule_type, 'C1' as code
Union all
Select 'Type C' as rule_type, 'C1' as code
),
table_2
as
(
Select 'XYZ' as site_ref, 'A1' as code
Union all
Select 'XYZ' as site_ref, 'A1' as code
Union all
Select 'XYZ' as site_ref, 'C1' as code
Union all
Select 'XYZ' as site_ref, 'C1' as code
),
table_3
as
(
Select 'XYZ' as site_ref, '100' as population
Union all
Select 'XYZ' as site_ref, '100' as population
Union all
Select 'XYZ' as site_ref, '100' as population
)
Select
x.code,
x.rule_type,
c.site_ref,
c.population,
x.count
from
(
Select
a.code,
a.rule_type,
b.count
from(
Select code,
count(code) as count
from table_2
group by code
) B
full outer join table_1 AS A ON
a.code = b.code
group by
a.code,
a.rule_type,
b.count
) x
cross join table_3 as c
WHERE site_ref = 'XYZ' AND rule_type in ('Type A', 'Type B', 'Type C')
group by
x.code,
x.rule_type,
c.site_ref,
c.population,
x.count
if you want default values, you can use case when
Select
a.code,
a.rule_type,
case when x.site_ref is NUll then 'XYZ' else x.site_ref end as site_ref ,
case when c.population is NUll then '100' else c.population end as population,
case when x.count is NUll then '0' else x.count end as count
from
(
Select
code,
site_ref,
count(b.Code) as count
from
table_2 as b
group by
b.code,
b.site_ref
) x
full outer join table_1 as a on
a.code = x.code
full outer join table_3 as c ON
x.site_ref = c.site_ref
group by
a.code,
a.rule_type,
x.site_ref,
c.population,
x.count

Group and sort by associated table attribute

There are two tables, Group sections and Groups.
I want to group and sort groups by group sections.
Group Sections:
Id | Name | Priority
1 | Football | 2
2 | Basketball | 1
3 | Tennis | 3
Groups:
Id | section_id | Name
1 | 1 | Barcelona
2 | NULL | Noname
3 | 1 | Real Madrid
4 | 2 | Cedevita
5 | 3 | Ljubljana
What i got for now in sql:
SELECT group_sections.id, group_sections.priority AS priority, group_sections.name, groups.*
FROM groups
LEFT OUTER JOIN group_sections ON group_sections.group_id = groups.id
GROUP BY group_sections.id, groups.id
ORDER BY group_sections.priority ASC
What I want to get:
Football => [<Group id: 1>, <Group id: 3], Basketball => [<Group id: 2>], Tennis => [<Group id: 3]
How can I get this in rails active record?
#groups.joins(:group_sections).group('group_sections.id', 'groups.id').order('group_sections.priority ASC')
This query is not working for me. Any ideas?
you can try string_agg
select string_agg(name || ' => [' || groupnames ,', ') groupnames
from(
SELECT b.name, string_agg(a.name,', ') || ']' groupnames
FROM groups1 a
JOIN group_sections b ON b.id = a.sectionid
group by a.sectionid, b.name, b.priority
order by b.priority) x
Postgresql has useful function for it named string_agg. Here example for your situation:
with
gs as (
select 1 as id, 'Football' as name, 2 as Priority union all
select 2 as id, 'Basketball' as name, 1 as Priority union all
select 3 as id, 'Tennis' as name, 3 as Priority
),
g as (
select 1 as id, 1 as section_id, 'Barcelona' as Name union all
select 2 as id, null as section_id, 'Noname' as Name union all
select 3 as id, 1 as section_id, 'Real Madrid' as Name union all
select 4 as id, 2 as section_id, 'Cedevita ' as Name
)
select
gs.id, gs.name, gs.Priority,
string_agg(g.id::text,',') as id_of_groups
from g
left join gs
on g.section_id = gs.id
group by gs.id, gs.name, gs.Priority
order by gs.Priority

Selecting a record based on a series of criteria

I would like to run a query that will allow me to chose the best record from a particular username based on certain criteria. I have 2 columns (col01, col02) that are my criteria that I am looking at.
• If one record (username a in the example below) has both columns as yes, I would like that one to take precedence.
• If one record has col01 as a yes, that takes next 2nd rank precenence (username c in the example below)
• If one record has col01, and the other has col02 as yes, than col01 takes precedence(username d in the example below).
• If one record has col02 as yes, and the other records as no, than column two takes 3rd precedence (username g in the example below).
• If both records are the same, than neither should be returned as these records need to be investigated further (usernames b, e, f)
Below is example sample and output. How it can be done using sql query?
+----------+-----+-------+-------+
| username | id | col01 | col02 |
+----------+-----+-------+-------+
| a | 1 | yes | yes |
| a | 2 | yes | no |
| b | 3 | no | no |
| b | 4 | no | no |
| c | 5 | yes | no |
| c | 6 | no | no |
| d | 7 | yes | no |
| d | 8 | no | yes |
| e | 9 | no | yes |
| e | 10 | no | yes |
| f | 11 | yes | yes |
| f | 12 | yes | yes |
| g | 13 | no | no |
| g | 14 | no | yes |
+----------+----+--------+-------+
output
+----------+-----+-------+------+
| username | id | col01 | col02|
+----------+-----+-------+------+
| a | 1 | yes | yes |
| c | 5 | yes | no |
| d | 7 | yes | no |
| g | 14 | no | yes |
+----------+----+--------+------+
Edit: I was asked to explain the conditions. Basically the records come from the same area (username); The col01 is the most recently updated information we have, while col02 is older. Both columns are important to us, so that is why it is better if both are yes; col01 being more recent is where the more dependable data is. Where all the records are exactly the same, we have to dig a little deeper to understand out data.
Use analytic functions and then you do not need any self-joins:
Query:
SELECT username,
id,
col01,
col02
FROM (
SELECT t.*,
c.col2,
MIN( t.col01 ) OVER ( PARTITION BY username ) AS mincol01,
MAX( t.col01 ) OVER ( PARTITION BY username ) AS maxcol01,
MIN( c.col02 ) OVER ( PARTITION BY username ) AS mincol02,
MAX( c.col02 ) OVER ( PARTITION BY username ) AS maxcol02,
ROW_NUMBER() OVER ( PARTITION BY username
ORDER BY t.col01 DESC, c.col02 DESC ) AS rn
FROM table_name t
INNER JOIN
col02_table c
ON ( t.id = c.id )
)
WHERE ( mincol01 < maxcol01 OR mincol02 < maxcol02 )
AND rn = 1;
Output:
USERNAME ID COL01 COL02
-------- -- ----- -----
a 1 yes yes
c 5 yes no
d 7 yes no
g 14 no yes
with
inputs ( username, id, col01 , col02 ) as (
select 'a', 1, 'yes', 'yes' from dual union all
select 'a', 2, 'yes', 'no' from dual union all
select 'b', 3, 'no' , 'no' from dual union all
select 'b', 4, 'no' , 'no' from dual union all
select 'c', 5, 'yes', 'no' from dual union all
select 'c', 6, 'no' , 'no' from dual union all
select 'd', 7, 'yes', 'no' from dual union all
select 'd', 8, 'no' , 'yes' from dual union all
select 'e', 9, 'no' , 'yes' from dual union all
select 'e', 10, 'no' , 'yes' from dual union all
select 'f', 11, 'yes', 'yes' from dual union all
select 'f', 12, 'yes', 'yes' from dual union all
select 'g', 13, 'no' , 'no' from dual union all
select 'g', 14, 'no' , 'yes' from dual
)
-- Query begins here
select username,
max(id) keep (dense_rank last order by col01, col02) as id,
max(col01) as col01,
max(col02) keep (dense_rank last order by col01) as col02
from inputs
group by username
having min(col01) != max(col01) or min(col02) != max(col02)
;
USERNAME ID COL COL
-------- --- --- ---
a 1 yes yes
c 5 yes no
d 7 yes no
g 14 no yes
Use multiple outer self joins, one for records with both yes, one for records with only col01 = yes and one for records with only col02 = yes. Then add predicates to only select records where the id is the id of the first record in that set (id of row with same name that has both yes, id of row with same name that has only col01 = yes, etc.)
to get rid of rows that are dupes, filter out any row where there's another row, (with different id) that has same value for username, col01, and col02.
Select distinct a.username, a.id,
a.col01, a.col02
From table a
left join table b -- <- this is rows with both cols = yes
on b.username=a.username
and b.col01='yes'
and b.col02='yes'
left join table c1 -- <- this is rows with col1 = yes
on c1.username=a.username
and c1.col01='yes'
and c1.col02='no'
left join table c2 -- <- this is rows with col2 = yes
on c2.username=a.username
and c2.col01='no'
and c2.col02='yes'
Where a.id = coalesce(b.id, c1.Id, c2.Id)
and not exists -- <- This gets rid of f
(select * from table
where username = a.username
and id != a.id
and col01 = a.col01
and col02 = a.col02)
if col02 is in another table, then in each place you use the table and need col02, you will need to add another join to this other table.
Select distinct a.username, a.id,
a.col01, ot.col02
From (table a join other table ot
on ot.id = a.Id)
left join (table b join otherTable ob -- <- this rows with both cols yes
on ob.id= b.id)
on b.username=a.username
and b.col01='yes'
and ob.col02='yes'
left join (table c1 join otherTable oc1 -- <- this rows with col1 yes
on oc1.id= c1.id)
on c1.username=a.username
and c1.col01='yes'
and oc1.col02='no'
left join (table c2 join otherTable oc2 -- <- this rows with col2 yes
on oc2.id= c2.id)
on c2.username=a.username
and c2.col01='no'
and oc2.col02='yes'
Where a.id = coalesce(b.id, c1.Id, c2.Id)
and not exists -- <- This gets rid of f
(select * from table e
join otherTable oe
on oe.id= e.id
where e.username = a.username
and e.id != a.id
and e.col01 = a.col01
and oe.col02 = a.col02)

SQL statement to conditionally selecting records based on the previous record

I have 2 tables as below
Table 1 : Animal (ID is a primary key)
ID |Animal
----------
1 |Dog
2 |Cat
3 |Fish
4 |Bird
5 |Elephant
Table 2: Pet (ID here is foreign keys to the Animal table)
ID | Animal | Name
----------
1 | Dog | Annie
1 | Dog | Buckie
2 | Cat | Conner
2 | Cat | Kitten
3 | Fish| Lala
I want to write a SQL statement to append a row with "Fish" right after wherever a specific pet "Dog" appears without breaking the order.
Expected result should be:
ID | Animal | Name
----------
1 | Dog | Annie
3 | Fish| NULL
1 | Dog | Buckie
3 | Fish| NULL
2 | Cat | Conner
2 | Cat | Kitten
3 | Fish| Lala
I'm not too sure about Oracle11g but I think it has ROW_NUMBER.
You could add a row number to the original table,
and then union a fish table with corresponding row numbers.
For example
WITH Tablex AS (
SELECT ROW_NUMBER() OVER(ORDER BY ID, Name) AS ref_id, *
FROM your_table
)
SELECT ID, Animal, Name
FROM (SELECT *
FROM Tablex
UNION ALL
SELECT *
FROM
(SELECT ref_id, 3 AS ID, 'Fish' AS Animal, NULL AS Name
FROM TableX
WHERE Animal = 'Dog'
) x
) X
ORDER BY ref_id, id
As commented above, the order of rows depends only on the ORDER BY clause and the order may not be actually incorporated in the table (if you put it INTO something).
Try this
select tn.Id
, case when tt.rn = 0 then tn.Animal else 'Fish' end Animal
, case when tt.rn = 0 then tn.Name else NULL end Name
, tn.rn+tt.rn rn
from (
select ID, Animal, Name, 2 * row_number() over (order by id, name) as rn
from pet
) tn
join (
select 0 rn from dual union
select 1 from dual
) tt on tt.rn <= case Animal when 'Dog' then 1 else 0 end
order by tn.rn+tt.rn;
with Q as (
select ID, Animal, Name,
row_number() over (order by id, name) rnum
from Pet
)
select ID, Animal, Name, rnum
from Q
union all
select 3, 'Fish', NULL, rnum+0.5
from Q
where ID=1 and name in('Annie','Buckie')
order by rnum

Query for missing elements

I have a table with the following structure:
timestamp | name | value
0 | john | 5
1 | NULL | 3
8 | NULL | 12
12 | john | 3
33 | NULL | 4
54 | pete | 1
180 | NULL | 4
400 | john | 3
401 | NULL | 4
592 | anna | 2
Now what I am looking for is a query that will give me the sum of the values for each name, and treats the nulls in between (orderd by the timestamp) as the first non-null name down the list, as if the table were as follows:
timestamp | name | value
0 | john | 5
1 | john | 3
8 | john | 12
12 | john | 3
33 | pete | 4
54 | pete | 1
180 | john | 4
400 | john | 3
401 | anna | 4
592 | anna | 2
and I would query SUM(value), name from this table group by name. I have thought and tried, but I can't come up with a proper solution. I have looked at recursive common table expressions, and think the answer may lie in there, but I haven't been able to properly understand those.
These tables are just examples, and I don't know the timestamp values in advance.
Could someone give me a hand? Help would be very much appreciated.
With Inputs As
(
Select 0 As [timestamp], 'john' As Name, 5 As value
Union All Select 1, NULL, 3
Union All Select 8, NULL, 12
Union All Select 12, 'john', 3
Union All Select 33, NULL, 4
Union All Select 54, 'pete', 1
Union All Select 180, NULL, 4
Union All Select 400, 'john', 3
Union All Select 401, NULL, 4
Union All Select 592, 'anna', 2
)
, NamedInputs As
(
Select I.timestamp
, Coalesce (I.Name
, (
Select I3.Name
From Inputs As I3
Where I3.timestamp = (
Select Max(I2.timestamp)
From Inputs As I2
Where I2.timestamp < I.timestamp
And I2.Name Is not Null
)
)) As name
, I.value
From Inputs As I
)
Select NI.name, Sum(NI.Value) As Total
From NamedInputs As NI
Group By NI.name
Btw, what would be orders of magnitude faster than any query would be to first correct the data. I.e., update the name column to have the proper value, make it non-nullable and then run a simple Group By to get your totals.
Additional Solution
Select Coalesce(I.Name, I2.Name), Sum(I.value) As Total
From Inputs As I
Left Join (
Select I1.timestamp, MAX(I2.Timestamp) As LastNameTimestamp
From Inputs As I1
Left Join Inputs As I2
On I2.timestamp < I1.timestamp
And I2.Name Is Not Null
Group By I1.timestamp
) As Z
On Z.timestamp = I.timestamp
Left Join Inputs As I2
On I2.timestamp = Z.LastNameTimestamp
Group By Coalesce(I.Name, I2.Name)
You don't need CTE, just a simple subquery.
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)), t.value
from inputs t
And summing from here
select name, SUM(value) as totalValue
from
(
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)) as name, t.value
from inputs t
) N
group by name
I hope I'm not going to be embarassed by offering you this little recursive CTE query of mine as a solution to your problem.
;WITH
numbered_table AS (
SELECT
timestamp, name, value,
rownum = ROW_NUMBER() OVER (ORDER BY timestamp)
FROM your_table
),
filled_table AS (
SELECT
timestamp,
name,
value
FROM numbered_table
WHERE rownum = 1
UNION ALL
SELECT
nt.timestamp,
name = ISNULL(nt.name, ft.name),
nt.value
FROM numbered_table nt
INNER JOIN filled_table ft ON nt.rownum = ft.rownum + 1
)
SELECT *
FROM filled_table
/* or go ahead aggregating instead */