How can I choose which column do I refer to? - sql

I have 2 tables with some duplicate columns. I need to join them without picking which columns I want to select:
CREATE TABLE IF NOT EXISTS animals (
id int(6) unsigned NOT NULL,
cond varchar(200) NOT NULL,
animal varchar(200) NOT NULL,
PRIMARY KEY (id)
) DEFAULT CHARSET=utf8;
INSERT INTO animals (id, cond, animal) VALUES
('1', 'fat', 'cat'),
('2', 'slim', 'cat'),
('3', 'fat', 'dog'),
('4', 'slim', 'dog'),
('5', 'normal', 'dog');
CREATE TABLE IF NOT EXISTS names (
id int(6) unsigned NOT NULL,
name varchar(200) NOT NULL,
animal varchar(200) NOT NULL,
PRIMARY KEY (id)
) DEFAULT CHARSET=utf8;
INSERT INTO names (id, name, animal) VALUES
('1', 'LuLu', 'cat'),
('2', 'DoDo', 'cat'),
('3', 'Jack', 'dog'),
('4', 'Shorty', 'dog'),
('5', 'Stinky', 'dog');
SELECT *
FROM animals AS a
JOIN names as n
ON a.id = n.id;
Result:
| id | cond | animal | id | name | animal |
| --- | ------ | ------ | --- | ------ | ------ |
| 1 | fat | cat | 1 | LuLu | cat |
| 2 | slim | cat | 2 | DoDo | cat |
| 3 | fat | dog | 3 | Jack | dog |
| 4 | slim | dog | 4 | Shorty | dog |
| 5 | normal | dog | 5 | Stinky | dog |
But when I try to make another request from the resulting table like:
SELECT name
FROM
(
SELECT *
FROM animals AS a
JOIN names as n
ON a.id = n.id
) as res_tbl
WHERE name = 'LuLu';
I get:
Query Error: Error: ER_DUP_FIELDNAME: Duplicate column name 'id'
Is there any way of avoiding it except removing duplicate columns from the 1st request?
P.S. in fact I am using PostgreSQL, I create my schema as MySQL because I am more used to it

You have columns with the same name in both tables, which causes ambiguity.
If you just want the name column in the outer query, then select that column only in the subquery:
select name
from (
select n.name
from animals a
inner join names n using (id)
) t
where ...
If you want more columns, then you would typically alias the homonym columns to remove the ambiguity - as for the joining column (here, id), the using() syntax is sufficient. So, for example:
select ...
from (
select id, a.cond, a.animal as animal1, n.name, n.animal as animal2
from animals a
inner join names n using (id)
) t
where ...
You may also select the records themselves, instead of the columns from them, which you can then access in an outer query using the usual record.column syntax;
SELECT a.cond animal_cond,
n.name animal_name
FROM (
SELECT a, n
FROM animals AS a
JOIN names as n
ON a.id = n.id
) t

Related

ORDER BY value in join table not grouped before aggregation

I am trying to order a Postgres result set based on an array_aggregate function.
I have the following query that works great:
select a.id, a.name, array_agg(f.name)
from actors a
join actor_films af on a.id = actor_id
join films f on film_id = f.id
group by a.id
order by a.id;
This gives me the following results, for example:
id | name | array_agg
----+--------+---------------------------------
1 | bob | {"delta force"}
2 | joe | {"delta force","the funny one"}
3 | fred | {"bad movie",AARRR}
4 | sally | {"the funny one"}
5 | suzzy | {"bad movie","delta force"}
6 | jill | {AARRR}
7 | victor | {"the funny one"}
I want to sort the results so that it is sorted alphabetically by Film name. For example, the final order should be:
id | name | array_agg
----+--------+---------------------------------
3 | fred | {"bad movie",AARRR}
6 | jill | {AARRR}
5 | suzzy | {"bad movie","delta force"}
1 | bob | {"delta force"}
2 | joe | {"delta force","the funny one"}
4 | sally | {"the funny one"}
7 | victor | {"the funny one"}
This is based on the alphabetical name of any movies they are in. When I add the ORDER BY f.name I get the following error:
ERROR: column "f.name" must appear in the GROUP BY clause or be used in an aggregate function
I cannot add it to the group, because I need it aggregated in the array, and I want to sort pre-aggregation, such that I can get the following order. Is this possible?
If you would like reproduce this example, here is the setup code:
create table actors(id serial primary key, name text);
create table films(id serial primary key, name text);
create table actor_films(actor_id int references actors (id), film_id int references film (id));
insert into actors (name) values('bob'), ('joe'), ('fred'), ('sally'), ('suzzy'), ('jill'), ('victor');
insert into films (name) values('AARRR'), ('the funny one'), ('bad movie'), ('delta force');
insert into actor_films(actor_id, film_id) values (2, 2), (7, 2), (4,2), (2, 4), (1, 4), (5, 4), (6, 1), (3, 1), (3, 3), (5, 3);
And the final query with the error:
select a.id, a.name, array_agg(f.name)
from actors a
join actor_films af on a.id = actor_id
join films f on film_id = f.id
group by a.id
order by f.name, a.id;
You can use an aggregation function:
order by min(f.name), a.id

How to return columns from nested SELECT queries in the final table?

I have three layered nested query which works.
select PARTNER, BIRTHDT, XSEXM, XSEXF from "schema"."platform.view/table2" where partner IN
(select SID from "schema"."platform.view/table1" where TYPE='BB' and CLASS='yy' and ID IN
(select SID from "schema"."platform.view/table1" where TYPE='AA' and CLASS='zz' and ID IN ("one", "two")
))
I want the values ( "one", "two") from table1 in the inner most query to be present in the final Table returned.
I have tried to get it like this:
select t1.ID, t2.SID from "schema"."platform.view/table1" t1
OUTER APPLY (
select SID from "schema"."platform.view/table1" t2
where t2.TYPE='BB' and t2.CLASS='yy' and t2.ID IN t1.SID
)
where t1.TYPE='AA' and t1.CLASS='zz' and t1.ID IN ("one", "two")
There are three three identifiers:
1. ID ( ONE, TWO, etc.)
2. intermediate SID ( 123, 124, etc) which is again searched as ID
3. Partner ID (P12, P13, etc) which maps to table2.
Sample Data:
table1:
| ID | SID | TYPE | CLASS |
|------|-----|------|-------|
| ONE | 123 | AA | zz |
| TWO | 124 | AA | zz |
| 123 | P12 | BB | yy |
| THRE | 125 | AA | zz |
| 124 | P13 | BB | yy |
| 125 | P14 | BB | yy |
| FOUR | 123 | AA | zz |
table2:
| PARTNER | BIRTHDT | XSEXM | XSEXF |
|---------|----------|-------|-------|
| P12 | 19900214 | X | |
| P13 | 19900713 | X | |
| P14 | 19900407 | | X |
Desired Output for Input ("ONE", "TWO", "THRE"):
| ID | PARTNER | BIRTHDT | XSEXM | XSEXF |
|-----|---------|----------|-------|-------|
| ONE | P12 | 19900214 | X | |
| TWO | P13 | 19900713 | X | |
| THRE| P14 | 19900407 | | X |
How to map this initial search value with its final result rows in this three layer nested statement?
Since you want to "carry" information from your "inner" SELECTs you can either "join back" the data at the final projection step which requires that you have a 1:1 relationship that you could use for the join.
This is not the case here.
Instead, don't use the WHERE ... IN (SELECT ID...) approach, but INNER JOINs instead.
These allow the same kind of filtering/selection but also give the option to project any column of the two involved tables.
For your rather abstract statement (the column names really need a lot of context knowledge in order to make sense... - that's something you may want to fix by adding useful column aliases) this can look like so:
drop table tab1;
drop table tab2;
CREATE TABLE TAB1
("ID" varchar(6)
, "SID" varchar(5)
, "TYPE" varchar(6)
, "CLASS" varchar(7))
;
INSERT INTO TAB1
VALUES ('ONE', '123', 'AA', 'zz');
INSERT INTO TAB1
VALUES ('TWO', '124', 'AA', 'zz');
INSERT INTO TAB1
VALUES ('123', 'P12', 'BB', 'yy');
INSERT INTO TAB1
VALUES ('THRE', '125', 'AA', 'zz');
INSERT INTO TAB1
VALUES ('124', 'P13', 'BB', 'yy');
INSERT INTO TAB1
VALUES ('125', 'P14', 'BB', 'yy');
INSERT INTO TAB1
VALUES ('FOUR', '123', 'AA', 'zz');
select * from tab1;
CREATE TABLE TAB2
("PARTNER" varchar(9)
, "BIRTHDT" varchar(10)
, "XSEXM" varchar(7)
, "XSEXF" varchar(7))
;
INSERT INTO TAB2
VALUES ('P12', '19900214', 'X', NULL);
INSERT INTO TAB2
VALUES ('P13', '19900713', 'X', NULL);
INSERT INTO TAB2
VALUES ('P14', '19900407', NULL, 'X');
with id_sel as (
select SID, ID
from TAB1
where
TYPE='AA'
and CLASS='zz'
and ID IN ('ONE', 'TWO', 'THRE')
),
part_sel as (
select
t1.SID, id.ID orig_id
from
TAB1 t1
inner join id_sel id
on t1.id = id.sid
where
t1.TYPE='BB'
and t1.CLASS='yy'
)
select
part_sel.orig_id, t2.PARTNER, t2.BIRTHDT, t2.XSEXM, t2.XSEXF
from
TAB2 t2
inner join part_sel
on t2.partner = part_sel.sid;
ORIG_ID PARTNER BIRTHDT XSEXM XSEXF
ONE P12 19900214 X ?
TWO P13 19900713 X ?
THRE P14 19900407 ? X

SQL Joins with NOT IN displays incorrect data

I have 3 tables as below, and I need data where Expense.Expense_Code Should not be availalbe in Income.Income_Code.
Table: Base
+----+-----------+----------------+
| ID | Reference | Reference_Name |
+----+-----------+----------------+
| 1 | 10000 | AAAA |
| 2 | 10001 | BBBB |
| 3 | 10002 | CCCC |
+----+-----------+----------------+
Table: Expense
+-----+---------+--------------+----------------+
| EID | BASE_ID | Expense_Code | Expense_Amount |
+-----+---------+--------------+----------------+
| 1 | 1 | I0001 | 25 |
| 2 | 1 | I0002 | 50 |
| 3 | 2 | I0003 | 75 |
+-----+---------+--------------+----------------+
Table: Income
+------+---------+-------------+------------+
| I_ID | BASE_ID | Income_Code | Income_Amt |
+------+---------+-------------+------------+
| 1 | 1 | I0001 | 10 |
| 2 | 1 | I0002 | 20 |
| 3 | 1 | I0003 | 30 |
+------+---------+-------------+------------+
SELECT DISTINCT Base.Reference,Expense.Expense_Code
FROM Base
JOIN Expense ON Base.ID = Expense.BASE_ID
JOIN Income ON Base.ID = Income.BASE_ID
WHERE Expense.Expense_Code IN ('I0001','I0002')
AND Income.Income _CODE NOT IN ('I0001','I0002')
I expect no data be retured.
However I am getting the result as below:
+-----------+--------------+
| REFERENCE | Expense_Code |
+-----------+--------------+
| 10000 | I0001 |
| 10000 | I0002 |
+-----------+--------------+
For Base.Reference (10000), Expense.Expense_Code='I0001','I0002' the same expense_code is availalbe in Income table therefore I should not get any data.
Am I trying to do something wrong with the joins.
Thanks in advance for your help!
You are not joining EXPENSE and INCOME tables in your query at all. There needs to be a condition to join these tables in order to get desired result. You can also use NOT EXISTS clause. Prefer using NOT EXISTS over NOT IN as it performs better in case there are NULLS allowed in the columns that you're joining on.
SELECT * FROM BASE B
JOIN EXPENSE E ON B.ID=E.BASE_ID
WHERE E.EXPENSE_CODE NOT EXISTS (SELECT I.INCOME_CODE FROM INCOME I WHERE I.I_ID=E.EID)
When the first join is performed, you end with two lines possessing the ID 1, because the relationship between the tables is not 1o1, hence every line of the first table will have joined to it a line coming from the second table. Like so:
Output of the first join statement
Then, when the second part of your statement is executed, the DBMS finds two ID's 1 from the first joined table(BASE+EXPENSE) and 3 from the third table(INCOME).
Again since it's non a 1o1 relationship between tables, every row from the first joined table will have a joined line coming from the second table, like so: Output of the second join statement
Finally, when it reads your where clause and outputs what you see. I highlighted the excluded rows from the where clause
Output of where statement
...I need data where Expense.Expense_Code Should not be availalbe in Income.Income_Code
The following query will retrieve this data:
select b.*, e.*
from base b
join expense e on e.base_id = b.id
left join income i on i.base_id = e.base_id
and e.expense_code = i.income_code
where i.i_id is null
For reference the data script (slightly modified) is:
create table base (
id number(6),
reference number(6),
reference_name varchar2(10)
);
insert into base (id, reference, reference_name) values (1, 10000, 'AAAA');
insert into base (id, reference, reference_name) values (2, 10001, 'BBBB');
insert into base (id, reference, reference_name) values (3, 10002, 'CCCC');
create table expense (
eid number(6),
base_id number(6),
expense_code varchar2(10),
expense_amount number(6)
);
insert into expense (eid, base_id, expense_code, expense_amount) values (1, 1, 'I0001', 25);
insert into expense (eid, base_id, expense_code, expense_amount) values (2, 1, 'I0002', 50);
insert into expense (eid, base_id, expense_code, expense_amount) values (3, 1, 'I0003', 75);
insert into expense (eid, base_id, expense_code, expense_amount) values (4, 2, 'I0004', 101);
create table income (
i_id number(6),
base_id number(6),
income_code varchar2(10),
income_amt number(6)
);
insert into income (i_id, base_id, income_code, income_amt) values (1, 1, 'I0001', 10);
insert into income (i_id, base_id, income_code, income_amt) values (2, 1, 'I0002', 20);
insert into income (i_id, base_id, income_code, income_amt) values (3, 1, 'I0003', 30);
Result:
ID REFERENCE REFERENCE_NAME EID BASE_ID EXPENSE_CODE EXPENSE_AMOUNT
-- --------- -------------- --- ------- ------------ --------------
2 10,001 BBBB 4 2 I0004 101

Select rows into columns and show a flag in the column

Trying to get an output like the below:
| UserFullName | JAVA | DOTNET | C | HTML5 |
|--------------|--------|--------|--------|--------|
| Anne San | | | | |
| John Khruf | 1 | 1 | | 1 |
| Mary Jane | 1 | | | 1 |
| George Mich | | | | |
This shows the roles of a person. A person could have 0 or N roles. When a person has a role, I am showing a flag, like '1'.
Actually I have 2 blocks of code:
Block #1: The tables and a simple output which generates more than 1 rows per person.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE AvailableRoles
(
id int identity primary key,
CodeID varchar(5),
Description varchar(500),
);
INSERT INTO AvailableRoles
(CodeID, Description)
VALUES
('1', 'JAVA'),
('2', 'DOTNET'),
('3', 'C'),
('4', 'HTML5');
CREATE TABLE PersonalRoles
(
id int identity primary key,
UserID varchar(100),
RoleID varchar(5),
);
INSERT INTO PersonalRoles
(UserID, RoleID)
VALUES
('John.Khruf', '1'),
('John.Khruf', '2'),
('Mary.Jane', '1'),
('Mary.Jane', '4'),
('John.Khruf', '4');
CREATE TABLE Users
(
UserID varchar(20),
EmployeeType varchar(1),
EmployeeStatus varchar(1),
UserFullName varchar(500),
);
INSERT INTO Users
(UserID, EmployeeType, EmployeeStatus, UserFullName)
VALUES
('John.Khruf', 'E', 'A', 'John Khruf'),
('Mary.Jane', 'E', 'A', 'Mary Jane'),
('Anne.San', 'E', 'A', 'Anne San'),
('George.Mich', 'T', 'A', 'George Mich');
Query 1:
SELECT
A.UserFullName,
B.RoleID
FROM
Users A
LEFT JOIN PersonalRoles B ON B.UserID = A.UserID
WHERE
A.EmployeeStatus = 'A'
ORDER BY
A.EmployeeType ASC,
A.UserFullName ASC
Results:
| UserFullName | RoleID |
|--------------|--------|
| Anne San | (null) |
| John Khruf | 1 |
| John Khruf | 2 |
| John Khruf | 4 |
| Mary Jane | 1 |
| Mary Jane | 4 |
| George Mich | (null) |
Block #2: An attempt to convert the rows into columns to be used in the final result
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE AvailableRoles
(
id int identity primary key,
CodeID varchar(5),
Description varchar(500),
);
INSERT INTO AvailableRoles
(CodeID, Description)
VALUES
('1', 'JAVA'),
('2', 'DOTNET'),
('3', 'C'),
('4', 'HTML5');
Query 1:
SELECT
*
FROM
(
SELECT CodeID, Description
FROM AvailableRoles
) d
PIVOT
(
MAX(CodeID)
FOR Description IN (Java, DOTNET, C, HTML5)
) piv
Results:
| Java | DOTNET | C | HTML5 |
|--------|--------|-------|--------|
| 1 | 2 | 3 | 4 |
Any help in mixing both blocks to show the top output will be welcome. Thanks.
Another option without PIVOT operator is:
select u.UserFullName,
max(case when a.CodeID='1' then '1' else '' end) JAVA,
max(case when a.CodeID='2' then '1' else '' end) DOTNET,
max(case when a.CodeID='3' then '1' else '' end) C,
max(case when a.CodeID='4' then '1' else '' end) HTML5
from
Users u
LEFT JOIN PersonalRoles p on (u.UserID = p.UserID)
LEFT JOIN AvailableRoles a on (p.RoleID = a.CodeID)
group by u.UserFullName
order by u.UserFullName
SQLFiddle: http://sqlfiddle.com/#!3/630c3/19
You can try this.
SELECT *
FROM
(
select u.userfullname,
case when p.roleid is not null then 1 end as roleid,
a.description
from users u
left join personalroles p
on p.userid = u.userid
left join availableroles a
on a.codeid = p.roleid
) d
PIVOT
(
MAX(roleID)
FOR Description IN (Java, DOTNET, C, HTML5)
) piv
Fiddle

Use DISTINCT ON with empty n:n relations

I'm a new user of PostgreSQL, trying to use DISTINCT ON but I can't reach my goal.
Here's a brief sketch of my database :
files with versioning
fields with model (for form generation purpose)
n:n relations between files' versions and fields
I would like to retrieve a whole set of fields for a specified file's version.
My problem is that we could have (and we'll) empty values, ie. missing FileVersion_Field relations. I'll try to give you an example bellow :
FileVersion Field
+----------------+---------+---------+ +----------+-------+---------------+
| id_fileversion | id_file | version | | id_field | value | id_fieldmodel |
+----------------+---------+---------+ +----------+-------+---------------+
| 1 | 1 | 1 | | 1 | Smith | 1 |
| 2 | 1 | 2 | | 2 | 20 | 2 |
+----------------+---------+---------+ | 3 | 25 | 2 |
+----------+-------+---------------+
FileVersion_Field FieldModel
+----------------+----------+ +---------------+------+
| id_fileversion | id_field | | id_fieldmodel | type |
+----------------+----------+ +---------------+------+
| 1 | 1 | | 1 | Name |
| 1 | 2 | | 2 | Age |
| 2 | 3 | +---------------+------+
+----------------+----------+
In this example, I would like to get these results:
-- id_file=1 & version=1
Name | Smith
Age | 20
-- id_file=1 & version=2
Name |
Age | 25
Here's what I've tried, which doesn't work :
SELECT DISTINCT ON(FieldModel.id_fieldmodel) *
FROM File
LEFT JOIN FileVersion ON File.id_file = FileVersion.id_file
LEFT JOIN FileVersion_Field ON FileVersion.id_fileversion = FileVersion_Field.id_fileversion
LEFT JOIN Field ON FileVersion_Field.id_field = Field.id_field
RIGHT JOIN FieldModel ON (Field.id_fieldmodel = FieldModel.id_fieldmodel OR FieldModel.id_fieldmodel IS NULL)
WHERE (FieldModel.id_fieldmodel IS NOT NULL AND FileVersion.version = 2 AND File.id_file = 1)
OR (Field.id_fieldmodel IS NULL)
ORDER BY FieldModel.id_fieldmodel;
-- Sample Structure
CREATE TABLE File (
id_file integer PRIMARY KEY);
CREATE TABLE FieldModel (
id_fieldmodel integer PRIMARY KEY, type varchar(50));
CREATE TABLE FileVersion (
id_fileversion integer PRIMARY KEY,
id_file integer, version integer,
CONSTRAINT fk_fileversion_file FOREIGN KEY(id_file) REFERENCES File(id_file));
CREATE TABLE Field (
id_field integer PRIMARY KEY,
id_fieldmodel integer,
value varchar(255),
CONSTRAINT fk_field_fieldmodel FOREIGN KEY(id_fieldmodel) REFERENCES FieldModel(id_fieldmodel));
CREATE TABLE FileVersion_Field (
id_fileversion integer,
id_field integer,
PRIMARY KEY(id_fileversion, id_field),
CONSTRAINT fk_fileversionfield_fileversion FOREIGN KEY(id_fileversion) REFERENCES FileVersion(id_fileversion),
CONSTRAINT fk_fileversionfield_field FOREIGN KEY(id_field) REFERENCES Field(id_field));
-- Sample Data
INSERT INTO File (id_file) VALUES (1);
INSERT INTO FileVersion (id_fileversion, id_file, version) VALUES (1, 1, 1), (2, 1, 2);
INSERT INTO FieldModel (id_fieldmodel, type) VALUES (1, 'Name'), (2, 'Age');
INSERT INTO Field (id_field, id_fieldmodel, value) VALUES (1, 1, 'Smith'), (2, 2, '20'), (3, 2, '25');
INSERT INTO FileVersion_Field (id_fileversion, id_field) VALUES (1, 1), (1, 2), (2, 3);
7 years later, time to exorcize my daemons!
I just needed to change my way of thinking.
First, we need the list of all used FieldModel for a File, whatever the version:
SELECT DISTINCT(fm.id_fieldmodel), fm.type
FROM FieldModel fm
LEFT JOIN Field f ON fm.id_fieldmodel = f.id_fieldmodel
LEFT JOIN FileVersion_Field fvf ON f.id_field = fvf.id_field
LEFT JOIN FileVersion fv ON fv.id_fileversion = fvf.id_fileversion
WHERE fv.id_file = 1;
-- id_fieldmodel | type
-- ---------------+------
-- 1 | Name
-- 2 | Age
Now, we need the list of Field for the same File, but this time with a specified version:
SELECT f.id_fieldmodel, f.value
FROM FileVersion_Field fvv
JOIN FileVersion fv ON fv.id_fileversion = fvv.id_fileversion
JOIN Field f ON f.id_field = fvv.id_field
WHERE fv.id_file = 1 AND fv.version = 2;
-- id_fieldmodel | value
-- ---------------+-------
-- 2 | 25
All that remains is to use a LEFT JOIN on both computed tables, by allowing NULL values in the fields:
SELECT fm.type, f.value
FROM (
SELECT DISTINCT(fm.id_fieldmodel), fm.type
FROM FieldModel fm
LEFT JOIN Field f ON fm.id_fieldmodel = f.id_fieldmodel
LEFT JOIN FileVersion_Field fvf ON f.id_field = fvf.id_field
LEFT JOIN FileVersion fv ON fv.id_fileversion = fvf.id_fileversion
WHERE fv.id_file = 1
) fm
LEFT JOIN (
SELECT f.id_fieldmodel, f.value
FROM FileVersion_Field fvv
JOIN FileVersion fv ON fv.id_fileversion = fvv.id_fileversion
JOIN Field f ON f.id_field = fvv.id_field
WHERE fv.id_file = 1 AND fv.version = 2
) f ON (f.id_fieldmodel = fm.id_fieldmodel OR f.id_fieldmodel IS NULL);
-- type | value
-- ------+-------
-- Name |
-- Age | 25