H2 seems to misinterpret a valid join clause - sql

Here is a simple test database schema. There is really nothing special about it. I am using H2 version 1.4.200 in Oracle compatibility mode.
create table STUFF (
ID number(19) generated by default as identity (start with 1 increment by 1),
NAME varchar2(128) not null,
constraint PK_STUFF primary key (ID),
constraint BK_STUFF unique (NAME)
);
create table STUFF_DETAILS (
ID number(19) generated by default as identity (start with 1 increment by 1),
BLAH varchar2(128) not null,
constraint PK_STUFF_DETAILS primary key (ID)
);
create table STUFF_MORE_DETAILS (
ID number(19) generated by default as identity (start with 1 increment by 1),
BLAH_BLAH varchar2(128) not null,
constraint PK_STUFF_MORE_DETAILS primary key (ID)
);
Here's a view definition that works fine. No objection from H2.
create or replace view V_STUFF1
(
ID,
NAME,
BLAH,
BLAH_BLAH
)
as select
S.ID,
S.NAME,
SD.BLAH,
SMD.BLAH_BLAH
from
STUFF S
inner join STUFF_DETAILS SD
inner join STUFF_MORE_DETAILS SMD
on SD.ID = SMD.ID
on S.ID = SD.ID
;
Here's a view definition that H2 chokes on with the following error message:
Caused by: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "SD.ID" not found
create or replace view V_STUFF2
(
ID,
NAME,
BLAH,
BLAH_BLAH
)
as select
S.ID,
S.NAME,
SD.BLAH,
SMD.BLAH_BLAH
from
STUFF S
inner join STUFF_DETAILS SD
left outer join STUFF_MORE_DETAILS SMD
on SD.ID = SMD.ID
on S.ID = SD.ID
;
The only difference is the type of the join (left outer vs inner) but I fail to see a reason why this should make a difference with regards to SD.ID column visibility.
To me this looks like a defect in H2 but before I raise an issue with H2 project I want to make sure I am not missing something obvious or doing something stupid.
PS: I am aware I can rewrite the view definition and make H2 accept it but ideally I would like to keep SQL code as close to the original as possible. It is a migration project.
PPS: Oracle (and DB2) have no trouble with both view definitions, so the issue appears H2 specific

A valid Oracle view/query must have each join predicate following the name/alias of the table that is being joined.
By reordering the ON clauses the query could take the form:
create or replace view V_STUFF2
(
ID,
NAME,
BLAH,
BLAH_BLAH
)
as select
S.ID,
S.NAME,
SD.BLAH,
SMD.BLAH_BLAH
from STUFF S
inner join STUFF_DETAILS SD on S.ID = SD.ID
left outer join STUFF_MORE_DETAILS SMD on SD.ID = SMD.ID

The issue has been acknowledged as a defect [1] by H2 developers and resolved with this PR [2]
[1] https://github.com/h2database/h2database/issues/3311
[2] https://github.com/h2database/h2database/pull/3312

Related

how Inner join work on two foreign key from single table

I am working on Bus route management system , I made two table first one is Cities and second one is route have following queries
CREATE TABLE Cities
(
ID NUMBER GENERATED ALWAYS AS IDENTITY(START with 1 INCREMENT by 1) PRIMARY KEY,
Name Varchar(30) not null,
)
CREATE TABLE route
(
ID NUMBER GENERATED ALWAYS AS IDENTITY(START with 1 INCREMENT by 1) PRIMARY KEY,
Name Varchar(30) not null,
from NUMBER not null,
to NUMBER NOT NULL,
CONSTRAINT FROM_id_FK FOREIGN KEY(from) REFERENCES Cities(ID),
CONSTRAINT TO_id_FK FOREIGN KEY(to) REFERENCES Cities(ID),
)
i am joining the table through inner join
select CITIES.Name
from CITIES
inner join ROUTES on CITIES.ID=ROUTES.ID
but it show single column as
Name
-----------
but i want result as
from | to
------------------------
what is possible way to do this using inner join
I suspect you need something like the following:
select r.Name, cs.Name SourceCity, cd.Name DestinationCity
from routes r
join cities cs on cs.id = r.from
join cities cd on cd.id = r.to
Hope is working for you
select CITIES.Name,ROUTES.from,ROUTES.to
from CITIES inner join ROUTES on CITIES.ID=ROUTES.ID

Group by on non id field

I have the following setup of tables:
CREATE TABLE public.tags (
tag_id int4 NOT NULL,
creation_timestamp timestamp NULL,
"name" varchar(255) NULL,
CONSTRAINT tags_pkey PRIMARY KEY (tag_id)
);
-- public.tag_targets definition
-- Drop table
-- DROP TABLE public.tag_targets;
CREATE TABLE public.tag_targets (
id int4 NOT NULL,
creation_timestamp timestamp NULL,
target_id int8 NULL,
target_name varchar(255) NULL,
last_update_timestamp timestamp NULL,
tag_id int4 NULL,
CONSTRAINT tag_targets_pkey PRIMARY KEY (id),
CONSTRAINT fkcesi55mqvysjv63c1xf2j15oh FOREIGN KEY (tag_id) REFERENCES tags(tag_id)
);
I am trying to run the following query:
SELECT *
FROM tag_targets tt, tags t
WHERE tt.tag_id = t.tag_id
AND (t."name" IN ('Keeper', 'Pk'))
GROUP by tt.target_id
However it wants the PK of both Tags and Tagtarget in the group by:
ERROR: column "tt.id" must appear in the GROUP BY clause or be used in an aggregate function
Is there anyway to group on the target_id column? Also feel free to give any feedback on table design as I went for a generic mapping table and independent tags table
The problem is that you are requesting SELECT * but in GROUP BY you specified only tt.target_id. Generally speaking All column names in SELECT list must appear in GROUP BY. Oversimplifying: your database doesn't know what to do with all values you requested in select, that weren't used in GROUP BY or any agregate.
Try running following query to see if you are getting something
SELECT tt.target_id, count(*)
FROM tag_targets tt, tags t
WHERE tt.tag_id = t.tag_id
AND (t."name" IN ('Keeper', 'Pk'))
GROUP by tt.target_id
Unrelated but your syntax of table1, table2 with the join in the "where" clause is the non-ANSI syntax. It's not wrong or anything, but the ANSI syntax of explicit joins is preferred for a litany of reasons I won't go into:
SELECT *
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
On the surface, when you say group I am wondering if you mean "sort..." I am assuming you are new to SQL, so if that's an oversimplification, forgive me, but this would be perhaps what you wanted -- an "order by" instead of a group by.
SELECT *
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
order by
tt.target_id
If, on the other hand, you only wanted a single record for each target_id (which is truly a "group by target_id"), then perhaps this is what you wanted... one record per target_id, but then you have to identify how to prioritize which order is selected. In this example, I say pick the one based on the most recent updated date:
SELECT distinct on (tt.target_id)
*
FROM
tag_targets tt
join tags t on
tt.tag_id = t.tag_id
where
t."name" IN ('Keeper', 'Pk')
order by
tt.target_id, tt.last_update_timestamp desc
Not confident on either of these suggestions, so if they miss the mark, post some sample data and expected results.

How to select from multiple tables in a group by query?

I have some database tables containing some documents that people need to sign. The tables are defined (somewhat simplified) as follows.
create table agreement (
id integer NOT NULL,
name character varying(50) NOT NULL,
org_id integer NOT NULL,
CONSTRAINT agreement_pkey PRIMARY KEY (id)
CONSTRAINT org FOREIGN KEY (org_id) REFERENCES org (id) MATCH SIMPLE
)
create table version (
id integer NOT NULL,
content text NOT NULL,
publish_date timestamp NOT NULL,
agreement_id integer NOT NULL,
CONSTRAINT version_pkey PRIMARY KEY (id)
CONSTRAINT agr FOREIGN KEY (agreement_id) REFERENCES agreement (id) MATCH SIMPLE
)
I skipped the org table, to reduce clutter. I have been trying to write a query that would give me all the right agreement information for a given org. So far, I can do
SELECT a.id, a.name FROM agreement AS a
JOIN version as v ON (a.id = v.agreement_id)
JOIN org as o ON (o.id = a.org_id)
WHERE o.name = $1
GROUP BY a.id
This seems to give me a single record for each agreement that belongs to the org I want and has at least one version. But I need to also include content and date published of the latest version available. How do I do that?
Also, I have a separate table called signatures that links to a user and a version. If possible, I would like to extend this query to only include agreements where a given user didn't yet sign the latest version.
Edit: reflected the need for the org join, since I select orgs by name rather than by id
You can use a correlated subquery:
SELECT a.id, a.name, v.*
FROM agreement a JOIN
version v
ON a.id = v.agreement_id
WHERE a.org_id = $1 AND
v.publish_date = (SELECT MAX(v2.publish_date) FROM version v2 WHERE v2.agreement_id = v.agreement_id);
Notes:
The org table is not needed because agreement has an org_id.
No aggregation is needed for this query. You are filtering for the most recent record.
The correlated subquery is one method that retrieves the most recent version.
Postgresql has Window Functions.
Window functions allow you to operate a sort over a specific column or set of columns. the rank function returns the row's place in the results for the sort. If you filter to just where the rank is 1 then you will always get just one row and it will be the highest sorted for the partition.
select u.id, u.name, u.content, u.publish_date from (
SELECT a.id, a.name, v.content, v.publish_date, rank() over (partition by a.id order by v.id desc) as pos
FROM agreement AS a
JOIN version as v ON (a.id = v.agreement_id)
JOIN org as o ON (o.id = a.org_id)
WHERE o.id = $1
) as u
where pos = 1
SELECT a.id, a.name, max(v.publish_date) publish_date FROM agreement AS a
JOIN version as v ON (a.id = v.agreement_id)
JOIN org as o ON (o.id = a.org_id)
WHERE o.id = $1
GROUP BY a.id, a.name

How to massive update?

I have three tables:
group:
id - primary key
name - varchar
profile:
id - primary key
name - varchar
surname - varchar
[...etc...]
profile_group:
profile_id - integer, foreign key to table profile
group_id - integer, foreign key to table group
Profiles may be in many groups. I have group named "Users" with id=1 and I want to assign all users to this group but only if there was no such entry for the table profiles.
How to do it?
If I understood you correctly, you want to add entries like (profile_id, 1) into profile_group table for all profiles, that were not in this table before. If so, try this:
INSERT INTO profile_group(profile_id, group_id)
SELECT id, 1 FROM profile p
LEFT JOIN profile_group pg on (p.id=pg.profile_id)
WHERE pg.group_id IS NULL;
What you want to do is use a left join to the profile group table and then exclude any matching records (this is done in the where clause of the below SQL statement).
This is faster than using not in (select xxx) since the query profiler seems to handle it better (in my experience)
insert into profile_group (profile_id, group_id)
select p.id, 1
from profiles p
left join profile_group pg on p.id = pg.profile_id
and pg.group_id = 1
where pg.profile_id is null

SQL Anomaly Using 'USING' Clause with Nested Queries?

I have a normalized database containing 3 tables whose DDL is this:
CREATE CACHED TABLE Clients (
cli_id INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
defmrn_id BIGINT,
lastName VARCHAR(48) DEFAULT '' NOT NULL,
midName VARCHAR(24) DEFAULT '' NOT NULL,
firstName VARCHAR(24) DEFAULT '' NOT NULL,
doB INTEGER DEFAULT 0 NOT NULL,
gender VARCHAR(1) NOT NULL);
CREATE TABLE Client_MRNs (
mrn_id BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
cli_id INTEGER REFERENCES Clients ( cli_id ),
inst_id INTEGER REFERENCES Institutions ( inst_id ),
mrn VARCHAR(32) DEFAULT '' NOT NULL,
CONSTRAINT climrn01 UNIQUE (mrn, inst_id));
CREATE TABLE Institutions (
inst_id INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 100) PRIMARY KEY,
loc_id INTEGER REFERENCES Locales (loc_id ),
itag VARCHAR(6) UNIQUE NOT NULL,
iname VARCHAR(80) DEFAULT '' NOT NULL);
The first table contains a foreign key column, defmrn_id, that is a reference to a "default identifier code" that is stored in the second table (which is a list of all identifier codes). A record in the first table may have many identifiers, but only one default identifier. So yeah, I have created a circular reference.
The third table is just normalized data from the second table.
I wanted a query that would find a CLIENT record based on matching a supplied identifier code to any of the identifier codes in CLIENT_MRNs that may belong to that CLIENT record.
My strategy was to first identify those records that matched in the second table (CLIENT_MRN) and then use that intermediate result to join to records in the CLIENT table that matched other user-supplied searching criteria. I also need to denormalize the identifier reference defmrn_id in the 1st table. Here is what I came up with...
SQL = SELECT c.*, r.mrn, i.inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = ?
) AS m2 ON m2.cli_id = c.cli_id
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )
WHERE (<other user supplied search criteria...>);
The above works, but I spent some time trying to understand why the following was NOT working...
SQL = SELECT c.*, r.mrn, i.inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = ?
) AS m2 USING ( cli_id )
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )
WHERE (<other user supplied search criteria...>);
It seems to me that the second SQL should work, but it fails on the USING clause every time. I am executing these queries against a database managed by HSQLDB 2.2.9 as the RDBMS. Is this a parsing issue in HSQLDB or is this a known limitation of the USING clause with nested queries?
You can always try with HSQLDB 2.3.0 (a release candidate).
The way you report the incomplet SQL does not allow proper checking. But there is an ovbious mistake in the query. If you have:
SELECT INST_ID FROM CLIENTS_MRS AS R INNER JOIN INSTITUTIONS AS I USING (INST_ID)
INST_ID can be used in the SELECT column list only without a table qualifier. The reason is it is no longer considered a column of either table. The same is true with common columns if you use NATURAL JOIN.
This query is accepted by version 2.3.0
SELECT c.*, r.mrn, inst_id, i.itag, i.iname
FROM Clients AS c
INNER JOIN
(
SELECT m.cli_id
FROM Client_MRNs AS m
WHERE m.mrn = 2
) AS m2 USING ( cli_id )
INNER JOIN Client_MRNs AS r ON c.defmrn_id = r.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )