HSQL DB is not accepting UNION in query - hsqldb

I'm using hsql db(v:2.3.6) and query is
select id, name, phone
from person p, address a
where a.id = p.id
order by id LIMIT 10
union
select id, name, phone
from old_person op, old_address oa
where op.id = oa.id
order by id LIMIT 10
But above query is throwing an error :
Caused by: org.hsqldb.HsqlException: unexpected token: UNION
I'm not sure why this is an issue.

Your query is intended to get 10 rows from the person table and 10 rows from the old_person table before merging the lists in UNION. For this, you need to use parentheses around each SELECT query.
(select id, name, phone
from person p, address a
where a.id = p.id
order by id LIMIT 10)
union
(select id, name, phone
from old_person op, old_address oa
where op.id = oa.id
order by id LIMIT 10)
If you remove the first LIMIT and keep the last one, you may get fewer rows (which could also be different rows) as the total limit is reduced from 20 to 10.

Related

How to use Array_agg without returning the same values in different Order?

When using Array_agg, it returns the same values in different orders. I tried using distinct in a few places and it didn't work. I tried using an order before and after the array and it would fail or not properly exclude results.
I am trying to find all fields in the field column that share the same time and same ID and put them into an array.
Columns are Fieldname, ID, Time
select b.Field, count(*)
from (select Time, ID, array_agg(fieldname) as Field
from a
group by 1,2
order by 3) b
group by b.field
order by 1 desc
This produces duplicate results
For example I will have:
Field Name Count
Ghost,Mark 1234
Mark,Ghost 1234
I also tried this below where I add a subquery where I first order the fields alphabetically when grouping time and ID but it failed to execute. I think due to array_agg not being the root query?
select a.Field, count(*)
from
(select Time, ID, array_agg(fieldname) as field
from
(select Time, ID, fieldname
from a
group by 1,2
order by 3 desc) a
group by 1,2 ) b
group by 1
order by 2 desc

SQL Optimization - using results from subqueries [Clickhouse]

Query aims: I would like to group columns where for each column is top 5 representatives for given pairs. For example I get 5 most common items in a whole table and for each item I get 5 most common users and for each distinct item-user pair I get 5 most common values and etc... This results in maximum distinct values of each column -> 5, 5^2, 5^3,5^4... (https://clickhouse.com/docs/en/sql-reference/statements/select/limit-by/). Without limiting the groups its basicaly this simple query
SELECT toStartOfDay("timestamp") AS "dt_timestamp",
item,
user,
value,
Count() as cnt
from base_table
GROUP BY dt_timestamp,
item,
user,
value
ORDER BY dt_timestamp asc,
cnt desc
I have a working query, but it is not as fast as I would like. The idea is to get 5 top items from base_table and then with inner joins over all columns get the result...
SELECT toStartOfDay("timestamp") AS "dt_timestamp",
item,
user,
value,
Count() as cnt
FROM (
SELECT item,
user,
value,
Count() as cnt
FROM(
SELECT item,
user,
Count() as cnt
FROM (
SELECT item,
Count() as cnt
FROM base_table
GROUP BY item
ORDER BY cnt desc
limit 5
) as q
INNER JOIN base_table on base_table.item = q.item
GROUP BY item,
user
ORDER BY cnt desc
limit 5 by item
) as qq
INNER JOIN base_table on base_table.item = qq.item
AND base_table.user = qq.user
GROUP BY item,
user,
value
ORDER BY cnt desc,
item desc
limit 5 by item, user
) as qqq
INNER JOIN base_table on base_table.item = qqq.item
AND base_table.user = qqq.user
AND base_table.value = qqq.value
GROUP BY dt_timestamp,
item,
user,
value
ORDER BY dt_timestamp asc,
cnt desc
NOTE: limit 5 by column has different functionality than limit, but its not really relevant in this question.
ISSUES and SPACE for Optimizations
I would like to reuse results of subqueries (q,qq) as they contain extracted items and items,users. So basically in qq I would use result of q and in qqq I would use result of q or qq
Is it possible to do inner join not on whole base_table but to somehow always pass reduced base_table? That means in q will be whole base_table but in qq will be base_table - unwanted items and in qqq will be base_table - unwanted items and users.
I tried to do 1) and 2) with WITH AS but its not very efficient because the query is run again everytime it is called.
If you have any idea how to optimize this query, it would be much appreciated

How to do this query to select N rows with highest numbers ordered by col

Lets say I have a table that has three columns: ID, Name and Users.
I want to select the 3 rows with the highest number of users and I wanted the rows to be ordered by the Name Ascending. How can I Achieve that?
I used
select Name from TABLE where ID IN (select ID from Tablesorder by Users desc limit 3)
But IN/ANY are not supported. Any other ways?
Thanks
When subqueries are allowed, you could use this.
It fetches the 3 records with highest value for column users. These 3 results will be ordered in the outer query.
select Name from
(
select Name
from Tables
order by Users desc
limit 3
) as temp
ORDER BY Name ASC
In Mysql :
SELECT id, name, users
FROM (SELECT id,name,users FROM tablename ORDER BY users DESC LIMIT 3) as a
ORDER BY name;
In Sql server
SELECT id, name, users
FROM (SELECT TOP 3 id,name,users FROM tablename ORDER BY users DESC ) as a
ORDER BY name;
In Oracle
SELECT id, name, users
FROM (SELECT id,name,users FROM tablename ORDER BY users DESC ) as a
WHERE ROWNUM<=3
ORDER BY name;

Adding count in select query

I am trying to find a query that would give me a count of another table in the query. The problem is that I have no idea what to set where in the count part to. As it is now it will just give back a count of all the values in that table.
Select
ID as Num,
(select Count(*) from TASK where ID=ID(Also tried Num)) as Total
from ORDER
The goal is to have a result that reads like
Num Total
_________________
1 13
2 5
3 22
You need table aliases. So I think you want:
Select ID as Num,
(select Count(*) from TASK t where t.ID = o.ID) as Total
from ORDER o;
By the way, ORDER is a terrible name for a table because it is a reserved work in SQL.
You can do it as a sub query or a join (or an OVER statement.)
I think the join is clearest when you are first learning SQL
Select
ID as Num, count(TASK.ID) AS Total
from ORDER
left join TASK ON ORDER.ID=TASK.ID
GROUP BY ORDER.ID

How do I create a SQL Distinct query and add some additional fields

I have the following query that selects combinations of first and last names and show me dupes. It works, not problems here.
I want to include three other fields for reference; Id, cUser, and cDate. These additional fields, however, should not be used to determine duplicates as I'd likely not end up with any duplicates.
SELECT * FROM
(SELECT FirstName, LastName, COUNT(*) as "Count"
FROM Contacts
WHERE ContactTypeID = 1
GROUP BY LastName,FirstName
) AS X
WHERE COUNT > 1
ORDER BY COUNT DESC
Any suggestions? Thanks!
SELECT *
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY FirstName, LastName) AS cnt
FROM Contacts
WHERE ContactTypeId = 1
) q
WHERE cnt > 1
ORDER BY
cnt DESC
This will return all fields for each of the duplicated records.
If these fields are always the same then you can include them in GROUP BY and it will not affect the detection of duplicates
If they are not then you must decide what kind of aggregate function you will apply on them, for example MAX() or MIN() would work and would give you some indication of which values are associated with some of the attributes for the duplicates.
Otherwise, if you want to see all of the records you can join back to the source
SELECT X2.* FROM
(SELECT FirstName, LastName, COUNT(*) as "Count"
FROM Contacts
WHERE ContactTypeID = 1
GROUP BY LastName,FirstName
) AS X INNER JOIN Contact X2 ON X.LastName = X2.LastName AND X.FirstName = X2.FirstName
WHERE COUNT > 1
ORDER BY COUNT DESC