Your query does not include the specified expression "ID" as part of aggregate function - sql

In MSAccess I have the query below. When I try to run the query gives the error Your query does not include the specified expression "ID" as part of aggregate function and I can't find the reason. What is the problem in my query?
SELECT
Count(t2.subjectid) AS CountOfsubjectid,
t2.pname,
(
select
max(outcometime)
from
table1 t1
where
t1.id = t2.id
)
AS showntime
FROM
table2 AS t2
WHERE
t2.outcome = "accepted"
GROUP BY
t2.pname,
t2.showntime;
UPDATE (SAMPLE DATA):
Table1:
ID outcometime pname outcome subjectid
1 20181111 USB shown Ux1ku
1 20181113 USB shown Ux1ku
2 20181115 USB shown Tsn2f
3 20181116 USB shown O93nf
2 20181114 USB shown Tsn2f
2 20181112 USB shown Tsn2f
Table2:
ID outcometime pname outcome subjectid
1 20181118 USB accepted Ux1ku
2 20181119 USB accepted Tsn2f
3 20181117 USB accepted O93nf
Desired Result:
pname showntime countofsubjectid
USB 20181113 1
USB 20181115 1
USB 20181116 1
Also updated the sample data. It was wrong.
Thanks.

Currently, you are attempting to run a correlated subquery in the SELECT clause encapsulated in an aggregate query and then reference this very subquery by alias in the GROUP BY clause.
Consider using a derived table to first run your unit level with subquery and then in outer main query run your aggregation.
SELECT
dt.pname,
COUNT(subjectid) AS CountOfsubjectid,
dt.showntime
FROM
(SELECT
t2.subjectid
t2.pname,
(
select
max(outcometime)
from
table1 t1
where
t1.id = t2.id
)
AS showntime
FROM
table2 AS t2
WHERE
t2.outcome = 'accepted'
) AS dt
GROUP BY
dt.pname,
dt.showntime;
However, consider avoiding the inefficient correlated subquery to run for every row in table to joining on an aggregate query for the MAX calculated once and then run aggregation again for COUNT on main level.
SELECT
t2.pname,
COUNT(t.subjectid) AS CountOfsubjectid,
agg.showntime
FROM
table2 AS t2
INNER JOIN
(
select
t1.id,
max(outcometime) as showntime
from
table1 t1
group by
t1.id
) AS agg
ON t2.id = agg.id
WHERE
t2.outcome = 'accepted'
GROUP BY
t2.pname,
agg.showntime;

If I understand correctly, you need to write this as:
SELECT Count(t2.subjectid) AS CountOfsubjectid,
t2.name,
(select max(outcometime)
from table1 as t1 inner join
table2 as tt2
on t1.id = tt2.id
where tt2.name = t2.name
) as showntime
FROM table2 AS t2
WHERE t2.outcome = "accepted"
GROUP BY t2.name;

Related

How to use a subquery result for another sql select?

I want to use the result of a sql query and send another query based on the result.
Exmaple (of course real live query is more complex):
table1: name, age
table2: name, age, field1, fieldN
First query:
select name, age from table1 where age > 18.
Now I'd like to find all entries from table2 that match the multiple resulting fields of the first query.
Important note: I want to retrieve the full rows of table2 where the match is.
But how?
If you want to automatically join based on matching column names, then you can use a NATURAL JOIN:
WITH query1 AS (
SELECT age, name FROM table1 WHERE age > 18
)
SELECT age, name, t2.field1, t2.fieldN
FROM table2 t2 NATURAL JOIN query1;
Now, while NATURAL JOIN is generally not recommended, as it is really weak because your queries using it can easily brake due to schema changes, it may be OK for hand ad-hoc queries, or for queries, like the above, where you can make the columns used explicit. In either case, I advise against it and use the common join style:
WITH query1 AS (
SELECT age, name FROM table1 WHERE age > 18
)
SELECT t2.age, t2.name, t2.field1, t2.fieldN
FROM table2 t2 JOIN query1 q1 ON t2.age = q1.age AND t2.name = t1.name;
Now I'd like to:
find
all entries from table2
that match the multiple resulting fields
of the first query
SELECT * -- find
FROM table2 t2 -- from t2
WHERE EXISTS (
SELECT * FROM table1 t1
WHERE t1.name = t2.name -- that match
AND t1.age = t2.age -- Huh? "multiple matching fields" ?
AND t1.age > 18 -- with the same condition
);
Actually this is what I was looking for, but thanks for any help:
select * from table2 where (name, age) IN (
select name, age from table1 where age > 18
)
Query build based on MS sql server
select t1.*
from table1 as t1
join table2 as t2 on t1.name=t2.name and t1.age=t2.age
where t2.age > 18

SQL - get max result

Assume there is a table name "test" below:
name value
n1 1
n2 2
n3 3
Now, I want to get the name which has the max value, I have some solution below:
Solution 1:
SELECT TOP 1 name
FROM test
ORDER BY value DESC
solution 2:
SELECT name
FROM test
WHERE value = (SELECT MAX(value) FROM test);
Now, I hope use join operation to find the result, like
SELECT name
FROM test
INNER JOIN test ON...
Could someone please help and explain how it works?
If you are looking for JOIN then
SELECT T.name, T.value
FROM test T
INNER JOIN
( SELECT T1.name, T1.value ,
RANK() OVER (PARTITION BY T1.name ORDER BY T1.value) N
FROM test T1
WHERE T1.value IN (SELECT MAX(t2.value) FROM test T2)
)T3 ON T3.N = 1 AND T.name = T3.name
FIDDLE DEMO
or
select name, value
from
(
select name, value,
row_number() over(order by value desc) rn
from test
) src
where rn = 1
FIDDLE DEMO
First, note that solutions 1 and 2 could give different results when value is not unique. If in your test data there would be an additional record ('n4', 3), then solution 1 would return either 'n3' or 'n4', but solution 2 would return both.
A solution with JOIN will need aliases for the table, because as you started of, the engine would say Ambiguous column name 'name'.: it would not know whether to take name from the first or second occurrence of the test table.
Here is a way to complete the JOIN version:
SELECT t1.name
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
WHERE t2.value IS NULL;
This query takes each of the records, and checks if any records exist that have a higher value. If not, the first record will be in the result. Note the use of LEFT: this denotes an outer join, so that records from t1 that have no match with t2 -- based on the ON condition -- are not immediately rejected (as would be the case with INNER): in fact, we want to reject all the other records, which is done with the WHERE clause.
A way to understand this mechanism, is to look at a variant of the query above, which lacks the WHERE clause and returns the values of both tables:
SELECT t1.value, t2.value
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
On your test data this will return:
t1.value t2.value
1 2
1 3
2 3
3 (null)
Note that the last entry would not be there if the join where an INNER JOIN. But with the outer join, one can now look for the NULL values and actually get those records in the result that would be excluded from an INNER JOIN.
Note that this query will give the same result as solution 2 when there are duplicate values. If you want to have also only one result like with solution 1, it suffices to add TOP 1 after SELECT.
Here is a fiddle.
Alternative with pure INNER JOIN
If you really want an INNER join, then this will do it. Again the TOP 1 is only needed if you have non-unique values:
SELECT TOP 1 t1.name
FROM test t1
INNER JOIN (SELECT Max(value) AS value FROM test) t2
ON t2.value = t1.value;
But this one really is very similar to what you did in solution 2. Here is fiddle for it.

array_agg contains another array_agg

t1
id|entity_type
9|3
9|4
9|5
2|3
2|5
t2
id|entity_type
1|3
1|4
1|5
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type by t1.entity_type) =
(SELECT ARRAY_AGG(t2.entity_type by t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
Result:
t1.id = 9|array_agg{3,4,5}
I have two tables t1 and t2. I want to get value of t1.id where t1.entity_type array equals t2.entity_type array.
In this scenario everything works fine. For t2.id = 1 I receive t1.id = 9.
Both have the same array of entity_type: {3,4,5}
Now I'd like to get t1.id not only for equal sets, but also for smaller sets.
If I modify t2 this way:
t2
id|entity_type
1|3
1|4
and modify query this way:
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type by t1.entity_type) >= /*MODIFICATION*/
(SELECT ARRAY_AGG(t2.entity_type by t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
I don't receive the expected result:
t1.id = 1 has {3, 4, 5}
t2.id = 1 has {3, 4}
Arrays in t1 that contain the array in t2 should qualify. I expect to receive results as in first case but I get no rows.
Is there any method like: ARRAY_AGG contains another ARRAY_AGG?
Clean up
First of all, syntax error. I assume you mean:
ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type)
Details in the manual.
Next, it would be inefficient to use two differing invocations of array_agg(). Use the same (ORDER BY in SELECT list and HAVING clause):
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) = (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
-- GROUP BY id -- not needed
);
"contains" operator #>
Like Nick commented, your 2nd query would work with the "contains" operator #>
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) #> (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
);
But this is very inefficient for big tables.
Faster query
This is a case of relational division. Depending on your (missing) exact table definition, there are more efficient techniques. We have gathered a whole arsenal under this related question:
How to filter SQL results in a has-many-through relation
Assuming (id, entity_type) is unique in both tables, this should be substantially faster for big tables, especially because it can use an index on t1 (as opposed to your original query):
SELECT t1.id
FROM t2
JOIN t1 USING (entity_type)
WHERE t2.id = 1
GROUP BY 1
HAVING count(*) = (SELECT count(*) FROM t2 WHERE id = 1);
You need two indexes:
First on t2.id, typically covered by the primary key.
Second on t1.entity_type:
CREATE INDEX t1_foo_idx ON t1 (entity_type, id);
The added id column is optional to allow index-only scans. Sequence of columns is essential:
Is a composite index also good for queries on the first field?
SQL Fiddle.

SQL - remove duplicates from left join

I'm creating a joined view of two tables, but am getting unwanted duplicates from table2.
For example: table1 has 9000 records and I need the resulting view to contain exactly the same; table2 may have multiple records with the same FKID but I only want to return one record (random chosen is ok with my customer). I have the following code that works correctly, but performance is slower than desired (over 14 seconds).
SELECT
OBJECTID
, PKID
,(SELECT TOP (1) SUBDIVISIO
FROM dbo.table2 AS t2
WHERE (t1.PKID = t2.FKID)) AS ProjectName
,(SELECT TOP (1) ASBUILT1
FROM dbo.table2 AS t2
WHERE (t1.PKID = t2.FKID)) AS Asbuilt
FROM dbo.table1 AS t1
Is there a way to do something similar with joins to speed up performance?
I'm using SQL Server 2008 R2.
I got close with the following code (~.5 seconds), but 'Distinct' only filters out records when all columns are duplicate (rather than just the FKID).
SELECT
t1.OBJECTID
,t1.PKID
,t2.ProjectName
,t2.Asbuilt
FROM dbo.table1 AS t1
LEFT JOIN (SELECT
DISTINCT FKID
,ProjectName
,Asbuilt
FROM dbo.table2) t2
ON t1.PKID = t2.FKID
table examples
table1 table2
OID, PKID FKID, ProjectName, Asbuilt
1, id1 id1, P1, AB1
2, id2 id1, P5, AB5
3, id4 id2, P10, AB2
5, id5 id5, P4, AB4
In the above example returned records should be id5/P4/AB4, id2/P10/AB2, and (id1/P1/AB1 OR id1/P5/AB5)
My search came up with similar questions, but none that resolved my problem. link, link
Thanks in advance for your help. This is my first post so let me know if I've broken any rules.
This will give the results you requested and should have the best performance.
SELECT
OBJECTID
, PKID
, t2.SUBDIVISIO,
, t2.ASBUILT1
FROM dbo.table1 AS t1
OUTER APPLY (
SELECT TOP 1 *
FROM dbo.table2 AS t2
WHERE t1.PKID = t2.FKID
) AS t2
Your original query is producing arbitrary values for the two columns (the use of top with no order by). You can get the same effect with this:
SELECT t1.OBJECTID, t1.PKID, t2.ProjectName, t2.Asbuilt
FROM dbo.table1 t1 LEFT JOIN
(SELECT FKID, min(ProjectName) as ProjectName, MIN(asBuilt) as AsBuilt
FROM dbo.table2
group by fkid
) t2
ON t1.PKID = t2.FKID
This version replaces the distinct with a group by.
To get a truly random row in SQL Server (which your syntax suggests you are using), try this:
SELECT t1.OBJECTID, t1.PKID, t2.ProjectName, t2.Asbuilt
FROM dbo.table1 t1 LEFT JOIN
(SELECT FKID, ProjectName, AsBuilt,
ROW_NUMBER() over (PARTITION by fkid order by newid()) as seqnum
FROM dbo.table2
) t2
ON t1.PKID = t2.FKID and t2.seqnum = 1
This assumes version 2005 or greater.
If you want described result, you need to use INNER JOIN and following query will satisfy your need:
SELECT
t1.OID,
t1.PKID,
MAX(t2.ProjectName) AS ProjectName,
MAX(t2.Asbuilt) AS Asbuilt
FROM table1 t1
JOIN table2 t2 ON t1.PKID = t2.FKID
GROUP BY
t1.OID,
t1.PKID
If you want to see all rows from left table (table1) whether it has pair in right table or not, then use LEFT JOIN and same query will gave you desired result.
EDITED
This construction has good performance, and you dont need to use subqueries.

SQL nested query

I have a table like below
id name dependency
-----------------------
1 xxxx 0
2 yyyy 1
3 zzzz 2
4 aaaaaa 0
5 bbbbbb 4
6 cccccc 5
the list goes on. I want to select group of rows from this table , by giving the name of 0 dependency in where clause of SQL and till it reaches a condition where there is no more dependency. (For ex. rows 1,2, 3 forms a group, and rows 4,5,6 is another group) .please help
Since you did not specify a product, I'll go with features available in the SQL specification. In this case, I'm using a common-table expression which are supported by many database products including SQL Server 2005+ and Oracle (but not MySQL):
With MyDependents As
(
Select id, name, 0 As level
From MyTable
Where dependency = 0
And name = 'some value'
Union All
Select T.id, T.name, T.Level + 1
From MyDependents As D
Join MyTable As T
On T.id = D.dependency
)
Select id, name, level
From MyDependents
Another solution which does not rely on common-table expressions but does assume a maximum level of depth (in this case two levels below level 0) would something like
Select T1.id, T1.name, 0 As level
From MyTable As T1
Where T1.name = 'some value'
Union All
Select T2.id, T2.name, 1
From MyTable As T1
Join MyTable As T2
On T2.Id = T1.Dependency
Where T1.name = 'some value'
Union All
Select T3.id, T3.name, 2
From MyTable As T1
Join MyTable As T2
On T2.Id = T1.Dependency
Join MyTable As T3
On T3.Id = T2.Dependency
Where T1.name = 'some value'
Sounds like you want to recursively query your table, for which you will need a Common Table Expression (CTE)
This MSDN article explains CTEs very well. They are confusing at first but surprisingly easy to implement.
BTW this is obviously only for SQL Server, I'm not sure how you'd achieve that in MySQL.
This is the first thing that came to mind. It can be probably done more directly/succinctly, I'll try to dwell on it a little.
SELECT *
FROM table T1
WHERE T1.id >=
(SELECT T2.id FROM table T2 WHERE T2.name = '---NAME HERE---')
AND T1.id <
(SELECT MIN(id)
FROM table T3
WHERE T3.dependency = 0 AND T3.id > T2.id)
If you can estimate a max depth, this works out to something like:
SELECT
COALESCE(t4.field1, t3.field1, t2.field1, t1.field1, t.field1),
COALESCE(t4.field2, t3.field2, t2.field2, t1.field2, t.field2),
COALESCE(t4.field3, t3.field3, t2.field3, t1.field3, t.field3),
....
FROM table AS t
LEFT JOIN table AS t1 ON t.dependency = t1.id
LEFT JOIN table AS t2 ON t1.dependency = t2.id
LEFT JOIN table AS t3 ON t2.dependency = t3.id
LEFT JOIN table AS t4 ON t3.dependency = t4.id
....
This is a wild guess just to be different, but I think it's kind of pretty, anyway. And it's at least as portable as any of the others. But I don't want to look to closely; I'd want to use sensible data, start testing, and check for sensible results.
Hierarchical query will do:
SELECT *
FROM your_table
START WITH id = :id_of_group_header_row
CONNECT BY dependency = PRIOR id
Query works like this:
1. select all rows satisfying START WITH condition (this rows are roots now)
2. select all rows satisfying CONNECT BY condition,
keyword PRIOR means this column's value will be taken from the root row
3. consider rows selected on step 2 to be roots
4. go to step 2 until there are no more rows