Firebird group clause - sql

I can't to understand firebird group logic
Query:
SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
works perfectly
But when I try to get other fields:
SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
I get error: Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause)

When you use GROUP BY in your query, the field or fields specified are used as 'keys', and data rows are grouped based on unique combinations of those 2 fields. In the result set, every such unique combination has one and only one row.
In your case, the only identifier in the group is t.id. Now consider that you have 2 records in the table, both with t.id = 1, but having different values for another column, say, t.name. If you try to select both id and name columns, it directly contradicts the constraint that one group can have only one row. That is why you cannot select any field apart from the group key.
For aggregate functions it is different. That is because, when you sum or count values or get the maximum, you are basically performing that operation only based on the id field, effectively ignoring the data in the other columns. So, there is no issue because there can only be one answer to, say, count of all names with a particular id.
In conclusion, if you want to show a column in the results, you need to group by it. This will however, make the grouping more granular, which may not be desirable. In that case, you can do something like this:
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)

When you using GROUP BY clause in SELECT you should use only aggreagted functions or columns that listed in GROUP BY clause. More about GROUP BY clause:http://www.firebirdsql.org/manual/nullguide-aggrfunc.html
As example:
SELECT Max(t.jid), t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id

SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
This will not execute,cause you have used t.id in group by, So all your columns in select clause should be using aggregate function , else those should be included in group by clause.
Select * means you are selecting all columns, so all columns except t.id are neither in group by nor in aggregate function.
Try this link, How to use GROUP BY in firebird

Related

Joined query producing more results compared to solo query

I am performing the following query which has an inner join against another table.
select count(myTable.name)
from sch2.sample_detail as myTable
inner join sch1.otherTable as otherTable on myTable.name = otherTable.name
where otherTable.is_valid = 1
and myTable.name IS NOT NULL;
This produces a count of 4912304.
The following is a query just on a single table (my table).
SELECT COUNT(myTable.name)
from sch2.sample_detail as myTable
where myTable.name IS NOT NULL;
This produces a count of 2864654.
But how is this possible? Both queries have the clause where myTable.name IS NOT NULL.
Shouldn't the second query produce same results or if not even more cos the second query doesn't have the otherTable.is_valid = 1 clause?
Why does the inner join produces a higher count of result?
Please advice if there is something I should amend in the 1st query, thanks.
Inner, left or cross join can duplicate rows. sch1.otherTable.name is not unique and this causing rows duplication because for each row in left table all corresponding rows from right table are being selected, this is normal join behavior.
To get duplicate names list use this query and decide how to remove duplicated rows: filter or distinct or filter by row_number, etc.
select count(*) cnt,
name
from sch1.otherTable
having count(*)>1
order by cnt desc;
If you need EXISTS (and do not need to select columns from otherTable), use left semi join.
Also subquery with distinct can be used to pre-aggregate name before join and filter:
select count(myTable.name)
from sch2.sample_detail as myTable
LEFT SEMI JOIN (select distinct name from sch1.otherTable otherTable where otherTable.is_valid = 1 ) as otherTable on myTable.name = otherTable.name
where myTable.name IS NOT NULL;

Postgresql - Group By

I have a simple groupby scenario. Below is the output of the query.
Query is:
select target_date, type, count(*) from table_name group by target_date, type
The query and output is perfectly good.
My problem is I am using this in Grafana for plotting. That is Grafana with postgres as backend.
What happens is since "type2" category is missed on 01-10-2020 and 03-10-2020, type2 category never gets plotted (side to side bar plot) at all. Though "type2" is present in other days.
It is expecting some thing like
So whenever a category is missed in a date, we need a count with 0 value.
Need to handle this in query, as the source data cannot be modified.
Any help here is appreciated.
You need to create a list of all the target_date/type combinations. That can be done with a CROSS JOIN of two DISTINCT selects of target_date and type. This list can beLEFT JOINed to table_name to get counts for each combination:
SELECT dates.target_date, types.type, COUNT(t.target_date)
FROM (
SELECT DISTINCT target_date
FROM table_name
) dates
CROSS JOIN (
SELECT DISTINCT type
FROM table_name
) types
LEFT JOIN table_name t ON t.target_date = dates.target_date AND t.type = types.type
GROUP BY dates.target_date, types.type
ORDER BY dates.target_date, types.type
Demo on dbfiddle
You may use a calendar table approach here:
SELECT
t1.target_date,
t2.type,
COUNT(t3.target_date) AS count
FROM (SELECT DISTINCT target_date FROM yourTable) t1
CROSS JOIN (SELECT DISTINCT type FROM yourTable) t2
LEFT JOIN yourTable t3
ON t3.target_date = t1.target_date AND
t3.type = t2.type
GROUP BY
t1.target_date,
t2.type
ORDER BY
t1.target_date,
t2.type;
The idea here is to cross join subqueries finding all distinct target dates and types, to generate a starting point for the query. Then, we left join this intermediate table to your actual table, and find the counts for each date and type.
select t.target_date, tmp.type, sum(case when t.type = tmp.type then 1 else 0 end)
from your_table t
cross join (select distinct type from your_table) tmp
group by t.target_date, tmp.type
Demo

Count records only from left side of a LEFT JOIN

I'm building an Access query with a LEFT JOIN that, among other things, counts the number of unique sampleIDs present in the left table of the JOIN, and counts the aggregate number of specimens (bugs) present in the right table of the JOIN, both for a given group of samples (TripID). Here's the pertinent chunk of SQL code:
SELECT DISTINCT t1.TripID, COUNT(t1.SampleID) AS Samples, SUM(t2.C1 + t2.C2)
AS Bugs FROM tbl_Sample AS t1
LEFT JOIN tbl_Bugs AS t2 ON t1.SampleID = t2.SampleID
GROUP BY t1.TripID
The trouble I'm having is that COUNT(t1.SampleID) is not giving me my desired result. My desired result is the number of unique SampleIDs present in t1 for a given TripID (let's say 7). Instead, what I get seems to be the number of rows in t2 for which the SampleID is contained within the given TripID group (let's say 77). How can I change this SQL query to get the desired number (7, not 77)?
just take the aggregate sum first on t2, then join with t2 like this:
SELECT t1.TripID, COUNT(t1.SampleID) AS Samples, SUM(t3.Bugs) as Bugs
FROM tbl_Sample AS t1
LEFT Join (
SELECT t2.SampleID, SUM(t2.C1 + t2.C2) as Bugs
FROM tbl_Bugs as t2
GROUP BY SampleID) AS t3 ON t1.SampleID = t3.SampleID
GROUP BY t1.TripID
This is a tricky query, because you have different hierarchies. Here is one method:
select s.tripid, count(*) as numsamples,
(select sum(b2.c1 + b2.c2)
from bugs b join
tbl_sample s2
on s2.sampleid = b.sampleid
where s2.tripid = s.tripid
) as numbugs
from tbl_sample s
group by s.tripid
You included a DISTINCT with a Group By. This is removing duplicates twice, which is unnecessarily complex. You can get rid of the DISTINCT.
I would have the count separate from what is going on in the group by.
SELECT dT.TripID
,(SELECT COUNT(DISTINCT(SampleID))
FROM Bugs B
WHERE B.TripID = dT.TripID
) AS [Samples]
,dT.Bugs
FROM (
SELECT t1.TripID
,SUM(t2.C1 + t2.C2) AS Bugs
FROM tbl_Sample AS t1
LEFT JOIN tbl_Bugs AS t2 ON t1.SampleID = t2.SampleID
GROUP BY t1.TripID
) AS dT

SQL: is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

left join
(
SELECT my_number, MAX(id) as id
FROM table1
GROUP BY location
) newNum
on newNum.Part = c.OtherPart
left join table2 t2 on t2.id = newNum.id and t2.site = a.site
Situation: I have the data fields (among others) my_number, location, and id in table1. In table2 I have the data fields (among others) id, site, date. I am joining those to other views/tables (c and a) that have some of the same data fields and my_numbers.
My goal: Each my_number has multiple id's and I want the greatest id value for each site. That is why I used group by site.
Then I need to get the 'date' of the my_number based on the id, because the second table does not contain the my_number, just its associated id.
There are a total of 3 sites, so I need the 3 greatest id value for each site. Then I want to get the 'date' of those 3 id values
Output table ex:
a.num a.site a.date c.OtherPart T2.date
15 TN 1.1.16 17 3.19.16
15 FL 2.21.16 17 4.22.16
15 TX 1.7.15 17 3.21.16
When you put something like max(column) in a SQL query, the max function is operating on a set of values of column from a group. If you've defined your query with a group by, such that the results are grouped, then every column (other than the one on which you are grouping) has multiple values.
In your case, location has one value (it's what you're grouping by), but my_number and id have multiple values. If my_number is (1,2,3,4) and id is (5,6,7,8), you can display sum(my_number) or max(my_number) but obviously you can't display on a single row the 'number' my_number. It is not a number, but a list.
This is what is meant when the error message says "SQL: is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause" If you put the column my_column in an aggregate function (like sum) it will work, or if you add it to the group by clause it will work.
Not sure what exactly you want to do about your sql, you should only fetch columns which appears in GROUP BY or the other column in an aggregate function, anyway try this please;)
left join
(
SELECT my_number, id
FROM table1 T1,
(SELECT location, MAX(id) as id
FROM table1
GROUP BY location) TMP
WHERE T1.id= TMP.id AND T1.location = TMP.location
) newNum
on newNum.Part = c.OtherPart
left join table2 t2 on t2.id = newNum.id and t2.site = a.site
And you also can fix this error by following sql but may not what you want to do,
left join
(
SELECT location, MAX(id) as id
FROM table1
GROUP BY location
) newNum
on newNum.Part = c.OtherPart
left join table2 t2 on t2.id = newNum.id and t2.site = a.site

Sum multiple columns using a subquery

I'm trying to play with Oracle's DB.
I'm trying to sum two columns from the same row and output a total on the fly.
However, I can't seem to get it to work. Here's the code I have so far.
SELECT a.name , SUM(b.sequence + b.length) as total
FROM (
SELECT a.name, a.sequence, b.length
FROM tbl1 a, tbl2 b
WHERE b.sequence = a.sequence
AND a.loc <> -1
AND a.id='10201'
ORDER BY a.location
)
The inner query works, but I can't seem to make the new query and the subquery work together.
Here's a sample table I'm using:
...[name][sequence][length]...
...['aa']['100000']['2000']...
...
...['za']['200000']['3001']...
And here's the output I'd like:
[name][ total ]
['aa']['102000']
...
['za']['203001']
Help much appreciated, thanks!
SUM() sums number across rows. Instead replace it with sequence + length.
...or if there is the possibility of NULL values occurring in either the sequence or length columns, use: COALESCE(sequence, 0) + COALESCE(length, 0).
Or, if your intention was indeed to produce a running total (i.e. aggregating the sum of all the totals and lengths for each user), add a GROUP BY a.name after the end of the subquery.
BTW: you shouldn't be referencing the internal aliases used inside a subquery from outside of that subquery. Some DB servers allow it (and I don't have convenient access to an Oracle server right now, so I can test it), but it's not really good practice.
I think what you are after is something like:
SELECT a.name,
SUM(B.sequence + B.length) AS total
FROM Tbl1 A
INNER JOIN Tbl2 B
ON B.sequence = A.sequence
WHERE A.loc <> -1
AND A.id = 10201
GROUP BY a.name
ORDER BY A.location
Your query with the subquery fails for several reasons:
You use the table alias a, but it is not defined.
You use the table alias b, but it is not defined.
You have a sum() in the select clause with unaggregated columns, but no group by.
In addition, you have an order by in the subquery which is allowed syntactically, but ignored.
Here is a better way to write the query without a subquery:
SELECT t1.name, (t1.sequence + t2.length) as total
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
ORDER BY t1.location;
Note the use of proper join syntax, the use of aliases that make sense, and the simple calculation at this level.
Here is a version with a subquery:
select name, (sequence + length) as total
from (SELECT t1.name, t1.sequence, t2.length
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
) t
ORDER BY location;
Note that the order by is going at the outer level. And, I gave the subquery an alias. This is not strictly required, but typically a good idea.