Group by SQL statement - sql

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?

Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?

If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?

You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)

Related

Access SQL GROUP BY problem (eg. tbl_Produktion.ID not part of the aggregation-function)

I want to group by two columns, however MS Access won't let me do it.
Here is the code I wrote:
SELECT
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter,
tbl_Produktion.ProduktionsID, tbl_Produktion.Linie,
tbl_Produktion.Schicht, tbl_Produktion.Anzahl_Schichten_P,
tbl_Produktion.Schichtteam, tbl_Produktion.Von, tbl_Produktion.Bis,
tbl_Produktion.Pause, tbl_Produktion.Kunde, tbl_Produktion.TeileNr,
tbl_Produktion.FormNr, tbl_Produktion.LabyNr,
SUM(tbl_Produktion.Stueckzahl_Prod),
tbl_Produktion.Stueckzahl_Ausschuss, tbl_Produktion.Ausschussgrund,
tbl_Produktion.Kommentar, tbl_Produktion.StvSchichtleiter,
tbl_Produktion.Von2, tbl_Produktion.Bis2, tbl_Produktion.Pause2,
tbl_Produktion.Arbeiter3, tbl_Produktion.Von3, tbl_Produktion.Bis3,
tbl_Produktion.Pause3, tbl_Produktion.Arbeiter4,
tbl_Produktion.Von4, tbl_Produktion.Bis4, tbl_Produktion.Pause4,
tbl_Produktion.Leiharbeiter5, tbl_Produktion.Von5,
tbl_Produktion.Bis5, tbl_Produktion.Pause5,
tbl_Produktion.Leiharbeiter6, tbl_Produktion.Von6,
tbl_Produktion.Bis6, tbl_Produktion.Pause6, tbl_Produktion.Muster
FROM
tbl_Personal
INNER JOIN
tbl_Produktion ON tbl_Personal.PersID = tbl_Produktion.Schichtleiter
GROUP BY
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter;
It works when I group it by all the columns, but not like this.
The error message say that the rest of the columns aren't part of the aggregation-function (translated from german to english as best as I could).
PS.: I also need the sum of "tbl_Produktion.Stueckzahl_Prod" therefore I tried using the SUM function (couldn't try it yet).
Have you tried something along these lines?
SELECT
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter,
MAX(tbl_Produktion.ProduktionsID), MAX(tbl_Produktion.Linie),
MAX(tbl_Produktion.Schicht), MAX(tbl_Produktion.Anzahl_Schichten_P),
MAX(tbl_Produktion.Schichtteam), MAX(tbl_Produktion.Von), MAX(tbl_Produktion.Bis),
SUM(tbl_Produktion.Stueckzahl_Prod)
FROM
tbl_Personal
INNER JOIN
tbl_Produktion ON tbl_Personal.PersID = tbl_Produktion.Schichtleiter
GROUP BY
tbl_Produktion.Datum, tbl_Produktion.Schichtleiter;
I have used the MAX function for all the data except the two items you specify in the GROUP BY and the one where you desire the SUM. I took the liberty of leaving out mush of your data just to get started.
Using the MAX function turns out to be a convenient workaround when the data item is known to be unique within each group. We cannot know your data or your itent, so we cannot tell you whether MAX will yield the results you need.
If you use an aggregation function in the select clause, you must group by every column that you're selecting that's not an aggregation. If you don't want to do that for some reason (perhaps it changes the output of the aggregation in way that you don't intend) you either must think of an aggregate to use (pick a value. Average? Max? Min?) or just do two selects, one for the aggregate, and one for the non-aggregates. But, then, you have to decide how to get the non-aggregated fields that make sense for the aggregate (or show them all in a table, I suppose?)

SQLite alias (AS) not working in the same query

I'm stuck in an (apparently) extremely trivial task that I can't make work , and I really feel no chance than to ask for advice.
I used to deal with PHP/MySQL more than 10 years ago and I might be quite rusty now that I'm dealing with an SQLite DB using Qt5.
Basically I'm selecting some records while wanting to make some math operations on the fetched columns. I recall (and re-read some documentation and examples) that the keyword "AS" is going to conveniently rename (alias) a value.
So for example I have this query, where "X" is an integer number that I render into this big Qt string before executing it with a QSqlQuery. This query lets me select all the electronic components used in a Project and calculate how many of them to order (rounding to the nearest multiple of 5) and the total price per component.
SELECT Inventory.id, UsedItems.pid, UsedItems.RefDes, Inventory.name, Inventory.category,
Inventory.type, Inventory.package, Inventory.value, Inventory.manufacturer,
Inventory.price, UsedItems.qty_used as used_qty,
UsedItems.qty_used*X AS To_Order,
ROUND((UsedItems.qty_used*X/5)+0.5)*5*CAST((X > 0) AS INT) AS Nearest5,
Inventory.price*Nearest5 AS TotPrice
FROM Inventory
LEFT JOIN UsedItems ON Inventory.id=UsedItems.cid
WHERE UsedItems.pid='1'
ORDER BY RefDes, value ASC
So, for example, I aliased UsedItems.qty_used as used_qty. At first I tried to use it in the next field, multiplying it by X, writing "used_qty*X AS To_Order" ... Query failed. Well, no worries, I had just put the original tab.field name and it worked.
Going further, I have a complex calculation and I want to use its result on the next field, but the same issue popped out: if I alias "ROUND(...)" AS Nearest5, and then try to use this value by multiplying it in the next field, the query will fail.
Please note: the query WORKS, but ONLY if I don't use aliases in the following fields, namely if I don't use the alias Nearest5 in the TotPrice field. I just want to avoid re-writing the whole ROUND(...) thing for the TotPrice field.
What am I missing/doing wrong? Either SQLite does not support aliases on the same query or I am using a wrong syntax and I am just too stuck/confused to see the mistake (which I'm sure it has to be really stupid).
Column aliases defined in a SELECT cannot be used:
For other expressions in the same SELECT.
For filtering in the WHERE.
For conditions in the FROM clause.
Many databases also restrict their use in GROUP BY and HAVING.
All databases support them in ORDER BY.
This is how SQL works. The issue is two things:
The logic order of processing clauses in the query (i.e. how they are compiled). This affects the scoping of parameters.
The order of processing expressions in the SELECT. This is indeterminate. There is no requirement for the ordering of parameters.
For a simple example, what should x refer to in this example?
select x as a, y as x
from t
where x = 2;
By not allowing duplicates, SQL engines do not have to make a choice. The value is always t.x.
You can try with nested queries.
A SELECT query can be nested in another SELECT query within the FROM clause;
multiple queries can be nested, for example by following the following pattern:
SELECT *,[your last Expression] AS LastExp From (SELECT *,[your Middle Expression] AS MidExp FROM (SELECT *,[your first Expression] AS FirstExp FROM yourTables));
Obviously, respecting the order that the expressions of the innermost select query can be used by subsequent select queries:
the first expressions can be used by all other queries, but the other intermediate expressions can only be used by queries that are further upstream.
For your case, your query may be:
SELECT *, PRC*Nearest5 AS TotPrice FROM (SELECT *, ROUND((UsedItems.qty_used*X/5)+0.5)*5*CAST((X > 0) AS INT) AS Nearest5 FROM (SELECT Inventory.id, UsedItems.pid, UsedItems.RefDes, Inventory.name, Inventory.category, Inventory.type, Inventory.package, Inventory.value, Inventory.manufacturer, Inventory.price AS PRC, UsedItems.qty_used*X AS To_Order FROM Inventory LEFT JOIN UsedItems ON Inventory.id=UsedItems.cid WHERE UsedItems.pid='1' ORDER BY RefDes, value ASC))

What do OrientDB's functions do when applied to the results of another function?

I am getting very strange behavior on 2.0-M2. Consider the following against the GratefulDeadConcerts database:
Query 1
SELECT name, in('written_by') AS wrote FROM V WHERE type='artist'
This query returns a list of artists and the songs each has written; a majority of the rows have at least one song.
Query 2
Now try:
SELECT name, count(in('written_by')) AS num_wrote FROM V WHERE type='artist'
On my system (OSX Yosemite; Orient 2.0-M2), I see just one row:
name num_wrote
---------------------------
Willie_Cobb 224
This seems wrong. But I tried to better understand. Perhaps the count() causes the in() to look at all written_by edges...
Query 3
SELECT name, in('written_by') FROM V WHERE type='artist' GROUP BY name
Produces results similar to the first query.
Query 4
Now try count()
SELECT name, count(in('written_by')) FROM V WHERE type='artist' GROUP BY name
Wrong path -- So try LET variables...
Query 5
SELECT name, $wblist, $wbcount FROM V
LET $wblist = in('written_by'),
$wbcount = count($wblist)
WHERE type='artist'
Produces seemingly meaningless results:
You can see that the $wblist and $wbcount columns are inconsistent with one another, and the $wbcount values don't show any obvious progression like a cumulative result.
Note that the strange behavior is not limited to count(). For example, first() does similarly odd things.
count(), like in RDBMS, computes the sum of all the records in only one value. For your purpose .size()seems the right method to call:
in('written_by').size()

Oracle SQL - Comparing AVG functions in WHERE

I'm trying to write a few Oracle SQL scripts for an assignment. I've managed to get all of it to work, except for one part. To summarize, I have to display data from 2 tables if the average of 1 column in table A is greater than the average of another column in table B. I realize you cannot include AVG functions in a WHERE clause or HAVING clause since it seems unable to properly access the data (from what I've read). When I exclude this clause, the script executes properly, so I'm confident there are no other errors.
I've tried writing it as follows but the error I get is ORA-00936: missing expression and it is just before the > sign. I thought this may be due to improper bracket placing but none of my attempts resolved this. Here is my attempt:
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING (SELECT AVG(l.l_cost) OVER (PARTITION BY l.l_cost)) >
(SELECT AVG(r.r_sold) OVER (PARTITION BY r.r_sold));
I tried doing this without the OVER (PARTITION BY ...) as well as putting it into a WHERE clause but it didn't resolve the error. I'm pretty sure I need to put it into a SELECT statement somehow but I'm at a loss.
You do not need to use the OVER clause when applying the aggregate functions in the HAVING clause. Just use the aggregate functions on their own.
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING HAVING AVG(l.l_cost) > AVG(r.r_sold)

Oracle Group by issue

I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example