postal code and location amounts in sql - sql

If it's been asked before then my search function is cursed.
select postkode, gemeente, count(postkode) as total
from leveradressen
where land=1
group by postkode
order by postkode asc
postkode is a zipcode
gemeente is the town name
i try to run this but i get
Msg 8120, Level 16, State 1, Line 4
Column 'leveradressen.Gemeente' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
... no idea how to make it work. I know WHY he does it. but this is an assignment to see if i can keep this job and i'm utterly failing at it.
what's happening is he thinks that there's too many place names to pick from. but every place name matches every zipcode. there are never any deviations. 2000 will always be antwerpen, for example.
and no i can't fix it by adding a 'group by' because that's in there and i can't use aggregate functions because both postkode and plaatsnaam are both nvarchar.
i'm out of ideas.

You should group by gemeente as well.
So:
select postkode, gemeente, count(postkode) as total
from leveradressen
where land=1
group by postkode, gemeente
order by postkode asc
Not sure what you mean with i can't fix it by adding a 'group by' because that's in there

You could use:
select postkode, MIN(gemeente) AS gemeente, count(postkode) as total
-- here goes agg function
from leveradressen
where land=1
group by postkode
order by postkode asc

Related

Specified Expression not part of an aggregate function

SELECT
Product_Line_ID=2 OR Product_Line_ID=3,
COUNT(Product_Finish), MIN(Standard_Price)
FROM Product_T
WHERE Product_Finish
GROUP BY Standard_Price
HAVING AVG(Standard_Price) <700
ORDER BY Product_FInish;
I keep getting this error: Your query does not include the specified expression 'Product_Line_ID=2 OR Product_Line_ID=3' as part of an aggregate function. Can anyone help me with this? Not sure how to select product line id that is 2 or 3.
What is confusing about the error message? Product_Line_ID=2 OR Product_Line_ID=3 is not valid SQL.
Your query basically makes no sense. You have boolean conditions in the SELECT, you have a WHERE clause with a column name and no conditions, you are ordering by the column you are counting.
I can guess that you intend something like this:
SELECT Product_Line_ID, COUNT(Product_Finish), MIN(Standard_Price)
FROM Product_T
WHERE Product_Line_ID IN (2, 3)
GROUP BY Product_Line_ID
HAVING AVG(Standard_Price) < 700
ORDER BY COUNT(Product_Finish);

Why cant the Count() operator be used in a where clause? how do i get around this?

I'm trying to write a query to return the town, and the number of runners from each town where the number of runners is greater than 5.
My Query right now look like this:
select hometown, count(hometown) from marathon2016 where count(hometown) > 5 group by hometown order by count(hometown) desc;
but sqlite3 responds with this:
Error: misuse of aggregate: count()
What am i doing wrong, Why cant I use the count() here, and what should I use instead.
When you're trying to use an aggregate function (such as count) in a WHERE cause, you're usually looking for HAVING instead of WHERE:
select hometown, count(hometown)
from marathon2016
group by hometown
having count(*) > 5
order by count(*) desc
You can't use an aggregate in a WHERE cause because aggregates are computed across multiple rows (as specified by GROUP BY) but WHERE is used to filter individual rows to determine what row set GROUP BY will be applied to (i.e. WHERE happens before grouping and aggregates apply after grouping).
Try the following:
select
hometown,
count(hometown) as hometown_count
from
marathon2016
group by
hometown
having
hometown_count > 5
order by
hometown_count desc;

Finding most popular and most unique records using SQL

My mom wanted a baby name game for my brother's baby shower. Wanting to learn python, I volunteered to do it. I pretty much have the python bit, it's the SQL that is throwing me.
The way the game is supposed to work is everyone at the shower writes down names on paper, I manually enter them into Excel (normalizing spellings as much as possible) and export to MS Access. Then I run my python program to find the player with the most popular names and the player with the most unique names. The database, called "babynames", is just four columns.
ID | BabyFirstName | BabyMiddleName | PlayerName
---|---------------|----------------|-----------
My mom has changed things every so often, but as they stand right now, I have to figure out :
a) The most popular name (or names if there is a tie) out of all first and middle names
b) The most unique name (or names if there is a tie) out of all the first and middle names
c) The player that has the most number of popular names (wins a prize)
d) The player that has the most number of unique names (wins a prize)
I've been working on this for about a week now and can't even get a SQL query for a) and b) to work, much less c) and d). I'm more than just a bit frustrated.
BTW, I'm just looking at spellings of the names, not phonetics. As I manually enter names, I will change names like "Kris" to "Chris" and "Xtina" to "Christina" etc.
Editing to add a couple of the most recent queries I tried for a)
SELECT [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
LIMIT 1
and
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC
LIMIT 1)
These both lead to syntax errors.
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error in ORDER BY clause. (-3508) (SQLExecDirectW)')
I've tried using [FirstNameOccurrence] and just FirstNameOccurrence as well with the same error. Not sure why it's not recognizing it by that column name to order by.
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error. in query expression 'COUNT(*) = (SELECT COUNT(*) FROM [babynames] GROUP BY [BabyFirstName] ORDER BY COUNT(*) DESC LIMIT 1)'. (-3100) (SQLExecDirectW)")
I'll admit that I'm not really grokking all of the COUNT(*) commands here, but this was a solution for a similar issue here in stackoverflow that I figured I'd try when my other idea didn't pan out.
For A and B, use a group by clause in your SQL, and then count, and order by the count. Use descending order for A and ascending order for B, and just take the first result for each.
For C and D, essentially use the same strategy but now just add the PlayerName (e.g. group by babyname,playername) and then use the ascending order/descending order question.
Here's Microsoft's write-up for a group by clause in MS Access: https://office.microsoft.com/en-us/access-help/group-by-clause-HA001231482.aspx
Here's an even better write-up demonstrating how to do both group by and order by at the same time: http://rogersaccessblog.blogspot.com/2009/06/select-queries-part-3-sorting-and.html
For the first query you tried, change it to:
SELECT TOP 1 [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
For the second, change it to:
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT TOP 1 COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC)
Limiting the number of records returned by a SQL Statement in Access is achieved by adding a TOP statement directly after SELECT, not with ORDER BY... LIMIT
Also, Access TOP statement will return all instances of the top n (or n percent) unique records, so if there are two or more identical records in the query output (before TOP), and TOP 1 is specified, you'll see them all.

Getting row count with other columns

I need to get some columns which are LinkID, ReplyCount and the most important one is TotalRowCount.
This is my code:
SELECT
TOP(10) link.LinkID, mesaj.ReplyCount
FROM
TBL_UserIcerikler AS link
INNER JOIN
TBL_UserMesajlar AS mesaj ON link.FromUserID = mesaj.UserID
WHERE
link.PublishDate >='2013-03-12 19:46:45.000'
ORDER BY
link.PublishDate DESC
It is not running anymore when I add Count(*) AS a".
I get this message instead. How can I get row count? Does anyone have any information about this topic?
Msg 208, Level 16, State 1, Line 1
Invalid object name 'TBL_UserIcerikler'
Count(*) is an aggregate function which returns the number of rows which have been summarised (not the number of rows returned by the query), so you must GROUP BY something and specify only the fields by which you group (or just return COUNT(*)).
It doesn't make a lot of sense to mix COUNT() and TOP().
For example :
SELECT link.LinkID, mesaj.ReplyCount, COUNT(*)
FROM TBL_UserIcerikler AS link
INNER JOIN TBL_UserMesajlar AS mesaj ON link.FromUserID = mesaj.UserID
WHERE link.PublishDate >='2013-03-12 19:46:45.000'
GROUP BY link.LinkID, mesaj.ReplyCount;
I know it's not quite what you want, but you haven't given quite enough explanation as to what you want to get out of your database.
That said, I think you might have forgotten a comma in the expression list.
Why not post your modified query.
Please read this MSDN explanation of group by, you will understand why you need it to get your total count.

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?
Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?
If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)