Getting row count with other columns

Getting row count with other columns - sql

I need to get some columns which are LinkID, ReplyCount and the most important one is TotalRowCount.
This is my code:
SELECT
TOP(10) link.LinkID, mesaj.ReplyCount
FROM
TBL_UserIcerikler AS link
INNER JOIN
TBL_UserMesajlar AS mesaj ON link.FromUserID = mesaj.UserID
WHERE
link.PublishDate >='2013-03-12 19:46:45.000'
ORDER BY
link.PublishDate DESC
It is not running anymore when I add Count(*) AS a".
I get this message instead. How can I get row count? Does anyone have any information about this topic?
Msg 208, Level 16, State 1, Line 1
Invalid object name 'TBL_UserIcerikler'

Count(*) is an aggregate function which returns the number of rows which have been summarised (not the number of rows returned by the query), so you must GROUP BY something and specify only the fields by which you group (or just return COUNT(*)).
It doesn't make a lot of sense to mix COUNT() and TOP().
For example :
SELECT link.LinkID, mesaj.ReplyCount, COUNT(*)
FROM TBL_UserIcerikler AS link
INNER JOIN TBL_UserMesajlar AS mesaj ON link.FromUserID = mesaj.UserID
WHERE link.PublishDate >='2013-03-12 19:46:45.000'
GROUP BY link.LinkID, mesaj.ReplyCount;
I know it's not quite what you want, but you haven't given quite enough explanation as to what you want to get out of your database.
That said, I think you might have forgotten a comma in the expression list.
Why not post your modified query.

Please read this MSDN explanation of group by, you will understand why you need it to get your total count.

Related

Why is my SQL aliasing not being recognized?

This may be an incredibly simple question, but I'm not seeing what the problem here is. I'm trying to teach myself SQL and was working on an experiment to play with subqueries and aliasing. When I try to enter the following query (into BigQuery), I get an error message "Unrecognized name: cast1 at [3:1]" which persists even if I copy the COUNT lines into the outer query. Obviously there is something I'm not understanding about aliasing here, but I'm not sure where I am going wrong. I would appreciate any help from more experienced SQL users out there on how to improve, thank you in advance!
SELECT
cast__1_,
cast1 + cast2 + cast3 + cast4 AS num_films,
(
SELECT
cast__1_,
COUNT (cast__1_) AS cast1,
COUNT (cast__2_) AS cast2,
COUNT (cast__3_) AS cast3,
COUNT (cast__4_) AS cast4,
FROM `dataproject1-351413.movie_data.movies` AS movies
WHERE
cast__1_ IS NOT Null
GROUP BY
cast__1_
)
FROM `dataproject1-351413.movie_data.movies`
GROUP BY
cast__1_
(The intended result was two columns, pairing each actor with the number of films across the dataset, in case that is not clear from the query)

The structure of your query seems a bit skewed, you need to treat the aggregate query as a derived table, such as:
SELECT cast__1_, cast1 + cast2 + cast3 + cast4 AS num_films
from (
SELECT
cast__1_,
COUNT (cast__1_) AS cast1,
COUNT (cast__2_) AS cast2,
COUNT (cast__3_) AS cast3,
COUNT (cast__4_) AS cast4,
FROM `dataproject1-351413`.movie_data.movies
WHERE cast__1_ IS NOT Null
GROUP BY cast__1_
)t;

postal code and location amounts in sql

If it's been asked before then my search function is cursed.
select postkode, gemeente, count(postkode) as total
from leveradressen
where land=1
group by postkode
order by postkode asc
postkode is a zipcode
gemeente is the town name
i try to run this but i get
Msg 8120, Level 16, State 1, Line 4
Column 'leveradressen.Gemeente' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
... no idea how to make it work. I know WHY he does it. but this is an assignment to see if i can keep this job and i'm utterly failing at it.
what's happening is he thinks that there's too many place names to pick from. but every place name matches every zipcode. there are never any deviations. 2000 will always be antwerpen, for example.
and no i can't fix it by adding a 'group by' because that's in there and i can't use aggregate functions because both postkode and plaatsnaam are both nvarchar.
i'm out of ideas.

You should group by gemeente as well.
So:
select postkode, gemeente, count(postkode) as total
from leveradressen
where land=1
group by postkode, gemeente
order by postkode asc
Not sure what you mean with i can't fix it by adding a 'group by' because that's in there

You could use:
select postkode, MIN(gemeente) AS gemeente, count(postkode) as total
-- here goes agg function
from leveradressen
where land=1
group by postkode
order by postkode asc

SQL ORDER BY clause causing GROUP BY/aggregate error

I get this error:
PG::GroupingError: ERROR: column "relationships.created_at" must appear in the GROUP BY clause or be used in an aggregate function
from this query:
last_check = #user.last_check.to_i
#new_relationships = User.select('*')
.from("(#{#rels_unordered.to_sql}) AS rels_unordered")
.joins("
INNER JOIN relationships
ON rels_unordered.id = relationships.character_id
WHERE EXTRACT(EPOCH FROM relationships.created_at) > #{last_check}
ORDER BY relationships.created_at DESC
")
Without the ORDER BY line, it works fine. I don't understand what the GROUP BY clause is. How do I get it working and still order by relationships.created_at?
EDIT
I understand you can GROUP BY relationships.created_at. But isn't grouping unnecessary? Is the problem that relationship.created_at is not included in the SELECT? How do you include it? If you've already done an INNER JOIN with relationships, why the hell isn't relationships.created_at included in the result??
I've just realised this is all happening because the logs show the query begins with SELECT COUNT(*) FROM..... So the COUNT is the aggregate function. But I never requested a COUNT! Why does the query start with that?
EDIT 2
Ok, this seems to be happening because of lazy querying. The first thing that happens to #new_relationships is #new_relationships.any? This affects the query and turns it into a count. So I suppose the question is, how do I force the query to run as originally intended? And also to check if #new_relationships is empty without affecting the sql query?

You just need to add group by along with your order by clause
last_check = #user.last_check.to_i
#new_relationships =
User.select('"rels_unordered".*')
.from("(#{#rels_unordered.to_sql}) AS rels_unordered")
.joins("INNER JOIN relationships
ON rels_unordered.id = relationships.character_id
WHERE EXTRACT(EPOCH FROM relationships.created_at) > #{last_check}
GROUP BY relationships.created_at
ORDER BY relationships.created_at DESC ")

It's asking for a GROUP BY after your FROM clause in your EXTRACT (essentially a subselect). There's ways around it, but I've found it's often easier to make a GROUP BY work. Try: ...FROM relationships.created_at GROUP BY id or whatever indexing column you are using from that table. It seems like your ORDER BY is conflicting with itself. By grouping the subselect data it should lose its conflict.

Finding most popular and most unique records using SQL

My mom wanted a baby name game for my brother's baby shower. Wanting to learn python, I volunteered to do it. I pretty much have the python bit, it's the SQL that is throwing me.
The way the game is supposed to work is everyone at the shower writes down names on paper, I manually enter them into Excel (normalizing spellings as much as possible) and export to MS Access. Then I run my python program to find the player with the most popular names and the player with the most unique names. The database, called "babynames", is just four columns.
ID | BabyFirstName | BabyMiddleName | PlayerName
---|---------------|----------------|-----------
My mom has changed things every so often, but as they stand right now, I have to figure out :
a) The most popular name (or names if there is a tie) out of all first and middle names
b) The most unique name (or names if there is a tie) out of all the first and middle names
c) The player that has the most number of popular names (wins a prize)
d) The player that has the most number of unique names (wins a prize)
I've been working on this for about a week now and can't even get a SQL query for a) and b) to work, much less c) and d). I'm more than just a bit frustrated.
BTW, I'm just looking at spellings of the names, not phonetics. As I manually enter names, I will change names like "Kris" to "Chris" and "Xtina" to "Christina" etc.
Editing to add a couple of the most recent queries I tried for a)
SELECT [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
LIMIT 1
and
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC
LIMIT 1)
These both lead to syntax errors.
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error in ORDER BY clause. (-3508) (SQLExecDirectW)')
I've tried using [FirstNameOccurrence] and just FirstNameOccurrence as well with the same error. Not sure why it's not recognizing it by that column name to order by.
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error. in query expression 'COUNT(*) = (SELECT COUNT(*) FROM [babynames] GROUP BY [BabyFirstName] ORDER BY COUNT(*) DESC LIMIT 1)'. (-3100) (SQLExecDirectW)")
I'll admit that I'm not really grokking all of the COUNT(*) commands here, but this was a solution for a similar issue here in stackoverflow that I figured I'd try when my other idea didn't pan out.

For A and B, use a group by clause in your SQL, and then count, and order by the count. Use descending order for A and ascending order for B, and just take the first result for each.
For C and D, essentially use the same strategy but now just add the PlayerName (e.g. group by babyname,playername) and then use the ascending order/descending order question.
Here's Microsoft's write-up for a group by clause in MS Access: https://office.microsoft.com/en-us/access-help/group-by-clause-HA001231482.aspx
Here's an even better write-up demonstrating how to do both group by and order by at the same time: http://rogersaccessblog.blogspot.com/2009/06/select-queries-part-3-sorting-and.html

For the first query you tried, change it to:
SELECT TOP 1 [BabyFirstName],
COUNT ([BabyFirstName]) AS 'FirstNameOccurrence'
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY 'FirstNameOccurrence' DESC
For the second, change it to:
SELECT [BabyFirstName]
FROM [babynames]
GROUP BY [BabyFirstName]
HAVING COUNT(*) =
(SELECT TOP 1 COUNT(*)
FROM [babynames]
GROUP BY [BabyFirstName]
ORDER BY COUNT(*) DESC)
Limiting the number of records returned by a SQL Statement in Access is achieved by adding a TOP statement directly after SELECT, not with ORDER BY... LIMIT
Also, Access TOP statement will return all instances of the top n (or n percent) unique records, so if there are two or more identical records in the query output (before TOP), and TOP 1 is specified, you'll see them all.

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?

Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?

If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?

You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas