Is COUNT(fld) faster than COUNT(*)? [duplicate]

Is COUNT(fld) faster than COUNT(*)? [duplicate] - sql

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
COUNT(id) vs. COUNT(*) in MySQL
Short but simple: in MySQL, would a SELECT COUNT(fld) AS count FROM tbl be faster than SELECT COUNT(*) AS count FROM tbl as I understand * is the "all" selector in MySQL.
Does COUNT(*) select all rows to compute a count, and therefore make a query like SELECT(id) less expensive? Or does it not really matter?

No, count(*) is faster than count(fld) (in the cases where there is a difference at all).
The count(fld) has to consider the data in the field, as it counts all non-null values.
The count(*) only counts the number of records, so it doesn't need access to the data.

SELECT COUNT(*) AS count FROM tbl
The above query doesn't even count the rows assuming there's no WHERE clause, it reads directly from the table cache. Specifying a field instead of * forces SQL to actually count the rows, so it's much faster to use * when there's no WHERE clause.

* is the “all” selector in MySQL
That's true when you SELECT columns, where the * is a shortcut for the whole column list.
SELECT * becomes SELECT foo, bar.
But COUNT(*) is not expanded to COUNT(foo,bar) which is nonsensical in SQL. COUNT is an aggregate function which normally needs one value per selected row.

Related

What happens when you use DISTINCT * in COUNT() in SQL?

I've just learned about the COUNT() function, and how it is possible to get the number of rows in a column by passing * as the argument.
SELECT COUNT(*) FROM table;
I've also learned that we can get the number of distinct rows of a column in a table by using DISTINCT.
SELECT COUNT(DISTINCT column) FROM table;
I've noticed that the following returns nothing.
SELECT COUNT(DISTINCT *) FROM table;
Why is this?
I suppose the root of my issue is that I don't quite fully understand what the COUNT() function with * as the argument does exactly. My resource says that the COUNT() function takes a column as an argument and counts how many non-NULL rows there are. So say we have a table that has a column with some rows having both NULL and non-NULL values. If COUNT(column) doesn't count the non-NULL rows, what happens differently in COUNT(*) so that all the rows are counted? And by extension, what happens during COUNT(DISTINCT *)?

This would be a syntax error in most databases. If it were allowed, it would probably be equivalent to:
select count(*)
from (select distinct * from t) t
However, NULL values might throw it off.

Get count and result from SQL query in Go

I'm running a pretty straightforward query using the database/sql and lib/pq (postgres) packages and I want to toss the results of some of the fields into a slice, but I need to know how big to make the slice.
The only solution I can find is to do another query that is just SELECT COUNT(*) FROM tableName;.
Is there a way to both get the result of the query AND the count of returned rows in one query?

Conceptually, the problem is that the database cursor may not be enumerated to the end so the database does not really know how many records you will get before you actually read all of them. The only way to count (in general case) is to go through all the records in the resultset.
But practically, you can enforce it to do so by using subqueries like
select *, (select count(*) from table) from table
and just ignore the second column for records other than first. But it is very rude and I do not recommend doing so.

Not sure if this is what you are asking for but you can call the ##Rowcount function to return the count of the previous select statement that has been executed.
SELECT mytable.mycol FROM mytable WHERE mytable.foo = 'bar'
SELECT ##Rowcount
If you want the row count included in your result set you can use the the OVER clause (MSDN)
SELECT mytable.mycol, count(*) OVER(PARTITION BY mytable.foo) AS 'Count' FROM mytable WHERE mytable.foo = 'bar'
You could also perhaps just separate two SQL statements with the a ; . This would return a result set of both statements executed.

You would used count(*)
SELECT count(distinct last)
FROM (XYZTable)
WHERE date(FROM_UNIXTIME(time)) >= '2013-10-28' AND
id = 90 ;

counting rows in select clause with DB2

I would like to query a DB2 table and get all the results of a query in addition to all of the rows returned by the select statement in a separate column.
E.g., if the table contains columns 'id' and 'user_id', assuming 100 rows, the result of the query would appear in this format: (id) | (user_id) | 100.
I do not wish to use a 'group by' clause in the query. (Just in case you are confused about what i am asking) Also, I could not find an example here: http://mysite.verizon.net/Graeme_Birchall/cookbook/DB2V97CK.PDF.
Also, if there is a more efficient way of getting both these results (values + count), I would welcome any ideas. My environment uses zend framework 1.x, which does not have an ODBC adapter for DB2. (See issue http://framework.zend.com/issues/browse/ZF-905.)

If I understand what you are asking for, then the answer should be
select t.*, g.tally
from mytable t,
(select count(*) as tally
from mytable
) as g;
If this is not what you want, then please give an actual example of desired output, supposing there are 3 to 5 records, so that we can see exactly what you want.

You would use window/analytic functions for this:
select t.*, count(*) over() as NumRows
from table t;
This will work for whatever kind of query you have.

count(*) vs count(column-name) - which is more correct? [duplicate]

This question already has answers here:
In SQL, what's the difference between count(column) and count(*)?
(12 answers)
Closed 8 years ago.
Does it make a difference if you do count(*) vs count(column-name) as in these two examples?
I have a tendency to always write count(*) because it seems to fit better in my mind with the notion of it being an aggregate function, if that makes sense.
But I'm not sure if it's technically best as I tend to see example code written without the * more often than not.
count(*):
select customerid, count(*), sum(price)
from items_ordered
group by customerid
having count(*) > 1;
vs. count(column-name):
SELECT customerid, count(customerid), sum(price)
FROM items_ordered
GROUP BY customerid
HAVING count(customerid) > 1;

COUNT(*) counts all rows
COUNT(column) counts non-NULLs only
COUNT(1) is the same as COUNT(*) because 1 is a non-null expressions
Your use of COUNT(*) or COUNT(column) should be based on the desired output only.

This applies to MySQL. I'm not sure about the others.
The difference is:
COUNT(*) will count the number of records.
COUNT(column_name) will count the number of records where column_name is not null.
Therefore COUNT(*) is what you should use. If you're using MyISAM and there is no WHERE clause, then the optimiser doesn't even have to look at the table, since the number of rows is already cached.

When it's an identifier (and guaranteed to be non-NULL) then it probably doesn't matter.
However, there is a difference between COUNT(*) and COUNT(column) in general, in that COUNT(column) will return a count of the non-NULL values in the column. There is also the COUNT(DISTINCT column) variant which returns the number of unique, non-NULL values.

Generally it's the same, but in details AFAIK "count(*)" is better b/c "count(columnname)" forces DB to execute a little more code to lookup that column name (but not necessary though).

Yes, there is possible difference in performance. Depending on your query, and the indexing of the table in question, it can be quicker to get the count from the index instead of going to table for the data. Thus you probably should specify the field name, instead of using *.

Aggregate functions in WHERE clause in SQLite

Simply put, I have a table with, among other things, a column for timestamps. I want to get the row with the most recent (i.e. greatest value) timestamp. Currently I'm doing this:
SELECT * FROM table ORDER BY timestamp DESC LIMIT 1
But I'd much rather do something like this:
SELECT * FROM table WHERE timestamp=max(timestamp)
However, SQLite rejects this query:
SQL error: misuse of aggregate function max()
The documentation confirms this behavior (bottom of page):
Aggregate functions may only be used in a SELECT statement.
My question is: is it possible to write a query to get the row with the greatest timestamp without ordering the select and limiting the number of returned rows to 1? This seems like it should be possible, but I guess my SQL-fu isn't up to snuff.

SELECT * from foo where timestamp = (select max(timestamp) from foo)
or, if SQLite insists on treating subselects as sets,
SELECT * from foo where timestamp in (select max(timestamp) from foo)

There are many ways to skin a cat.
If you have an Identity Column that has an auto-increment functionality, a faster query would result if you return the last record by ID, due to the indexing of the column, unless of course you wish to put an index on the timestamp column.
SELECT * FROM TABLE ORDER BY ID DESC LIMIT 1

I think I've answered this question 5 times in the past week now, but I'm too tired to find a link to one of those right now, so here it is again...
SELECT
*
FROM
table T1
LEFT OUTER JOIN table T2 ON
T2.timestamp > T1.timestamp
WHERE
T2.timestamp IS NULL
You're basically looking for the row where no other row matches that is later than it.
NOTE: As pointed out in the comments, this method will not perform as well in this kind of situation. It will usually work better (for SQL Server at least) in situations where you want the last row for each customer (as an example).

you can simply do
SELECT *, max(timestamp) FROM table
Edit:
As aggregate function can't be used like this so it gives error. I guess what SquareCog had suggested was the best thing to do
SELECT * FROM table WHERE timestamp = (select max(timestamp) from table)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Is COUNT(fld) faster than COUNT(*)? [duplicate] - sql

No, count() is faster than count(fld) (in the cases where there is a difference at all). The count(fld) has to consider the data in the field, as it counts all non-null values. The count() only counts the number of records, so it doesn't need access to the data.

SELECT COUNT() AS count FROM tbl The above query doesn't even count the rows assuming there's no WHERE clause, it reads directly from the table cache. Specifying a field instead of forces SQL to actually count the rows, so it's much faster to use * when there's no WHERE clause.

Related

What happens when you use DISTINCT * in COUNT() in SQL?

Get count and result from SQL query in Go

counting rows in select clause with DB2

count(*) vs count(column-name) - which is more correct? [duplicate]

Aggregate functions in WHERE clause in SQLite

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Is COUNT(fld) faster than COUNT(*)? [duplicate] - sql

No, count(*) is faster than count(fld) (in the cases where there is a difference at all). The count(fld) has to consider the data in the field, as it counts all non-null values. The count(*) only counts the number of records, so it doesn't need access to the data.

SELECT COUNT(*) AS count FROM tbl The above query doesn't even count the rows assuming there's no WHERE clause, it reads directly from the table cache. Specifying a field instead of * forces SQL to actually count the rows, so it's much faster to use * when there's no WHERE clause.

Related

What happens when you use DISTINCT * in COUNT() in SQL?

Get count and result from SQL query in Go

counting rows in select clause with DB2

count(*) vs count(column-name) - which is more correct? [duplicate]

Aggregate functions in WHERE clause in SQLite

Categories

Resources

No, count() is faster than count(fld) (in the cases where there is a difference at all). The count(fld) has to consider the data in the field, as it counts all non-null values. The count() only counts the number of records, so it doesn't need access to the data.

SELECT COUNT() AS count FROM tbl The above query doesn't even count the rows assuming there's no WHERE clause, it reads directly from the table cache. Specifying a field instead of forces SQL to actually count the rows, so it's much faster to use * when there's no WHERE clause.