sql count performance [duplicate] - sql

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Count(*) vs Count(1)
If I have a table, 'id' is primary key, then these two commands have different performance ?
select count(*) from t;
select count(id) from t;
thanks

These would have the same performance. In most databases, count() results in a scan of the table or available indexes. Whether or not it uses the index instead of the table depends only on the query optimizer. If the optimizer is smart enough to use the index, it should be smart enough in both cases.
Using available metadata tables, you can often get the number of rows in a table much mroe efficiently than by using a count() query.

No, Oracle takes over and takes the fastest way in case of count(*)

I think if id is primary Key both count(*) and count(id) are semantically equivalent.
But for readers count(id) means the intention to count all rows where id is not null. To avoid confusions I would rather use count(*).

Related

Improve SQl query by indexing or aggregated index

Assume this general SQL query takes too long to execute. Which of the 4 options below can be done to improve its performance? Why? This is just for the understanding of SQL, not a dbms specific.
CREATE INDEX IDX ON Vaccinations (COUNT(*), Site)
Wrap the query in a view and ask users to query from the view.
CREATE AGGREGATE INDEX ON Vaccinations(COUNT(*))
CREATE INDEX IDX ON Vaccinations (Site)
SELECT Site, COUNT(*) AS Number of Vaccinations
FROM Vaccinations
GROUP BY Site;
A few rows from the Vaccinations table is shown here:
Personally, I'm leaning towards 1. Since the count(*) is the only thing being used in the SELECT...FROM as a condition. However, I'm not entirely sure what the "aggregate index" option entails. Can someone explain when it's preferred over normal index? Thanks!

Most efficient way of getting mariaDB/SQL record count

A question on a simple SQL statement, but one which I sometimes wonder over. Thought I'd see if anyone knew the answer to.
When counting the records in a table using a simple SQL statement, which has the least overheard:
1) SELECT COUNT(single_primary_field) FROM table, i.e. SELECT COUNT(user_ID) FROM users;
2) SELECT COUNT(*) FROM table
I initially thought the first may be quickest. But perhaps not having a specific field to associate with makes the second quicker?
Probably makes very little difference speed wise either way.
Thanks
COUNT(column) counts only selected column and ignore the null values.
COUNT(*) count rows and don't care values in the columns.
Using COUNT(*) is a better way for counting rows.
Count(*) is most efficent way to count according to mysql:
Count
Have a read through https://mariadb.com/kb/en/library/explain/ to look at what type of indexing your query is using, it usually hints at its performance.
I think Count(*) is going to be the fastest because maria does store a running count.

Does the order of the columns in a SELECT statement make a difference?

This question was inspired by a previous question posted on SO, "Does the order of the WHERE clause make a differnece?". Would it improve a SELECT statement's performance if the the columns used in the WHERE section are placed at the begining of the SELECT statement?
example:
SELECT customer.id,
transaction.id,
transaction.efective_date,
transaction.a,
[...]
FROM customer, transaction
WHERE customer.id = transaction.id;
I do know that limiting the list of columns to only the needed ones in a SELECT statement improves performance as opposed to using SELECT * because the current list is smaller.
For Oracle and Informix and any other self-respecting DBMS, the order of the columns should have no impact on performance. Similarly, it should be the case that the query engine finds the optimal order to process the Where clause so the order should not matter all things being equal (i.e., looking past constructs which might force an execution order).

In SQL is there a difference between count(*) and count(<fieldname>)

Pretty self explanatory question. Is there any reason to use one or the other?
Count(*) counts all records, including nulls, whereas Count(fieldname) does not include nulls.
Select count(*) selects any row, select count(field) selects rows where this field is not null.
If you want to improve performance (i.e. be a complete performance Nazi), you might want to do neither.
Example:
SELECT COUNT(1) FROM MyTable WHERE ...
This puzzled me for a while too.
In MySQL at least COUNT(*) counts the number of rows where every (*) value in the row is not null. Just COUNTing a column will count the number of rows where that column is not null.
In terms of performance using a single column would be slightly faster,
count(*) is faster if table type is MyISAM with no WHERE statement. With WHERE the speed will be the same for MyISAM and InnoDB.

In SQL, what’s the difference between count(*) and count('x')? [duplicate]

This question already has answers here:
In SQL, what's the difference between count(column) and count(*)?
(12 answers)
Closed 9 years ago.
I have the following code:
SELECT <column>, count(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1;
Is there any difference to the results or performance if I replace the COUNT(*) with COUNT('x')?
(This question is related to a previous one)
To say that SELECT COUNT(*) vs COUNT(1) results in your DBMS returning "columns" is pure bunk. That may have been the case long, long ago but any self-respecting query optimizer will choose some fast method to count the rows in the table - there is NO performance difference between SELECT COUNT(*), COUNT(1), COUNT('this is a silly conversation')
Moreover, SELECT(1) vs SELECT(*) will NOT have any difference in INDEX usage -- most DBMS will actually optimize SELECT( n ) into SELECT(*) anyway. See the ASK TOM: Oracle has been optimizing SELECT(n) into SELECT(*) for the better part of a decade, if not longer:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1156151916789
problem is in count(col) to count()
conversion
**03/23/00 05:46 pm *** one workaround is to set event 10122 to
turn off count(col) ->count()
optimization. Another work around is
to change the count(col) to count(),
it means the same, when the col has a
NOT NULL constraint. The bug number is
1215372.
One thing to note - if you are using COUNT(col) (don't!) and col is marked NULL, then it will actually have to count the number of occurrences in the table (either via index scan, histogram, etc. if they exist, or a full table scan otherwise).
Bottom line: if what you want is the count of rows in a table, use COUNT(*)
The major performance difference is that COUNT(*) can be satisfied by examining the primary key on the table.
i.e. in the simple case below, the query will return immediately, without needing to examine any rows.
select count(*) from table
I'm not sure if the query optimizer in SQL Server will do so, but in the example above, if the column you are grouping on has an index the server should be able to satisfy the query without hitting the actual table at all.
To clarify: this answer refers specifically to SQL Server. I don't know how other DBMS products handle this.
This question is slightly different that the other referenced. In the referenced question, it was asked what the difference was when using count(*) and count(SomeColumnName), and SQLMenace's answer was spot on.
To address this question, essentially there is no difference in the result. Both count(*) and count('x') and say count(1) will return the same number. The difference is that when using " * " just like in a SELECT all columns are returned, then counted. When a constant is used (e.g. 'x' or 1) then a row with one column is returned and then counted. The performance difference would be seen when " * " returns many columns.
Update: The above statement about performance is probably not quite right as discussed in other answers, but does apply to subselect queries when using EXISTS and NOT EXISTS
MySQL: According to the MySQL website, COUNT(*) is faster for single table queries when using MyISAM:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_count
I'm guessing with a having clause with a count in it may change things.