What is the best way to get count of rows and distinct rows in a single query?
To get distinct count we can use subquery like this:
select count(*) from
(
select distinct * from table
)
I have 15+ columns and have many duplicates rows as well and I want to calculate count of rows as well as distinct count of rows in one query.
More if I use this
select count(*) as Rowcount , count(distinct *) as DistinctCount from table
This will not give accurate results as count(distinct *) doesn't work.
Why don't you just put the subquery inside another query?
select count(*),
(select count(*) from (select distinct * from table))
from table;
create table tbl
(
col int
);
insert into tbl values(1),(2),(1),(3);
select count(*) as distinct_count, sum(sum) as all_count
from (
select count(col) sum from tbl group by col
)A
I think I have understood what you are looking for. You need to use some window function. So, you query should be look like =>
Select COUNT(*) OVER() YourRowcount ,
COUNT(*) OVER(Partition BY YourColumnofGroup) YourDistinctCount --Basic of the distinct count
FROM Yourtable
NEW Update
select top 1
COUNT(*) OVER() YourRowcount,
DENSE_RANK() OVER(ORDER BY YourColumn) YourDistinctCount
FROM Yourtable ORDER BY TT DESC
Note: This code is written sql server. Please check the code and let me know.
Trying to get an overall distinct count of the employees for a range of records which has a group by on it.
I've tried using the "over()" clause but couldn't get that to work. Best to explain using an example so please see my script below and wanted result below.
EDIT:
I should mention I'm hoping for a solution that does not use a sub-query based on my "sales_detail" table below because in my real example, the "sales_detail" table is a very complex sub-query.
Here's the result I want. Column "wanted_result" should be 9:
Sample script:
CREATE TEMPORARY TABLE [sales_detail] (
[employee] varchar(100),[customer] varchar(100),[startdate] varchar(100),[enddate] varchar(100),[saleday] int,[timeframe] varchar(100),[saleqty] numeric(18,4)
);
INSERT INTO [sales_detail]
([employee],[customer],[startdate],[enddate],[saleday],[timeframe],[saleqty])
VALUES
('Wendy','Chris','8/1/2019','8/12/2019','5','Afternoon','1'),
('Wendy','Chris','8/1/2019','8/12/2019','5','Morning','5'),
('Wendy','Chris','8/1/2019','8/12/2019','6','Morning','6'),
('Dexter','Chris','8/1/2019','8/12/2019','2','Mid','2.5'),
('Jennifer','Chris','8/1/2019','8/12/2019','4','Morning','2.75'),
('Lila','Chris','8/1/2019','8/12/2019','2','Morning','3.75'),
('Rita','Chris','8/1/2019','8/12/2019','2','Mid','1'),
('Tony','Chris','8/1/2019','8/12/2019','4','Mid','2'),
('Tony','Chris','8/1/2019','8/12/2019','1','Morning','6'),
('Mike','Chris','8/1/2019','8/12/2019','4','Mid','1.5'),
('Logan','Chris','8/1/2019','8/12/2019','3','Morning','6.25'),
('Blake','Chris','8/1/2019','8/12/2019','4','Afternoon','0.5')
;
SELECT
[timeframe],
SUM([saleqty]) AS [total_qty],
COUNT(DISTINCT [s].[employee]) AS [employee_count1],
SUM(COUNT(DISTINCT [s].[employee])) OVER() AS [employee_count2],
9 AS [wanted_result]
FROM (
SELECT
[employee],[customer],[startdate],[enddate],[saleday],[timeframe],[saleqty]
FROM
[sales_detail]
) AS [s]
GROUP BY
[timeframe]
;
If I understand correctly, you are simply looking for a COUNT(DISTINCT) for all employees in the table? I believe this query will return the results you are looking for:
SELECT
[timeframe],
SUM([saleqty]) AS [total_qty],
COUNT(DISTINCT [s].[employee]) AS [employee_count1],
(SELECT COUNT(DISTINCT [employee]) FROM [sales_detail]) AS [employee_count2],
9 AS [wanted_result]
FROM #sales_detail [s]
GROUP BY
[timeframe]
You can try this below option-
SELECT
[timeframe],
SUM([saleqty]) AS [total_qty],
COUNT(DISTINCT [s].[employee]) AS [employee_count1],
SUM(COUNT(DISTINCT [s].[employee])) OVER() AS [employee_count2],
[wanted_result]
-- select count form sub query
FROM (
SELECT
[employee],[customer],[startdate],[enddate],[saleday],[timeframe],[saleqty],
(select COUNT(DISTINCT [employee]) from [sales_detail]) AS [wanted_result]
--caculate the count with first sub query
FROM [sales_detail]
) AS [s]
GROUP BY
[timeframe],[wanted_result]
Use a trick where you only count each person on the first day they are seen:
select timeframe, sum(saleqty) as total_qty),
count(distinct employee) as employee_count1,
sum( (seqnum = 1)::int ) as employee_count2
9 as wanted_result
from (select sd.*,
row_number() over (partition by employee order by startdate) as seqnum
from sales_detail sd
) sd
group by timeframe;
Note: From the perspective of performance, your complex subquery is only evaluated once.
I don't understand why I can't use this in my code :
SELECT MAX(SMTHNG), COUNT(MAX(SMTHNG))
FROM SomeTable;
Searched for an answer but didn't find it in documentation about these aggregate functions.
Also I get an SQL-compiler error "Invalid column name "SMTHNG"".
You want to know what the maximum SMTHNG in the table is with:
SELECT MAX(SMTHNG) FROM SomeTable;
This is an aggregation without GROUP BY and hence results in one single row containing the maximum SMTHNG.
Now you also want to know how often this SMTHNG occurs and you add COUNT(MAX(SMTHNG)). This, however, does not work, because you can not aggregate an aggregate directly.
This doesn't work either:
SELECT ANY_VALUE(max_smthng), COUNT(*)
FROM (SELECT MAX(smthng) AS max_smthng FROM sometable) t;
because the sub query only contains one row, so it's too late to count.
So, either use a sub query and select from the table again:
SELECT ANY_VALUE(smthng), COUNT(*)
FROM sometable
WHERE smthng = (SELECT MAX(smthng) FROM sometable);
Or count per SMTHNG before looking for the maximum. Here is how to get the counts:
SELECT smthng, COUNT(*)
FROM sometable
GROUP BY smthng;
And the easiest way to get the maximum from this result is:
SELECT TOP(1) smthng, COUNT(*)
FROM sometable
GROUP BY smthng
ORDER BY COUNT(*) DESC;
First of all, please read my comment.
Depending on what you're trying to achieve, the statement have to be changed.
If you want to count the highest values in SMTHNG field, you may try this:
SELECT T1.SMTHNG, COUNT(T1.SMTHNG)
FROM SomeTable T1 INNER JOIN
(
SELECT MAX(SMTHNG) AS A
FROM SomeTable
) T2 ON T1.SMTHNG = T2.A
GROUP BY T1.SMTHNG;
use cte like below or subquery
with cte as
(
select count(*) as cnt ,col from table_name
group by col
) select max(cnt) from cte
you can not use double aggregate function at a time on same column
I want to get first value in a field in Oracle when another corresponding field has max value.
Normally, we would do this using a query and a subquery. The subquery ordering by a field and the outer query with where rownum<=1.
But, I cannot do this because the table aliases persist only one level deep and this query is a part of another big query and I need to use some aliases from the outermost query.
Here's the query structure
select
(
select a --This should get first value of a after b's are sorted desc
from
(
select a,b from table1 where table1.ID=t2.ID order by b desc
)
where rownum<=1
)
) as "A",
ID
from
table2 t2
Now this is not gonna work because alias t2 wont be available at innermost query.
Real world analogy that comes to my mind is I have a table containing records for all employees of a company, their salaries(including past salaries) and the date from which the salary was effective. So, for each employee, there will multiple records. Now, I want to get latest salaries for all the employees.
With SQL server, I could have used SELECT TOP. But that's not available with Oracle and since where clauses execute before order by, I cannot use where rownum<=1 and order by in same query and expect correct results.
How do I do this?
Using your analogy of employees and their salaries, if I understand what you are trying to do, you could do something like this (haven't tested):
SELECT *
FROM (
SELECT employee_id,
salary,
effective_date,
ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY effective_date DESC) rowno
FROM employees
)
WHERE rowno=1
I would much rather see you connect the subquery up with a JOIN instead of embedding it in the SELECT. Cleaner SQL. Then you can use the windowing function that roartechs suggests.
Select t2.whatever, t1.a
From table2 t2
Inner Join (
Select tfirst.ID, tfirst.a
From (
Select ID, a,
ROW_NUMBER() Over (Partition BY ID ORDER BY b DESC) rownumber
FROM table1
) tfirst
WHERE tfirst.rownumber=1
) t1 on t2.ID=t1.ID
If i perform a standard query in SQLite:
SELECT * FROM my_table
I get all records in my table as expected. If i perform following query:
SELECT *, 1 FROM my_table
I get all records as expected with rightmost column holding '1' in all records. But if i perform the query:
SELECT *, COUNT(*) FROM my_table
I get only ONE row (with rightmost column is a correct count).
Why is such results? I'm not very good in SQL, maybe such behavior is expected? It seems very strange and unlogical to me :(.
SELECT *, COUNT(*) FROM my_table is not what you want, and it's not really valid SQL, you have to group by all the columns that's not an aggregate.
You'd want something like
SELECT somecolumn,someothercolumn, COUNT(*)
FROM my_table
GROUP BY somecolumn,someothercolumn
If you want to count the number of records in your table, simply run:
SELECT COUNT(*) FROM your_table;
count(*) is an aggregate function. Aggregate functions need to be grouped for a meaningful results. You can read: count columns group by
If what you want is the total number of records in the table appended to each row you can do something like
SELECT *
FROM my_table
CROSS JOIN (SELECT COUNT(*) AS COUNT_OF_RECS_IN_MY_TABLE
FROM MY_TABLE)