I have two tables, class_students and school_students. I need to count the total number of schools and the proportion of the whole class, but I used two identical queries. This is my query:
SELECT t.class_name,
(SELECT COUNT (1) FROM school_students) as total_school_population,
COUNT (1) / (SELECT COUNT (1) FROM school_students)
FROM class_students t;
so, how do I optimize it?
If those two table doesn't have any relationship, you can try to use to CROSS JOIN let subquery get result set then use the column.
SELECT t.class_name,
t1.cnt total_school_population,
COUNT(1)/ t1.cnt
FROM class_students t CROSS JOIN
(
SELECT COUNT(1) cnt
from school_students
) t1
group by t.class_name,t1.cnt
To avoid the recalculation of total school count for each record, save the value in a column user variable
COLUMN school_count NEW_VALUE school_count
SELECT count(*) school_count
FROM school_students;
Then use this variable for the division expression
SELECT class_name,
'&school_count' as total_school_population,
COUNT (1) / &school_count as class_proportion
FROM class_students
GROUP BY class_name;
In case the tables contain very large number of records, do gather statistics and/or use optimizer hints like
/*+ ALL_ROWS*/
Related
I'm trying to select all columns in table top_teams_team as well as get a count of values for the hash_value column. The sql statement here is partially working in that it returns two columns, hash_value and total. I still want it to give me all the columns of the table as well.
select hash_value, count(hash_value) as total
from top_teams_team
group by hash_value
In the sql statement below, it gives me all the columns, but there are duplicates hash_value being displayed which isn't what I want. I tried putting distinct keyword in but it wasn't working correctly or maybe I'm not putting it in the right place.
select *
from top_teams_team
inner join (
select hash_value, count(hash_value) as total
from top_teams_team
group by hash_value
) q
on q.hash_value = top_teams_team.hash_value
A combination of a window function with DISTINCT ON might do what you are looking for:
SELECT DISTINCT ON (hash_value)
*, COUNT(*) OVER (PARTITION BY hash_value) AS total_rows
FROM top_teams_team
-- ORDER BY hash_value, ???
;
DISTINCT ON is applied after the window function, so Postgres first counts rows per distinct hash_value before picking the first row per group (incl. that count).
The query picks an arbitrary row from each group. If you want a specific one, add ORDER BY expressions accordingly.
This is not "a count of values for the hash_value column" but a count of rows per distinct hash_value. I guess that's what you meant.
Detailed explanation:
Best way to get result count before LIMIT was applied
Select first row in each GROUP BY group?
Depending on undisclosed information there may be (much) faster query styles ...
Optimize GROUP BY query to retrieve latest row per user
I am assuming that you are getting duplicate columns when you say: "but there are duplicates hash_value being displayed"
select q.hash_value, q.total, ttt.field1, ttt.field2, ttt.field3
from top_teams_team ttt
join (
select hash_value, count(hash_value) as total
from top_teams_team
group by hash_value
) q
on q.hash_value = top_teams_team.hash_value
Try using COUNT as an analytic function:
SELECT *, COUNT(*) OVER (PARTITION BY hash_value) total
FROM top_teams_team;
I am trying to run the query to get the total number of repetitions (appeared more than once) for one column called "abc" . I am trying this but not able to achieve.
select COUNT(SELECT DISTINCT card_no, COUNT(*) AS cnt )
please help, thanks in advance.
For Example below is the column :
cards
123,
456
,123
Result:
Count
1
As 123 appeared more than once.
You want the number of distinct values in the column that are repeated at least once, is that right?
SELECT COUNT(dupes)
FROM (SELECT card_no AS dupes, COUNT(*) cnt FROM table_name
GROUP BY card_no HAVING COUNT(*) > 1) A
Edit for explanation.
The inner query SELECT card_no AS dupes, COUNT(*) cnt FROM table_name GROUP BY card_no HAVING COUNT(*) > 1 returns only those values that are repeated in your table. The aliases on the columns are necessary because it's a subquery. You can run this query independently of the outer query to see what results it returns.
You have to have the group by on any field that you don't want to aggregate when you're aggregating other fields (e.g. performing a count of records), and the HAVING part is to filter out anything that isn't duplicated (i.e. has a count of 1). HAVING is the way to apply filtering on aggregated fields that you can't have in a WHERE.
The outer query SELECT COUNT(dupes)... is merely counting the number of card_no values returned by the inner query. Since these are grouped, it gives the number of distinct values that are duplicated.
A at the end there sets up an alias for the subquery so that it can be referenced like it's an actual table elsewhere in the query. This is necessary for any subquery in the FROM clause of another query. Effectively the select in the outer query reads SELECT COUNT(A.dupes)... and without the alias A there would be no way to qualify where the dupes field is being referenced from (even though in this case it's implied).
It's also worth noting that the field COUNT(*) cnt isn't required in the SELECT part of the subquery as it isn't being used anywhere else in the query. It will work just as effectively without it, as long as you still have the GROUP BY and HAVING clauses.
SELECT
card_no, COUNT(*) AS "Occurrences"
FROM
YourTable
GROUP BY card_no
HAVING
COUNT(*) > 1
I'm trying to do the following:
combine tables over a timerange using FROM TABLE_DATE_RANGE
FLATTEN that set of data
GROUP BY ColumnX
SELECT ColumnX, SUM(ColumnY), SUM(ColumnZ) over only unique ColumnX values.
here's the gist of my query:
SELECT
r.ColumnX
,SUM(r.ColumnY)
,SUM(r.ColumnZ)
FROM
(
SELECT *
FROM FLATTEN(
(
SELECT
ColumnX
,ColumnY
,ColumnZ
FROM TABLE_DATE_RANGE(projectx.events_,
TIMESTAMP('2015-09-01'), TIMESTAMP('2015-09-08'))), my_funky_object
)
WHERE ColumnY > 10
) r
GROUP BY
r.ColumnX
The problem is, I get a number of rows WAY GREATER than the count of unique values of ColumnX should. So I took a step back and simply outputted the GROUP BY - COUNT of ColumnX in order to debug, and I get the following output!
and I get what looks like an intermediate result.
What is happening and how do I ensure that my outer select only aggregates over unique values of ColumnX?
You're getting the count of each distinct value of ColumnX, but you're only showing the count, not the value.
If your goal is to get an accurate count for the number of distinct values, try something like this:
SELECT
COUNT(*) ct
FROM (
SELECT
1
FROM
... rest of your query ...
GROUP BY r.ColumnX
)
That inner query will give you exactly one row (each with the value 1) for each distinct value of ColumnX. The outer select statement will count the number of such rows.
Another alternative is to use EXACT_COUNT_DISTINCT to get the exact count of rows. That's simpler but less scalable than using GROUP BY.
I have an oracle database table with a lot of columns. I'd like to count the number of fully unique rows. The only thing I could find is:
SELECT COUNT(DISTINCT col_name) FROM table;
This however would require me listing all the columns and I haven't been able to come up with syntax that will do that for me. I'm guessing the reason for that is that this query would be very low performance? Is there a recommended way of doing this?
How about
SELECT COUNT(*)
FROM (SELECT DISTINCT * FROM Table)
It depends on what you are trying to accomplish.
To get a count of the distinct rows by specific column, so that you know what data exists, and how many of that distinct data there are:
SELECT DISTINCT
A_CODE, COUNT(*)
FROM MY_ARCHV
GROUP BY A_CODE
--This informs me there are 93 unique codes, and how many of each of those codes there are.
Another method
--How to count how many of a type value exists in an oracle table:
select A_CDE, --the value you need to count
count(*) as numInstances --how many of each value
from A_ARCH -- the table where it resides
group by A_CDE -- sorting method
Either way, you get something that looks like this:
A_CODE Count(*)
1603 32
1600 2
1605 14
I think you want a count of all distinct rows from a table like this
select count(1) as c
from (
select distinct *
from tbl
) distinct_tbl;
SELECT DISTINCT **col_name**, count(*) FROM **table_name** group by **col_name**
If i perform a standard query in SQLite:
SELECT * FROM my_table
I get all records in my table as expected. If i perform following query:
SELECT *, 1 FROM my_table
I get all records as expected with rightmost column holding '1' in all records. But if i perform the query:
SELECT *, COUNT(*) FROM my_table
I get only ONE row (with rightmost column is a correct count).
Why is such results? I'm not very good in SQL, maybe such behavior is expected? It seems very strange and unlogical to me :(.
SELECT *, COUNT(*) FROM my_table is not what you want, and it's not really valid SQL, you have to group by all the columns that's not an aggregate.
You'd want something like
SELECT somecolumn,someothercolumn, COUNT(*)
FROM my_table
GROUP BY somecolumn,someothercolumn
If you want to count the number of records in your table, simply run:
SELECT COUNT(*) FROM your_table;
count(*) is an aggregate function. Aggregate functions need to be grouped for a meaningful results. You can read: count columns group by
If what you want is the total number of records in the table appended to each row you can do something like
SELECT *
FROM my_table
CROSS JOIN (SELECT COUNT(*) AS COUNT_OF_RECS_IN_MY_TABLE
FROM MY_TABLE)