difference between count(*) and count(id) in impala - impala

I am freshman using impala. When using function having with count(*), I get the result:
HAVING clause 'count(*)' requires return type 'BOOLEAN'. Actual type is 'BIGINT'.
the sql code is here.
select listingid, inserttime, count(*) from listauditreason
group by listingid, inserttime
having count(*) >= 2
limit 20
But when replacing count(*) with count(id), I get the output. The code is here:
select listingid, inserttime, count(listingid) from listauditreason
group by listingid, inserttime
having count(listingid) >= 2
limit 20
I am curious about the difference between the two and why I can get the right result in the second way.

Related

Combine SQL Queries in Same Table

I am trying to combine the following queries:
SELECT country_name,
avg(value) as Entrance_Age
FROM `bigquery-public-data.world_bank_intl_education.international_education`
WHERE indicator_code = 'UIS.THAGE.0'
GROUP BY country_name
ORDER BY avg(value) DESC LIMIT 10
SELECT avg(value) as Illiterate
FROM `bigquery-public-data.world_bank_intl_education.international_education`
WHERE indicator_code = 'UIS.ILLPOP.AG25T64'
GROUP BY country_name
ORDER BY avg(value) DESC LIMIT 10
The output from the first query is:[1]: https://i.stack.imgur.com/jsxx7.png
The goal is to get another column named "Illiterate" next to Entrance_Age. I am trying to show the illiteracy rate for each of these 10 countries next to the Entrance Age column. All the data is from the same table. The values are linked to the indicator_code which is a statistics based on the indicator code.
I've tried multiple joins but can't seem to get one that works.
If there is anything I am missing from my question, please let me know.
You can use conditional aggregation:
SELECT country_name,
avg(case when indicator_code = 'UIS.THAGE.0' then value end) as Entrance_Age,
avg(case when indicator_code = 'UIS.ILLPOP.AG25T64' then value end) as illiterate
FROM `bigquery-public-data.world_bank_intl_education.international_education`
GROUP BY country_name
ORDER BY Entrance_Age DESC LIMIT 10

Select query to show timestamp (HH:MM:SS) colomn with group by HH (ORACLE QUERY)

I am trying to create select query in oracle to get following result from my_table
table contents timestamp_coloumn and count coloumn, records in timestamp_colomn are getting inserted are by minute bases.
Tried query:- something like this
select to_char(timestamp_coloumn ,'HH24:MI:SS') as TS , count as count
from my_table
group by to_char(timestamp_coloumn ,'HH24');
Error:-
ORA-00979: not a GROUP BY expression
if ,I match select and and group by statement like following it works but, i couldn't achieve my expected result (i dont know if is right to query like that)
select to_char(timestamp_coloumn ,'HH24:MI:SS') as TS , count as count
from my_table
group by to_char(timestamp_coloumn ,'HH24:MI:SS');
Expected result (Hourly timestamp and count is grouped and summed for all records present in that hour):-
timestamp_coloumn count
--------------------------
07:01:23 4
08:01:36 3
09:01:44 6
10:01:10 5
Please help me with this query
You can use MIN():
select min(to_char(timestamp_coloumn ,'HH24:MI:SS')) as TS, count(*)
from my_table
group by to_char(timestamp_coloumn ,'HH24');
Or make the two expressions match:
select to_char(timestamp_coloumn ,'HH24') as TS, count(*)
from my_table
group by to_char(timestamp_coloumn ,'HH24');

Get Total Sum with User Sum

SQL Table:
UserId ReportsRead
1 4
2 6
3 5
I would like to query that table so that I can get the following out:
UserId ReportsRead TotalReports
1 4 15
The problem is that because I apply the WHERE clause the sum I get will be the same as users reports read.
SELECT UserId, ReportsRead, SUM(ReportsRead) AS TotalReports FROM MyTable WHERE UserId = 1
Is there a built in function that will allow me to do this? I would like to avoid Sub-queries entirely.
I don't usually recommend subqueries in this situation, but in this case, it seems like a simple approach:
SELECT UserId, ReportsRead,
(SELECT SUM(ReportsRead) from MyTable) AS TotalReports
FROM MyTable
WHERE UserId = 1;
If you want rows for all users, then window functions are the way to go:
select t.*, sum(reportsread) over () as totalreports
from mytable;
However, you can't include a where clause and still expect to get the correct total.
Use the sum window function.
SELECT UserId, ReportsRead, SUM(ReportsRead) OVER() AS TotalReports
FROM MyTable
Use a filtering condition to get a specific userId like
SELECT *
FROM (SELECT UserId, ReportsRead, SUM(ReportsRead) OVER() AS TotalReports
FROM MyTable
) t
WHERE UserId=1

How to filter records by them amount per date?

i have a tablet 'A' that have a column of date. and the same date can be in a few records. I'm trying to filter the records where the amount of the records by day is less than 5. And still keep all the fields of the tablet.
I mean that if i have only 4 records on 11/10/2017 I need to filter all of this 4 records.
So You can SELECT them basing at sub-query . In SUB-Query group them by this date column and then use HAVING with aggregated count to know how many in every date-group we have and then select all which have this count lesser than 5 ;
SELECT *
FROM A
WHERE A.date in (SELECT subA.date
FROM A
GROUP BY A.date
HAVING COUNT(*) < 5 );
Take Care's answer is good. Alternatively, you can use an analytic/windowing function. I'd benchmark both and see which one works better.
with cte as (
select *, count(1) over (partition by date) as cnt
from table_a
)
select *
from cte
where cnt < 5

adding count( ) column on each row

I'm not sure if this is even a good question or not.
I have a complex query with lot's of unions that searches multiple tables for a certain keyword (user input). All tables in which there is searched are related to the table book.
There is paging on the resultset using LIMIT, so there's always a maximum of 10 results that get withdrawn.
I want an extra column in the resultset displaying the total amount of results found however. I do not want to do this using a separate query. Is it possible to add a count() column to the resultset that counts every result found?
the output would look like this:
ID Title Author Count(...)
1 book_1 auth_1 23
2 book_2 auth_2 23
4 book_4 auth_.. 23
...
Thanks!
This won't add the count to each row, but one way to get the total count without running a second query is to run your first query using the SQL_CALC_FOUND_ROWS option and then select FOUND_ROWS(). This is sometimes useful if you want to know how many total results there are so you can calculate the page count.
Example:
select SQL_CALC_FOUND_ROWS ID, Title, Author
from yourtable
limit 0, 10;
SELECT FOUND_ROWS();
From the manual:
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_found-rows
The usual way of counting in a query is to group on the fields that are returned:
select ID, Title, Author, count(*) as Cnt
from ...
group by ID, Title, Author
order by Title
limit 1, 10
The Cnt column will contain the number of records in each group, i.e. for each title.
Regarding second query:
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl
cross join (select count(*) as cnt from tbl) as x
If you will not join to other table(s):
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl, (select count(*) as cnt from tbl) as x
My Solution:
SELECT COUNT(1) over(partition BY text) totalRecordNumber
FROM (SELECT 'a' text, id_consult_req
FROM consult_req cr);
If your problem is simply the speed/cost of doing a second (complex) query I would suggest you simply select the resultset into a hash-table and then count the rows from there while returning, or even more efficiently use the rowcount of the previous resultset, then you do not even have to recount
This will add the total count on each row:
select count(*) over (order by (select 1)) as Cnt,*
from yourtable
Here is your answare:
SELECT *, #cnt count_rows FROM (
SELECT *, (#cnt := #cnt + 1) row_number FROM your_table
CROSS JOIN (SELECT #cnt := 0 AS variable) t
) t;
You simply cannot do this, you'll have to use a second query.