How to get WHERE come after group by in sqlite3 - sql

I am working with a dataset, I used GROUP BY to get the count of one of the columns. Then, I want to filter the columns with count > 2, I find that WHERE does not work after GROUP BY, may I ask what should I do?
For example, for the id_table with only one column id with the numbers [2, 3, 2, 3, 4, 2], I want to count each and find which id appears more than two times, in this case, the output should be 2 since it appeared 3 times. My code is like below:
SELECT id
FROM id_table
GROUP BY id
WHERE count(id) > 2;
The error code is: near "WHERE": syntax error

Try this (HAVING instead of WHERE):
SELECT id
FROM id_table
GROUP BY id
HAVING count(id) > 2
;

Related

capture non occured data from dynamic input data

I have data rules like given below
1|Group1|Mandatory|1st occurrence
2|Group1|Optional|1st occurrence
3|Group1|Mandatory|1st occurrence
1|Group1|Mandatory|2nd occurrence
2|Group1|Optional|2nd occurrence
3|Group1|Mandatory|2nd occurrence
4|Group2|Mandatory|1st occurrence
5|Group2|Mandatory|1st occurrence
6|Group2|Optional|1st occurrence
Here as you can see Group 1 is present two times for data record 1, 2 and 3. It means group 1 can appear min 1 time and max two times. And also can see the occurrence of that specific record under group 1 when it occurs. Mandatory should occur always and optional is may or may not be occur in input data. But all needs to be captured ..what's missing
And here is my input column data. That's a only column am having in input data
1
2
3
1
2
4
5
Is there any way I could get result to identify which data set if missing according to data rules table from input data ? Like in this example, output should like saying Mandatory record(3) is missing from Group 1 in second occurrence. That's only available information would be coming from input data and data rules table.
If any things needs to be added to get desired result...I would like to hear..what it is. All suggestions are welcome.
Thanks
I think You need something like this:
with input as (select column_value id,
count(1) over (partition by column_value order by null
rows between unbounded preceding and current row) cnt
from table(sys.odcinumberlist(1, 2, 3, 1, 2, 4, 5)))
select *
from data
where status = 'Mandatory'
and (id, occurence) not in (select id, cnt from input)
demo
ID GRP STATUS OCCURENCE
---- ---------- ---------- ---------
3 Group1 Mandatory 2
Count how many times id appears in input data and compare result with mandatory occurences in your data.
Edit: explanation
select column_value id,
count(1) over (partition by column_value order by null
rows between unbounded preceding and current row) cnt
from table(sys.odcinumberlist(1, 2, 3, 1, 2, 4, 5))
This part simulates you input data. table(sys.odcinumberlist(1, 2, 3, 1, 2, 4, 5)) is just simulation of inputs, probably these ids are in some table, select them from there. For each provided id I'm counting it's growing number of occurences using function count() in analytic version, so we have this:
id cnt
--- ---
1 1
1 2
2 1
2 2
3 1
4 1
5 1
Next these pairs are compared with mandatory pairs (id, occurence) in your data. If something is missing last select displays this row with a clause not in.
This is how I understood Your question, perhaps You'll need some modifications, but now You have some hints. Hope this helps (and sorry for my bad English ;-) ).

Postgresql: Query to know which fraction of the values are larger/smaller

I would like to query my database to know which fraction/percentage of the elements of a table are larger/smaller than a given value.
For instance, let's say I have a table shopping_list with the following schema:
id integer
name text
price double precision
with contents:
id name price
1 banana 1
2 book 20
3 chicken 5
4 chocolate 3
I am now going to buy a new item with price 4, and I would like to know where this new item will be ranked in the shopping list. In this case the element will be greater than 50% of the elements.
I know I can run two queries and count the number of elements, e.g.:
-- returns = 4
SELECT COUNT(*)
FROM shopping_list;
-- returns = 2
SELECT COUNT(*)
FROM shopping_list
WHERE price > 4;
But I would like to do it with a single query to avoid post-processing the results.
if you just want them in single query use UNION
SELECT COUNT(*), 'total'
FROM shopping_list
UNION
SELECT COUNT(*),'greater'
FROM shopping_list
WHERE price > 4;
The simplest way is to use avg():
SELECT AVG( (price > 4)::float)
FROM shopping_list;
One way to get both results is as follows:
select count(*) as total,
(select count(*) from shopping_list where price > 4) as greater
from shopping_list
It will get both results in a single row, with the names you specified. It does, however, involve a query within a query.
I found the aggregate function PERCENT_RANK which does exactly what I wanted:
SELECT PERCENT_RANK(4) WITHIN GROUP (ORDER BY price)
FROM shopping_list;
-- returns 0.5

Find duplicated values on array column

I have a table with a array column like this:
my_table
id array
-- -----------
1 {1, 3, 4, 5}
2 {19,2, 4, 9}
3 {23,46, 87, 6}
4 {199,24, 93, 6}
And i want as result what and where is the repeated values, like this:
value_repeated is_repeated_on
-------------- -----------
4 {1,2}
6 {3,4}
Is it possible? I don't know how to do this. I don't how to start it! I'm lost!
Use unnest to convert the array to rows, and then array_agg to build an array from the ids
It should look something like this:
SELECT v AS value_repeated,array_agg(id) AS is_repeated_on FROM
(select id,unnest(array) as v from my_table)
GROUP by v HAVING Count(Distinct id) > 1
Note that HAVING Count(Distinct id) > 1 is filtering values that don't appear even once
The clean way to call a set-returning function like unnest() is in a LATERAL join, available since Postgres 9.3:
SELECT value_repeated, array_agg(id) AS is_repeated_on
FROM my_table
, unnest(array_col) value_repeated
GROUP BY value_repeated
HAVING count(*) > 1
ORDER BY value_repeated; -- optional
About LATERAL:
Call a set-returning function with an array argument multiple times
There is nothing in your question to rule out shortcut duplicates (the same element more than once in the same array (like I#MSoP commented), so it must be count(*), not count (DISTINCT id).

SQL query for aggregate on multiple rows

I have data in a table like following
Name indicator
A 1
A 2
A 3
B 1
B 2
C 3
I want to get count of Names, for which both indicator 1,2 exists. In the preeceding example, this number is 2 (A & B both have indicator as 1, and 2).
The data I am dealing with is moderately large, and i need to get the similar information of some other permutations of (pre defined ) indicators (which i can change, once i get base query).
Try this:
SELECT Name
FROM Tablename
WHERE indicator IN(1, 2)
GROUP BY Name
HAVING COUNT(DISTINCT indicator) = 2;
See it in action here:
SQL Fiddle Demo

SQL COUNT of COUNT

I have some data I am querying. The table is composed of two columns - a unique ID, and a value. I would like to count the number of times each unique value appears (which can easily be done with a COUNT and GROUP BY), but I then want to be able to count that. So, I would like to see how many items appear twice, three times, etc.
So for the following data (ID, val)...
1, 2
2, 2
3, 1
4, 2
5, 1
6, 7
7, 1
The intermediate step would be (val, count)...
1, 3
2, 3
7, 1
And I would like to have (count_from_above, new_count)...
3, 2 -- since three appears twice in the previous table
1, 1 -- since one appears once in the previous table
Is there any query which can do that? If it helps, I'm working with Postgres. Thanks!
Try something like this:
select
times,
count(1)
from ( select
id,
count(distinct value) as times
from table
group by id ) a
group by times