Listagg + Count in Select duplicates - sql

I'm writing up a query and cannot seem to get over this hurdle.
I am using both LISTAGG and COUNT (side-by-side) in it and whenever I do so, the ListAgg will duplicate when count is more than 1. Moreover, it adds more into the count when the ListAgg is more than one. They're each messing with each other, and I want to know how to keep them within the same query, but keep duplicates from appearing in the ListAgg while finding only the correct amount of instances for the Count.
I've tried using DISTINCT and various groupings, but to no avail.
Here is my (simplified) SQL:
SELECT DISTINCT /*+PARALLEL */ ID, NAME, LISTAGG(USERID, ';'), COUNT(MAIN_DATA)
FROM MAIN m
JOIN USERS u on m.pk1 = u.main_pk1
WHERE MAIN_DATA like '%keyword%'
GROUP BY ID, NAME
which yields something similar to this:
ID|NAME|USERID|MAIN_DATA
1|Hello|Jim|1
2|Hi|Arthur;Arthur;Arthur|3
3|Bonjour|Jane;Jane;Jim;Jim|4
When ID 2 should only have Arthur once, and there are only 2 instances of the keyword in ID 3, not 4. How can I achieve this?

Unfortunately, LISTAGG() doesn't support DISTINCT.
To remove duplicates, you need a subquery:
SELECT ID, NAME, LISTAGG(USERID, ';'), SUM(cnt)
FROM (SELECT ID, NAME, USERID, COUNT(*) as cnt
FROM MAIN m JOIN
USERS u
ON m.pk1 = u.main_pk1
WHERE m.MAIN_DATA like '%keyword%'
GROUP BY ID, NAME, USERID
) mu
GROUP BY ID, NAME;

Related

Filter by number of occurrences in a SQL Table

Given the following table where the Name value might be repeated in multiple rows:
How can we determine how many times a Name value exists in the table and can we filter on names that have a specific number of occurrances.
For instance, how can I filter this table to show only names that appear twice?
You can use group by and having to exhibit names that appear twice in the table:
select name, count(*) cnt
from mytable
group by name
having count(*) = 2
Then if you want the overall count of names that appear twice, you can add another level of aggregation:
select count(*) cnt
from (
select name
from mytable
group by name
having count(*) = 2
) t
It sounds like you're looking for a histogram of the frequency of name counts. Something like this
with counts_cte(name, cnt) as (
select name, count(*)
from mytable
group by name)
select cnt, count(*) num_names
from counts_cte
group by cnt
order by 2 desc;
You need to use a GROUP BY clause to find counts of name repeated as
select name, count(*) AS Repeated
from Your_Table_Name
group by name;
If You want to show only those Which are repeated more than one times. Then use the below query which will show those occurrences which are there more than one times.
select name, count(*) AS Repeated
from Your_Table_Name
group by name having count(*) > 1;

Count() how many times a name shows up in a table with the rest of info

I have read in various websites about the count() function but I still cannot make this work.
I made a small table with (id, name, last name, age) and I need to retrieve all columns plus a new one. In this new column I want to display how many times a name shows up or repeats itself in the table.
I have made test and can retrieve but only COLUMN NAME with the count column, but I haven't been able to retrieve all data from the table.
Currently I have this
select a.n_showsup, p.*
from [test1].[dbo].[person] p,
(select count(*) n_showsup
from [test1].[dbo].[person])a
This gives me all data on output but on the column n_showsup it gives me just the number of rows, now I know this is because I'm missing a GROUP BY but then when I write group by NAME it shows me a lot of records. This is an example of what I need:
You can use window functions, if you RDBMS supports them:
select t.*, count(*) over(partition by name) n_showsup
from mytable t
Alternatively, you can join the table with an aggregation query that counts the number of occurences of each name:
select t.*, x.n_showsup
from mytable t
inner join (select name, count(*) n_showsup from mytable group by name) x
on x.name = t.name
While the window function approach (#GMB's answer) is the right way to go, thinking through this from a subquery approach (like you were headed towards) would look something like:
select p.*, a.n_showsup
from [test1].[dbo].[person] p
INNER JOIN (
select name, count(*) n_showsup
from [test1].[dbo].[person]
GROUP BY name
) a ON p.name = a.name
This is VERY close to what you had, the difference is that we are grouping that subquery by name (so we get a count by name) and we can use that in the join criteria which we do with the ON clause on that INNER JOIN.
You should really never ever use a comma in your FROM clause. Instead use a JOIN.

Is there any optimal way to find the count of rows

I wrote SQL query in which I have one inner query and one outer query, My outer query produces the result on behalf of inner query, now I need to find the no of rows returning by my outer query, so what I did, I enclosed it inside another select statement and use count() function which produces the result, but i need to know more precise way to calculate the row count, please see my below query and suggest me the best way to do the same.
SELECT count(*) FROM (
SELECT
COUNT(*) NO_OF_EMP
,SUM(tbl.AMOUNT) TOTAL_AMOUNT
,tbl.YYYYMM
,tbl.DATA_PICKED_BY_NAME
,MIN(DATA_PICKED_DATE) DATA_PICKED_DATE
,ROW_NUMBER() OVER (ORDER BY tbl.REFERENCE_ID) AS ROW_NUM
FROM (
SELECT
SALARY_REPORT_ID
,EMP_NAME
,EMP_CODE
,PAY_CODE
,PAY_CODE_NAME
,AMOUNT
,PAY_MODE
,PAY_CODE_DESC
,YYYYMM
,REMARK
,EMP_ID
,PRAN_NUMBER
,PF_NUMBER
,PRAN_NO
,ATTOFF_EMPCODE
,DATA_PICKED_DATE
,DATA_PICKED_BY
,DATA_PICKED_BY_NAME
,SUBSTR(REFERENCE_ID,0,3) REFERENCE_ID
FROM SALARY_DETAIL_REPORT_HISTORY
WHERE PAY_CODE=999
AND REFERENCE_ID LIKE '202%'
) tbl
GROUP BY tbl.REFERENCE_ID,tbl.YYYYMM,tbl.DATA_PICKED_BY_NAME
order by tbl.YYYYMM
)mytbl1
Select count distinct of the most abbreviated version of a single value of your group values from your original query:
SELECT count(distinct SUBSTR(REFERENCE_ID,0,3) || YYYYMM || DATA_PICKED_BY_NAME)
FROM SALARY_DETAIL_REPORT_HISTORY
WHERE PAY_CODE=999
AND REFERENCE_ID LIKE '202%'

Issue using not exist in SQL

Not exist is not working.
I have a query which is fetching 10k rows... now there are 237 rows which I do not want to be retrieved in my final result but when I am using not exist it is fetching the same no. of rows that is 10k I have used the following query:
Select bu_name,
person_num,
name,
f_config_id,
ass_the
from x_asig_table
where not exist ((select 1
from
(select XXH.x_asig_table.*,
count(*) over (partition by bu_name, person_num, name) as c
from XXH.x_asig_table) t
where c > 1);
The sub-query is not correlated with the main query, i.e. it doesn't matter what row you look at in the main query, the subquery will always give you the same result. So either you get all rows or none. It is not possible with this query to get some rows and others not.
Add criteria to your subquery that relates it to the main query to solve the problem.
You need to connect back to the outer query. Something like this (also simplified your query, untested, but should work):
Select bu_name,
person_num,
name,
f_config_id,
ass_the
from x_asig_table X
where not exist (
SELECT NULL
FROM x_asig_table Y
GROUP BY bu_name,person_num, name
WHERE X.bu_name = Y.bu_name
AND X.person_num = Y.person_num
AND X.name = Y.name
HAVING COUNT(1) > 1
)
You appear to be trying to find only those rows where there is a single row per combination of bu_name, person_num and name (although the question is rather unclear what your intents are). If so, then you can do it without using EXISTS like this:
SELECT bu_name,
person_num,
name,
f_config_id,
ass_the
FROM (
SELECT bu_name,
person_num,
name,
f_config_id,
ass_the,
COUNT(1) OVER ( PARTITION BY bu_name, person_num, name ) AS cnt
FROM x_asig_table
)
WHERE cnt = 1;

adding count( ) column on each row

I'm not sure if this is even a good question or not.
I have a complex query with lot's of unions that searches multiple tables for a certain keyword (user input). All tables in which there is searched are related to the table book.
There is paging on the resultset using LIMIT, so there's always a maximum of 10 results that get withdrawn.
I want an extra column in the resultset displaying the total amount of results found however. I do not want to do this using a separate query. Is it possible to add a count() column to the resultset that counts every result found?
the output would look like this:
ID Title Author Count(...)
1 book_1 auth_1 23
2 book_2 auth_2 23
4 book_4 auth_.. 23
...
Thanks!
This won't add the count to each row, but one way to get the total count without running a second query is to run your first query using the SQL_CALC_FOUND_ROWS option and then select FOUND_ROWS(). This is sometimes useful if you want to know how many total results there are so you can calculate the page count.
Example:
select SQL_CALC_FOUND_ROWS ID, Title, Author
from yourtable
limit 0, 10;
SELECT FOUND_ROWS();
From the manual:
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_found-rows
The usual way of counting in a query is to group on the fields that are returned:
select ID, Title, Author, count(*) as Cnt
from ...
group by ID, Title, Author
order by Title
limit 1, 10
The Cnt column will contain the number of records in each group, i.e. for each title.
Regarding second query:
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl
cross join (select count(*) as cnt from tbl) as x
If you will not join to other table(s):
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl, (select count(*) as cnt from tbl) as x
My Solution:
SELECT COUNT(1) over(partition BY text) totalRecordNumber
FROM (SELECT 'a' text, id_consult_req
FROM consult_req cr);
If your problem is simply the speed/cost of doing a second (complex) query I would suggest you simply select the resultset into a hash-table and then count the rows from there while returning, or even more efficiently use the rowcount of the previous resultset, then you do not even have to recount
This will add the total count on each row:
select count(*) over (order by (select 1)) as Cnt,*
from yourtable
Here is your answare:
SELECT *, #cnt count_rows FROM (
SELECT *, (#cnt := #cnt + 1) row_number FROM your_table
CROSS JOIN (SELECT #cnt := 0 AS variable) t
) t;
You simply cannot do this, you'll have to use a second query.