SQL count show 0 if no rows including where condition - sql

I'm doing a COUNT but cannot get the value 0 when there are no rows in the result.
If I remove the where condition:
AND documentstats.OPENINGDATE >= '2021-01-01T00: 00: 00.000'
it works fine and I get the value 0 when there are no rows in the result.
I am looking for an option to return value 0 in the NumberOfViews column when no rows are found in my count.
Can anyone help me?
SELECT
customertodocument.DocId,
COUNT (documentstats.DocId) AS NumberOfViews
FROM
customertodocument
LEFT JOIN
documentstats ON customertodocument.DocId = documentstats.DocId
AND customertodocument.customerId = documentstats.customerId
WHERE
customertodocument.customerId = '1111'
AND documentstats.openingdate >= '2021-01-01T00:00:00.000'
GROUP BY
customertodocument.DocId
ORDER BY
NumberOfViews ASC

The second condition in the WHERE clause is filtering out all non-matches. Because you have an explicit GROUP BY, the query will return no rows if the FROM clause has no rows.
If you want counts of 0, then move the condition to the ON clause of the LEFT JOIN. Note: Conditions on the second table go in the ON clause.
The query should look like:
SELECT cd.DocId, COUNT(ds.DocId) AS NumberOfViews
FROM customertodocument cd LEFT JOIN
documentstats ds
ON cd.DocId = ds.DocId AND
cd.customerId = ds.customerId AND
ds.openingdate >= '2021-01-01'
WHERE ds.customerId = 1111
GROUP BY cd.DocId
ORDER BY NumberOfViews ASC;
Notes:
Table aliases make the query easier to write and to read.
customerId looks like a number. If it is, then the comparison should be to a number. If the id is really a string, put the single quotes back in.
You have a date constant. There is no need to include the time. No real harm, except it clutters the query.

My guess is that there are no records in the data that meet both of the conditions. You are grouping by customertodocument.DocId, but if no values of customertodocument.DocID exist after filtering with the WHERE clause, the aggregation will have nothing to group by and you'll get no results. You can test this by running the following query:
SELECT *
FROM customertodocument
LEFT JOIN documentstats on customertodocument.DocId = documentstats.DocId and customertodocument.customerId = documentstats.customerId
WHERE customertodocument.customerId = '1111' AND documentstats.openingdate >= '2021-01-01T00:00:00.000'

your WHERE condition returns nothing so you are not getting any record. Count alone in select clause can give you 0 but you have one column and then count so you are not getting any record
What value you are expecting in the customertodocument.DocId if no matching record found.
You can get the 0 count if you remove the customertodocument.DocId from select clause keeping only count in select clause and removing the GROUP BY clause

Related

How to create an additional column in a SQL query that contains the number of rows with a column value equal to a column value from the current row?

This is what I currently have (it doesn't work):
select MOCKSTEMS.WORD_ID,
MOCKSTEMS.STEM_ID,
MOCKSTEMS.LABSTEM,
MOCKSTEMS.LABSTEMCATEGORY,
MOCKLEMMAS.LEMMAFORM,
MOCKSTEMS.LEMMA_ID,
MOCKWORDS.ORIGINALWORD,
MOCKSTEMS.CONTAINEDIN,
COUNT(*) as SAMEVALUE from MOCKSTEMS where CONTAINEDIN=STEM_ID
from MOCKSTEMS
inner join MOCKWORDS on MOCKSTEMS.WORD_ID = MOCKWORDS.WORD_ID
inner join MOCKLEMMAS on MOCKSTEMS.LEMMA_ID = MOCKLEMMAS.LEMMA_ID
Basically, I wish to create a column called 'SAMEVALUE' that shows the number of rows in this query with 'CONTAINEDIN' values equal to the 'STEM_ID' value of each row. Is this possible, and if so, how can I do it with SQL?
EDITED:
This is what I get when I run the query without the 'COUNT(*) as SAMEVALUE from MOCKSTEMS where CONTAINEDIN=STEM_ID' row:
image of a few rows returned by the query.
For example, for the row with STEM_ID='stem-003' and LABSTEM='owotan okitz', I would like the SAMEVALUE column to have value 2, because there are 2 rows with CONTAINEDIN='stem-003', as circled in this image.
It would also be fine if the SAMEVALUE column just indicates true/false (or 0/1) depending on whether there are rows with CONTAINEDIN values equal to the STEM_ID of each row.
To get overall count alongside the query results, you need an analytic function. So to count only rows with some condition, we put this condition in case expression, which returns something in case of "true", and null in other cases. Then count will ignore nulls.
select MOCKSTEMS.WORD_ID,
MOCKSTEMS.STEM_ID,
MOCKSTEMS.LABSTEM,
MOCKSTEMS.LABSTEMCATEGORY,
MOCKLEMMAS.LEMMAFORM,
MOCKSTEMS.LEMMA_ID,
MOCKWORDS.ORIGINALWORD,
MOCKSTEMS.CONTAINEDIN,
COUNT(
case
when CONTAINEDIN=STEM_ID
then 1
end
) over() as SAMEVALUE
/*Over is empty to consider all the result set as a single window*/
from MOCKSTEMS
inner join MOCKWORDS on MOCKSTEMS.WORD_ID = MOCKWORDS.WORD_ID
inner join MOCKLEMMAS on MOCKSTEMS.LEMMA_ID = MOCKLEMMAS.LEMMA_ID

PostgreSQL where clause not pushed down when using grouping sets

SELECT *
FROM (
SELECT SUM(quantity) AS quantity,
product_location_id,
location_bin_id,
product_lot_id,
product_serial_id,
CASE
WHEN GROUPING (product_location_id, location_bin_id, product_lot_id, product_serial_id) = 0 AND product_serial_id IS NOT NULL THEN
'Serial'
WHEN GROUPING (product_location_id, location_bin_id, product_lot_id, product_serial_id) = 0 THEN
'Lot'
ELSE
'Quantity'
END AS pick_by
FROM product_location_bins
WHERE status != 'Void'
AND has_quantity = 'Yes'
GROUP BY GROUPING SETS (
(product_location_id, location_bin_id, product_lot_id, product_serial_id),
(product_location_id, location_bin_id)
)
HAVING SUM(quantity) > 0
) x
WHERE x.product_serial_id = 5643
I have the above query. Using a normal GROUP BY postgres is able to "push down" the outer where clause and use the index on product_serial_id. When I use grouping sets it's unable to do so. It resolves the entire inner query and then filters the results. I'm wondering why this is. Is it a limitation with grouping sets?
Your query is odd. Your outer where clause eliminates the second set of results from grouping sets, because product_serial_id would be NULL for the second set. This gets filtered out in the outer where.
I think you want something like this for the outer query:
WHERE x.product_serial_id = 5643 OR x.product_serial_id IS NULL
I suppose that Postgres could add optimizations for poorly written code -- that is, eliminate the work for the second grouping sets set because it is filtered out by the outer where. However, that is not usually the focus of optimizations.

SQL SELECT returns same item more than one time

I have the following SQL Command:
SELECT *
FROM Notes
INNER JOIN AuthorizedPersons
ON Notes.idPass = AuthorizedPersons.idPass
AND AuthorizedPersons.Privileged = 0
AND Notes.idUser =7
This returns the correct items! BUT returns the same item twice for each AuthorizedPerson that exists!
(Using DISTINCT does not solve the problem because items can have the same name.)
Query Results:
As you can see in the idPass 15 and 16 the description can be the same BUT idPass cannot since it's the primary key!
The query returns 3 times the idPass 30...
Try to use Where instead of the first AND.
SELECT *
FROM Notes
INNER JOIN AuthorizedPersons
ON Notes.idPass = AuthorizedPersons.idPass
WHERE AuthorizedPersons.Privileged = 0
AND Notes.idUser =7
In the table AuthorizedPersons ,column starting with 'IdUs..' repeating multiple times against the same idPass.That is why you are getting multiple rows against same value of idpass.For avoiding the duplicate records, you can either use a 'DISTINCT' keyword after excluding that particular column or you can choose any one of the record from that duplicated record by eliminating the others.

Why are queries these not equivalent? (correlated subquery vs. group by)

Why are these two SQL queries not equivalent? One uses a correlated subquery, the other uses group by. The first produces a little over 51000 rows from my database, the second nearly 66000. In both cases, I am simply trying to return all the parts meeting the stated condition, current revision only. A comparison of the output files shows that method #1 (oracle_test1.txt) fails to return quite a few values. Based on that, I can only assume that method #2 is correct. I have some code that has used method #1 for a long time, but it appears I will have to change it. My reasoning concerning the correlated subquery was that as the inner select is comparing the columns in the self join, it will find the max vaule for the prev value for all matches; then return that max prev value for use in the outer query. I designed that query long ago before becoming familiar with the use of group by. Any insights would be appreciated.
Query #1
select pobj_name, prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
and prev = (select max(prev) from pfmc_part a where a.pobj_name = pfmc_part.pobj_name)
order by pobj_name, prev"
Query #2
select pobj_name, max(prev) prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
group by pobj_name
order by pobj_name, prev"
Sample output:
Query #2 Query #1
P538512 B P538512 B
P538513 A P538513 A
P538514 C P538514 C
P538520 B
P538522 B P538522 B
P538525 A P538525 A
P538531 C P538531 C
P538533 A P538533 A
P538538 B
P538541 B
P538542 B
P538553 A P538553 A
P538569 A P538569 A
Query 1 is returning each of the max ids and then those that have a pmodel of the type specified within your where clause.
Whereas query 2 is selecting all items with a pmodel of the type specified in your where clause and each of the max ids of that.
You may have data which isn't the max id which satisfies your where clause in query 2 which is why it's being omitted in query 1
There are two differences and the rest of the answers focus on one. The "easy" difference is that the max() in the group by is affected by the filter clause. The max() in the other query has no filter, and so it might return no rows (when max(prev) is on a row otherwise filtered out by the where conditions).
In addition, the where version of the query might return duplicate rows when there are multiple rows with the same value of max(prev) for a given pobj_name. The group by will never return duplicate rows.
this query
select pobj_name, prev
from pfmc_part
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
and prev = (select max(prev) from pfmc_part a where a.pobj_name = pfmc_part.pobj_name)
order by pobj_name, prev"
has a where clause declaration causing it to return less rows -- specifically, only rows where prev = (subquery). that and prev makes it entirely different, and also assigns the value into prev in the first line
if you wanted them to be the more similar, you'd need to modify it like so
select pobj_name, prev, maxes.max
from pfmc_part
JOIN (select max(prev) as max from pfmc_part a where a.pobj_name = pfmc_part.pobj_name) maxes
where pmodel in ('PN-DWG', 'NO-DWG') and pstatus = 'RELEASED'
order by pobj_name, prev"
In query 1 you are ONLY selecting the rows whose prev field is equal to the max(prev) and in query 2 you are selecting all records ALONG WITH max(prev) that's meeting the conditions in the where and group by clause.
Basically, query 1 and query 2 have completely different where clauses. Hope this explains the missing records from query 1.
Your query #1 will certainly fail to return a row for a given pobj_name where maximum prev for that name does not correspond to a revision currently in the database. That could perhaps happen if a revision was skipped or if its row was deleted.
Your Query #2 does not suffer Query #1's limitation, and it may perform better on account of avoiding a correlated subquery. It would be inappropriate, however, if you wanted more data than just pobj_name and aggregate functions of the groups. And by the way, there's no point in including prev in the ORDER BY clause, since pobj_name will already be unique to each result row.
Overall, if the two queries happen to return similar results then that is a matter of the details of the data, not of the queries. They arrive at their results completely differently.

How to display filtered rows in aggregation

Given the following query:
SELECT dbo.ClientSub.chk_Name as Details, COUNT(*) AS Counts
FROM dbo.ClientMain
INNER JOIN
dbo.ClientSub
ON dbo.ClientMain.ate_Id = dbo.ClientSub.ate_Id
WHERE chk_Status=1
GROUP BY dbo.ClientSub.chk_Name
I want to display the rows in the aggregation even if there are filtered in the WHERE clause.
NULL values are considered as zeros for aggregation purposes.
For your intention you should use GROUP BY ALL that returns rows which were filtered also, with zero as aggregated value.
in case you use oracle sql:
Have you tried Select nvl(dbo.ClientSub.chk_name,'0') .... ?
I hope you are using Oracle. You can directly use count(chk_name). Passing a column name in the count neglects null values. Your query above returns the total number of records against each chk_name group. When you use count(chk_name) it will count all the records where chk_name is not null.
I hope that answers your question.
Thanks,
Aditya
Do you mean this? Which returns count 0 if chk_Name is NULL.
SELECT
dbo.ClientSub.chk_Name as Details,
SUM(CASE WHEN ISNULL(dbo.ClientSub.chk_Name, '')<>'' then 1 else 0 end) AS Counts
FROM
dbo.ClientMain INNER JOIN dbo.ClientSub ON dbo.ClientMain.ate_Id = dbo.ClientSub.ate_Id
where chk_Status=1
group by dbo.ClientSub.chk_Name