subquery Count() - Column must appear in the GROUP BY clause - sql

I just wanted to know, why a subquery returned more than one value, so I made this query:
SELECT id,
(SELECT Count(tags[i])
FROM generate_subscripts(tags, 1) AS i
WHERE tags[i]='oneway') as oneway_string
FROM planet_osm_ways
WHERE 'oneway' = ANY(tags)
HAVING
(SELECT Count(tags[i])
FROM generate_subscripts(tags, 1) AS i
WHERE tags[i]='oneway') > 1
which should find all occurences of 'oneway' in tags array and count them.
[42803] ERROR: column "planet_osm_ways.id" must appear in the GROUP BY clause
or be used in an aggregate function Position: 8

You should change HAVING to WHERE as there are no groups on which you could apply HAVING filter, instead you want to use WHERE filter which applies to each row.
SELECT id,
(SELECT Count(tags[i])
FROM generate_subscripts(tags, 1) AS i
WHERE tags[i]='oneway') as oneway_string
FROM planet_osm_ways
WHERE 'oneway' = ANY(tags)
AND
(SELECT Count(tags[i])
FROM generate_subscripts(tags, 1) AS i
WHERE tags[i]='oneway') > 1

Related

How to return instances where all members of GROUP BY equal a certain value?

I have the following table structure:
Essentially, I need to return the rows where ALL VALUES of that grouping are FALSE for LeadIndication. I am using GROUP BY on the Parent column but having trouble getting the instances where ALL records in the GROUP BY are FALSE. Below is my attempt which return all groupings by "Parent" but having trouble layering in the additional logic.
SELECT [AssetID], [InvestmentID] FROM [rdInvestments] GROUP BY [AssetID],[InvestmentID]
As you can see based the yellow highligthed portion of my screen shot, I only want to return those rows since ALL members of the GROUP BY are false for LeadIndication.
Using conditional aggregation:
SELECT AssetID
FROM rdInvestments
GROUP BY AssetID
HAVING SUM(IIF(LeadIndication <> 'FALSE', 1, 0)) = 0;
Another way:
SELECT AssetID
FROM rdInvestments
GROUP BY AssetID
HAVING SUM(IIF(LeadIndication = 'FALSE', 1, 0)) = COUNT(*);
Try this using a subquery:
SELECT DISTINCT
[AssetID]
FROM
[rdInvestments]
WHERE
[AssetID] Not In
(SELECT T.[AssetID]
FROM [rdInvestments] As T
WHERE T.LeadIndication = 'TRUE')

GROUPING multiple LIKE string

Data:
2015478 warning occurred at 20201403021545
2020179 error occurred at 20201303021545
2025480 timeout occurred at 20201203021545
2025481 timeout occurred at 20201103021545
2020482 error occurred at 20201473021545
2020157 timeout occurred at 20201403781545
2020154 warning occurred at 20201407851545
2027845 warning occurred at 20201403458745
In above data, there are 3 kinds of strings I am interested in warning, error and timeout
Can we have a single query where it will group by string and give the count of occurrences as below
Output:
timeout 3
warning 3
error 2
I know I can write separate queries to find count individually. But interested in a single query
Thanks
You can use filtered aggregation for that:
select count(*) filter (where the_column like '%timeout%') as timeout_count,
count(*) filter (where the_column like '%error%') as error_count,
count(*) filter (where the_column like '%warning%') as warning_count
from the_table;
This returns the counts in three columns rather then three rows as your indicated.
If you do need this in separate rows, you can use regexp_replace() to cleanup the string, then group by that:
select regexp_replace(the_column, '(.*)(warning|error|timeout)(.*)', '\2') as what,
count(*)
from the_table
group by what;
Please use below query, without hard coding the values using STRPOS
select val, count(1) from
(select substring(column_name ,position(' ' in (column_name))+1,
length(column_name) - position(reverse(' ') in reverse(column_name)) -
position(' ' in (column_name))) as val from matching) qry
group by val; -- Provide the proper column name
Demo:
If you want this on separate rows you can also use a lateral join:
select which, count(*)
from t cross join lateral
(values (case when col like '%error%' then 'error' end),
(case when col like '%warning%' then 'warning' end),
(case when col like '%timeout%' then 'timeout' end)
) v(which)
where which is not null
group by which;
On the other hand, if you simply want the second word -- but don't want to hardcode the values -- then you can use:
select split_part(col, ' ', 2) as which, count(*)
from t
group by which;
Here is a db<>fiddle.

Reference an ALIAS in a SUM funciton - SQL Server

Background:
I am trying to calculate the profit margin inside of the query and I am running into errors. When I try to use the select statement in the SUM function, I trigger an error:
Cannot perform an aggregate function on an expression containing an
aggregate or a subquery.
I understand that this is caused by having a SELECT query inside of the SUM function. From there, I tried to reference the alias of the COGS column. I recieve an error when I do that as well:
Invalid column name 'COGS'.
After messing around with the query some more, I figured it might be due to fact that I'm trying all of this inside of a SUM function and so I removed that and ran the query. It returned a few errors:
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Is there another way to use or reference the value I need in the SUM function?
Query:
--Main query
SELECT
custID,
COUNT(custID) AS InvoiceNum,
--This is the column that has an alias
(SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID) as COGS,
--This is where I am trying to calculate the profit margin
SUM(((((subTotal + (-1 * tradeinAmount) - (SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID)))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice
group by custID
order by InvoiceNum desc;
SELECT
a.custID,
COUNT(a.custID) AS InvoiceNum, case when a.custid=b.custid then b.cogs else 0 end
as COGS,
SUM(((((subTotal + (-1 * tradeinAmount) - case when a.custid=b.custid then b.cogs else 0 end))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice a
left join #tempMarketing b on a.custID =b.custid
group by a.custID,
case when a.custid=b.custid then b.cogs else null end
order by InvoiceNum desc;
You can try the following query, I have created a common table expression for cogs column:
WITH cte_base AS(
SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID
)
SELECT
custID,
COUNT(custID) AS InvoiceNum,
--This is the column that has an alias
cte_base.cogs as COGS,
--This is where I am trying to calculate the profit margin
SUM(((((subTotal + (-1 * tradeinAmount) - (cte_base.cogs)))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice
group by custID
order by InvoiceNum desc;

PostgreSQL use case when result in where clause

I use complex CASE WHEN for selecting values. I would like to use this result in WHERE clause, but Postgres says column 'd' does not exists.
SELECT id, name, case when complex_with_subqueries_and_multiple_when END AS d
FROM table t WHERE d IS NOT NULL
LIMIT 100, OFFSET 100;
Then I thought I can use it like this:
select * from (
SELECT id, name, case when complex_with_subqueries_and_multiple_when END AS d
FROM table t
LIMIT 100, OFFSET 100) t
WHERE d IS NOT NULL;
But now I am not getting a 100 rows as result. Probably (I am not sure) I could use LIMIT and OFFSET outside select case statement (where WHERE statement is), but I think (I am not sure why) this would be a performance hit.
Case returns array or null. What is the best/fastest way to exclude some rows if result of case statement is null? I need 100 rows (or less if not exists - of course). I am using Postgres 9.4.
Edited:
SELECT count(*) OVER() AS count, t.id, t.size, t.price, t.location, t.user_id, p.city, t.price_type, ht.value as houses_type_value, ST_X(t.coordinates) as x, ST_Y(t.coordinates) AS y,
CASE WHEN t.classification='public' THEN
ARRAY[(SELECT i.filename FROM table_images i WHERE i.table_id=t.id ORDER BY i.weight ASC LIMIT 1), t.description]
WHEN t.classification='protected' THEN
ARRAY[(SELECT i.filename FROM table_images i WHERE i.table_id=t.id ORDER BY i.weight ASC LIMIT 1), t.description]
WHEN t.id IN (SELECT rl.table_id FROM table_private_list rl WHERE rl.owner_id=t.user_id AND rl.user_id=41026) THEN
ARRAY[(SELECT i.filename FROM table_images i WHERE i.table_id=t.id ORDER BY i.weight ASC LIMIT 1), t.description]
ELSE null
END AS main_image_description
FROM table t LEFT JOIN table_modes m ON m.id = t.mode_id
LEFT JOIN table_types y ON y.id = t.type_id
LEFT JOIN post_codes p ON p.id = t.post_code_id
LEFT JOIN table_houses_types ht on ht.id = t.houses_type_id
WHERE datetime_sold IS NULL AND datetime_deleted IS NULL AND t.published=true AND coordinates IS NOT NULL AND coordinates && ST_MakeEnvelope(17.831490030182, 44.404640972306, 12.151558389557, 47.837396630872) AND main_image_description IS NOT NULL
GROUP BY t.id, m.value, y.value, p.city, ht.value ORDER BY t.id LIMIT 100 OFFSET 0
To use the CASE WHEN result in the WHERE clause you need to wrap it up in a subquery like you did, or in a view.
SELECT * FROM (
SELECT id, name, CASE
WHEN name = 'foo' THEN true
WHEN name = 'bar' THEN false
ELSE NULL
END AS c
FROM case_in_where
) t WHERE c IS NOT NULL
With a table containing 1, 'foo', 2, 'bar', 3, 'baz' this will return records 1 & 2. I don't know how long this SQL Fiddle will persist, but here is an example: http://sqlfiddle.com/#!15/1d3b4/3 . Also see https://stackoverflow.com/a/7950920/101151
Your limit is returning less than 100 rows if those 100 rows starting at offset 100 contain records for which d evaluates to NULL. I don't know how to limit the subselect without including your limiting logic (your case statements) re-written to work inside the where clause.
WHERE ... AND (
t.classification='public' OR t.classification='protected'
OR t.id IN (SELECT rl.table_id ... rl.user_id=41026))
The way you write it will be different and it may be annoying to keep the CASE logic in sync with the WHERE limiting statements, but it would allow your limits to work only on matching data.

How to use an ALIAS in a PostgreSQL ORDER BY clause?

I have the following query:
SELECT
title,
(stock_one + stock_two) AS global_stock
FROM
product
ORDER BY
global_stock = 0,
title;
Running it in PostgreSQL 8.1.23 i get this error:
Query failed: ERROR: column "global_stock" does not exist
Anybody can help me to put it to work? I need the availale items first, after them the unnavailable items. Many thanks!
You can always ORDER BY this way:
select
title,
( stock_one + stock_two ) as global_stock
from product
order by 2, 1
or wrap it in another SELECT:
SELECT *
from
(
select
title,
( stock_one + stock_two ) as global_stock
from product
) x
order by (case when global_stock = 0 then 1 else 0 end) desc, title
One solution is to use the position:
select title,
( stock_one + stock_two ) as global_stock
from product
order by 2, 1
However, the alias should work, but not necessarily the expression. What do you mean by "global_stock = 0"? Do you mean the following:
select title,
( stock_one + stock_two ) as global_stock
from product
order by (case when global_stock = 0 then 1 else 0 end) desc, title
In case anyone finds this when googling for whether you can just ORDER BY my_alias: Yes, you can. This cost me a couple hours.
As the postgres docs state:
The ordinal number refers to the ordinal (left-to-right) position of the output column. This feature makes it possible to define an ordering on the basis of a column that does not have a unique name. This is never absolutely necessary because it is always possible to assign a name to an output column using the AS clause.
So either this has been fixed since, or this question is specifically about the ORDER BY my_alias = 0, other_column syntax which I didn't actually need.