How come my count() passes evaluation? - sql

I am fairly new at SQL and I am trying to create a query that tries to determine in which countries only 3 specific languages (spanish,italian,german) are spoken and no other languages.
select country
from langusage
group by country
having count(case when language in ('spanish','german','italian') then 1 else 5 end)=3
The output are all countries that have at least 1 of the aforementioned languages . How come they pass the '=3' test?

The reason is that count(1) = count(5). count() counts the number of non-NULL values.
You intend sum():
select country
from langusage
group by country
having sum(case when iso in ('spanish', 'german', 'italian') then 1 else 5 end) = 3

Related

Pivot aggregates in SQL

I'm trying to find a way to pivot the table below (I guess you would say it's in "long" format) into the ("wider") format where all the columns are essentially explicitly Boolean. I hope this simple example gets across what I'm trying to do.
Note there is about 74 people. (so the output table will have 223 columns, 1 + 74 x 3 )
I can't figure out an easy way to do it other than horribly with a huge number of left joins along "Town" by statements like
... left join(
select
town,
case where person = 'Richard' then 1 else 0 end as "Richard"
Fee as "Richard Fee"
from services
where person = 'Richard'
left join...
can some smart person suggest a way to do this using PIVOT functions in SQL?
I am using Snowflake (and dbt so I can get some jinja into play if really necessary to loop through all the people).
Input:
Desired output:
ps. I know this is a ridiculous SQL ask, but this is the "output the client wants" so I have this undesirable task to fulfil.
If persons are known in advance then you could use conditional aggregation:
SELECT town,
MAX(CASE WHEN person = 'Richard' THEN 1 ELSE 0 END) AS "Richard",
MAX(CASE WHEN person = 'Richard' THEN Fee END) AS "Richard Fee",
MAX(CASE WHEN person = 'Richard' THEN Service END) AS "Richard Service",
MAX(CASE WHEN person = 'Caitlin' THEN 1 ELSE 0 END) AS "Caitlin",
...
FROM services
GROUP BY town;

How to count when don't know specific types

I'm working on a table kind of like this:
id
type
1
A
2
A
3
B
4
C
5
C
I wanted to count the number of ids for each type, and get a table like this.
type_a
type_b
type_c
2
1
2
What I did was
SELECT
SUM(CASE WHEN type = 'A' THEN 1 ELSE 0 END) AS type_a,
SUM(CASE WHEN type = 'B' THEN 1 ELSE 0 END) AS type_b,
SUM(CASE WHEN type = 'C' THEN 1 ELSE 0 END) AS type_c
FROM myTable
My question is, if I don't know how many types are there, and can't specificly list all cases, how can I achieve it?
You are looking for "cross tabulation" or a "pivot table". I added tags.
However:
if I don't know how many types are there, and can't specifically list all cases, how can I achieve it?
Basically, that's impossible in a single SQL query because SQL demands to know the number of result columns at call time. It cannot return a dynamic number of columns on principle.
There are various workarounds with polymorphic types, or with a document type like json, jsonb, hstore or xml, or return arrays instead of individual columns ...
But to get exactly what you are asking for, an unknown number of dedicated columns, you need a two-step workflow. Like:
Build the query dynamically (determining the return type).
Execute it.
Related:
Dynamic alternative to pivot with CASE and GROUP BY
PostgreSQL Crosstab Query
That said, if your case is simple and you deal with a hand full of known types, you can just over-provision. With a faster crosstab() query, or with simple conditional aggregation like you have it, just more elegant and efficient with the aggregate FILTER clause:
SELECT count(*) FILTER (WHERE type = 'A') AS type_a
, count(*) FILTER (WHERE type = 'B') AS type_b
, count(*) FILTER (WHERE type = 'C') AS type_c
, count(*) FILTER (WHERE type = 'D') AS type_d
-- that's all folks!
FROM tbl;
Types with no entries report 0 (count() never returns NULL) which would be correct anyway.
Does not work for unknown types, obviously.
If I understand you correctly, all you want is a simple COUNT:
SELECT
type
,COUNT(id)
FROM myTable
GROUP BY type

postgresql dynamically name columns in case statement

I'm looking for a way to dynamically or automatically name the columns in my case statement below. Scenario - I'm trying to find out how many different companies of various industries are found in each country. The countries are the rows while the categories are the columns.
I'm using postgressql so pivot won't work and I don't have a new enough version where I can use cross-tab
I want to be able to replicate this for much larger scenarios where I won't have to worry about 'hardcoding' the cat_nbr and column names like I do here.
SELECT country,
count(CASE WHEN cat_nbr = 1 THEN company_code END) retail,
count(CASE WHEN cat_nbr = 2 THEN company_code END) finance,
count(CASE WHEN cat_nbr = 3 THEN company_code END) oil,
count(CASE WHEN cat_nbr = 4 THEN company_code END) tech
FROM global_companies
GROUP BY country
the table structure format in case it isn't clear has these columns:
country - cat_nbr - company_code - cat_desc.
Cat_desc is where I have hardcoded the words 'retail', 'finance', etc
Is there someway I can do this with less hardcoding in terms of what I refer to each cat_nbr/cat_desc? There are lots and lots of cat_nbrs and cat_descs.
You can not create a query with a dynamic row size. That's impossible, even with cross-tab.
You can however
create a query that returns a SQL statement which you can execute afterward, in the client.
create something like \crosstabview with the client.
You can read more information about this in my question, "How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?".
Instead of hardcoding category names, you could hardcode country names for the columns and let the rows be dynamic as usual.
SELECT cat_nbr,
COUNT(CASE WHEN Country = 'US' THEN company_code END) AS NumUS,
COUNT(CASE WHEN Country = 'UK' THEN company_code END) AS NumUK,
COUNT(CASE WHEN Country = 'FR' THEN company_code END) AS NumFR,
...
FROM global_companies
GROUP BY cat_nbr;
Another alternative is you can aggregate the data into JSON or array structures.

SQL - Group by an agregate function

I have a question whether if it's possible to make a group by an aggregate function.
Scenario:
I have a table which has biomass(kg) and number of individuals for everyday and a description, therefore I can calculate the total av. weight and total number of individuals within two dates as:
select
description,
sum(biomass)/sum(number_individuals) as av.weight,
sum(number_individuals) as individuals
from
Table
group by description
Which works okay, now, the thing is that I want to group those individuals separating them by weight ranges, in order to get something like:
description range(kg) number av.weigh(g)
Foo 2-3 2400 2584.48
I have tried something like
SELECT
description,
case when sum(biomass)/sum(number_individuals) >= 2000.0
and sum(biomass)*1000/sum(number_individuals) < 3000 then '2-3'
else 'nothing'
end as desc_range
FROM Table
Group by
description,
sum(biomass)/sum(number_individuals)
But it doesn't seem to work, neither using the alias desc_range ofc.
I am using Informix 9.40 TC3
Any help will be appreciated.
Best regards
If you want to aggregate on an aggregation, you usually need a subquery. However, you mention individuals, so perhaps this is what you want:
select description,
(case when biomass between 2 and 3 then '2-3'
else 'nothing'
end) as biomass
sum(biomass)/sum(number_individuals) as av.weight, sum(number_individuals) as individuals
from Table
group by description,
(case when biomass between 2 and 3 then '2-3'
else 'nothing'
end);

Complex SQL query on one table

Have forgotten SQL queries as have not used it for a long time.
I have a following requirement.
Have a table called match where I keep my competitor details with respect to matches my team have played against them. So some important fields are like this
match_id
competior_id
match_winner_id
ismatchtied
goals_scored_my_team
goals_scored_comp
From this table I want to get the head to head information for all my competitors.
like this
Competitor Matches Wins Losses Draws
A 10 5 4 1
B 8 3 2 1
Draw information I can get from ismatchtied is set to 'Y' or 'N'.
I want to get all the info from one query. I can get all the info from executing queries separately and do complex logic processing in my server code. But my performance will take a hit.
Any help will be hugely appreciated.
cheers,
Saurav
You could use conditional aggregation, involving CASE expressions inside aggregate functions, like this:
SELECT
competitor_id,
COUNT(*) AS Matches,
COUNT(CASE WHEN goals_scored_my_team > goals_scored_comp THEN 1 END) AS Wins,
COUNT(CASE WHEN goals_scored_my_team < goals_scored_comp THEN 1 END) AS Losses,
COUNT(CASE WHEN goals_scored_my_team = goals_scored_comp THEN 1 END) AS Draws
FROM matches
GROUP BY
competitor_id
;
Every CASE above will evaluate to NULL when the condition isn't satisfied. And since COUNT(expr) omits NULLs, every COUNT(CASE ...) in the above query will effectively only count rows that match the corresponding WHEN condition.
So, the first COUNT counts only rows where my team scored more against the competitor, i.e. where my team won. In a similar way, the second and the third CASEs get the numbers of losses and draws.
SELECT m4.competior_id, COUNT(*) as TotalMathces,
(select count(*) from match m1 where goals_scored_my_team>goals_scored_comp AND m1.competior_id=m4.competior_id) as WINS,
(select count(*) as WIN from match m2 where goals_scored_comp>goals_scored_my_team AND m2.competior_id=m4.competior_id) as LOSES,
(select count(*) as WIN from match m3 where goals_scored_my_team=goals_scored_comp AND m3.competior_id=m4.competior_id) as DRAWS
FROM match m4 group by m4.competior_id;