Alternative to CASE Statement to Compare and Summarize Data - sql

I have a database that looks like the following;
--------------------------------
region | price_a | price_b
--------------------------------
USA | 100 | 120
USA | 150 | 150
Canada | 300 | 300
Mexico | 20 | 25
I need to compare the values from each price column and count the matched and mismatched prices and summarize as follows:
Required Results
--------------------------------
region | price_match | price_mismatch
--------------------------------
USA | 1 | 1
Canada | 1 | 0
Mexico | 0 | 1
I can do this via multiple case statements (below) but I'm wondering if there is a better approach.
Current Code:
SELECT
region,
COUNT(CASE WHEN price_a = price_b THEN 'match' END) AS price_match,
COUNT(CASE WHEN price_a != price_b THEN 'match' END) AS price_mismatch
FROM
FOO
GROUP BY region;

I am guessing from your recent questions, you're using Snowflake, in which case you can use a more compact syntax. I still think using case expression is better from a documentation and portability standpoint;
select region,
sum(iff(price_a=price_b,1,0)) price_match,
sum(iff(price_a=price_b,0,1)) price_mismatch
from cte
group by region;

You can use sum:
select region, sum(price_a = price_b), sum(price_a != price_b)
from foo
group by region

Related

Coalesce value column on basis of other column without using multiple left join

So, I have a table like this:
|---------------------|------------------|------------------|
| ID | Region |isProductAvailable|
|---------------------|------------------|------------------|
| 12 | USA | Yes |
|---------------------|------------------|------------------|
| 13 | Ohio | No |
|---------------------|------------------|------------------|
| 14 | Australia | Yes |
|---------------------|------------------|------------------|
Now, The use-case that I have is, there is a product, and it's availability is based on hierarchy that is predefined.
For example:
USA -> Ohio
Australia -> Sydney
Case 1: Now whenever I am querying in this product table, I want to check if it is available in Ohio. Since there is an entry for Ohio. The result should be returned.
Case 2: Now whenever I am querying for Sydney, the table does not contain Sydney, so it should search for it's parent in hierarchy specified above. Since there is an entry available for Australian the value for Australia should be returned.
P.S. I have solved this problem with left join and coalesce, but the problem with that is the number of left join increase as the length of specified hierarchy increases.
select coalesce(rgn_Oh.isProductAvailable,rgn_USA.isProductAvailable)
from
(select t.* from t where region = 'Ohio') rgn_Oh
left join
(select t.* from t where region = 'USA') rgn_USA
on rgn_Oh.id = rgn_USA.id;
If I understand correctly, you can use order by for this:
select t.*
from t
where region in ('USA', 'Ohio', 'Australia', 'Sydney')
order by (case region
when 'Sydney' then 1
when 'Australia' then 2
when 'Ohio' then 3
when 'USA' then 4
end)
fetch first 1 row only;

How Can I Count the Number of Times that Different Values Occur in a Column if the Possible Values Are Not Known?

Given the table of items
id | lccnumber | libraryid
--------------------------------------+------------------------+--------------------------------------
d6f7c1ba-a237-465e-94ed-f37e91bc64bd | PR6056.I4588 | 5d78803e-ca04-4b4a-aeae-2c63b924518b
1714f71f-b845-444b-a79e-a577487a6f7d | RC60 .A5 | 5d78803e-ca04-4b4a-aeae-2c63b924518b
1b6d3338-186e-4e35-9e75-1b886b0da53e | PR6056.I4588 | 5d78803e-ca04-4b4a-aeae-2c63b924518b
4428a37c-8bae-4f0d-865d-970d83d5ad55 | PR6056.I4588 | c2549bb4-19c7-4fcc-8b52-39e612fb7dbe
7212ba6a-8dcf-45a1-be9a-ffaa847c4423 | TK5105.88815 .A58 2004 | 5d78803e-ca04-4b4a-aeae-2c63b924518b
100d10bf-2f06-4aa0-be15-0b95b2d9f9e3 | TK5105.88815 .A58 2004 | c2549bb4-19c7-4fcc-8b52-39e612fb7dbe
is there a SQL query that will produce the result set
lccnumber | 5d78803e-ca04-4b4a-aeae-2c63b924518b | c2549bb4-19c7-4fcc-8b52-39e612fb7dbe
------------------------+--------------------------------------+--------------------------------------
PR6056.I4588 | 2 | 1
RC60 .A5 | 1 | 0
TK5105.88815 .A58 2004 | 1 | 1
If the possible libraryids are known ahead of time, then I could do something like
SELECT lccNumber,
SUM(CASE WHEN libraryId = 5d78803e-ca04-4b4a-aeae-2c63b924518b THEN 1 ELSE 0) AS 5d78803e-ca04-4b4a-aeae-2c63b924518b,
SUM(CASE WHEN libraryId = c2549bb4-19c7-4fcc-8b52-39e612fb7dbe THEN 1 ELSE 0) AS c2549bb4-19c7-4fcc-8b52-39e612fb7dbe
FROM items
GROUP BY lccNumber;
but I am looking for a solution in the case that they are not known ahead of time. One approach that would probably work is to first query for the possible libraryIds and then programmatically construct a SELECT clause that accounts for all of these values, but I am wondering if there is a simpler or more efficient way to accomplish it.
You can put the values on separate rows:
SELECT lccNumber, libraryId, COUNT(*)
FROM items
GROUP BY lccNumber, libraryId;
Then re-arrange them at the application layer. You can combine these into records and aggregate into an array for each lccNumber:
SELECT lccNumber, ARRAY_AGG( (libraryId, cnt) )
FROM (SELECT lccNumber, libraryId, COUNT(*) as cnt
FROM items
GROUP BY lccNumber, libraryId
) l
GROUP BY lccNumber

Access sql to retrieve counts of values meeting a condition

I'm trying to write a query in Access that will return a count of values for each site in a table where the value exceeds a specified level, but also, for sites that have no values exceeding that level, return a specified value, such as "NA".
I've tried Iif, Switch, Union, sub queries, querying a different query, but no luck. I can get all the counts exceeding the level, or all sites with "NA" correct but showing total count for the rest, not just count above the level.
For example, in the table below, assuming level > 10, Houston = "NA", Detroit = 2, Pittsburgh PA = 3. I just can't get both sides of the query to work.
Apologize in advance for poor formatting.
+-----------------+-------+
| 1. Site | Value |
+-----------------+-------+
| 2. Houston | 10 |
| 3. Houston | 3 |
| 4. Houston | 0 |
| 5. Detroit | 15 |
| 6. Detroit | 7 |
| 7. Detroit | 4 |
| 8. Detroit | 12 |
| 9. Pittsburgh | 23 |
| 10. Pittsburgh | 2 |
| 11. Pittsburgh | 18 |
| 12. Pittsburgh | 12 |
+-----------------+-------+
Another solution is to use conditional aggregation, as follows :
SELECT site, SUM(IIf(value > 10, 1, 0)) AS value
FROM mytable
GROUP BY site
This approach should be more efficient than self-joining the table, since it requires to scan the table only once.
The SUM(IIf ...) is a handy construct to count how many records satisfy a given condition.
NB : it is generally not a good idea to return two different data types in the same column (in your use case, either a number or string 'NA'). Most RDBMS do not allow that. So I provided a query that will return 0 when there are not matches, instead of NA. If you really want 'NA', you can try :
IIF(
SUM(IIf(value > 10, 1, 0)) = 0,
'NA',
STR(SUM(IIf(value > 10, 1, 0)))
) AS value
This demo on DB Fiddle, with your sample data returns :
site | value
:--------- | ----:
Detroit | 2
Houston | 0
Pittsburgh | 3
Get a list of all sites independant of the counts (SiteList derived table below)
LEFT Join this back to your base table (SiteValues) to get the counts for each site where it's meeting threshold. --note should join on key which I'm not sure what is for this table. site alone isn't enough
Count the values from the siteValues dataset as NULL's will get counted as 0.
WORKING DEMO:
.
SELECT SiteList.Site, Count(Sitevalues.Site)
FROM (SELECT site, value
FROM TableName) SiteList
LEFT JOIN TableName SiteValues
on SiteList.Site = SiteValues.Site
and SiteValues.Value > 10
and SiteValues.Value = SiteList.value
GROUP BY SiteList.Site
GIVING US:
+----+------------+------------------+
| | Site | (No column name) |
+----+------------+------------------+
| 1 | Detroit | 2 |
| 2 | Houston | 0 |
| 3 | Pittsburgh | 3 |
+----+------------+------------------+
Or if you need the NA you have to cast the count to a varchar
SELECT SiteList.Site, case when Count(Sitevalues.Site) = 0 then 'NA' else cast(count(Sitevalues.site) as varchar(10)) end as SitesMeetingThreshold
FROM (SELECT site, value
FROM TableName) SiteList
LEFT JOIN TableName SiteValues
on SiteList.Site = SiteValues.Site
and SiteValues.Value > 10
and SiteValues.Value = SiteList.value
GROUP BY SiteList.Site
Just use conditional aggregation:
select site,
max(iif(value > 10, 1, 0)) as cnt_11plus
from t
group by site;
I think 0 is better than N/A. But if you want that you'll need to convert the results to a string.
select site,
iif(max(iif(value > 10, 1, 0)) > 0,
str(max(iif(value > 10, 1, 0))),
"N/A"
) as cnt_11plus
from t
group by site;
You can use UNION like this:
SELECT site, count(value) AS counter
FROM sites
WHERE value > 10
GROUP BY site
UNION
SELECT s.site, 'NA' AS counter
FROM sites AS s
WHERE value <= 10
AND NOT EXISTS (
SELECT 1 FROM sites WHERE site = s.site AND value > 10
)
GROUP BY site
Results:
site counter
Detroit 2
Houston NA
Pittsburgh 3
There is no need to convert the integer counter to Text, because Access does this implicitly for you.

sql Properly grouping my table

I'm using MS Access in order to play around with tables through SQL. I want to properly group my table and this is an example of what I want to do. Say I have a table like this:
Cool? | Age
Yes | 15
No | 34
No | 12
Yes | 26
Yes | 10
What I want is the resulting table to show how many ppl are cool or not grouped by age. For instance in this example it would be:
AGE | Count that are cool | Count that is Not cool
<25 | 2 | 1
>=25 | 1 | 1
Thanks in advance!
Try this:
case when age<25 then '<25' when age>=25 then '>=25' end as age, count(case when age<25 then 1 else null end) as [Count that are cool], count(case when age>=25 then 1 else null end) as [Count that is Not cool]
from Table1
group by case when age<25 then '<25' when age>=25 then '>=25' end

PHP Database Query - Group by month

An edit per the suggestions:
$sql=
"SELECT SysproCompanyJ.dbo.InvMovements.StockCode,
SysproCompanyJ.dbo.InvMaster.Description,
SysproCompanyJ.dbo.InvMovements.TrnYear,
SysproCompanyJ.dbo.InvMovements.Warehouse,
SysproCompanyJ.dbo.InvMovements.TrnMonth,
SysproCompanyJ.dbo.InvMovements.TrnQty,
SysproCompanyJ.dbo.InvMovements.TrnValue
FROM SysproCompanyJ.dbo.InvMovements,
SysproCompanyJ.dbo.InvMaster
WHERE SysproCompanyJ.dbo.InvMovements.StockCode = SysproCompanyJ.dbo.InvMaster.StockCode
AND SysproCompanyJ.dbo.InvMovements.Warehouse = 'S2'
GROUP BY SysproCompanyJ.dbo.InvMovements.TrnMonth";
The sample DB data would be:
Stockcode | Description | TrnYear | Warehouse | TrnMonth | TrnQty | TrnValue
PN1 | Part Number 1 | 2013 | S2 | 1 | 100 | 10.00
PN2 | Part Number 2 | 2013 | S2 | 1 | 200 | 125.00
PN3 | Part Number 3 | 2013 | S2 | 1 | 200 | 60.00
PN1 | Part Number 1 | 2013 | S2 | 2 | 300 | 560.00
PN4 | Part Number 4 | 2013 | S2 | 2 | 400 | 30.00
PN5 | Part Number 5 | 2013 | S2 | 2 | 100 | 230.00
I'm trying to break down the data into separate tables grouped by month and then having a variable to sum the total TrnValue by month.
The current query as is gives the following error
Warning: odbc_exec() [function.odbc-exec]: SQL error: [Microsoft][ODBC SQL Server Driver][SQL Server]Column 'SysproCompanyJ.dbo.InvMovements.StockCode' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause., SQL state 37000 in SQLExecDirect in C:\wamp\www\dacs\S2_2.php on line 69
You can't use columns in the select statement unless they inside an aggregate function (min,max,sum,count,...) or are included in the Group By.
Try something like this:
SELECT SysproCompanyJ.dbo.InvMovements.Warehouse,
SysproCompanyJ.dbo.InvMovements.TrnYear,
SysproCompanyJ.dbo.InvMovements.TrnMonth,
Sum(SysproCompanyJ.dbo.InvMovements.TrnQty) as sum_TrnQty,
Sum(SysproCompanyJ.dbo.InvMovements.TrnValue) as sum_TrnValue
FROM SysproCompanyJ.dbo.InvMovements,
SysproCompanyJ.dbo.InvMaster
WHERE SysproCompanyJ.dbo.InvMovements.StockCode = SysproCompanyJ.dbo.InvMaster.StockCode
AND SysproCompanyJ.dbo.InvMovements.Warehouse = 'S2'
GROUP BY SysproCompanyJ.dbo.InvMovements.Warehouse,
SysproCompanyJ.dbo.InvMovements.TrnYear,
SysproCompanyJ.dbo.InvMovements.TrnMonth
Usually doesn't make sense to include varchar columns (Stockcode, Description) in any type of aggregate, and since they are different values, you probably don't want them in the Group By either.
When using GROUP BY in a SQL query, all fields shown should either be 'grouped' by or a calculated (aggregate) value.
For example;
SELECT city, count(*), avg(price) FROM properties GROUP BY city;
Which will produce something like;
City   |  count  |   Avg
Paris  |  3      |   166.666
Will count the number of rows per city. City is part of the 'group by'. 'count(*)' and 'avg(price)' are calculated columns(aggregate).
If we would introduce another column to the query, 'city';
SELECT country, city, count(*), avg(price) FROM properties GROUP BY city;
The query would give an error, because country is neither 'grouped' or a calculated value. This error is quite logical; city-names are not unique worldwide (e.g. 'Paris' USA and 'Paris' in France), so grouping by city alone, the database can not show a unique country name.
To resolve this, either include 'country' in the group by, or make it a calculated field;
SELECT country, city, count(*), avg(price) FROM properties GROUP BY country, city;
Will return the results grouped by country, then by city
Country | City | count | Avg
USA | Paris | 1 | 100.000
France | Paris | 2 | 200.000
Using a calculated value for country;
SELECT min(country), city, count(*), avg(price) FROM properties GROUP BY country, city;
Would return the results grouped by city, and the 'first' country in the group:
min(Country)| City | count | Avg
France | Paris | 3 | 166.666
Which is probably not logical; the results show 'France', but it includes results from Paris, USA