Case statement COUNT for THEN in (PostgreSQL) 9.3.11 - sql

Firstly, this is a coursework question, so I am not looking for a full answer, just a hint :)
I have a "monarch" database that stores name, house(?), accession, coronation(?) that keeps track of monarchs (including prime ministers). The house and coronation applies only to monarchs and returns null if monarch is a prime minister.
It looks as follows:
I am required to write a psql query that returns the scheme(house,seventeenth,eighteenth,nineteenth,twentieth), listing the number of monarchs of each royal house that acceded to the throne in the 17th,18th,19th and 20th centuries and have an issue of what to add as my THEN query.
EDIT:
Thank you for your suggestions! I made some changes to my query now:
SELECT house,
TO_CHAR(accession, 'YYYY' ) AS accession_year,
COUNT(CASE WHEN accession_year BETWEEN 1601 AND 1700 THEN name END) AS seventeenth,
COUNT(CASE WHEN accession_year BETWEEN 1701 AND 1800 THEN name END) AS eighteenth,
COUNT(CASE WHEN accession_year BETWEEN 1801 AND 1900 THEN name END) AS nineteenth,
COUNT(CASE WHEN accession_year BETWEEN 1901 AND 2000 THEN name END) AS twentieth,
FROM monarch
WHERE house IS NOT NULL
GROUP BY house
;
Now psql tells me that accession_year does not exist. I do not want to use full accession date in CASE statements. How can I still use my TO_CHAR in the query?

Each expression in a single SELECT clause is evaluated "as if" it's being computed in parallel with all other expressions in the same clause. As such, you're not allowed to have any dependencies between them since no resulting values are available at the start.
One option is to introduce a subquery:
SELECT house,
accession_year,
COUNT(CASE WHEN accession_year BETWEEN 1601 AND 1700 THEN name END) AS seventeenth,
COUNT(CASE WHEN accession_year BETWEEN 1701 AND 1800 THEN name END) AS eighteenth,
COUNT(CASE WHEN accession_year BETWEEN 1801 AND 1900 THEN name END) AS nineteenth,
COUNT(CASE WHEN accession_year BETWEEN 1901 AND 2000 THEN name END) AS twentieth,
FROM (
SELECT house,name,TO_CHAR(accession, 'YYYY' ) AS accession_year
FROM monarch
WHERE house IS NOT NULL ) AS t
GROUP BY house
;
Where now you have two separate SELECT clauses and the outer one is allowed to depend on values computed by the inner one.

You can use sub-selects OR CTEs to SELECT (or just use) calculated columns, but in some simpler cases (like yours) a LATERAL join is more readable:
SELECT house,
COUNT(CASE WHEN accession_century = 17 THEN 1 END) AS seventeenth,
COUNT(CASE WHEN accession_century = 18 THEN 1 END) AS eighteenth,
COUNT(CASE WHEN accession_century = 19 THEN 1 END) AS nineteenth,
COUNT(CASE WHEN accession_century = 20 THEN 1 END) AS twentieth
FROM monarch,
date_part('century', accession) accession_century
WHERE house IS NOT NULL
GROUP BY house
Note: to_char() returns a string, which is not really useful for querying, use date_part() or EXTRACT() instead. Especially in your case: they have the ability to extract the century, which you want to search for.

Related

SQL Summing columns based on date key

I have a dataset as given in the link, DataSet
I want to segregate the column "order_item_unit_status" as separate column and bring respective transaction amount for the same. Desired output is given below.
Objective is to consolidate the txn_amt into respective categories and group them based on txn_date_key. (Basically pivoting based on order_item_unit_status column and bringing txn_amt respectively.)
I used the below code,
Select *, CASE WHEN order_item_unit_status ='DELIVERED'
THEN txn_amt ELSE 0 END as DELIVERED,
CASE WHEN order_item_unit_status ='RETURNED'
THEN txn_amt ELSE 0 END as RETURNED
from sales
Got output as referred in the link Output
The output is not grouping based on txn_date_key and multiple line items found. If i use GROUP BY txn_date_key an error is thrown.
Also I was informed that server is supported by HiveSQL and does not support of using ":", date time, and temp tables can not be created. I'm currently stuck on how to go about given the constraints.
Help would be much appreciated
You have to use your columns in the group by:
EDIT: also SUM() added for the correct output...
Select *,
SUM(CASE WHEN order_item_unit_status ='DELIVERED'
THEN txn_amt ELSE 0 END) as DELIVERED,
SUM(CASE WHEN order_item_unit_status ='RETURNED'
THEN txn_amt ELSE 0 END) as RETURNED
from sales
group by txn_amt,txn_date_key,order_item_unit_status
In hivesql you can use from_unixtime command
Unixtime
All columns which are not aggregated and selected shold be in group by.
This query produces result you need:
Select txn_date_key,
sum(CASE WHEN order_item_unit_status ='DELIVERED'
THEN txn_amt ELSE 0 END) as DELIVERED,
sum(CASE WHEN order_item_unit_status ='RETURNED'
THEN txn_amt ELSE 0 END) as RETURNED
from sales
group by txn_date_key
Result:
txn_date_key delivered returned
20190701 3200 0
20210631 0 3000

Using COUNT CASE WHEN MONTH Statement in MariaDB 10.2.15

I created a query to calculate the Amount of Id in a table using COUNT, CASE, WHEN and MONTH ..
Code:
SELECT
COUNT(CASE WHEN MONTH(LogsFormatted.DateIn) = 1 THEN LogsFormatted.Id ELSE 0 END ) AS '1',
COUNT(CASE WHEN MONTH(LogsFormatted.DateIn) = 2 THEN LogsFormatted.Id ELSE 0 END ) AS '2'
FROM
HrAttLogsFormatted AS LogsFormatted
WHERE
LogsFormatted.DateIn BETWEEN '2019-01-01' AND '2019-02-31'
AND LogsFormatted.Late != ''
Output :
| 1 | 2 |
| 1378 | 1378 |
The output I want to make is to calculate the Id in each month, namely Month 1 and Month 2
| 1 | 2 |
| 792 | 586 |
The data above is a fact
Using the above query instead adds up between the results of calculating month 1 and month 2
You should be counting NULL when the criteria in your CASE expression does not match. Also, I prefer counting 1 unless you really want to the count the Ids themselves. This version should work:
SELECT
COUNT(CASE WHEN MONTH(lf.DateIn) = 1 THEN 1 END) AS '1',
COUNT(CASE WHEN MONTH(lf.DateIn) = 2 THEN 1 END) AS '2'
FROM HrAttLogsFormatted AS lf
WHERE
lf.DateIn BETWEEN '2019-01-01' AND '2019-02-31' AND
lf.Late != '';
Note carefully that the current counts you are seeing sum up to the individual counts, that is:
1378 = 792 + 586
The reason for this is the the COUNT function "counts" any non NULL value as 1, and any NULL value as zero. Your current CASE expression will always count 1, for every record in the table.
remove else part from case when expression - if you use else with 0 then count takes that also in consideration which gives u actually wrong ouput
SELECT
COUNT(CASE WHEN MONTH(LogsFormatted.DateIn) = 1 THEN LogsFormatted.Id END ) AS '1',
COUNT(CASE WHEN MONTH(LogsFormatted.DateIn) = 2 THEN LogsFormatted.Id END ) AS '2'
FROM
HrAttLogsFormatted AS LogsFormatted
WHERE
LogsFormatted.DateIn BETWEEN '2019-01-01' AND '2019-02-31'
AND LogsFormatted.Late != ''
If you are using MariaDB, I would just use SUM() with a boolean:
SELECT SUM( MONTH(lf.DateIn) = 1 ) as month_1,
SUM( MONTH(lf.DateIn) = 2 ) as month_2
FROM HrAttLogsFormatted lf
WHERE lf.DateIn >= '2019-01-01' AND
lf.DateIn < '2020-01-01' AND
lf.Late <> '';
This assumes that the id that you are counting is never NULL (a reasonable assumption for an id).
Note other changes:
The column aliases do not need to be escaped. Don't use non-standard names, unless you need them -- for some reason -- for downstream processing.
This uses a shorter table alias, so the query is easier to write and to read.
The date comparisons use inequalities, so this works both for dates and datetimes.
<> is the standard SQL comparison operator for inequality.

Converting dates into weekdays then correlating it and summing it

The query is simple but not functioning the way I want it,
I am trying to check the date I inspected is the correct day I am checking against.
Input
SELECT TO_CHAR(date '1982.03.09', 'DAY'),
(CASE When lower(TO_CHAR(date '1982.03.09', 'DAY')) like lower('TUESDAY')
then 1 else 0 end)
Output
The answer should have been 1 for the case statement.
I added lower to check if it had to something with the capitals
Reason
The reason why I use a case statement is because when a student has an afterschool activity on monday, I want to place either 1 or 0 in the table and calculate the sum of how many students have afterschool acitivity on monday and so on.
Need eventually
I am doing this so that I can create a table of the week with the number of children doing aftershool activities for each day.
Any help regarding fixing my query would be greatly appreciated!
Thanks
For whatever reason there are spaces behind the TUESDAY to_char() produces. You can trim() them away. But instead of relying on a string representation (that probably might change when the locale changes) you should better use extract() to get the day of the week in numerical representation, 0 for Sunday, 1 for Monday and so on.
SELECT to_char(DATE '1982.03.09', 'DAY'),
CASE
WHEN trim(to_char(DATE '1982.03.09', 'DAY')) = 'TUESDAY' THEN
1
ELSE
0
END,
CASE extract(dow FROM DATE '1982.03.09')
WHEN 2 THEN
1
ELSE
0
END;
I'm a personal fan of extract (<datepart> from <date>) in lieu of to_char for problems like this.
Based on the output you are trying to achieve, I might also recommend a poor man's pivot table:
select
student_id,
max (case when extract (dow from activity_date) = 1 then 1 else 0 end) as mo,
max (case when extract (dow from activity_date) = 2 then 1 else 0 end) as tu,
max (case when extract (dow from activity_date) = 3 then 1 else 0 end) as we,
max (case when extract (dow from activity_date) = 4 then 1 else 0 end) as th,
max (case when extract (dow from activity_date) = 5 then 1 else 0 end) as fr
from activities
where activity_date between :FROM_DATE and :THRU_DATE
group by
student_id
Normally this would be a good use case for filter (where, but that would leave null values on date/student records where there is no activity. Depending on how you render your output, that may or may not be okay (Excel would handle it fine).
select
student_id,
max (1) filter (where extract (dow from activity_date) = 1) as mo,
max (1) filter (where extract (dow from activity_date) = 2) as tu,
max (1) filter (where extract (dow from activity_date) = 3) as we,
max (1) filter (where extract (dow from activity_date) = 4) as th,
max (1) filter (where extract (dow from activity_date) = 5) as fr
from activities
group by
student_id

SQL Pivot 2 Columns

I have the table of the following format
I think my problem is a bit unique than the possible duplicate question, and I'm trying to get repetitive 201601...201652 columns for the two metrics orders and cost.
This is an approach for any database (including SQL Server) that does not rely on a proprietary PIVOT() function. It's a bit weird to do that for 52 weeks in such an example, though (and, to tell you the truth, the 105 resulting columns are not really the best output for the benefit of a human being reading the report).
Having said that, in this example, I do it for quarters of a year rather than weeks, and you'd just have to repeat the expressions 52 times instead of 4 times.
You could use perl or Visual Basic or whatever you prefer to generate the statement, actually.
Here goes:
-- the input table, don't use in real query ...
WITH
input(id,quarter,orders,cost) AS (
SELECT 1,201601,200,1000
UNION ALL SELECT 1,201602,300,1500
UNION ALL SELECT 1,201603,330,1800
UNION ALL SELECT 1,201604,500,2500
)
-- end of input -
SELECT
id
, SUM(CASE quarter WHEN 201601 THEN orders END) AS "orders_201601"
, SUM(CASE quarter WHEN 201602 THEN orders END) AS "orders_201602"
, SUM(CASE quarter WHEN 201603 THEN orders END) AS "orders_201603"
, SUM(CASE quarter WHEN 201604 THEN orders END) AS "orders_201604"
, SUM(CASE quarter WHEN 201601 THEN cost END) AS "cost_201601"
, SUM(CASE quarter WHEN 201602 THEN cost END) AS "cost_201602"
, SUM(CASE quarter WHEN 201603 THEN cost END) AS "cost_201603"
, SUM(CASE quarter WHEN 201604 THEN cost END) AS "cost_201604"
FROM input
GROUP BY id;
id|orders_201601|orders_201602|orders_201603|orders_201604|cost_201601|cost_201602|cost_201603|cost_201604
1| 200| 300| 330| 500| 1,000| 1,500| 1,800| 2,500

AVG only valid dates

I have a simple search query like this one:
SELECT COUNT(id),
COUNT(CASE WHEN nation = 'german' THEN 1 END),
COUNT(CASE WHEN nation = 'french' THEN 1 END),
AVG(AGE(birthday))
FROM persons;
My problem is that I get an error:
ERROR: date out of range for timestamp
I suppose I get this error because not every person has a birthday saved.
birthday is a date-field
How can I prevent this error, and only average birthdays that are valid dates?THANKS
How about this:
SELECT COUNT(id),
COUNT(CASE WHEN nation = 'german' THEN 1 END),
COUNT(CASE WHEN nation = 'french' THEN 1 END),
AVG(AGE(COALESCE(birthday, 0) ))
FROM persons where birthday is not null;
I would use another case statement to exclude the invalid null values from the avg(age()) calculation:
SELECT
COUNT(id),
COUNT(CASE WHEN nation = 'german' THEN 1 END),
COUNT(CASE WHEN nation = 'french' THEN 1 END),
AVG(CASE WHEN birthday IS NOT NULL THEN AGE(birthday) END)
FROM persons;
If you were to add a where birthday is not null clause the average would be correct (or as correct as it can be) but the counts would be off due to the excluded rows not being counted.
See this SQL Fiddle demo and notice how the counts differ between the two queries.