Case statement gives unexpected output - sql

I have a subquery producing some rows like this:
| year | quarter | id | status |
|------|---------|----|--------|
| 2020 | 4 | 1 | no |
| 2021 | 1 | 1 | yes |
| | | | |
I then want to add an extra column to track status change. I have tried this:
select
*,
case
when (year = 2020 and quarter = 4 and status = 'no') and (year = 2021 and quarter = 1 and status = 'yes')
then 1 else 0
end as status_change_during_2021_q1
from(
...
...
) t
order by id, year, quarter
limit 50
However, this is the output I get:
| year | quarter | id | status | status_change_during_2021_q1 |
|------|---------|----|--------|------------------------------|
| 2020 | 4 | 1 | no | 0 |
| 2021 | 1 | 1 | yes | 0 |
I do not understand why the second row isn't set to 1 for status_change...?

You are trying to compare values across rows. Use lag():
(case when status = 'yes' and
lag(status) over (partition by id order by year, quarter) = 'no'
then 1 else 0
end)
Your version cannot do anything useful, because, for instance, year cannot be both 2021 and 2020 in the same row.

Related

SQL Grouping by year gives incorrect results

I am trying to summerize sales date, by month, sales region and type. The problem is, the results change when I try to group by year.
My simplified query is as follows:
SELECT
DAB700.DATUM,DAB000.X_REGION,DAB700.BELEG_ART, // the date, sales region, order type
// calculate the number of orders per month
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 1 THEN DAB700.BELEG_NR END) as jan,
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 2 THEN DAB700.BELEG_NR END) as feb,
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 3 THEN DAB700.BELEG_NR END) as mar
FROM "DAB700.ADT" DAB700
left join "DAB050.ADT" DAB050 on DAB700.BELEG_NR = DAB050.ANUMMER // join to table 050, to pull in order info
left join "DF030000.DBF" DAB000 on DAB050.KDNR = DAB000.KDNR // join table 000 to table 050, to pull in customer info
left join "DAB055.ADT" DAB055 on DAB050.ANUMMER = left (DAB055.APNUMMER,6)// join table 055 to table 050, to pull in product info
WHERE (DAB700.BELEG_ART = 10 OR DAB700.BELEG_ART = 20) AND (DAB700.DATUM>={d '2021-01-01'}) AND (DAB700.DATUM<={d '2021-01-11'}) AND DAB055.ARTNR <> '999999' AND DAB055.ARTNR <> '999996' AND DAB055.TERMIN <> 'KW.22.22' AND DAB055.TERMIN <> 'KW.99.99' AND DAB050.AUF_ART = 0
group by DAB700.DATUM,DAB000.X_REGION,DAB700.BELEG_ART
This returns the following data, which is correct (manually checked):
| DATUM | X_REGION | BELEG_ART | jan | feb | mar |
|------------|----------|-----------|-----|-----|-----|
| 04.01.2021 | 1 | 10 | 3 | 0 | 0 |
| 04.01.2021 | 3 | 10 | 2 | 0 | 0 |
| 04.01.2021 | 4 | 10 | 1 | 0 | 0 |
| 04.01.2021 | 4 | 20 | 1 | 0 | 0 |
| 04.01.2021 | 6 | 20 | 2 | 0 | 0 |
| 05.01.2021 | 1 | 10 | 1 | 0 | 0 |
and so on....
The total number of records for Jan is 117 (correct).
Now I now want to summerize the data in one row (for example, data grouped by region and type)..
so I change my code so that I have:
SELECT
YEAR(DAB700.DATUM),
and
group by YEAR(DAB700.DATUM)
the rest of the code stays the same.
Now my results are:
| EXPR | X_REGION | BELEG_ART | jan | feb | mar |
|------|----------|-----------|-----|-----|-----|
| 2021 | 1 | 10 | 16 | 0 | 0 |
| 2021 | 1 | 20 | 16 | 0 | 0 |
| 2021 | 2 | 10 | 19 | 0 | 0 |
| 2021 | 2 | 20 | 22 | 0 | 0 |
| 2021 | 3 | 10 | 12 | 0 | 0 |
| 2021 | 3 | 20 | 6 | 0 | 0 |
Visually it is correct. But, the total count for January is now 116. A difference of 1. What am I doing wrong?
How can I keep the results from the first code - but have it presented as per the 2nd set?
You count distinct BELEG_NR. This is what makes the difference. Let's look at an example. Let's say your table contains four rows:
DATUM
X_REGION
BELEG_ART
BELEG_NR
04.01.2021
1
10
100
04.01.2021
1
10
200
05.01.2021
1
10
100
05.01.2021
1
10
300
That gives you per day, region and belegart:
DATUM
X_REGION
BELEG_ART
DISTINCT COUNT BELEG_NR
04.01.2021
1
10
2
05.01.2021
1
10
2
and per year, region and belegart
YEAR
X_REGION
BELEG_ART
DISTINCT COUNT BELEG_NR
2021
1
10
3
The BELEG_NR 100 never appears more than once per day, so every instance gets counted. But it appears twice for the year, so it gets counted once instead of twice.

Sum case from previous month

I couldn't find the answer to this on here or on google.
This is part of the main table
+---+-------+----------------+--------------+
| | Acct | Last_trans_date|Last_transpay |
+---+-------+----------------+--------------+
| 1 | ABC | July 31 | Nov 5 |
| 2 | DEF | Mar 1 | Aug 8 |
| 3 | GFH | Mar 9 | Feb 7 |
+---+------+-----------------+--------------+
I want the total account for the previous month that includes last_trans_date and Last_transpay = previous month as count.
I used this
Select
year(open)
sum(case when month(last_trans_date) = month(current date - 1) and month(last_transpay) = month(current_date - 1) then 1 else 0 end) as activity
from table
group by 1.
I don't think it's outputting the correct amount
SELECT Count(*)
FROM [table]
WHERE
CHARINDEX(#PrevMonth, Last_trans_date) = 1
AND CHARINDEX(#PrevMonth, Last_transpay) = 1

Pivot table in SQL but keep measure names in column

Im having trouble pivoting a table correct.
My input is this raw data table:
+------+---------+------------+----------+
| YEAR | FACULTY | ADMISSIONS | DROPOUTS |
+------+---------+------------+----------+
| 2018 | LAW | 15 | 2 |
| 2019 | LAW | 18 | 4 |
| 2020 | LAW | 11 | 1 |
| 2018 | MATH | 19 | 1 |
| 2019 | MATH | 17 | 6 |
| 2020 | MATH | 24 | 5 |
+------+---------+------------+----------+
I want to pivot years to row but I also want to keep the measure for admissions and drop outs as row names. E.g I want a table as this:
+---------+------------+------+------+------+
| FACULTY | MEASURE | 2018 | 2019 | 2020 |
+---------+------------+------+------+------+
| LAW | ADMISSIONS | 15 | 18 | 11 |
| LAW | DROPOUTS | 2 | 4 | 1 |
| MATH | ADMISSIONS | 19 | 17 | 24 |
| MATH | DROPOUTS | 1 | 6 | 5 |
+---------+------------+------+------+------+
I can pivot years using:
SELECT *
FROM
(
SELECT FACULTY, YEAR, ADMINISSION, DROPPUTS
FROM TABLE
PIVOT (SUM (ADMISSIONS)
FOR YEAR IN (2018,2019,2020)
)
But I need to pivot both measures and still get the measure names column. Any ideas?
That's unpivoting, then pivoting. If your database supports lateral joins and values(), you can do:
select
t.faculty,
x.measure,
sum(case when t.year = 2018 then x.value end) value_2018,
sum(case when t.year = 2019 then x.value end) value_2019,
sum(case when t.year = 2020 then x.value end) value_2020
from mytable t
cross apply (values ('admission', admission), ('dropout', dropout)) as x(measure, value)
group by t.faculty, x.measure
I would unpivot using apply (assuming you are using SQL Server) and reaggregate:
select t.faculty, v.measure,
max(case when year = 2018 then val end) as [2018],
max(case when year = 2019 then val end) as [2019],
max(case when year = 2020 then val end) as [2020]
from t cross apply
(values ('ADMISSIONS', ADMISSIONS), ('DROPOUTS', DROPOUTS)
) v(measure, val)
group by t.faculty, v.measure

SQL Query to Display Daily Count Results in Columns from 1st to last day of month

Need a Query to Display daily count of each item bought by customers in columns from 1st day of month to last day
Sample data table "Item"
+--------+--------+----------+---------------+
| Purchase Date | Item Code| Item Name| Price|
|--------+--------+----------+--------------+
| 01-JAN-20 | 11 | Apple | 1 |
| 01-JAN-20 | 11 | Apple | 1 |
| 02-JAN-20 | 12 | Orange | 2 |
| 02-JAN-20 | 11 | Apple | 1 |
| 03-JAN-20 | 12 | Orange | 2 |
| 03-JAN-20 | 12 | Orange | 2 |
| 04-JAN-20 | 12 | Orange | 2 |
| 04-JAN-20 | 11 | Apple | 1 |
+--------+--------+----------+--------------+
SQL Query should Display Daily Count using Item code and Result to be displayed as below table .
Count daily with each day displayed in column base on the day e.g If today is 4th of Jan then count tomorrow will create new column with count result and continues until last day of month or something similar.
+--------+--------+----------+---------------+
| Items | Jan 01| Jan 02| Jan 03|Jan 04| etc
+--------+--------+----------+--------------+
| Apple | 2 | 1 | 2 | 1 |
| Orange | 0 | 1 | 0 | 1 |
+--------+--------+----------+--------------+
If you know what dates you want, you can use conditional aggregation:
select item,
sum(case when purchase_date = '2020-01-01' then 1 else 0 end) as jan_1,
sum(case when purchase_date = '2020-01-02' then 1 else 0 end) as jan_2,
sum(case when purchase_date = '2020-01-03' then 1 else 0 end) as jan_3,
sum(case when purchase_date = '2020-01-04' then 1 else 0 end) as jan_4,
. . .
from items
group by item;
Note that this assumes that purchase_date is really stored as an internal date format. So the comparison is a date constant -- however, that might differ among databases.
If you do not have a specific set of dates in mind, then you will need to use dynamic SQL.

How to get last value for each user_id (postgreSQL)

Current ratio of user is his last inserted ratio in table "Ratio History"
user_id | year | month | ratio
For example if user with ID 1 has two rows
1 | 2019 | 2 | 10
1 | 2019 | 3 | 15
his ratio is 15.
there is some slice from develop table
user_id | year | month | ratio
1 | 2018 | 7 | 10
2 | 2018 | 8 | 20
3 | 2018 | 8 | 30
1 | 2019 | 1 | 40
2 | 2019 | 2 | 50
3 | 2018 | 10 | 60
2 | 2019 | 3 | 70
I need a query which will select grouped rows by user_id and their last ratio.
As a result of the request, the following entries should be selected
user_id | year | month | ratio
1 | 2019 | 1 | 40
2 | 2019 | 3 | 70
3 | 2018 | 10 | 60
I tried use this query
select rh1.user_id, ratio, rh1.year, rh1.month from ratio_history rh1
join (
select user_id, max(year) as maxYear, max(month) as maxMonth
from ratio_history group by user_id
) rh2 on rh1.user_id = rh2.user_id and rh1.year = rh2.maxYear and rh1.month = rh2.maxMonth
but i got only one row
Use distinct on:
select distinct on (user_id) rh.*
from ratio_history rh
order by user_id, year desc, month desc;
distinct on is a very convenient Postgres extension. It returns one row for the key values in parentheses? Which row, it is the first row based on the sort criteria. Note that the sort criteria need to start with the expressions in parentheses.