Combining multiple rows of data to one row per id - sql

I have a raw data that has multiple dates per category, and I use code case when category = 'referral' then min(date) end as date_referral to get earliest dates of each category per id.
However, it will not return data in a row but create row per category, as such:
id date_entered date_referral date_reply date_final
-------------------------------------------------------------------------
1 2020-12-20 null null null
1 2020-12-20 2020-12-21 null null
1 2020-12-20 null 2020-12-21 null
1 2020-12-20 null null 2020-12-24
I tried enforcing single rows by using distinct or group by (separately and together):
select distinct id
, date_entered
, case when category = 'referral' then min(date) end as date_referral
, case when category = 'reply' then min(date) end as date_reply
, case when category = 'final' then min(date) end as date_final
from data
group by id
, date_entered
, category
but it will keep returning multiple rows, with each row being calculated earliest date per category. I also tried creating cte after this code to select distinct id, date_entered, date_referral, date_reply, date_final from table but that also still returns multiple rows..
How can I combine these rows and make it return one single row?

You should not group by category.
Use conditional aggregation like this:
select id, date_entered,
min(case when category = 'referral' then date end) as date_referral,
min(case when category = 'reply' then date end) as date_reply,
min(case when category = 'final' then date end) as date_final
from data
group by id, date_entered

Related

CASE WHEN condition with MAX() function

There are a lot questions on CASE WHEN topic, but the closest my question is related to this How to use CASE WHEN condition with MAX() function query which has not been resolved.
Here is some of my sample data:
date
debet
2022-07-15
57190.33
2022-07-14
815616516.00
2022-07-15
40866.67
2022-07-14
1221510.00
So, I want to all records for the last two dates and three additional columns: sum(sales) for the previous day, sum for the current day and the difference between them:
SELECT
[debet],
[date] ,
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END ) AS sum_act,
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END ) AS sum_prev ,
(
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END )
-
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END )
) AS diff
FROM
Table
WHERE
[date] = ( SELECT MAX(date) FROM Table WHERE date < ( SELECT MAX(date) FROM Table) )
OR
[date] = ( SELECT MAX(date) FROM Table WHERE date = ( SELECT MAX(date) FROM Table ) )
GROUP BY
[date],
[debet]
Further, of course, it informs that I can't use the aggregate function inside CASE WHEN. Now I use this combination: sum(CASE WHEN [date] = dateadd(dd,-3,cast(getdate() as date)) THEN [debet] ELSE 0 END). But here every time I need to make an adjustment for weekends and holidays. The question is, is there any other way than using 'getdate' in 'case when' Statement to get max date?
Expected result:
date
sum_act
sum_prev
diff
2022-07-15
97190.33
0.00
97190.33
2022-07-14
0.00
508769.96
-508769.96
You can use dense_rank() to filter the last 2 dates in your table. After that you can use either conditional case expression with sum() to calculate the required value
select [date],
sum_act = sum(case when rn = 1 then [debet] else 0 end),
sum_prev = sum(case when rn = 2 then [debet] else 0 end),
diff = sum(case when rn = 1 then [debet] else 0 end)
- sum(case when rn = 2 then [debet] else 0 end)
from
(
select *, rn = dense_rank() over (order by [date] desc)
from tbl
) t
where rn <= 2
group by [date]
db<>fiddle demo
Two steps:
Get the sums for the last three dates
Show the results for the last two dates.
Well, we could also get all daily sums in step 1, but we just need the last three in order to calculate the sums for the last two days, so why aggregate more data than necessary?
Here is the query. You may have to put the date column name in brackets in SQL Server, as date is a keyword in SQL.
select top(2)
date,
sum_debit_current,
sum_debit_previous,
sum_debit_current - sum_debit_previous as diff
(
select
date,
sum(debet) as sum_debit_current,
lag(sum(debet)) over (order by date) as sum_debit_previous
from table
where date in (select distinct top(3) date from table order by date desc)
group by date
)
order by date desc;
(SQL Server uses TOP(n) instead of standard SQL FETCH FIRST 3 ROWS and while SELECT DISTINCT TOP(3) date looks like "get the top 3 rows, then apply distinct on their date", it is really "apply distinct on the dates, then get the top 3" like in standard SQL.)

Conditional CASE WHEN select snowflake SQL

I am stuck on a conditional snowflake select sql. I am trying to count the IDs when they have the corresponding categorial value. I would appreciate some help.
Thanks
SELECT
YEAR(DATETIME) AS YEAR,
WEEKOVERYEAR(DATETIME) AS WEEK,
COUNT(CASE WHEN ID THEN CATEGORY = 'A')
from table
group by week, year;
Here is one method:
SELECT YEAR(DATETIME) AS YEAR,
WEEKOVERYEAR(DATETIME) AS WEEK,
SUM(CASE WHEN CATEGORY = 'A' THEN 1 ELSE 0 END) as num_a
FROM table
GROUP BY week, year;
Snowflake supports COUNT_IF:
Returns the number of records that satisfy a condition.
Aggregate function
COUNT_IF( <condition> )
SELECT YEAR(DATETIME) AS YEAR,
WEEKOVERYEAR(DATETIME) AS WEEK,
COUNT_IF(CATEGORY = 'A') AS num_a
FROM tab
GROUP BY week, year;
You should / can use IFF() since case when is more suitable when there are multiple conditions.
SELECT
YEAR(DATETIME) AS YEAR,
WEEKOVERYEAR(DATETIME) AS WEEK,
COUNT(IFF(CATEGORY = 'A',ID,NULL)) as count
from table
group by week, year;
COUNT() counts the number of rows that are not null.
If you are want when ID is not null AND CATEGORY = 'A' then
COUNT(CASE WHEN ID IS NOT NULL AND CATEGORY = 'A' THEN TRUE ELSE NULL END)
will give you that, or you can use a SUM like in Gordon's answer
SUM(CASE WHEN ID IS NOT NULL AND CATEGORY = 'A' THEN 1 ELSE 0 END)
or you can use the snowflake IFF as a shorter form for the same thing, which is how I do it
SUM( IFF( ID IS NOT NULL AND CATEGORY = 'A', 1, 0))

SQL: Query the same column 3 times with 3 different where clauses

Trying to show a table with 3 columns that are prices that need to be displayed. the columns are differentiated by 'price_type' and there are 3 different price types.
Its probably something obvious im missing but something like:
Select price as 'current', price as '10min', price as '30min'
from table
where Price_Type(current) = 'current' AND Price_Type(10min) = '10min' AND
Price_Type(30min) = '30min'
Order by date desc
I'm not sure what the actual syntax would be, but any help is appreciated.
With conditional aggregation:
select date,
max(case when Price_Type = 'current' then price end) as [current],
max(case when Price_Type = '10min' then price end) as [10min],
max(case when Price_Type = '30min' then price end) as [30min]
from table
group by date
order by date desc

How to sum count on specific columns in sql

I want to calculate specific sum of counts
select is_known_bot, count(*)
FROM "public"."bus_request"
where app_name = 'xxxxxx' and event_type <> 'browser_js'
and is_known_bot <>''
and date <= GETDATE() and date>= GETDATE()-14
group by is_known_bot
order by is_known_bot asc
I am getting the below table:
is_known_bot count
good 2
bad 3
Human 7
in the end, i want to get the below table:
is_known_bot count
bot 5
Human 7
You can use CASE instead the column is_know_bot
Case when is_know_bot = 'Human' then is_know_bot else 'Bot' end

How I can group by and count in PostgreSQL to prevent empty cells in result

I have the table in PostgreSQL DB
Need to calculate SUM of counts for each event_type (example for 4 and 1)
When I use query like this
SELECT account_id, date,
CASE
WHEN event_type = 1 THEN SUM(count)
ELSE null
END AS shows,
CASE
WHEN event_type = 4 THEN SUM(count)
ELSE null
END AS clicks
FROM widgetstatdaily WHERE account_id = 272 AND event_type = 1 OR event_type = 4 GROUP BY account_id, date, event_type ORDER BY date
I receive this table
With <null> fields. It's because I have event_type in select and I need to GROUP BY on it.
How I can make query to receive grouped by account_id and date result without null's in cells? Like (first row)
272 2018-03-28 00:00:00.000000 57 2
May be I can group it after receiving result
You need conditional aggregation and some other fixes. Try this:
SELECT account_id, date,
SUM(CASE WHEN event_type = 1 THEN count END) as shows,
SUM(CASE WHEN event_type = 4 THEN count END) as clicks
FROM widgetstatdaily
WHERE account_id = 272 AND
event_type IN (1, 4)
GROUP BY account_id, date
ORDER BY date;
Notes:
The CASE expression should be an argument to the SUM().
The ELSE NULL is redundant. The default without an ELSE is NULL.
The logic in the WHERE clause is probably not what you intend. That is fixed using IN.
try its
SELECT account_id, date,
SUM(CASE WHEN event_type = 1 THEN count else 0 END) as shows,
SUM(CASE WHEN event_type = 4 THEN count else 0 END) as clicks
FROM widgetstatdaily
WHERE account_id = 272 AND
event_type IN (1, 4)
GROUP BY account_id, date
ORDER BY date;