How do you use having for multiple conditions? - sql

In the following codes, how do you exclude members's spending that's larger than $500 for each year (instead of total spending for all years)?
select
Year
,month
,memberkey
,sum(spending) as spending
from table1
group by
1,2,3

A HAVING clause won't work here since you really want to aggregate at the YEAR level to determine which records should be included. Traditionally you would do this with a correlated subquery, but in Teradata you can make use of the QUALIFY clause:
SELECT "Year"
,"Month"
,MemberKey
,spending
from table1
QUALIFY sum(spending) OVER (PARTITION BY "Year", MemberID) < 500

Related

How to conditional SQL select

My table consists of user_id, revenue, publish_month columns.
Right now I use group_by user_id and sum(revenue) to get revenue for all individual users.
Is there a single SQL query I can use to query for user revenue across a time period conditionally? If for a specific user, there is a row for this month, I want to query for this month, last month and the month before. If there is not yet a row for this month, I want to query for last month and the two months before.
Any advice with which approach to take would be helpful. If I should be using cases, if-elses with exists or if this is do-able with a single SQL query?
UPDATE---since I did a bad job of describing the question, I've come to include some example data and expected results
Where current month is not present for user 33
Where current month is present
Assuming publish_month is a DATE datatype, this should get the most recent three months of data per user...
SELECT
user_id, SUM(revenue) as s_revenue
FROM
(
SELECT
user_id, revenue, publish_month,
MAX(publish_month) OVER (PARTITION BY user_id) AS user_latest_publish_month
FROM
yourtableyoudidnotname
)
summarised
WHERE
publish_month >= DATEADD(month, -2, user_latest_publish_month)
GROUP BY
user_id
If you want to limit that to the most recent 3 months out of the last 4 calendar months, just add AND publish_month >= DATEADD(month, -3, DATE_TRUNC(month, GETDATE()))
The ambiguity here is why it is important to include a Minimal Reproducible Example
With input data and require results, we could test our code against your requirements
If you're using strings for the publish_month, you shouldn't be, and should fix that with utmost urgency.
You can use a windowing function to "number" the months. In this way the most recent one will have a value of 1, the prior 2, and the one before 3. Then you can only select the items with a number of 3 or less.
Here is how:
SELECT user_id, revienue, publish_month,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY publish_month DESC) as RN
FROM yourtableyoudidnotname
now you just select the items with RN less than 3 and do your sum
SELECT user_id, SUM(revenue) as s_revenue
FROM (
SELECT user_id, revenue, publish_month,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY publish_month DESC) as RN
FROM yourtableyoudidnotname
) X
WHERE RN <= 3
GROUP BY user_id
You could also do this without a sub query if you use the windowing function for SUM and a range, but I think this is easier to understand.
From the comment -- there could be an issue if you have months from more than one year. To solve this make the biggest number in the order by always the most recent. so instead of
ORDER BY publish_month DESC
you would have
ORDER BY (100*publish_year)+publish_month DESC
This means more recent years will always have a higher number so january of 2023 will be 202301 while december of 2022 will be 202212. Since january is a bigger number it will get a row number of 1 and december will get a row number of 2.

not counting duplicate ids month by month

I want to count the number of ids, but in a special way. For example, if the id is counted in April, I don't want to count that id again in May. so excluding id's that have been counted in previous months.
this is the query I am using.
select store, monthname(created_at), count(distinct customer_id) from a group by 1,2;
Hope you can help me.
Thanks in advance!
Just count the first time a customer is seen. One method is two levels of aggregation:
select store, date_trunc('month', min_created_at), count(*)
from (select store, customer_id, min(created_at) as min_created_at
from a
group by store, customer_id
) c
group by 1, 2;
Note: monthname() is not appropriate for defining a month, because it does not take the year into account. If you really do want to ignore the year, you can use monthname() but that seems unusual.
Hope this sql statement helps!
select store, monthname(created_at), count(distinct customer_id)
from a
where monthname(created_at)=MONTHNAME(current_date())
group by 1,2;

Is there a way to count how many strings in a specific column are seen for the 1st time?

**Is there a way to count how many strings in a specific column are seen for
Since the value in the column 2 gets repeated sometimes due to the fact that some clients make several transactions in different times (the client can make a transaction in the 1st month then later in the next year).
Is there a way for me to count how many IDs are completely new per month through a group by (never seen before)?
Please let me know if you need more context.
Thanks!
A simple way is two levels of aggregation. The inner level gets the first date for each customer. The outer summarizes by year and month:
select year(min_date), month(min_date), count(*) as num_firsts
from (select customerid, min(date) as min_date
from t
group by customerid
) c
group by year(min_date), month(min_date)
order by year(min_date), month(min_date);
Note that date/time functions depends on the database you are using, so the syntax for getting the year/month from the date may differ in your database.
You can do the following which will assign a rank to each of the transactions which are unique for that particular customer_id (rank 1 therefore will mean that it is the first order for that customer_id)
The above is included in an inline view and the inline view is then queried to give you the month and the count of the customer id for that month ONLY if their rank = 1.
I have tested on Oracle and works as expected.
SELECT DISTINCT
EXTRACT(MONTH FROM date_of_transaction) AS month,
COUNT(customer_id)
FROM
(
SELECT
date_of_transaction,
customer_id,
RANK() OVER(PARTITION BY customer_id
ORDER BY
date_of_transaction ASC
) AS rank
FROM
table_1
)
WHERE
rank = 1
GROUP BY
EXTRACT(MONTH FROM date_of_transaction)
ORDER BY
EXTRACT(MONTH FROM date_of_transaction) ASC;
Firstly you should generate associate every ID with year and month which are completely new then count, while grouping by year and month:
SELECT count(*) as new_customers, extract(year from t1.date) as year,
extract(month from t1.date) as month FROM table t1
WHERE not exists (SELECT 1 FROM table t2 WHERE t1.id==t2.id AND t2.date<t1.date)
GROUP BY year, month;
Your results will contain, new customer count, year and month

How to group by year with the year showing only once

I have tried using the following query
select distinct Year (SaleDate) AS SaleYear,Max(SalePrice)
from Sale
group by SaleDate
The years 2010 and 2014 are showing twice,even though i used distinct and group by. the amounts in Maxprice are different as well. am i doing something wrong here?
You need to repeat year() in the group by:
select Year(SaleDate) AS SaleYear, Max(SalePrice)
from Sale
group by year(SaleDate);
SELECT DISTINCT with GROUP BY is almost never correct. All that your query does is aggregate by SaleDate and in the result set extract the year. That is why you see duplicates.

How to use an sum() function without group by?

I just have to omit those records whose sum of sales in all 53 weeks is 0 and would need the output without group by
You cannnot really get that in one query.
To get all years without any sum of sales, you have to sum the sales.
That is:
Firstly:
select YEAR(date) from YourTable group by YEAR(date) having sum(sales) > 0
Then:
select * from YourTable where Year in (<firstquery>) as aliasname
order by <anydatecolumn>
If you are using mssql you can do that in one query using the OVER clause and partitioning