Partitions and groupby: first quarter with company sales - sql

I have a table with company sales by quarter. The table only registers transactions that occurred, so if there are no sales in a given quarter, it won't appear at all in the table.
I would like to find the first quarter that the company has any sales.
If I include group by 1, I get error: 'quarter' is not present in the GROUP BY list. If I don't include it, I get duplicate rows. What is the correct syntax to get just one row associated with each company?
select
company,
first_value(quarter) over (partition by company order by year_quarter) as first_quarter
from
sales_table
group by 1

Related

Delete duplicates using dense rank

I have a sales data table with cust_ids and their transaction dates.
I want to create a table that stores, for every customer, their cust_id, their last purchased date (on the basis of transaction dates) and the count of times they have purchased.
I wrote this code:
SELECT
cust_xref_id, txn_ts,
DENSE_RANK() OVER (PARTITION BY cust_xref_id ORDER BY CAST(txn_ts as timestamp) DESC) AS rank,
COUNT(txn_ts)
FROM
sales_data_table
But I understand that the above code would give an output like this (attached example picture)
How do I modify the code to get an output like :
I am a beginner in SQL queries and would really appreciate any help! :)
This would be an aggregation query which changes the table key from (customer_id, date) to (customer_id)
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(txn_ts) as count_purchase_dates
FROM
sales_data_table
GROUP BY
cust_xref_id
You are looking for last purchase date and count of distinct transaction dates ( like if a person buys twice, it should be considered as one single time).
Although you mentioned you want count of dates but sample data shows you want count of distinct dates - customer 284214 transacted 9 times but distinct will give you 7.
So, here is the SQL you can use to get your result.
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(distinct txn_ts) as count_purchase_dates -- Pls note distinct will count distinct dates
FROM sales_data_table
GROUP BY 1

Return defined number of unique values in separate columns all meeting same 'Where' Criteria

We enter overrides based on a unique value from our tables (we have two columns with unique values for each transaction, so may or may not be primary key).
Sometimes we have to enter multiple overrides based on the same set of criteria, so it would be nice to be able to pull multiple unique values in one query that all meet the same criteria in the where clause as our system throws a warning if the same unique id is used for more than one override.
Say we have some customers that were under charged for three months and we need to enter a commission override for each of the three sales people that split the accounts for each month:
I've tried the following code, but the same value gets returned for each column:
select month, customer, product, sum(sales),
any_value(unique_id)unique_id1,
any_value(unique_id)unique_id2,
any_value(unique_id)unique_id3
from table
where customer in (j,k,l) and product = m and year = o
group by 1,2,3;
This will give me a row for each month and customer, but the values in unique_id1, unique_id2 and unique_id3 are the same on each row.
I was able to use:
select month, customer, product, sum(sales),
string_agg(unique_id, "," LIMIT 3)
from table
where customer in (j,k,l) and product = m and year = o
group by 1,2,3;
and split the unique_ids in a spreadsheet but I feel there has to be a better way to accomplish this directly in SQL.
I figure I could use a sub query and select column based on row 1,2,3, but I'm trying to eliminate the redundancy of including the same 'where' criteria in the sub query.
Beow is for BigQuery Standard SQL
I think you second query was close enough to get to something like below
#standardSQL
SELECT month, customer, product, sales,
arr[OFFSET(0)] unique_id1,
arr[SAFE_OFFSET(1)] unique_id2,
arr[SAFE_OFFSET(2)] unique_id3
FROM (
SELECT month, customer, product, SUM(sales) sales,
ARRAY_AGG(unique_id ORDER BY month DESC LIMIT 3) arr
FROM `project.dataset.table`
WHERE customer IN ('j','k','l') AND product = 'm' AND year = 2019
GROUP BY month, customer, product
)

How do I use array_agg with a condition?

I have a table with a list of potential customers, their activity, and their sales representative. Every customer can have up to 1 sales rep. I've built a summary table where I aggregate the customer activity, and group it by the sales rep, and filter by the customer creation date. This is NOT a cohort (the customers do not all correspond to the scheduled_flights, but rather this is a snapshot of activity for a given period of time) It looks something like this:
Now, in addition to the total number of customers, I'd also like to output an array of those actual customers. The customers field is currently calculated by performing sum(is_customer) as customers and then grouping by the sales rep. To build the array, I've tried to do array_agg(customer_name) which outputs the list of all customer names -- I just need the list of names who also satisfy the condition that is_customer = 1, but I can't use that as a where clause since it would filter out other activity, like scheduled and completed flights for customers that were not new.
This should probably work:
array_agg(case when is_customer = 1 then customer_name end) within group (order by customer_name)
Snowflake should ignore NULL values in the aggregation.

How to count how many times certain values appear in a table in SQL and return that number in a column?

I've used the COUNT function to determine how many rows there are in a table or how often a value appears in a table.
However, I want to return the 'count' for multiple values in a table as a seperate column.
Say we a have a customer table with columns; Customer ID #, Name, Phone Number.
Say we also have a sales table with columns: Customer ID, Item Purchased, Date
I would like my query to return a column for customer ID and a column for # of times that customer ID appeared in the sales table. I would like to do this for all of my customer IDs at once--any tips?
You can use group by:
select customer_id,
count(*)
from sales
group by customer_id
This will return a row by customer ID with the count of how many matching items.
You want to use GROUP BY
Select Count(*), CustomerID
from Sales
GROUP BY CustomerID

Sql join only on the first match

I don't know how to perform the the following case.
I have the sales info in a table:
Number of Bill (key),
Internal number (key),
Client,
Date (month-year),
Product group,
Product,
Quantities,
Total,
Sales man.
I need to joint this sales tables with the annual forecast sales table that is the next one:
Date (key),
Group product(key),
Sales man (key),
Total.
In each tables the combination of the key is the primary key. I need to add in the sales tables the forecast. For this I need to add the sales of the forecast in the real sale only on the first match of date, group product and sales man, so the total of forecast sales don't get bigger than it is (a sales man can sell the same group product, to the same client, in the same day on multiple times).
.. only on the first match of date, group product and sales man ..
You can use window functions for this, consider using ROW_NUMBER() OVER(PARTITION BY ... ORDER BY ... ). First match has row number of 1.
More information and examples (sales!) can be found from MSDN.