PostgreSQL Query To Obtain Value that Occurs more than once in 12 months - sql

I have the following query to return the number of users that booked a flight at least twice, but I need to identify those which have booked a flight more than once in the range of 12 months
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
GROUP BY customer
HAVING COUNT(*) > 1
)

You would use window functions. The simplest method is lag():
select count(distinct customer)
from (select s.*,
lag(date) over (partition by customer order by date) as prev_date
from sales s
) s
where prev_date > s.date - interval '12 month';

At the cost of a self-join, #AdrianKlaver's answer can adapt to any 12-month period.
SELECT COUNT(DISTINCT customer) FROM
(SELECT customer
FROM sales s1
JOIN sales s2
ON s1.customer = s2.customer
AND s1.ticket_id <> s2.ticket_id
AND s2.date_field BETWEEN s1.date_field AND (s1.date_field + interval'1 year')
GROUP BY customer
HAVING COUNT(*) > 1) AS subquery;

A stab at it with a made up date field:
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
WHERE date_field BETWEEN '01/01/2019' AND '12/31/2019'
GROUP BY customer
HAVING COUNT(*) > 1
)

Related

SQL get the first date the condition exists

I have a collections table:
subscription_id, transaction_date_est, start_date_est, end_date_est, transaction_id, invoice_number, user_id, transaction_amount, plan_code, transaction_type
I have users who purchased a subscription, for each subscription I generate a subscription ID. Each subscription has between 1 to many invoices.
For each subscription I try to find when the user passed the accumulated amount of $100 with using a window function.
My Expected results: subscription_id, transaction_date
I tried:
SELECT subscription_id, MAX(date) transaction_date
FROM (SELECT subscription_id,
SUM(usd_price) LAG (partition by subscription_id
ORDER BY first_transaction_date ASC) AS total
GROUP BY(current_period_started_at)
HAVING BY total >= 100
ORDER BY usd_price)
but I don't succeed to extract the first date the user passed 100$.
I think you want:
select c.*
from (
select c.*, sum(usd_price) over((partition by subscription_id order by transaction_date) sum_usd_price
from collections c
) c
where sum_usd_price >= 100 and sum_usd_price - usd_price < 100

How to summarize information over the dynamic period in sql?

I have a table with orders and the following fields:
create table orders2 (
orderID int,
customerID int,
date DateTime,
amount int)
engine=Memory;
Each customer can make 0 or many orders each day. I need to create an SQL query that will show for each customer how many orders he/she made during the period of 3 days starting from the day when the customer has made his/her first order.
So, for each customer, the query should detect the date of the first order, then compute the date that is 3 days in the future from the first date, then filter rows to take only orders with dates in the given range, and then perform counting of orders (orderID) in that time period. At the moment, I was able to just detect the date of the first order for each customer.
SELECT
O.customerID,
O.date AS first_day,
COUNT(O.orderID) AS first_day_orders_num,
SUM(O.amount) AS first_day_amount
FROM orders2 AS O
INNER JOIN
(
SELECT
customerID,
MIN(date) AS first_date
FROM orders2
GROUP BY customerID
) AS I ON (O.customerID = I.customerID) AND (O.date = I.first_date)
GROUP BY
O.customerID,
O.date
I don't really understand what result do you need. Probably it can be solved using arrays.
Here is solution using vanilla sql
select customerID, min(first_date), sum(num_orders_per_day)
from (
select customerID, date, min(date) first_date, count() num_orders_per_day
from orders2
group by customerID, date
having date <= first_date + interval 3 days
)
group by customerID
You can use window functions to get the first order date:
select o.CustomerID, count(*) as num_orders_3_days
from (select o.*, min(date) over (partition by CustomerID) as min_date
from orders o
) o
where date < min_date + interval '3 day'
group by CustomerID;
Try this query:
SELECT customerID, orders_count
FROM (
SELECT customerID,
arraySort(x -> x.1, groupArray((date, orderID))) sorted_date_per_order_pairs,
sorted_date_per_order_pairs[1].1 + INTERVAL 3 day AS end_date,
arrayFilter(x -> x.1 < end_date, sorted_date_per_order_pairs) orders_in_period,
length(orders_in_period) orders_count
FROM orders2
GROUP BY customerID);

Joining two aggregated queries

I have 2 tables that look like this
users:
id | created_at
payments:
id | created_at
I need a table that is grouped by year and month and contains both number of users and payments
stats:
month | year | users | payments
Where users column contains number of registered users and payments - number of payments. I can get two tables separately, but how can I join them?
select
month(created_at) as month,
year(created_at) as year,
count(*) users
from
users
group by
month, year
having
users > 0
order by
year desc, month desc;
select
month(created_at) as month,
year(created_at) as year,
count(*) payments
from
payments
group by
month, year
having
payments > 0
order by
year desc, month desc;
The comparison to users > 0 and payments > 0 are useless. In addition, order by in subqueries is meaningless.
You can do this with a full join:
select month, year, coalesce(users, 0) as users, coalesce(payments, 0) as payments
from (select month(created_at) as month, year(created_at) as year,
count(*) as users
from users
group by month, year
) u full join
(select month(created_at) as month, year(created_at) as year,
count(*) as payments
from payments
group by month, year
) p
using (month, year)
order by year desc, month desc;
If you know you have users and payments for all months (that you care about), you can use an inner join rather than a full join.
I think that's what you're looking for :
select a.*, b.payments from (
select month(created_at) as month, year(created_at) as year, count(*) users
from users group by month, year having users > 0 order by year desc, month desc
) a left join (
select month(created_at) as month, year(created_at) as year, count(*) payments
from payments group by month, year having payments > 0 order by year desc, month desc
) b on a.month = b.month and b.year = b.year

SQL Query to group ID overlap (via inner join) by month

I'm trying to find a query that will give me the number of customers that have transacted with 2 different entities in the same month. In other words, customer_ids that transacted with company_a and company_b within the same month. Here is what I have so far:
SELECT Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date) AS
payment_month,
Count(UNIQUE(company_a_customers.customer_id))
FROM (SELECT *
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )) AS company_a_customers
INNER JOIN (SELECT *
FROM my_table
WHERE ( merchant_name = 'company_b' )) AS
company_b_customers
ON company_a_customers.customer_id =
company_b_customers.customer_id
GROUP BY Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date)
The problem is that this is giving me a running total of all customers that transacted with company A on a month-by-month basis who also ever transacted with company B.
If I whittle it down to a specific month, it will obviously give me the correct overlap, because the query is only getting IDs for that month:
SELECT Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date) AS
payment_month,
Count(UNIQUE(company_a_customers.customer_id))
FROM (SELECT *
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )
AND transaction_date >= '2017-06-01'
AND transaction_date <= '2017-06-30') AS company_a_customers
INNER JOIN (SELECT *
FROM my_table
WHERE ( merchant_name = 'company_b' )
AND transaction_date >= '2017-06-01'
AND transaction_date <= '2017-06-30') AS
company_b_customers
ON company_a_customers.customer_id =
company_b_customers.customer_id
GROUP BY Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date)
How can I do this in one query to get monthly totals for customers who transacted with both companies within the given month?
Desired result: Output of second query, but for every month that is in the database. In other words:
January 2017: xx,xxx overlapping customers
February 2017: xx,xxx overlapping customers
March 2017: xx,xxx overlapping customers
Thanks very much.
You could simply calculate year/month for both and then add it as a join-condition, but this is not very efficient as it might create a huge intermediate result.
You better check for each month/customer if there were transactions with both merchants using conditional aggregation. And then count by month:
SELECT payment_month, count(*)
FROM
( SELECT Extract(year FROM transaction_date)
|| Extract(month FROM transaction_date) AS payment_month,
customer_id
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )
OR ( merchant_name = 'company_b' )
GROUP BY payment_month,
customer_id
-- both merchants within the same months
HAVING SUM(CASE WHEN merchant_name LIKE '%company_a%' THEN 1 ELSE 0 END) > 0
AND SUM(CASE WHEN merchant_name = 'company_b' THEN 1 ELSE 0 END) > 0
) AS dt
GROUP BY 1
YOur payment_month calculation is to complicated (and the returned string is not nicely formatted).
To get year/month as string:
TO_CHAR(transaction_date, 'YYYYMM')
as number:
EXTRACT(YEAR FROM transaction_date) * 100
+ EXTRACT(MONTH FROM transaction_date)
or calculate the first of month:
TRUNC(transaction_date, 'mon')
You should be able to get your desired results in one query just by counting the number of merchant_names per month per customer id. Using HAVING > 1 will show you only customers with transactions with both (or more if there are more matches for like '%company_a%').
SELECT
EXTRACT(Year from transaction_date)||EXTRACT(Month from transaction_date) as payment_month
,customer_id
,COUNT(DISTINCT merchant_name) as CompanyCount
FROM my_table
WHERE transaction_date >= '2017-06-01' AND transaction_date <= '2017-06-30'
AND (merchant_name = 'company_b' or merchant_name LIKE '%company_a%')
GROUP BY
EXTRACT(Year from transaction_date)||EXTRACT(Month from transaction_date)
,customer_id
HAVING COUNT(DISTINCT merchant_name) > 1

Days Since Last Help Ticket was Filed

I am trying to create a report to show me the last date a customer filed a ticket.
Customers can file dozens of tickets. I want to know when the last ticket was filed and show how many days it's been since they have done so.
The fields I have are:
Customer,
Ticket_id,
Date_Closed
All from the Same table "Tickets"
I'm thinking I want to do a ranking of tickets by min date? I tried this query to grab something but it's giving me all the tickets from the customer. (I'm using SQL in a product called Domo)
select * from (select *, rank() over (partition by "Ticket_id"
order by "Date_Closed" desc) as date_order
from tickets ) zd
where date_order = 1
This should be simple enough,
SELECT customer,
MAX (date_closed) last_date,
ROUND((SYSDATE - MAX (date_closed)),0) days_since_last_ticket_logged
FROM emp
GROUP BY customer
select Customer, datediff(day, date_closed, current_date) as days_since_last_tkt
from
(select *, rank() over (partition by Customer order by "Date_Closed" desc) as date_order
from tickets) zd
join tickets t on zd.date_closed = t.date_closed
where zd.date_order = 1
Or you can simply do
select customer, datediff(day, max(Date_closed), current_date) as days_since_last_tkt
from tickets
group by customer
To select other fields
select t.*
from tickets t
join (select customer, max(Date_closed) as mxdate,
datediff(day, max(Date_closed), current_date) as days_since_last_tkt
from tickets
group by customer) tt
on t.customer = tt.customer and tt.mxdate = t.date_closed
I would do this with a simple sub-query to select the last closed date for the customer. Then compare this to today with datediff() to get the number of days since last closed.
Select
LastTicket.Customer,
LastTicket.LastClosedDate,
DateDiff(day,LastTicket.LastClosedDate,getdate()) as DaysSinceLastClosed
From
(select
tickets.customer
max(tickets.dateClosed) as LastClosedDate
from tickets
Group By tickets.Customer) as LastTicket
Based on the responses this is what I did:
select "Customer",
Max("date_closed") "last_date,
round(datediff(DAY, CURRENT_DATE, max("date_closed")), 0) as "Closed_date"
from tickets
group by "Customer"
ORDER BY "Customer"