help with getting statistics from db in SQL - sql

I'm using SQL-Server 2005.
I have two tables Users and Payments which has a foreign key to Users.
Both Users and Payments have a date column (Users set their value during registration and Payments gets a value during payment). Users has a column called isPaymentsRecurring as bit which tells me to renew the user or not. The value is 1 on default = recurring. Payments has a column called paymentSum as tinyint where I insert the value of payment. ( first payment is equal to recurring one)
I need to get a few statistics about that for simple line chart, grouped by date. To separate single days I use the hack below to get whole day as a single item.
Day hack
dateadd(dd,datediff(dd,0,date),0)
What I need to get are the values below all must be grouped by day hack.
1.Unique users in system per day
2.Users that ordered once and then set isPaymentRecurring to 0.
3.Sum of paymentSum per day
4.How many users got recurring payment per day. This means orders of user per day except first order in system.
Thanks, I have queries that work but they don't work as I want them to thus I want pros opinion on that.

You could add computed columns to your Payments and Users tables that represent the day, month and year of payment or user registration:
ALTER TABLE dbo.Payments
ADD PaymentDay AS DAY(PaymentDate) PERSISTED,
PaymentMonth AS MONTH(PaymentDate) PERSISTED,
PaymentYear AS YEAR(PaymentDate) PERSISTED
Now these columns give you INT values for the day, month and year and you can easily query them. Since they're persisted, they can even be indexed!
You can easily group your payments by those columns now:
SELECT (list of fields), COUNT(*)
FROM dbo.Payments
GROUP BY PaymentYear, PaymentMonth, PaymentDay
or whatever you need to do. With this approach, you can also easily do stats on a per-month basis, e.g. sum up all payments and group by month.
Update: ok, so you need to understand how to create those queries:
Unique users in system per day
SELECT UserDay, UserMonth, UserYear, COUNT(*)
FROM dbo.Users
GROUP BY UserDay, UserMonth, UserYear
That would count the users and group them by day, month, year
Users that ordered once and then set isPaymentRecurring to 0.
SELECT UserDay, UserMonth, UserYear, COUNT(*)
FROM dbo.Users
WHERE isPaymentRecurring = 0
GROUP BY UserDay, UserMonth, UserYear
Is that what you're looking for?? Number of users grouped by day/month/year that have their isPaymentRecurring set to 0 ?
sum of paymentSum per day
SELECT PaymentDay, PaymentMonth, PaymentYear, SUM(PaymentSum)
FROM dbo.Payments
GROUP BY PaymentDay, PaymentMonth, PaymentYear
That would sum up the PaymentSum column and group it by day, month, year
I don't really understand what you're trying to achieve with the other two queries, and what criteria would have to apply for those queries.

Related

Delete duplicates using dense rank

I have a sales data table with cust_ids and their transaction dates.
I want to create a table that stores, for every customer, their cust_id, their last purchased date (on the basis of transaction dates) and the count of times they have purchased.
I wrote this code:
SELECT
cust_xref_id, txn_ts,
DENSE_RANK() OVER (PARTITION BY cust_xref_id ORDER BY CAST(txn_ts as timestamp) DESC) AS rank,
COUNT(txn_ts)
FROM
sales_data_table
But I understand that the above code would give an output like this (attached example picture)
How do I modify the code to get an output like :
I am a beginner in SQL queries and would really appreciate any help! :)
This would be an aggregation query which changes the table key from (customer_id, date) to (customer_id)
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(txn_ts) as count_purchase_dates
FROM
sales_data_table
GROUP BY
cust_xref_id
You are looking for last purchase date and count of distinct transaction dates ( like if a person buys twice, it should be considered as one single time).
Although you mentioned you want count of dates but sample data shows you want count of distinct dates - customer 284214 transacted 9 times but distinct will give you 7.
So, here is the SQL you can use to get your result.
SELECT
cust_xref_id,
MAX(txn_ts) as last_purchase_date,
COUNT(distinct txn_ts) as count_purchase_dates -- Pls note distinct will count distinct dates
FROM sales_data_table
GROUP BY 1

Return defined number of unique values in separate columns all meeting same 'Where' Criteria

We enter overrides based on a unique value from our tables (we have two columns with unique values for each transaction, so may or may not be primary key).
Sometimes we have to enter multiple overrides based on the same set of criteria, so it would be nice to be able to pull multiple unique values in one query that all meet the same criteria in the where clause as our system throws a warning if the same unique id is used for more than one override.
Say we have some customers that were under charged for three months and we need to enter a commission override for each of the three sales people that split the accounts for each month:
I've tried the following code, but the same value gets returned for each column:
select month, customer, product, sum(sales),
any_value(unique_id)unique_id1,
any_value(unique_id)unique_id2,
any_value(unique_id)unique_id3
from table
where customer in (j,k,l) and product = m and year = o
group by 1,2,3;
This will give me a row for each month and customer, but the values in unique_id1, unique_id2 and unique_id3 are the same on each row.
I was able to use:
select month, customer, product, sum(sales),
string_agg(unique_id, "," LIMIT 3)
from table
where customer in (j,k,l) and product = m and year = o
group by 1,2,3;
and split the unique_ids in a spreadsheet but I feel there has to be a better way to accomplish this directly in SQL.
I figure I could use a sub query and select column based on row 1,2,3, but I'm trying to eliminate the redundancy of including the same 'where' criteria in the sub query.
Beow is for BigQuery Standard SQL
I think you second query was close enough to get to something like below
#standardSQL
SELECT month, customer, product, sales,
arr[OFFSET(0)] unique_id1,
arr[SAFE_OFFSET(1)] unique_id2,
arr[SAFE_OFFSET(2)] unique_id3
FROM (
SELECT month, customer, product, SUM(sales) sales,
ARRAY_AGG(unique_id ORDER BY month DESC LIMIT 3) arr
FROM `project.dataset.table`
WHERE customer IN ('j','k','l') AND product = 'm' AND year = 2019
GROUP BY month, customer, product
)

SQL - Most Common Day of Week Query

I am trying to query on which day of the week most orders occur. I have written a query that returns a result. But, the result is, I believe, the day of the week possibly based on most orders, but the day for every order, so there are dozens. I am wondering how to only return the one day.
The table is:
Orders
OrderID
OrderDate
SELECT MAX(DAYNAME(OrderDate)) as WeekDay
FROM Orders O
GROUP BY OrderDate
OrDER BY WeekDay DESC;
The query you are trying to write is:
SELECT TOP (1) DAYNAME(OrderDate) as WeekDay, COUNT(*)
FROM Orders O
GROUP BY DAYNAME(OrderDate)
OrDER BY COUNT(*) DESC;
Although this query appears to answer your question, you have to be very careful. After all, if Orders had only one day's worth of data, then that day of the week would predominate. My suggestion would be to take data from a specified number of weeks -- preferably with no holidays -- to get this information.
Your question doesn't specify the database (although I could guess). The specific date/time logic for this depends on the database.
This will let you know how many orders were placed on which day.
SELECT
DAYNAME((OrderDate)) AS dayofweek, COUNT(OrderID) AS ordered
FROM
Orders
GROUP BY dayofweek
ORDER BY ordered DESC;

SQL: Filtering out customers who have not purchased over the past year

I have a couple tables with information on customers, their account information, and transactions/sales. Numerous customers can be on an account, but I want to find the number of accounts who have not purchased anything over the past year.
Customer table =
individual id
date added
first transaction date
gnc account number
Account table =
account number
date added
expiration date
first transaction date
last purchase date
Transactions table =
transaction id
sales date
account number (null when customer doesn't have an account)
What things do I need incorporate into a query (subquery, etc) where I exclude those accounts where they did not perform a transaction over the past one year.
PSEUDO CODE:
SELECT some columns
FROM the account table
LEFT OUTER JOIN to INCLUDE ALL the accounts and only those that match from transaction
on the account number
and the salesdate >= sysdate-365 (assuming this is a year)
Avoid using a where clause unless the limit is on the account table.. if you filter on the transaction table in the where clause it changes the outer join to an inner join as it will exclude the nulls (unless you OR each where clause criteria)
Assuming the last purchase date is correct, you just need a simple query like:
select a.*
from accounts a
where LastPurchaseDate >= dateadd(year, -1, getdate());
If you want the number, use select count(*) rather than select a.*.
If you want the number who have not made a purchase, then use < rather than >=.

How to produce a distinct count of records that are stored by day by month

I have a table with several "ticket" records in it. Each ticket is stored by day (i.e. 2011-07-30 00:00:00.000) I would like to count the unique records in each month by year I have used the following sql statement
SELECT DISTINCT
YEAR(TICKETDATE) as TICKETYEAR,
MONTH(TICKETDATE) AS TICKETMONTH,
COUNT(DISTINCT TICKETID) AS DAILYTICKETCOUNT
FROM
NAT_JOBLINE
GROUP BY
YEAR(TICKETDATE),
MONTH(TICKETDATE)
ORDER BY
YEAR(TICKETDATE),
MONTH(TICKETDATE)
This does produce a count but it is wrong as it picks up the unique tickets for every day. I just want a unique count by month.
Try combining Year and Month into one field, and grouping on that new field.
You may have to cast them to varchar to ensure that they don't simply get added together. Or.. you could multiple through the year...
SELECT
(YEAR(TICKETDATE) * 100) + MONTH(TICKETDATE),
count(*) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE GROUP BY
(YEAR(TICKETDATE) * 100) + MONTH(TICKETDATE)
Presuming that TICKETID is not a primary or unique key, but does appear multiple times in table NAT_JOBLINE, that query should work. If it is unique (does not occur in more than 1 row per value), you will need to select on a different column, one that uniquely identifies the "entity" that you want to count, if not each occurance/instance/reference of that entity.
(As ever, it is hard to tell without working with the actual data.)
I think you need to remove the first distinct. You already have the group by. If I was the first Distict I would be confused as to what I was supposed to do.
SELECT
YEAR(TICKETDATE) as TICKETYEAR,
MONTH(TICKETDATE) AS TICKETMONTH,
COUNT(DISTINCT TICKETID) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE
GROUP BY YEAR(TICKETDATE), MONTH(TICKETDATE)
ORDER BY YEAR(TICKETDATE), MONTH(TICKETDATE)
From what I understand from your comments to Phillip Kelley's solution:
SELECT TICKETDATE, COUNT(*) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE
GROUP BY TICKETDATE
should do the trick, but I suggest you update your question.