Left join on same table, analysis on trends mysql - sql

table structure is as follows
+---------------+---------+---------+
| customer_name | date | balance |
+---------------+---------+---------+
| 123 | june 14 | 20 |
| 123 | june 15 | 30 |
| 1234 | june 14 | 30 |
| 12345 | june 16 | 50 |
+---------------+---------+---------+
i would like to join on the same table, keeping my original data set as 2014 and i want to analyse trends to see which customers balance doesnt change from 2014.
for example i would like to show the below
+-----------+-----------+-----------+
| custmomer | june14bal | june15bal |
+-----------+-----------+-----------+
| 1234 | 30 | null |
| 123 | 20 | 30 |
+-----------+-----------+-----------+
I have trids multiple left joins but cant seem to get it working. the most important thing is starting my sample with records from 2014 only.
current script
with TABLE_DATA as
(
select Customer ,DATE, Balance
from table
where dATE in ('30-JUN-2014','30-juN-2015')
)
SELECT
sum(inv1.balance) as year1bal,
suminv2.balance) as year2bal,
customer,
date
from table_datA inv1
left join TABLE_DATA inv2
on inv1.customer= inv2.customer and inv2.as_of_Date = '30-June-2015'
group by date, customer

you can add having clause after group by Like:
having sum(inv1.balance) != sum(inv2.balance)
or try the below query
with table2014 as
(
select Customer ,sum(Balance) Balance2014
from tableName
where dATE ='30-JUN-2014' group by Customer
)
,Table2015 as
( select Customer ,sum( Balance) Balance2015
from tableName
where dATE ='30-juN-2015' group by Customer
)
SELECT
inv1.customer,Balance2014, Balance2015
from table2014 inv1
left join Table2015 inv2
on inv1.customer= inv2.customer
--where Balance2014 !=Balance2015

Related

Find the first order of a supplier in a day using SQL

I am trying to write a query to return supplier ID (sup_id), order date and the order ID of the first order (based on earliest time).
+--------+--------+------------+--------+-----------------+
|orderid | sup_id | items | sales | order_ts |
+--------+--------+------------+--------+-----------------+
|1111132 | 3 | 1 | 27,0 | 24/04/17 13:00 |
|1111137 | 3 | 2 | 69,0 | 02/02/17 16:30 |
|1111147 | 1 | 1 | 87,0 | 25/04/17 08:25 |
|1111153 | 1 | 3 | 82,0 | 05/11/17 10:30 |
|1111155 | 2 | 1 | 29,0 | 03/07/17 02:30 |
|1111160 | 2 | 2 | 44,0 | 30/01/17 20:45 |
|....... | ... | ... | ... | ... ... |
+--------+--------+------------+--------+-----------------+
Output I am looking for:
+--------+--------+------------+
| sup_id | date | order_id |
+--------+--------+------------+
|....... | ... | ... |
+--------+--------+------------+
I tried using a subquery in the join clause as below but didn't know how to join it without having selected order_id.
SELECT sup_id, date(order_ts), order_id
FROM sales s
JOIN
(
SELECT sup_id, date(order_ts) as date, min(time(order_date))
FROM sales
GROUP BY merchant_id, date
) m
on ...
Kindly assist.
You can use not exists:
select *
from sales
where not exists (
-- find sales for same supplier, earlier date, same day
select *
from sales as older
where older.sup_id = sales.sup_id
and older.order_ts < sales.order_ts
and older.order_ts >= cast(sales.order_ts as date)
)
The query below might not be the fastest in the world, but it should give you all information you need.
select order_id, sup_id, items, sales, order_ts
from sales s
where order_ts <= (
select min(order_ts)
from sales m
where m.sup_id = s.sup_id
)
select sup_id, min(order_ts), min(order_id) from sales
where order_ts = '2022-15-03'
group by sup_id
Assumed orderid is an identity / auto increment column

Find missing values SQL Server?

I have a table (dataset_final) that contains data on the number of sales (field quantity) of goods in a particular store for a particular week of the year. Unique goods about 200 thousand, about 50 stores, the period of 6 years.
dataset_final
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
I would like the missing values, i.e. when the combination of good and store was not sold in a certain week of the year, to fill in the zero. For example.
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
| 2017 | 39 | 137233 | 9 | 0 |
+---------+-------------+---------+----------+----------+
| 2016 | 36 | 152501 | 23 | 0 |
+---------+-------------+---------+----------+----------+
I wanted to do this: find all unique combinations of year_id, week_number, good_id, store_id and add only those that are not in the dataset_final table. My query:
WITH t1 AS (SELECT DISTINCT
[year_id]
,[week_number]
,[good_id]
,[store_id]
FROM [fs_db].[dbo].[ds_dataset_final]),
t2 AS (SELECT DISTINCT [year_id], [week_number] FROM [fs_db].[dbo].[ds_dataset_final])
SELECT t2.[year_id], t2.[week_number], t1.[good_id], t1. [store_id] FROM t1
full join t2 ON t2.[year_id]=t1.[year_id] AND t2.[week_number]=t2.[week_number]
This query produces about 1.2 billion unique combinations, which seems too much.
Also, I take into account the combination only from the beginning of sales of goods, for example, if the table has sales of a particular product only from 2017, then I do not need to fill in earlier data.
The basic idea is to general all the rows using cross join and then use left join to bring in the values.
Assuming you have all year/week combinations in your original table and have all the goods and stores in the table, you can use:
select vw.year_id, vw.week_number,
g.good_id, s.store_id,
coalesce(d.quantity, 0) as quantity
from (select distinct year_id, week_number
from fs_db..ds_dataset_final
) yw cross join
(select distinct good_id
from fs_db..ds_dataset_final
) g cross join
(select distinct store_id
from fs_db..ds_dataset_final
) s left join
fs_db..ds_dataset_final d
on d.year_id = vw.year_id and
d.week_number = vw.week_number and
d.good_id = g.good_id and
d.store_id = s.store_id;
You may have other sources for each of the dimensions (such as a proper dimension table). If so, don't use select distinct but use the reference tables.
EDIT:
Just add as the last line the in the query:
where yw.year >= 2015 and yw.year < 2019
if you want the years 2015, 2016, 2017, and 2018.
This is very much pseudo SQL in the absence of what your actual database looks like, it should, however, get you on the right path. You'll need to replace the objects like dbo.Store with your actual objects, and I suggest creating a proper calendar table:
--This shoudl really be a full calendar table, but we'll making a sample here
CREATE TABLE dbo.Weeks (Year int,
Week int);
INSERT INTO dbo.Weeks (Year, Week)
SELECT Y.Year,
W.Week
FROM (VALUES(2016),(2017),(2018),(2019))Y(Year)
CROSS APPLY (SELECT TOP 52 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Week
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N1(N),
(VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N2(N)) W
GO
WITH CTE AS(
SELECT W.Year,
W.Week,
S.StoreID,
G.GoodsID
FROM dbo.Weeks W
CROSS JOIN dbo.Store S
CROSS JOIN dbo.Goods G
WHERE EXISTS (SELECT 1
FROM dbo.YourTable YT
WHERE YT.year_id <= W.Year
AND YT.store_id = S.StoreID))
SELECT C.Year,
C.Week,
C.StoreID,
C.GoodsID,
ISNULL(YT.quantity,0) AS quantity
FROM CTE C
LEFT JOIN YourTable YT ON C.Year = YT.year_id
AND C.Week = YT.week_number
AND C.StoreID = YT.store_id
AND C.GoodsID = YT.good_id
--WHERE?

postgresql - cumul. sum active customers by month (removing churn)

I want to create a query to get the cumulative sum by month of our active customers. The tricky thing here is that (unfortunately) some customers churn and so I need to remove them from the cumulative sum on the month they leave us.
Here is a sample of my customers table :
customer_id | begin_date | end_date
-----------------------------------------
1 | 15/09/2017 |
2 | 15/09/2017 |
3 | 19/09/2017 |
4 | 23/09/2017 |
5 | 27/09/2017 |
6 | 28/09/2017 | 15/10/2017
7 | 29/09/2017 | 16/10/2017
8 | 04/10/2017 |
9 | 04/10/2017 |
10 | 05/10/2017 |
11 | 07/10/2017 |
12 | 09/10/2017 |
13 | 11/10/2017 |
14 | 12/10/2017 |
15 | 14/10/2017 |
Here is what I am looking to achieve :
month | active customers
-----------------------------------------
2017-09 | 7
2017-10 | 6
I've managed to achieve it with the following query ... However, I'd like to know if there are a better way.
select
"begin_date" as "date",
sum((new_customers.new_customers-COALESCE(churn_customers.churn_customers,0))) OVER (ORDER BY new_customers."begin_date") as active_customers
FROM (
select
date_trunc('month',begin_date)::date as "begin_date",
count(id) as new_customers
from customers
group by 1
) as new_customers
LEFT JOIN(
select
date_trunc('month',end_date)::date as "end_date",
count(id) as churn_customers
from customers
where
end_date is not null
group by 1
) as churn_customers on new_customers."begin_date" = churn_customers."end_date"
order by 1
;
You may use a CTE to compute the total end_dates and then subtract it from the counts of start dates by using a left join
SQL Fiddle
Query 1:
WITH edt
AS (
SELECT to_char(end_date, 'yyyy-mm') AS mon
,count(*) AS ct
FROM customers
WHERE end_date IS NOT NULL
GROUP BY to_char(end_date, 'yyyy-mm')
)
SELECT to_char(c.begin_date, 'yyyy-mm') as month
,COUNT(*) - MAX(COALESCE(ct, 0)) AS active_customers
FROM customers c
LEFT JOIN edt ON to_char(c.begin_date, 'yyyy-mm') = edt.mon
GROUP BY to_char(begin_date, 'yyyy-mm')
ORDER BY month;
Results:
| month | active_customers |
|---------|------------------|
| 2017-09 | 7 |
| 2017-10 | 6 |

SQL sum and group values from two tables

I'm trying to group sales data based on a sellers' name. The name is available in another table. My tables look like this:
InvoiceRow:
+-----------+----------+-----+----------+
| InvoiceNr | Title | Row | Amount |
+-----------+----------+-----+----------+
| 1 | Chair | 1 | 2000.00 |
| 2 | Sofa | 1 | 1500.00 |
| 2 | Cushion | 2 | 2000.00 |
| 3 | Lamp | 1 | 6500.00 |
| 4 | Table | 1 | -500.00 |
+-----------+----------+-----+----------+
InvoiceHead:
+-----------+----------+------------+
| InvoiceNr | Seller | Date |
+-----------+----------+------------+
| 1 | Adam | 2016-01-01 |
| 2 | Lisa | 2016-01-04 |
| 3 | Adam | 2016-01-08 |
| 4 | Carl | 2016-01-17 |
+-----------+----------+------------+
The query that I'm working with currently looks like this:
SELECT SUM(Amount)
FROM InvoiceRow
WHERE InvoiceNr IN (
SELECT InvoiceNr
FROM InvoiceHead
WHERE Date >= '2016-01-01' AND Date < '2016-02-01'
)
This works and will sum the values of all rows of all invoices (total sales) in the month of january.
What I want to do is a sales summary grouped by each sellers' name. Something like this:
+----------+------------+
| Seller | Amount |
+----------+------------+
| Adam | 8500.00 |
| Lisa | 3500.00 |
| Carl | -500.00 |
+----------+------------+
And after that maybe even grouped by month (but that's not part of this question, I'm hoping to be able to figured that out if I solve this).
I've tried all kinds of joins but I end up with a lot of duplicates, and I'm not sure how to SUM and group at the same time. Does anyone know how to do this?
Try This
SELECT seller, SUM(amount) FROM InvoiceRow
JOIN InvoiceHead
ON InvoiceRow.InvoiceNr = InvoiceHead.InvoiceNr
GROUP BY InvoiceHead.seller;
OR If you want to between two date. Try This
SELECT seller, SUM(amount) FROM InvoiceRow
JOIN InvoiceHead
ON InvoiceRow.InvoiceNr = InvoiceHead.InvoiceNr
WHERE InvoiceHead.Date >= '2016-01-01' AND InvoiceHead.Date < '2016-02-01'
GROUP BY InvoiceHead.seller;
You just need to join the tables, filter result by date as you need and then make grouping:
select
H.Seller,
sum(R.Amount) as Amount
from InvoiceHead as H
left outer join InvoiceRow as R on R.InvoiceNr = H.InvoiceNr
where H. Date >= '2016-01-01' AND H.Date < '2016-02-01'
group by H.Seller
SELECT t.seller,sum(s.amount)
FROM invoiceRow s join InvoiceHead t
ON s.invoiceNr = t.invoiceNr
group by t.seller
You should just sum them up. If date range is necessary, you can add a where clause after the ON clause and filter you dates like this:
SELECT t.seller,sum(s.amount)
FROM invoiceRow s join InvoiceHead t
ON s.invoiceNr = t.invoiceNr
WHERE t.date between '01-01-2016' and '31-01-2016'
group by t.seller
You may try this once:
SELECT ih.Seller,
(
SELECT SUM(Amount) FROM invoicerow ir
INNER JOIN invoicehead ih1
ON (ir.InvoiceNr = ih1.InvoiceNr)
WHERE ih1.Seller = ih.Seller
) AS Amount
FROM invoicehead ih
GROUP BY ih.Seller

Data aggregation with left-outer join

I am trying to pull some data with transaction counts, by branch, by week, which will later be used to feed some dynamic .Net charts.
I have a calendar table, I have a branch table and I have a transaction table.
Here is my DB info (only relevant columns included):
Branch Table:
ID (int), Branch (varchar)
Calendar Table:
Date (datetime), WeekOfYear(int)
Transaction Table:
Date (datetime), Branch (int), TransactionCount(int)
So, I want to do something like the following:
Select b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
LEFT OUTER JOIN TransactionTable t
on t.Branch = b.ID
JOIN Calendar c
on t.Date = c.Date
WHERE YEAR(c.Date) = #Year // (SP accepts this parameter)
GROUP BY b.Branch, c.WeekOfYear
Now, this works EXCEPT when a branch doesn't have any transactions for a week, in which case NO RECORD is returned for that branch on that week. What I WANT is to get that branch, that week and "0" for the sum. I tried isnull(sum(TransactionCount), 0) - but that didn't work, either. So I will get the following (making up sums for illustration purposes):
+--------+------------+-----+
| Branch | WeekOfYear | Sum |
+--------+------------+-----+
| 1 | 1 | 25 |
| 2 | 1 | 37 |
| 3 | 1 | 19 |
| 4 | 1 | 0 | //THIS RECORD DOES NOT GET RETURNED, BUT I NEED IT!
| 1 | 2 | 64 |
| 2 | 2 | 34 |
| 3 | 2 | 53 |
| 4 | 2 | 11 |
+--------+------------+-----+
So, why doesn't the left-outer join work? Isn't that supposed to
Any help will be greatly appreciated. Thank you!
EDIT: SAMPLE TABLE DATA:
Branch Table:
+----+---------------+
| ID | Branch |
+----+---------------+
| 1 | First Branch |
| 2 | Second Branch |
| 3 | Third Branch |
| 4 | Fourth Branch |
+----+---------------+
Calendar Table:
+------------+------------+
| Date | WeekOfYear |
+------------+------------+
| 01/01/2015 | 1 |
| 01/02/2015 | 1 |
+------------+------------+
Transaction Table
+------------+--------+--------------+
| Date | Branch | Transactions |
+------------+--------+--------------+
| 01/01/2015 | 1 | 12 |
| 01/01/2015 | 1 | 9 |
| 01/01/2015 | 2 | 4 |
| 01/01/2015 | 2 | 2 |
| 01/01/2015 | 2 | 23 |
| 01/01/2015 | 3 | 42 |
| 01/01/2015 | 3 | 19 |
| 01/01/2015 | 3 | 7 |
+------------+--------+--------------+
If you want to return a query that contains each Branch and each week, then you'll need to first create a full list of that, then use a LEFT JOIN to the transactions to get the count. The code will be similar to:
select bc.Branch,
bc.WeekOfYear,
TotalTransaction = coalesce(sum(t.TransactionCount), 0)
from
(
select b.id, b.branch, c.WeekOfYear, c.date
from branch b
cross join Calendar c
-- if you want to limit the number of rows returned use a WHERE to limit the weeks
-- so far in the year or using the date column
WHERE c.date <= getdate()
and YEAR(c.Date) = #Year // (SP accepts this parameter)
) bc
left join TransactionTable t
on t.Date = bc.Date
and bc.id = t.branch
GROUP BY bc.Branch, bc.WeekOfYear
See Demo
This code will create in your subquery a full list of each branch with each date. Once you have this list, then you can JOIN to the transactions to get your total transaction count and you'd return each date as you want.
Bring in the Calendar before you bring in the transactions:
SELECT b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
INNER JOIN CalendarTable c ON YEAR(c.Date) = #Year
LEFT JOIN TransactionTable t ON t.Branch = b.ID AND t.Date = c.Date
GROUP BY b.Branch, c.WeekOfYear
ORDER BY c.WeekOfYear, b.Branch