multiple transactions within a certain time period, limited by date range - sql

I have a database of transactions, people, transaction dates, items, etc.
Each time a person buys an item, the transaction is stored in the table like so:
personNumber, TransactionNumber, TransactionDate, ItemNumber
What I want to do is to find people (personNumber) who, from January 1st 2012(transactionDate) until March 1st 2012 have purchased the same ItemNumber multiple times within 14 days (configurable) or less. I then need to list all those transactions on a report.
Sample data:
personNumber, TransactionNumber, TransactionDate, ItemNumber
1 | 100| 2001-01-31| 200
2 | 101| 2001-02-01| 206
2 | 102| 2001-02-11| 300
1 | 103| 2001-02-09| 200
3 | 104| 2001-01-01| 001
1 | 105| 2001-02-10| 200
3 | 106| 2001-01-03| 001
1 | 107| 2001-02-28| 200
Results:
personNumber, TransactionNumber, TransactionDate, ItemNumber
1 | 100| 2001-01-31| 200
1 | 103| 2001-02-09| 200
1 | 105| 2001-02-10| 200
3 | 104| 2001-01-01| 001
3 | 106| 2001-01-03| 001
How would you go about doing that?
I've tried doing it like so:
select *
from (
select personNumber, transactionNumber, transactionDate, itemNumber,
count(*) over (
partition by personNumber, itemNumber) as boughtSame)
from transactions
where transactionDate between '2001-01-01' and '2001-03-01')t
where boughtSame > 1
and it gets me this:
personNumber, TransactionNumber, TransactionDate, ItemNumber
1 | 100| 2001-01-31| 200
1 | 103| 2001-02-09| 200
1 | 105| 2001-02-10| 200
1 | 107| 2001-02-28| 200
3 | 104| 2001-01-01| 001
3 | 106| 2001-01-03| 001
The issue is that I don't want TransactionNumber 107, since that's not within the 14 days. I'm not sure where to put in that limit of 14 days. I could do a datediff, but where, and over what?

Alas, the window functions in SQL Server 2005 just are not quite powerful enough. I would solve this using a correlated subquery.
The correlated subquery counts the number of times that a person purchased the item within 14 days after each purchase (and not counting the first purchase).
select t.*
from (select t.*,
(select count(*)
from t t2
where t2.personnumber = t.personnumber and
t2.itemnumber = t.itemnumber and
t2.transactionnumber <> t.transactionnumber and
t2.transactiondate >= t.transactiondate and
t2.transactiondate < DATEADD(day, 14, t.transactiondate
) NumWithin14Days
from transactions t
where transactionDate between '2001-01-01' and '2001-03-01'
) t
where NumWithin14Days > 0
You may want to put the time limit in the subquery as well.
An index on transactions(personnumber, itemnumber, transactionnumber, itemdate) might help this run much faster.

If as your question states you just want to find people (personNumbers) with the specified criteria, you can do a self join and group by:
create table #tx (personNumber int, transactionNumber int, transactionDate dateTime, itemNumber int)
insert into #tx
values
(1, 100, '2001-01-31', 200),
(2, 101, '2001-02-01', 206),
(2, 102, '2001-02-11', 300),
(1, 103, '2001-02-09', 200),
(3, 104, '2001-01-01', 001),
(1, 105, '2001-02-10', 200),
(3, 106, '2001-01-03', 001),
(1, 107, '2001-02-28', 200)
declare #days int = 14
select t1.personNumber from #tx t1 inner join #tx t2 on
t1.personNumber = t2.personNumber
and t1.itemNumber = t2.itemNumber
and t1.transactionNumber < t2.transactionNumber
and datediff(day, t1.transactionDate, t2.transactionDate) between 0 and #days
group by t1.personNumber
-- if more than zero joined rows there is more than one transaction in period
having count(t1.personNumber) > 0
drop table #tx

Related

How to SUM column 1 and select column 2 by condition?

I've stuck with how to sum column A and select column B with a condition if column B >= 50 select this row id.
Example Table Like this
+----+-----------+---------+
| ID | PRICE | PERCENT |
+----+-----------+---------+
| 1 | 5 | 5 |
| 2 | 18 | 20 |
| 3 | 7 | 50 |
| 4 | 16 | 56 |
| 5 | 50 | 87 |
| 6 | 17 | 95 |
| 7 | 40 | 107 |
+----+-----------+---------+
SELECT ID, SUM(PRICE) AS PRICE, PERCENT FROM Table
Column ID and PERCENT, I want to select from a row with PERCENT >= 50
The result should be
Any suggestions?
Try below query:
declare #tbl table(ID int, PRICE int, [PERCENT] int);
insert into #tbl values
(1, 5, 5),
(2, 18, 20),
(3, 7, 50),
(4, 16, 56),
(5, 50, 87),
(6, 17, 95),
(7, 40, 107);
select top 1 ID,
(select sum(PRICE) from #tbl) PRICE,
[PERCENT]
from #tbl
where [PERCENT] > 50
You could include the total in a subquery in the SELECT clause of your query like this:
SELECT
[ID],
(SELECT SUM([PRICE]) FROM T) AS [PRICE],
[PERCENT]
FROM
T
WHERE
[PRICE] >= 50
However, it remains unclear which of the five valid records should be picked. You indicated it should be the record where PERCENT has value 56, but IMHO value 50 would be possible too, just like 87, 95, and 107 (?). It is unclear why you pick value 56 as the correct one. If it doesn't matter, you could use TOP (1) in the SELECT clause, but if it does matter, you should extend the WHERE clause with appropriate conditions/filters.
Mixing aggregate data from groups back with individual elements/records like this is often fuzzy. I consider it to be a "code smell" and here in your question on StackOverflow, it might indicate an XY-problem. Anyway, these query results might get misinterpreted quite easily if you are not careful. Always remember that such aggregated data in the result (in this case the PRICE field) has practically nothing to do with the detail data in the result (in this case the ID and PERCENT fields). Unless you want to combine your aggregate data with your detail data (in a calculation for example), but you do not indicate you want anything like that in your question...
you can do this Trick to have a result of 2 queries in 1 query:
select ID as ID,T.[PERCENT] AS B, 0 as sumA
from Table_1 as T
where T.[PERCENT]>=50
union All
select 0 as ID,0 AS B, sum(t.[PRICE]) as sumA
from Table_1 as T
Am not sure why you need this but certainly, You can Archive Above Output using below query
Sample Data
declare #data table
(Id int, Price int, [Percent] int)
insert #data
VALUES (1,5,5),
(2,18,20),
(3,7,50),
(4,16,56),
(5,50,87),
(6,17,95),
(7,40,107)
Query
select top 1 ID, (select sum(price) from #data) as Price, [Percent ]
from #data
where [Percent ] >50
You can try the following code:
SELECT TOP (1) [ID], SUM(PRICE) OVER (), [PERCENT]
FROM #tbl
ORDER BY CASE WHEN [PERCENT] > 50 THEN 0 ELSE 1 END, [ID];
I am using OVER clause in order to extract/read data from the table only once - one table scan.

SQL performing day difference by matching value

My goal is to get the duration when the 1st OLD or 1st NEW status reaches to the 1st END. For example: Table1
ID Day STATUS
111 1 NEW
111 2 NEW
111 3 OLD
111 4 END
111 5 END
112 1 OLD
112 2 OLD
112 3 NEW
112 4 NEW
112 5 END
113 1 NEW
113 2 NEW
The desired outcome would be:
STATUS Count
NEW 2 (1 for ID 111-New on day 1 to End on day 4,and 1 for 112-new on day 3 to End on day 5)
OLD 2 (1 for ID 111-Old on day 3 to End on day 4, and 1 for 112-OLD on day 1 to End on day 5)
The following is T-SQL (SQL Server) and NOT available in MySQL. The choice of dbms is vital in a question because there are so many dbms specific choices to make. The query below requires using a "window function" row_number() over() and a common table expression neither of which exist yet in MySQL (but will one day). This solution also uses cross apply which (to date) is SQL Server specific but there are alternatives in Postgres and Oracle 12 using lateral joins.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Table1
(id int, day int, status varchar(3))
;
INSERT INTO Table1
(id, day, status)
VALUES
(111, 1, 'NEW'),
(111, 2, 'NEW'),
(111, 3, 'OLD'),
(111, 4, 'END'),
(111, 5, 'END'),
(112, 1, 'OLD'),
(112, 2, 'OLD'),
(112, 3, 'NEW'),
(112, 4, 'NEW'),
(112, 5, 'END'),
(113, 1, 'NEW'),
(113, 2, 'NEW')
;
Query 1:
with cte as (
select
*
from (
select t.*
, row_number() over(partition by id, status order by day) rn
from table1 t
) d
where rn = 1
)
select
t.id, t.day, ca.nxtDay, t.Status, ca.nxtStatus
from cte t
outer apply (
select top(1) Status, day
from cte nxt
where t.id = nxt.id
and t.status = 'NEW' and nxt.status = 'END'
order by day
) ca (nxtStatus, nxtDay)
where nxtStatus IS NOT NULL or Status = 'OLD'
order by id, day
Results:
| id | day | nxtDay | Status | nxtStatus |
|-----|-----|--------|--------|-----------|
| 111 | 1 | 4 | NEW | END |
| 111 | 3 | (null) | OLD | (null) |
| 112 | 1 | (null) | OLD | (null) |
| 112 | 3 | 5 | NEW | END |
As you can see, counting that Status column would result in NEW = 2 and OLD = 2

get continuosly increasing sale records from table sql server

I have 2 tables
Product
ProdId, ProdName
1 A
2 B
and
Sale
SaleId, ProdId, Sale, Year
1, 1, 100, 2012
2, 1, 130, 2013
3, 2, 100, 2012,
4, 1, 150, 2014,
5, 1, 180, 2015
6, 2, 120, 2013,
7, 2, 90, 2014,
8, 2, 130, 2015
I want the name of product whose sale is continuosly increasing.
Like Product "A" has sale record like in year 2012 - 100 Units,2013 - 130 Units,2014 - 150 Units,2015 - 180 Units, So this product A is having continuous increase in sale. Another case of non-continuous record is, product "B" having sale record 2012 - 100 Units,2013 - 120 Units,2014 - 90 Units, 2015 - 130 Units, So for product "B", it is not continuous.
I want records like product "A", who is having continuous increasing sale.
Help appreciated.
You can do this using row_number() twice:
select prod_id
from (select s.*,
row_number() over (partition by s.prod_id order by sale) as seqnum_s,
row_number() over (partition by s.prod_id order by year) as seqnum_y
from sales s
) s
group by prod_id
having sum( case when seqnum_s = seqnum_y then 1 else 0 end) = count(*);
That is, order by the year and the sales. When all row numbers are the same, then the sales are increasing.
Note: There are some cases where tied sales might be considered increasing. This can be handled by the logic -- either by excluding or including such situations. I have not included logic for this, because your question is not clear what to do in that situation.
Use cross apply to get the previous year's sale amount and check with conditional aggregation for the increasing amount condition.
select prodid
from sale s1
cross apply (select sale as prev_sale
from sale s2
where s1.prodid=s2.prodid and s2.year=s1.year-1) s2
group by prodid
having sum(case when sale-prev_sale<0 then 1 else 0 end) = 0
To get the all the rows for such prodId's, use
select * from sale
where prodid in (select prodid
from sale s1
cross apply (select sale as prev_sale
from sale s2
where s1.prodid=s2.prodid and s2.year=s1.year-1) s2
group by prodid
having sum(case when sale-prev_sale<0 then 1 else 0 end) = 0
)
Here's a way with a CTE
declare #sale table (SaleID int, ProdId int, Sale int, Year int)
insert into #sale
values
(1,1,100,2012),
(2,1,130,2013),
(3,2,100,2012),
(4,1,150,2014),
(5,1,180,2015),
(6,2,120,2013),
(7,2,90,2014),
(8,2,130,2015)
declare #product table (ProdID int, ProdName char(1))
insert into #product
values
(1,'A'),
(2,'B')
;with cte as(
select
row_number() over (partition by ProdId order by Year) as RN
,*
from #sale)
select
p.ProdName
,cte.*
from cte
inner join
#product p on
p.ProdID=cte.ProdId
where cte.ProdId IN
(select distinct
c1.ProdId
from cte c1
left join
cte c2 on c2.RN = c1.rn+1 and c2.ProdId = c1.ProdId
group by c1.ProdId
having min(case when c1.Sale < isnull(c2.Sale,999999) then 1 else 0 end) = 1)
RETURNS
+----------+----+--------+--------+------+------+
| ProdName | RN | SaleID | ProdId | Sale | Year |
+----------+----+--------+--------+------+------+
| A | 1 | 1 | 1 | 100 | 2012 |
| A | 2 | 2 | 1 | 130 | 2013 |
| A | 3 | 4 | 1 | 150 | 2014 |
| A | 4 | 5 | 1 | 180 | 2015 |
+----------+----+--------+--------+------+------+

calculate sum based on value of other row in another column

I am trying to figure how can I calculate the number of days,the customer did not eat any candy.
Assuming that the Customer eats 1 candy/day.
If customer purchases more candy, it gets added to previous stock
Eg.
Day Candy Puchased
0 30
40 30
65 30
110 30
125 40
170 30
Answer here is 20.
Meaning on 0th day, customer brought 30 candies and his next purchase was on 40th day so he did not get to eat any candy between 30th to 39th day, also in the same way he did not eat any candy between 100th to 109th day.
Can anyone help me to write the query. I think I have got the wrong logic in my query.
select sum(curr.candy_purchased-(nxt.day-curr.day)) as diff
from candies as curr
left join candies as nxt
on nxt.day=(select min(day) from candies where day > curr.day)
You need a recursive CTE
First I need create a row_id so I use row_number
Now I need the base case for recursion.
Day: Mean how many day has pass. (0 from db)
PrevD: Is the Prev day amount so you can calculate Day (start at 0)
Candy Puchased: How many cadies bought (30 from db)
Remaining: How many candies left after eating (start at 0)
NotEat: How many days couldnt eat candy (start at 0)
Level: Recursion Level (start at 0)
Recursion Case
Day, PrevD, Candy Puchased are easy
Remaining: if I eat more than I have then 0
NotEat: Keep adding the diffence when doesnt have candy.
SQL Fiddle Demo
WITH Candy as (
SELECT
ROW_NUMBER() over (order by [Day]) as rn,
*
FROM Table1
), EatCandy ([Day], [PrevD], [Candy Puchased], [Remaining], [NotEat], [Level]) as (
SELECT [Day], 0 as [PrevD], [Candy Puchased], [Candy Puchased] as [Remaining], 0 as [NotEat], 1 as [Level]
FROM Candy
WHERE rn = 1
UNION ALL
SELECT c.[Day] - ec.[PrevD],
c.[Day],
c.[Candy Puchased],
c.[Candy Puchased] +
IIF((c.[Day] - ec.[PrevD]) > ec.[Remaining], 0, ec.[Remaining] - (c.[Day] - ec.[PrevD])),
ec.[NotEat] +
IIF((c.[Day] - ec.[PrevD]) > ec.[Remaining], (c.[Day] - ec.[PrevD]) - ec.[Remaining], 0),
ec.[Level] + 1
FROM Candy c
JOIN EatCandy ec
ON c.rn = ec.[level] + 1
)
select * from EatCandy
OUTPUT
| Day | PrevD | Candy Puchased | Remaining | NotEat | Level |
|-----|-------|----------------|-----------|--------|-------|
| 0 | 0 | 30 | 30 | 0 | 1 |
| 40 | 40 | 30 | 30 | 10 | 2 |
| 25 | 65 | 30 | 35 | 10 | 3 |
| 45 | 110 | 30 | 30 | 20 | 4 |
| 15 | 125 | 40 | 55 | 20 | 5 |
| 45 | 170 | 30 | 40 | 20 | 6 |
Just add SELECT MAX(NotEat) over the last query
Nice question.
Check my answer and also try with different sample data.
and please,if with different sample data it is not working then let me know.
declare #t table([Day] int, CandyPuchased int)
insert into #t
values (0, 30),(40,30),(65, 30)
,(110, 30),(125,40),(170,30)
select * from #t
;With CTE as
(
select *,ROW_NUMBER()over(order by [day])rn from #t
)
,CTE1 as
(
select [day],[CandyPuchased],rn from CTE c where rn=1
union all
select a.[Day],case when a.Day-b.Day<b.CandyPuchased
then a.CandyPuchased+(b.CandyPuchased-(a.Day-b.Day))
else a.CandyPuchased end CandyPuchased
,a.rn from cte A
inner join CTE B on a.rn=b.rn+1
)
--select * from CTE1
select sum(case when a.Day-b.Day>b.CandyPuchased
then (a.Day-b.Day)-b.CandyPuchased else 0 end)[CandylessDays]
from CTE1 A
inner join CTE1 b on a.rn=b.rn+1
If you just need the result at the end of the series, you don't really need that join.
select max(days) --The highest day in the table (convert these to int first)
- (sum(candies) --Total candies purchased
- (select top 1 candies from #a order by days desc)) --Minus the candies purchased on the last day
from MyTable
If you need this as a sort of running total, try over:
select *, sum(candies) over (order by days) as TotalCandies
from MyTable
order by days desc

Filter SQL Query?

The sample data is:
l16seqno | l16lcode | carrno | ecarrno | l16qty | reasoncode
32001 | 12 | 207620 | 370036873034035916 | 32 | 0
32269 | 12 | 207620 | 370036873034035916 | -32 | 800
39075 | 12 | 207620 | 370036873034035916 | 32 | 0
39074 | 12 | 207622 | 370036873034035923 | 32 | 0
32268 | 12 | 207622 | 370036873034035923 | -32 | 800
31999 | 12 | 207622 | 370036873034035923 | 32 | 0
32271 | 12 | 207624 | 370036873034035930 | -32 | 800
32005 | 12 | 207624 | 370036873034035930 | 32 | 0
39077 | 12 | 207624 | 370036873034035930 | 32 | 0
I have logging of all the events in table Z02T1. Whenever I have l16lcode=12 – I am blocking or unblocking a pallet. When I block a pallet l16lqty feild is negative, and when I unblock – it is positive.
Reason codes can be found in Z02T2 table (can be connected to Z02T1 by l16seqno – a unique sequence number of each log record).
Z14T1 table contains info about pallets – pallet numbers.
My aim is to find two lines for each pallet i.e.
when blocked with code 800 ... and ... when unblocked with code 0
For this I have to find the nearest next record of l16lcode=12 for the same pallet with reason code 0 (after there was a record for this pallet with reason code 800).
The initial query I have made is:
select Z02T1.datreg, Z02T1.l16seqno, Z02T1.l16lcode, Z02T1.divcode, Z02T1.carrno,
Z14T1.ecarrno, Z02T1.l16qty, Z02T2.reascode from Z02T2
inner join Z02T1 on Z02T1.l16seqno=Z02T2.l16seqno
left outer join Z14T1 ON Z14T1.carrno=Z02T1.carrno
where Z02T1.l16lcode=12
and (Z02T2.reascode=800 or Z02T2.reascode=0 )
order by Z14T1.ecarrno
How I can change this query to get one record with reasoncode 800 and then very next record with reasoncode 0 for same ecarrno feild ?
Here is some sample code that you could use to modify your existing query.
Be aware that this example filters on the first occurance of reasoncode=800 and then subfilters on the first occurance of reasoncode=0 that has a l16seqno greater than the reasoncode=800 record.
CREATE TABLE reasons (
l16seqno int NOT NULL,
carrno int NOT NULL,
reasoncode int NOT NULL
);
INSERT INTO reasons
(l16seqno, carrno, reasoncode)
VALUES
(1, 1, 0),
(2, 1, 800),
(3, 1, 0),
(10, 300, 0),
(11, 300, 800),
(12, 300, 0),
(13, 300, 800),
(14, 300, 0),
(1003, 1212, 0),
(1004, 1212, 800),
(1005, 1212, 0),
(1006, 1212, 0);
WITH cte1 (l16seqno, carrno, reasoncode, rownumber)
AS
(
SELECT l16seqno, carrno, reasoncode, ROW_NUMBER() OVER (PARTITION BY carrno, reasoncode ORDER BY l16seqno)
FROM reasons
WHERE reasoncode = 800
),
cte2 (l16seqno, carrno, reasoncode, rownumber)
AS
(
SELECT r.l16seqno, r.carrno, r.reasoncode, ROW_NUMBER() OVER (PARTITION BY r.carrno, r.reasoncode ORDER BY r.l16seqno)
FROM reasons AS r
INNER JOIN cte1 AS c ON r.carrno = c.carrno
WHERE r.reasoncode = 0 AND r.l16seqno > c.l16seqno
)
SELECT r.l16seqno, r.carrno, r.reasoncode
FROM reasons AS r
LEFT OUTER JOIN cte1 AS c1 ON c1.l16seqno = r.l16seqno
LEFT OUTER JOIN cte2 AS c2 ON c2.l16seqno = r.l16seqno
WHERE c1.rownumber = 1
OR c2.rownumber = 1
ORDER BY r.carrno, r.l16seqno;
Here is the SQL Fiddle demo of the sample code listed above.
I hope this helps.
Here you go:
;with cte as
(
Select l16seqno
,l16lcode
,carrno
,ecarrno
,l16qty
,reasoncode
,ROW_NUMBER() Over(Partition By ecarrno, reasoncode Order By l16seqno) rn
From MyTable
)
Select l16seqno
,l16lcode
,carrno
,ecarrno
,l16qty
,reasoncode
From cte
Where rn = 1
Order By ecarrno asc, reasoncode desc