I need to show the monthly inventory data - sql

I have a table some thing like as follows for Inventory details.
InventoryTable.
InventoryTableID DateCreated quantity ItemName
-------------------------------------------------
1 2010-02-04 12 abc
2 2010-03-10 4 abc
3 2010-03-13 5 xyz
4 2010-03-13 19 def
5 2010-03-17 15 abc
6 2010-03-29 15 abc
7 2010-04-01 22 xyz
8 2010-04-13 5 abc
9 2010-04-15 6 def
from the above table if my admin wants to know the inventory details for month April 2010 (i.e. Apr 1st 2010 - Apr 30th 2010)
I need the output as shown below.
inventory as on Apr 1st 2010
ItemName Datecreated qty
----------------------------
abc 2010-03-29 15
xyz 2010-04-01 22
def 2010-03-13 19
inventory as on Apr 30th 2010
ItemName Datecreated qty
---------------------------
abc 2010-04-13 5
xyz 2010-04-01 22
def 2010-04-15 6

For your first result set, run with #YourDataParam = '2010-04-01'. For the second set, use '2010-04-30'.
;with cteMaxDate as (
select it.ItemName, max(it.DateCreated) as MaxDate
from InventoryTable it
where it.DateCreated <= #YourDataParam
group by it.ItemName
)
select it.ItemName, it.DateCreated, it.qty
from cteMaxDate c
inner join InventoryTable it
on c.ItemName = it.ItemName
and c.MaxDate = it.DateCreated

Related

How to get top values when there is a tie

I am having difficulty figuring out this dang problem. From the data and queries I have given below I am trying to see the email address that has rented the most movies during the month of September.
There are only 4 relevant tables in my database and they have been anonymized and shortened:
Table "cust":
cust_id
f_name
l_name
email
1
Jack
Daniels
jack.daniels#google.com
2
Jose
Quervo
jose.quervo#yahoo.com
5
Jim
Beam
jim.beam#protonmail.com
Table "rent"
inv_id
cust_id
rent_date
10
1
9/1/2022 10:29
11
1
9/2/2022 18:16
12
1
9/2/2022 18:17
13
1
9/17/2022 17:34
14
1
9/19/2022 6:32
15
1
9/19/2022 6:33
16
3
9/1/2022 18:45
17
3
9/1/2022 18:46
18
3
9/2/2022 18:45
19
3
9/2/2022 18:46
20
3
9/17/2022 18:32
21
3
9/19/2022 22:12
10
2
9/19/2022 11:43
11
2
9/19/2022 11:42
Table "inv"
mov_id
inv_id
22
10
23
11
24
12
25
13
26
14
27
15
28
16
29
17
30
18
31
19
31
20
32
21
Table "mov":
mov_id
titl
rate
22
Anaconda
3.99
23
Exorcist
1.99
24
Philadelphia
3.99
25
Quest
1.99
26
Sweden
1.99
27
Speed
1.99
28
Nemo
1.99
29
Zoolander
5.99
30
Truman
5.99
31
Patient
1.99
32
Racer
3.99
and here is my current query progress:
SELECT cust.email,
COUNT(DISTINCT inv.mov_id) AS "Rented_Count"
FROM cust
JOIN rent ON rent.cust_id = cust.cust_id
JOIN inv ON inv.inv_id = rent.inv_id
JOIN mov ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY "Rented_Count" DESC;
and here is what it outputs:
email
Rented_Count
jack.daniels#google.com
6
jim.beam#protonmail.com
6
jose.quervo#yahoo.com
2
and what I want it to be outputting:
email
jack.daniels#google.com
jim.beam#protonmail.com
From the results I am actually getting I have a tie for first place (Jim and Jack) and that is fine but I would like it to list both tieing email addresses not just Jack's so you cant do anything with rows or max I don't think.
I think it must have something to do with dense_rank but I don't know how to use that specifically in this scenario with the count and Group By?
Your creativity and help would be appreciated.
You're missing the FETCH FIRST ROWS WITH TIES clause. It will work together with the ORDER BY clause to get you the highest values (FIRST ROWS), including ties (WITH TIES).
SELECT cust.email
FROM cust
INNER JOIN rent
ON rent.cust_id = cust.cust_id
INNER JOIN inv
ON inv.inv_id = rent.inv_id
INNER JOIN mov
ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY COUNT(DISTINCT inv.mov_id) DESC
FETCH FIRST 1 ROWS WITH TIES

R - get a vector that tells me if a value of another vector is the first appearence or not

I have a data frame of sales with three columns: the code of the customer, the month the customer bought that item, and the year.
A customer can buy something in september and then in december make another purchase, so appear two times. But I'm interested in knowing the absolutely new customoers by month and year.
So I have thought in make an iteration and some checks and use the %in% function and build a boolean vector that tells me if a customer is new or not and then count by month and year with SQL using this new vector.
But I'm wondering if there's a specific function or a better way to do that.
This is an example of the data I would like to have:
date cust month new_customer
1 14975 25 1 TRUE
2 14976 30 1 TRUE
3 14977 22 1 TRUE
4 14978 4 1 TRUE
5 14979 25 1 FALSE
6 14980 11 1 TRUE
7 14981 17 1 TRUE
8 14982 17 1 FALSE
9 14983 18 1 TRUE
10 14984 7 1 TRUE
11 14985 24 1 TRUE
12 14986 22 1 FALSE
So put it more simple: the data frame is sorted by date, and I'm interested in a vector (new_customer) that tells me if the customer purchased something for the first time or not. For example customer 25 bought something the first day, and then four days later bought something again, so is not a new customer. The same can be seen with customer 17 and 22.
I create dummy data my self with id, month of numeric format, and year
dat <-data.frame(
id = c(1,2,3,4,5,6,7,8,1,3,4,5,1,2,2),
month = c(1,6,7,8,2,3,4,8,11,1,10,9,1,12,2),
year = c(2019,2019,2019,2019,2019,2020,2020,2020,2020,2020,2021,2021,2021,2021,2021)
)
id month year
1 1 1 2019
2 2 6 2019
3 3 7 2019
4 4 8 2019
5 5 2 2019
6 6 3 2020
7 7 4 2020
8 8 8 2020
9 1 11 2020
10 3 1 2020
11 4 10 2021
12 5 9 2021
13 1 1 2021
14 2 12 2021
15 2 2 2021
Then, group by id and arrange by year and month (order is meaningful). Then use filter and row_number().
dat %>%
group_by(id) %>%
arrange(year, month) %>%
filter(row_number() == 1)
id month year
<dbl> <dbl> <dbl>
1 1 1 2019
2 5 2 2019
3 2 6 2019
4 3 7 2019
5 4 8 2019
6 6 3 2020
7 7 4 2020
8 8 8 2020
Sample Code
You can change in your code according to this logic:-
Create Table:-
CREATE TABLE PURCHASE(Posting_Date DATE,Customer_Id INT,Customer_Name VARCHAR(15));
Insert Data Into Table
Posting_Date Customer_Id Customer_Name
2018-01-01 C_01 Jack
2018-02-01 C_01 Jack
2018-03-01 C_01 Jack
2018-04-01 C_02 James
2019-04-01 C_01 Jack
2019-05-01 C_01 Jack
2019-05-01 C_03 Gill
2020-01-01 C_02 James
2020-01-01 C_04 Jones
Code
WITH Date_CTE (PostingDate,CustomerID,FirstYear)
AS
(
SELECT MIN(Posting_Date) as [Date],
Customer_Id,
YEAR(MIN(Posting_Date)) as [F_Purchase_Year]
FROM PURCHASE
GROUP BY Customer_Id
)
SELECT T.[ActualYear],(CASE WHEN T.[Customer Status] = 'new' THEN COUNT(T.[Customer Status]) END) AS [New Customer]
FROM (
SELECT DISTINCT YEAR(T2.Posting_Date) AS [ActualYear],
T2.Customer_Id,
(CASE WHEN T1.FirstYear = YEAR(T2.Posting_Date) THEN 'new' ELSE 'old' END) AS [Customer Status]
FROM Date_CTE AS T1
left outer join PURCHASE AS T2 ON T1.CustomerID = T2.Customer_Id
) AS T
GROUP BY T.[ActualYear],T.[Customer Status]
Final Result
ActualYear New Customer
2018 2
2019 1
2020 1
2019 NULL
2020 NULL

Reverse track forced records relationships based on user-defined tagging

I have this table where the tagging [Tag_To] is updated by an algorithm based on Year and Period of coverage. My current task (in question) is to update the Status given the Year.
ID Year Method Period_From Period_To SeqNo Tag_To Status
-----------------------------------------------------------------------------------
10 2019 A 2019-01-01 2019-12-31 1
11 2019 B 2019-01-01 2019-06-30 2 1
12 2019 B 2019-07-01 2019-12-31 3 1
13 2019 C 2019-01-01 2019-06-30 4 2
14 2020 A 2020-01-01 2020-12-31 1
15 2020 B 2020-01-01 2020-06-30 2 1
16 2020 B 2020-07-01 2020-12-31 3 1
17 2020 C 2020-01-01 2020-12-31 4 2,3
18 2021 A 2021-01-01 2021-12-31 1
19 2021 B 2021-01-01 2021-12-31 2 1
20 2021 C 2021-07-01 2021-12-31 3 2
The SeqNo is applied per Year and the Tag_To is done based on period of coverage.
11 and 12 are tagged to 10 since B follows A and their period falls within 10 period coverage.
13 is tagged to 11 since C follows B and the period...
15 and 16 to 14
Also note that 17 is tagged to 15 and 16 (2,3) because 17's coverage spans across the 2 periods of 15 and 16 combined
and so on...
The objective is to update the Status by Year such that each path is considered Closed if the path already has Methods A, B and C (there are actually more methods, but to simplify). Status should be Open for paths that haven't completed the methods.
From the example above, there are 5 paths:
10(A)-->11(B)-->13(C) = Closed
10(A)-->12(B)-->??? = Open
14(A)-->15(B)-->17(C) = Closed
14(A)-->16(B)-->17(C) = Closed
18(A)-->19(B)-->20(C) = Closed
Therefore the status update should be:
ID Year Method Period_From Period_To SeqNo Tag_To Status
-----------------------------------------------------------------------------------
10 2019 A 2019-01-01 2019-12-31 1 Open
11 2019 B 2019-01-01 2019-06-30 2 1 Closed
12 2019 B 2019-07-01 2019-12-31 3 1 Open
13 2019 C 2019-01-01 2019-06-30 4 2 Closed
14 2020 A 2020-01-01 2020-12-31 1 Closed
15 2020 B 2020-01-01 2020-06-30 2 1 Closed
16 2020 B 2020-07-01 2020-12-31 3 1 Closed
17 2020 C 2020-01-01 2020-12-31 4 2,3 Closed
18 2021 A 2021-01-01 2021-12-31 1 Closed
19 2021 B 2021-01-01 2021-12-31 2 1 Closed
20 2021 C 2021-07-01 2021-12-31 3 2 Closed
I hope I have explained everything clearly. Would really appreciate if anyone could help.
Just to update viewers that I have managed to solve this on my own although the solution is super non-dynamic and quite inefficient, it pretty much did the job for me. Here's what I did.
UPDATE Table SET
Status =
CASE WHEN Method = 'B'
AND NOT EXISTS ( SELECT * FROM Table P INNER JOIN
(
SELECT VALUE AS Tag_To
FROM Table AV
CROSS APPLY STRING_SPLIT(AV.Tag_To, ',')
WHERE AV.Method = 'C'
) C ON P.Sequence_No = C.Tag_To
WHERE P.ID = AValue.ID
)
THEN 'Open'
WHEN Method = 'A'
AND NOT EXISTS ( SELECT * FROM Table P INNER JOIN
(
SELECT VALUE AS Tag_To
FROM Table AV
CROSS APPLY STRING_SPLIT(AV.Tag_To, ',')
WHERE AV.Method = 'B'
) C ON P.Sequence_No = C.Tag_To
WHERE P.ID = AValue.ID
)
THEN 'Open'
ELSE 'Closed'
END
FROM Table AValue
WHERE Year = #Year
;WITH CTE AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY A.Method ORDER BY A.Sequence_No ASC) SN,
A.ID,
A.Method,
A.Sequence_No,
A.Tag_To,
A.Period_From,
A.Period_To,
A.Status
FROM Table A
LEFT JOIN
(
SELECT VALUE AS Tag_To
FROM Table AV
CROSS APPLY STRING_SPLIT(AV.Tag_To, ',')
WHERE Year = #Year
) B ON A.Sequence_No = B.Tag_To
WHERE Year = #Year
),
CTE2 AS
(
SELECT DISTINCT SN FROM CTE
WHERE Status = 'Open'
)
UPDATE Table SET
Status = 'Open'
FROM Table
INNER JOIN CTE ON Table.ID = CTE.ID
INNER JOIN CTE2 ON CTE.SN = CTE2.SN
Yeah, it's ugly but, hey, it did the job! :)

How to make a copy to previous value to another column

I have a Postgres database which is have a tons of data and including schema and all database attributes. I want to create open price which is pulling from to same symbol previous close price.
My data is like:
ID DATE SYMBOL OPEN CLOSE
1 1.01.2020 ABC 2,33
2 1.01.2020 XYZ 10,32
3 1.01.2020 KLM 30,33
4 1.01.2020 DEF 50,78
5 3.01.2020 ABC 3,00
6 3.01.2020 KLM 31,00
7 4.01.2020 ABC 4,00
8 4.01.2020 XYZ 13,00
9 4.01.2020 KLM 25,00
10 4.01.2020 DEF 48,00
11 5.01.2020 XYZ 11,50
12 5.01.2020 DEF 47,53
13 7.01.2020 ABC 4,58
14 7.01.2020 XYZ 12,54
15 7.01.2020 KLM 25,78
16 7.01.2020 DEF 48,33
I created Open colum which is should be have previous symbol prices.
My expect output:
ID DATE SYMBOL OPEN CLOSE
1 01.01.2020 ABC 2,33
2 01.01.2020 XYZ 10,32
3 01.01.2020 KLM 30,33
4 01.01.2020 DEF 50,78
5 03.01.2020 ABC 2,33 3,00
6 03.01.2020 KLM 30,33 31,00
7 04.01.2020 ABC 3,00 4,00
8 04.01.2020 XYZ 10,32 13,00
9 04.01.2020 KLM 31,00 25,00
10 04.01.2020 DEF 50,78 48,00
11 05.01.2020 XYZ 13,00 11,50
12 05.01.2020 DEF 48,00 47,53
13 07.01.2020 ABC 4,00 4,58
14 07.01.2020 XYZ 11,50 12,54
15 07.01.2020 KLM 25,00 25,78
16 07.01.2020 DEF 47,53 48,33
Open value = Previous close price value
ABC 01.01.2020 close price 2,33 = ABC 03.01.2020 open price 2,33
My database is active and fetching new data everyday which should be OPEN column must be filled up with (same symbols) previous close price data.
All symbolms doesnt have a price in some day and its getting over 100.000+ column at this moment. I tried something with sql query but didnt figure out. Im kindly new as database query. As I can understand documentation the following definitions
So Can this be possible? If so, how? Thanks in advance..
I think you just want lag():
select t.*,
lag(close) over (partition by symbol order by date) as prev_close
from t;
If you want to update the value in the table, you can use join (after adding the column):
update t
set open = tt.prev_close
from (select t.*,
lag(close) over (partition by symbol order by date) as prev_close
from t
) tt
where tt.id = t.id and
tt.prev_close is distinct from t.open;

return the last row that meets a condition in sql

I have two tables:
Meter
ID SerialNumber
=======================
1 ABC1
2 ABC2
3 ABC3
4 ABC4
5 ABC5
6 ABC6
RegisterLevelInformation
ID MeterID ReadValue Consumption PreviousReadDate ReadType
============================================================================
1 1 250 250 1 jan 2015 EST
2 1 550 300 1 feb 2015 ACT
3 1 1000 450 1 apr 2015 EST
4 2 350 350 1 jan 2015 EST
5 2 850 500 1 feb 2015 ACT
6 2 1000 150 1 apr 2015 ACT
7 3 1500 1500 1 jan 2015 EST
8 3 2500 1000 1 mar 2015 EST
9 3 5000 2500 4 apr 2015 EST
10 4 250 250 1 jan 2015 EST
11 4 550 300 1 feb 2015 ACT
12 4 1000 450 1 apr 2015 EST
13 5 350 350 1 jan 2015 ACT
14 5 850 500 1 feb 2015 ACT
15 5 1000 150 1 apr 2015 ACT
16 6 1500 1500 1 jan 2015 EST
17 6 2500 1000 1 mar 2015 EST
18 6 5000 2500 4 apr 2015 EST
I am trying to group by meter serial and return the last actual read date for each of the meters but I am unsure as to how to accomplish this. Here is the sql I have thus far:
select a.SerialNumber, ReadTypeCode, MAX(PreviousReadDate) from Meter as a
left join RegisterLevelInformation as b on a.MeterID = b.MeterID
where ReadType = 'ACT'
group by a.SerialNumber,b.ReadTypeCode, PreviousReadDate
order by a.SerialNumber
I can't seem to get the MAX function to take effect in returning only the latest actual reading row and it returns all dates and the same meter serial is displayed several times.
If I use the following sql:
select a.SerialNumber, count(*) from Meter as a
left join RegisterLevelInformation as b on a.MeterID = b.MeterID
group by a.SerialNumber
order by a.SerialNumber
then each serial is shown only once. Any help would be greatly appreciated.
Like #PaulGriffin said in his comment you need to remove PreviousReadDate column from your GROUP BY clause.
Why are you experiencing this behaviour?
Basically the partition you have chosen - (SerialNumber,ReadTypeCode,PreviousReadDate) for each distinct pair of those values prints you SerialNumber, ReadTypeCode, MAX(PreviousReadDate). Since you are applying a MAX() function to each row of the partition that includes this column you are simply using an aggregate function on one value - so the output of MAX() will be equal to the one without it.
What you wanted to achieve
Get MAX value of PreviousReadDate for every pair of (SerialNumber,ReadTypeCode). So this is what your GROUP BY clause should include.
select a.SerialNumber, ReadTypeCode, MAX(PreviousReadDate) from Meter as a
left join RegisterLevelInformation as b on a.MeterID = b.MeterID
where ReadType = 'ACT'
group by a.SerialNumber,b.ReadTypeCode
order by a.SerialNumber
Is the correct SQL query for what you want.
Difference example
ID MeterID ReadValue Consumption PreviousReadDate ReadType
============================================================================
1 1 250 250 1 jan 2015 EST
2 1 550 300 1 feb 2015 ACT
3 1 1000 450 1 apr 2015 EST
Here if you apply the query with grouping by 3 columns you would get result:
SerialNumber | ReadTypeCode | PreviousReadDate
ABC1 | EST | 1 jan 2015 -- which is MAX of 1 value (1 jan 2015)
ABC1 | ACT | 1 feb 2015
ABC1 | EST | 1 apr 2015
But instead when you only group by SerialNumber,ReadTypeCode it would yield result (considering the sample data that I posted):
SerialNumber | ReadTypeCode | PreviousReadDate
ABC1 | EST | 1 apr 2015 -- which is MAX of 2 values (1 jan 2015, 1 apr 2015)
ABC1 | ACT | 1 feb 2015 -- which is MAX of 1 value (because ReadTypeCode is different from the row above
Explanation of your second query
In this query - you are right indeed - each serial is shown only once.
select a.SerialNumber, count(*) from Meter as a
left join RegisterLevelInformation as b on a.MeterID = b.MeterID
group by a.SerialNumber
order by a.SerialNumber
But this query would produce you odd results you don't expect if you add grouping by more columns (which you have done in your first query - try it yourself).
You need to remove PreviousReadDate from your Group By clause.
This is what your query should look like:
select a.SerialNumber, ReadTypeCode, MAX(PreviousReadDate) from Meter as a
left join RegisterLevelInformation as b on a.MeterID = b.MeterID
where ReadType = 'ACT'
group by a.SerialNumber,b.ReadTypeCode
order by a.SerialNumber
To understand how the group by clause works when you mention multiple columns, follow this link: Using group by on multiple columns
You will understand what was wrong with your query and why it returns all dates and the same meter serial is displayed several times.
Good luck!
Kudos! :)