smart way to create a master list (avoiding cross joins)

smart way to create a master list (avoiding cross joins) - sql

I need to create a Table of date,product and inventory count only for the days inventory 0 , something like this
Date Product store Inv
Jan1 1 1 0
Feb4 1 1 0
From the inventory table that only has a record whenever inventory changes
Like this
Store Product start_date end_date Inv
1 1 Jan 4 Jan10 5
1 1 Jan10 jan 15 4
I know I can create a master table by cross joining all store,product and calendar days in a year and then join only with days where date falls between start and end date of the inventory table. Is there a better way than this ? Can cross join be avoided ? Thanks

Are you looking for lag():
select t.*
from (select t.*,
lag(inventory) over (partition by product, store order by date) as prev_inventory
from t
) t
where prev_inventory is null or prev_inventory <> inventory;

Create a table with dates (which can be handy to have around for a ton of reasons) then left join from the inventory table and use BETWEEN against your date columns.

Related

The nearest row in the other table

One table is a sample of users and their purchases.
Structure:
Email | NAME | TRAN_DATETIME (Varchar)
So we have customer email + FirstName&LastName + Date of transaction
and the second table that comes from second system contains all users, they sensitive data and when they got registered in our system.
Simplified Structure:
Email | InstertDate (varchar)
My task is to count minutes difference between the rows insterted from sale(first table)and the rows with users and their sensitive data.
The issue is that second table contain many rows and I want to find the nearest in time row that was inserted in 2nd table, because sometimes it may be a few minutes difeerence(delay or opposite of delay)and sometimes it can be a few days.
So for x email I have row in 1st table:
E_MAIL NAME TRAN_DATETIME
p****#****.eu xxx xxx 2021-10-04 00:03:09.0000000
But then I have 3 rows and the lastest is the one I want to count difference
Email InstertDate
p****#****.eu 2021-05-20 19:12:07
p****#****.eu 2021-05-20 19:18:48
p****#****.eu 2021-10-03 18:32:30 <--
I wrote that some query, but I have no idea how to match nearest row in the 2nd table
SELECT DISTINCT TOP (100)
,a.[E_MAIL]
,a.[NAME]
,a.[TRAN_DATETIME]
,CASE WHEN b.EMAIL IS NOT NULL THEN 'YES' ELSE 'NO' END AS 'EXISTS'
,(ABS(CONVERT(INT, CONVERT(Datetime,LEFT(a.[TRAN_DATETIME],10),120))) - CONVERT(INT, CONVERT(Datetime,LEFT(b.[INSERTDATE],10),120))) as 'DateAccuracy'
FROM [crm].[SalesSampleTable] a
left join [crm].[SensitiveTable] b on a.[E_MAIL]) = b.[EMAIL]

Totally untested: I'd need sample data and database the area of suspect is the casting of dates and the datemath.... since I dont' know what RDBMS and version this is.. consider the following "pseudo code".
We assign a row number to the absolute difference in seconds between the dates those with rowID of 1 win.
WTIH CTE AS (
SELECT A.*, B.* row_number() over (PARTITION BY A.e_mail
ORDER BY abs(datediff(second, cast(Tran_dateTime as Datetime), cast(InsterDate as DateTime)) desc) RN
FROM [crm].[SalesSampleTable] a
LEFT JOIN [crm].[SensitiveTable] b
on a.[E_MAIL] = b.[EMAIL])
SELECT * FROM CTE WHERE RN = 1

Last record per transaction

I am trying to select the last record per sales order.
My query is simple in SQL Server management.
SELECT *
FROM DOCSTATUS
The problem is that this database has tens of thousands of records, as it tracks all SO steps.
ID SO SL Status Reason Attach Name Name Systemdate
22 951581 3 Processed Customer NULL NULL BW 2016-12-05 13:33:27.857
23 951581 3 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
24 947318 1 Hold Customer NULL NULL bw 2016-12-05 13:54:27.173
25 947318 1 Invoices Submit Customer NULL NULL bw 2016-13-05 13:54:27.300
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440
I would to see the most recent record per the SO
ID SO SL Status Reason Attach Name Name Systemdate
23 951581 4 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440

Well I'm not sure how that table has two Name columns, but one easy way to do this is with ROW_NUMBER():
;WITH cte AS
(
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY SO ORDER BY Systemdate DESC)
FROM dbo.DOCSTATUS
)
SELECT ID, SO, SL, Status, Reason, ..., Systemdate
FROM cte WHERE rn = 1;
Also please always reference the schema, even if today everything is under dbo.

I think you can keep it this simple:
SELECT *
FROM DOCSTATUS
WHERE ID IN (SELECT MAX(ID)
FROM DOCSTATUS
GROUP BY SO)
You want only the maximum ID from each SO.

An efficient method with the right index is a correlated subquery:
select t.*
from t
where t.systemdate = (select max(t2.systemdate) from t t2 where t2.so = t.so);
The index is on (so, systemdate).

How to create a Select statement which contains a SUM on a different table

I currently have a select statement that is causing me issues, i have two tables:
Customer table
ID Month4Value Month5Value
1 24 5
Orders table
ID Year Month Value Quantity
1 2018 8 10 2
1 2018 4 2 1
1 2018 6 10 4
1 2018 4 7 3
I currently have the below view:
Create View Values as
Select ID, Year, Month, ROUND(SUM(Value*Quantity),2) as NewQuantity
FROM Orders
GROUP BY ID, Year, Month
The below select statement is what i am trying to run
Select Customer.ID, Customer.Month5Value, NewQuantity
from Customer inner join Values on Customer.ID = Values.ID
where ROUND(Customer.Month5Value, 2) <> ROUND(NewQuantity,2)
AND Values.Year = 2018
AND Values.Month = 5
What i am trying to achieve is to find any mismatches between the Orders table and the Customer table. In the above example, what i am expecting is to highlight that the value in Customer.Month5Value does not match the total of the (Quantity*Value) from the Orders table.
As there are 0 orders for Month 5 in the Orders Table, the Month5Value should be 0. However, it returns no entrys.
Any thoughts about what i have missed?
EDIT -
I have updated my query to this:
Select Customer.ID, Customer.Month5Value, NewQuantity
from Customer left join Values on Customer.ID = Values.ID
where ROUND(Customer.Month5Value, 2) <> ISNULL((Select NewQuantity from Customer left join Values on Customer.ID = Values.ID where Values.Month = 5 and Values.Year = 2018),0)
This has given me a list of IDs which have an incorrect amount in Month5Value on the Customer table, but displays lines for each month entry
ID Month5Value NewQuantity
1 5 24
1 5 40
1 5 20
How can i adjust this so that I get one line per ID with the correct value for NewQuantity (either 0 or NULL in this case)?

I think the INNER JOIN is removing any records which are missing from VALUES. Replacing the INNER JOIN with LEFT JOIN may give the result you are looking for.

Forcing empty rows from query

I have a table containing monthly statistics for clients.
Columns are CustNo, Year, Month, Trips
Some customers do not have any trips in some months and therefore there are combinations of CustNo, Year and Month that have no rows in that table.
I am trying to write a Query that shows 0 for those combinations of CustNo, Year and Month that have no trips, instead of producing an empty row.
To start with I have created a ValidPeriods table that has a Year and a Month column containing those periods that are valid.
I can then Query like this:
SELECT v.ValidYear, v.ValidMonth, tc.CustNo, tc.Trips
FROM ValidPeriods v
LEFT OUTER JOIN TempTrips AS tc ON v.ValidYear = tc.Year
AND v.ValidMonth = tc.Month
WHERE tc.CustNo IN (1001230, 1001286, 1001292)
This will give me rows for all periods, with 1 row with NULL values for those periods where there are no customers in the list that have any trips.
But how do I get one row for each customer in the list for all periods?
Ideally I want this:
2016 1 1001230 0
2016 1 1001286 14
2016 1 1001292 23
2016 2 1001230 7
2016 2 1001286 0
2016 2 1001292 4
etc...

Generate the rows using cross join. Then fill in the values using left join:
SELECT ym.ValidYear, ym.ValidMonth, c.CustNo, COALESCE(tt.Trips, 0)
FROM ValidPeriods ym CROSS JOIN
(VALUES (1001230), (1001286), (1001292)) c(CustNo) LEFT JOIN
TempTrips tt
ON tt.ValidYear = ym.ValidYear AND tt.ValidMOnth = ym.ValidMonth AND
tt.CustNo = c.CustNo;

Adding in missing dates from results in SQL

I have a database that currently looks like this
Date | valid_entry | profile
1/6/2015 1 | 1
3/6/2015 2 | 1
3/6/2015 2 | 2
5/6/2015 4 | 4
I am trying to grab the dates but i need to make a query to display also for dates that does not exist in the list, such as 2/6/2015.
This is a sample of what i need it to be:
Date | valid_entry
1/6/2015 1
2/6/2015 0
3/6/2015 2
3/6/2015 2
4/6/2015 0
5/6/2015 4
My query:
select date, count(valid_entry)
from database
where profile = 1
group by 1;
This query will only display the dates that exist in there. Is there a way in query that I can populate the results with dates that does not exist in there?

You can generate a list of all dates that are between the start and end date from your source table using generate_series(). These dates can then be used in an outer join to sum the values for all dates.
with all_dates (date) as (
select dt::date
from generate_series( (select min(date) from some_table), (select max(date) from some_table), interval '1' day) as x(dt)
)
select ad.date, sum(coalesce(st.valid_entry,0))
from all_dates ad
left join some_table st on ad.date = st.date
group by ad.date, st.profile
order by ad.date;
some_table is your table with the sample data you have provided.
Based on your sample output, you also seem to want group by date and profile, otherwise there can't be two rows with 2015-06-03. You also don't seem to want where profile = 1 because that as well wouldn't generate two rows with 2015-06-03 as shown in your sample output.
SQLFiddle example: http://sqlfiddle.com/#!15/b0b2a/2
Unrelated, but: I hope that the column names are only made up. date is a horrible name for a column. For one because it is also a keyword, but more importantly it does not document what this date is for. A start date? An end date? A due date? A modification date?

You have to use a calendar table for this purpose. In this case you can create an in-line table with the tables required, then LEFT JOIN your table to it:
select "date", count(valid_entry)
from (
SELECT '2015-06-01' AS d UNION ALL '2015-06-02' UNION ALL '2015-06-03' UNION ALL
'2015-06-04' UNION ALL '2015-06-05' UNION ALL '2015-06-06') AS t
left join database AS db on t.d = db."date" and db.profile = 1
group by t.d;
Note: Predicate profile = 1 should be applied in the ON clause of the LEFT JOIN operation. If it is placed in the WHERE clause instead then LEFT JOIN essentially becomes an INNER JOIN.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

smart way to create a master list (avoiding cross joins) - sql

Are you looking for lag(): select t.* from (select t.*, lag(inventory) over (partition by product, store order by date) as prev_inventory from t ) t where prev_inventory is null or prev_inventory <> inventory;

Create a table with dates (which can be handy to have around for a ton of reasons) then left join from the inventory table and use BETWEEN against your date columns.

Related

The nearest row in the other table

Last record per transaction

How to create a Select statement which contains a SUM on a different table

Forcing empty rows from query

Adding in missing dates from results in SQL

Categories

Resources