Creating a View from an Aggregated Self-Join

Creating a View from an Aggregated Self-Join - sql

I've been working for around two weeks now on building some detailed reporting options for the company I work at. I asked a question here sometime last week or the week before, and that got me started on a query, which I eventually tightened up substantially.
I'm starting from an inventory ledger which just keeps track of single transactions. The goal is to build a more thorough ledger that will keep a running stock total, a running sales total, and if an item went out of stock, it will track the days until resupply.
The initial query used a With ... as statement to define the table with its aggregates before doing the self join on the aggregated columns. Unfortunately, I can't do the same thing to create a view, so I need to find a way to create those aggregates differently that will still allow me to self-join on them to keep my totals in order.
Here is how I've retooled my statement so far:
Create View 'QLedger' as
Select tcum.txnid,
tcum.Item,
tcum.TxnDate,
tcum.[Tran Type],
tcum.Quantity,
tcum.cumq
from (select *, SUM( Quantity )
OVER (PARTITION BY InventoryLedger.Item
ORDER BY InventoryLedger.TxnID
ROWS UNBOUNDED PRECEDING ) cumq,
abs(
sum(
case when [Tran Type] = 'Shipping'
or [Tran Type] = 'Customer Return'
then Quantity end)
over (partition by qryrptInventoryLedger.item
order by InventoryLedger.txnid
rows unbounded preceding)) LifeSales
from InventoryLedger) tcum
left outer join InventoryLedger tcumnext
on tcum.Item = tcumnext.Item
and tcum.TxnID < tcumnext.TxnID
and
tcum.cumq = 0 and tcumnext.cumq >0
where tcum.Item = '103-02'
and tcum.cumq = 0
group by tcum.TxnID, tcum.TxnDate, tcum.Item, tcum.[tran type], tcum.Quantity
This is almost right, except the table I'm self joining to (tcumnext) doesn't have a running/cumulative quantity column to compare to tcum. I can't at all figure out how to make one to compare with. Can anyone help me out? I'd really appreciate it. It's exciting and frustrating to be so, so close after working on this for so long.

If you already solved the aggregated-functions problem using a with in your query you can do it with the view as well.
Here's an example of a view that uses a with clause which contains aggregate functions:
http://social.msdn.microsoft.com/Forums/sk/sqlgetstarted/thread/302040c6-6a1b-4f99-8a1d-84bb196cb5e6
First post there.
Hope this helps =)

You can use a with statement in a view:
create view xxx as
with <blah blah blah>
select <your query>
Does this solve your problem?

Related

SQL sum, multiple Group By's and Date criteria

I'm looking to perform a sum calculation on a SQL table to find the quantity of a particular stock item held against a particular salesperson up to and including a specific date.
I'm able to perform the sum function to find the quantities on a Salesperson/item basis whenever I do not factor in the date range criteria, but as soon as i add that aspect, it all goes a bit pear shaped! Here is my code so far:
SELECT Salesperson, Item No, Sum(Quantity) AS 'Quantity'
FROM dbo
WHERE (Location Code='VAN')
GROUP BY Salesperson, Item No,
HAVING (Registering Date<={ts '2017-05-03 00:00:00'})
The location code = VAN filter is required to ensure it ignores Warehouse quantities.My SQL knowledge is limited to the few instances I run into it at work and my interaction is largely based through Microsoft Query in Excel. When looking at the above code, i figured that the 'Registering date' criteria should be in the 'WHERE' section, however when i add the criteria using the options available in Microsoft Query, it creates the 'HAVING' line.
If anyone could provide any pointers, it would be much appreciated!
Cheers
Peter

I would imagine a query like this:
SELECT Salesperson, [Item No], Sum(Quantity) AS Quantity
--------------------^ escape the non-standard column name
FROM dbo.??
---------^ table name goes here
WHERE Location Code = 'VAN' AND
[Registering Date] <= '2017-05-03'
------^ put the filtering condition in the correct clause
GROUP BY Salesperson, Item No
-----------------------------^ remove the comma
Your code, as written, has multiple errors. I am guessing that most are transcription errors rather than in the original query (queries don't run if no table is given in the FROM for instance). The "major" error would then be filtering in the HAVING clause rather than the WHERE clause.

SQL months_between grouping issue

First post; go easy on me.
Relatively new to SQL (anything beyond simple queries really), but attempting to learn more complex functions in an effort to take advantage of superior server resources. My issue:
I would like to use a SUM function to aggregate cash flows across a very large variety of sources. I would like to see these cash flows along a monthly time period. Because the cash flows start at different times, I would like to season them so that they are all aligned. My current code:
select
months_between(A.reporting_date, B.start_date) as season,
sum(case when A.current_balance is null then B.original_balance
else A.current_balance end) as cashflow
from dataset1 A, dataset2 B
group by season
order by season
Now, executing the code like this generates an error message that states that A.reporting_date and B.start_date must be GROUPED or part of an AGGREGATE function.
The problem is, if I add them to the GROUP BY statement, while it generates output without error, I get cash flow sums that are essentially Cartesian crosses with all the grouped variables.
So long story short, is there any way for me to get cash flow sums grouped by only the season? If so, any ideas how to do it?
Thank you.

Most databases don't allow using column aliases defined previously, in where, group by and order by clauses.
For your query you should use months_between(A.reporting_date, B.start_date) instead of the alias season in group by and order by.
Also your query will return a cross product, as a join condition isn't specified.
select
months_between(A.reporting_date, B.start_date) as season,
sum(case when A.current_balance is null then B.original_balance
else A.current_balance end) as cashflow
from dataset1 A
JOIN dataset2 B ON --add a join condition
group by months_between(A.reporting_date, B.start_date)
order by months_between(A.reporting_date, B.start_date)

How to merge two or more queries with different where conditions? I have to reuse the code which is being used in 1st where code

---below query gives all the customers from fact irrespective of condition
SELECT count( dbo.Fact_Promotion.customerid) as Mailquantity
FROM dbo.Fact_Promotion
INNER JOIN dbo.Dim_Promotion
ON dbo.Fact_Promotion.PromotionID = dbo.Dim_Promotion.PromotionID
---below query gives customers with where condition
SELECT count(distinct fact_loan.customerid) as [New loans] ,avg(Fact_Loan.Financeamount) as [Avg New Loan Amount]
FROM dbo.Fact_Promotion
where <condition>
AND dbo.Fact_Loan.LoanTypeID = 6
AND dbo.Fact_Loan.AccountStatusID = 1
----below query gives customers with different where condition
SELECT count(distinct fact_loan.customerid) as [Total loans],avg(Fact_Loan.Financeamount) as [Avg Total Loan Amount]
FROM dbo.Fact_Promotion
where <condition>
AND dbo.Fact_Loan.AccountStatusID = 1

I'm not sure from your question what you are trying to achieve.
The WHERE clause in the second query appears to deliver a subset of the data from the WHERE clause in the third query. Both WHERE statements look identical with the exception that the second query (New loans) includes an extra condition that the LoanTypeId (presumably the financial product they have taken) is 6. I guess this is the latest loan product or campaign.
Without knowing what you're trying to do it's difficult to give you an answer but if you want to show total number of customers by LoanTypeId you could aggregate a count by adding the LoanTypeId column to the SELECT statement and adding a GROUP BY dbo.Fact_Loan.LoanTypeId to the end of the statement.
This may not be a straight forward as that as you're doing some other stuff in your SELECT (such as the DISTINCT and the AVG) but without knowing what your end goal is, it's difficult to fully answer your question.

Trouble with pulling distinct data

Ok this is hard to explain partially because I'm bad at sql but this code isn't doing exactly what I want it to do. I'll try to explain what it is supposed to do as best I can and hopefully someone can spot a glaring mistake. I'm sorry about the long winded explanation but there is a lot going on here and I really could use the help.
The point of this script is to search for parts which need to be obsoleted. in other words they haven't been used in three years and are still active.
When we obsolete part, "part.status" is set to 'O'. It is normally null. Also, the word 'OBSOLETE' is usually written in to "part.description"
The "WORK_ORDER" contains every scheduled work order. These are defined by base,lot, and sub ID's. It also contains many dates such as the date when the work order was closed.
the "REQUIREMENT" table contains all the parts require for each job. many jobs may require multiple parts, some at different legs of the job. The way this is handled is that for a given "REQUIREMENT.WORKORDER_BASE_ID" and "REQUIREMENT.WORKORDER_LOT_ID", they may be listed on a dozen or so subsequent rows. Each line specifies a different "REQUIREMENT.PART_ID". The sub id separates what leg of the job that the part is needed. All of the parts I care about start with 'PCH'
When I run this code it returns 14 lines, I happen to know it should be returning about 39 right now. I believe the screwy part starts at line 17. I found that code on another form hoping that it would help solve the original problem. Without that code, I get like 27K lines because the DB is pulling every criteria matching requirement from every criteria matching work order. Many of these parts are used on multiple jobs. I've also tried using DISTINCT on REQUIREMENT.PART_ID which seems like it should solve the problem. Alas it doesn't.
So I know despite all the information I probably still didn't give nearly enough. Does anyone have any suggestions?
SELECT
PART.ID [Engr Master]
,PART.STATUS [Master Status]
,WO.CLOSE_DATE
,PT.ID [Die]
,PT.STATUS [Die Status]
FROM PART
CROSS APPLY(
SELECT
WORK_ORDER.BASE_ID
,WORK_ORDER.LOT_ID
,WORK_ORDER.SUB_ID
,WORK_ORDER.PART_ID
,WORK_ORDER.CLOSE_DATE
FROM WORK_ORDER
WHERE
GETDATE() - (360*3) > WORK_ORDER.CLOSE_DATE
AND PART.ID = WORK_ORDER.PART_ID
AND PART.STATUS ='O'
)WO
CROSS APPLY(
SELECT
REQUIREMENT.WORKORDER_BASE_ID
,REQUIREMENT.WORKORDER_LOT_ID
,REQUIREMENT.WORKORDER_SUB_ID
,REQUIREMENT.PART_ID
FROM REQUIREMENT
WHERE
WO.BASE_ID = REQUIREMENT.WORKORDER_BASE_ID
AND WO.LOT_ID = REQUIREMENT.WORKORDER_LOT_ID
AND WO.SUB_ID = REQUIREMENT.WORKORDER_SUB_ID
AND REQUIREMENT.PART_ID LIKE 'PCH%'
)REQ
CROSS APPLY(
SELECT
PART.ID
,PART.STATUS
FROM PART
WHERE
REQ.PART_ID = PART.ID
AND PART.STATUS IS NULL
)PT
ORDER BY PT.ID

This is difficult to understand without any sample data, but I took a stab at it anyway. I removed the second JOIN to PART (that had alias PART1) as it seemed unecessary. I also removed the subquery that was looking for parts HAVING COUNT(PART_ID) = 1
The first JOIN to PART should be done on REQUIREMENT.PART_ID = PART.PART_ID as the relationship as already been defined from WORK_ORDER to REQUIREMENT, hence you can JOIN PART directly to REQUIREMENT at this point.
EDIT 03/23/2015
If I understand this correctly, you just need a distinct list of PCH parts, and their respective last (read: MAX) CLOSE_DATE. If that is the case, here is what I propose.
I broke the query up into a couple of CTE's. The first CTE is simply going through the PART table and pulling out a DISTINCT list of PCH parts, grouping by PART_ID and DESCRIPTION.
The second CTE, is going through the REQUIREMENT table, joining to the WORK_ORDER table and, for each PART_ID (handled by the PARTITION) assigning the CLOSE_DATE a ROW_NUMBER in descending order. This will ensure that each ROW_NUMBER with a value of "1" will be the Max CLOSE_DATE for each PART_ID.
The final SELECT statement simply JOINS the two Cte's on PART_ID, filtering where LastCloseDate = 1 (the ROW_NUMBER assigned in the second CTE).
If I understand the requirements correctly, this should give you the desired results.
Additionally, I removed the filter WHERE PART.DESCRIPTION NOT LIKE 'OB%' because we're already filtering by PART.STATUS IS NULL and you stated above that an 'O' is placed in this field for Obsolete parts. Also, [DIE] and [ENGR MASTER] have the same value in the 27 rows being pulled before, so I just used the same field and labeled them differently.
; WITH Parts AS(
SELECT prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
FROM PART prt
WHERE prt.STATUS IS NULL
AND prt.PART_ID LIKE 'PCH%'
GROUP BY prt.ID, prt.DESCRIPTION
)
, LastCloseDate AS(
SELECT req.PART_ID
, wrd.CLOSE_DATE
, ROW_NUMBER() OVER(PARTITION BY req.PART_ID ORDER BY wrd.CLOSE_DATE DESC) AS LastCloseDate
FROM REQUIREMENT req
INNER JOIN WORK_ORDER wrd
ON wrd.BASE_ID = req.WORKORDER_BASE_ID
AND wrd.LOT_ID = req.WORKORDER_LOT_ID
AND wrd.SUB_ID = req.WORKORDER_SUB_ID
WHERE wrd.CLOSE_DATE IS NOT NULL
AND GETDATE() - (365 * 3) > wrd.CLOSE_DATE
)
SELECT prt.PART_ID AS [DIE]
, prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
, lst.CLOSE_DATE
FROM Parts prt
INNER JOIN LastCloseDate lst
ON prt.PART_ID = lst.PART_ID
WHERE LastCloseDate = 1

SQL sub queries - is there a better way

This is an SQL efficiency question.
A while back I had to write a collection of queries to pull data from an ERP system. Most of these were simple enough but one of them resulted in a rather ineficient query and its been bugging me ever since as there's got to be a better way.
The problem is not complex. You have rows of sales data. In each row you have quantity, sales price and the salesman code, among other information.
Commission is paid based on a stepped sliding scale. The more they sell, the better the commission. Steps might be 1000, 10000, 10000$ and so forth. The real world problem is more complex but thats it essentially it.
The only way I found of doing this was to do something like this (obviously not the real query)
select qty, price, salesman,
(select top 1 percentage from comissions
where comisiones.salesman = saleslines.salesman
and saleslines.qty > comisiones.qty
order by comissiones.qty desc
) percentage
from saleslines
this results in the correct commission but is horrendously heavy.
Is there a better way of doing this? I'm not looking for someone to rewrite my sql, more 'take a look as foobar queries' and I can take it from there.
The real life commission structure can be specified for different salesmen, articles and clients and even sales dates. It also changes from time to time, so everything has to be driven by the data in the tables... i.e I can't put fixed ranges in the sql. The current query returns some 3-400000 rows and takes around 20-30 secs. Luckily its only used monthly but the slowness is kinda bugging me.
This is on mssql.
Ian
edit:
I should have given a more complex example from the beginning. I realize now that my initial example is missing a few essential elements of the complexity, apologies to all.
This may better capture it
select client-code, product, product-family, qty, price, discount, salesman,
(select top 1 percentage from comissions
where comisiones.salesman = saleslines.salesman
and saleslines.qty > comisiones.qty
and [
a collection of conditions which may or may not apply:
Exclude rows if the salesman has offered discounts above max discounts
which appear in each row in the commissions table
There may be a special scale for the product family
There may be a special scale for the product
There may be a special scale for the client
A few more cases
]
order by [
The user can control the order though a table
which can prioritize by client, family or product
It normally goes from most to least specific.
]
) percentage
from saleslines
needless to say the real query is not easy to follow. Just to make life more interesting, its naming is multi language.
Thus for every row of salesline the commission can be different.
It may sound overly complex but if you think of how you would pay commission it makes sense. You don't want to pay someone for selling stuff at high discounts, you also want to be able to offer a particular client a discount on a particular product if they buy X units. The salesman should earn more if they sell more.
In all the above I'm excluding date limited special offers.
I think partitions may be the solution but I need to explore this more indepth as I know nothing about partitions. Its given me a few ideas.

If you are using a version of SQL Server that supports common-table expressions such as SQL Server 2005 and later, a more efficient solution might be:
With RankedCommissions As
(
Select SL.qty, SL.price, SL.salesman, C.percentage
, Row_Number() Over ( Partition By SL.salesman Order By C.Qty Desc ) As CommissionRank
From SalesLines As SL
Join Commissions As C
On SL.salesman = C.salesman
And SL.qty > C.qty
)
Select qtr, price, salesman, percentage
From RankedCommissions
Where CommissionRank = 1
If you needed to account for the possibility that there are no Commissions values for a given salesman where the SalesLine.Qty > Commission.Qty, then you could do something like:
With RankedCommissions As
(
Select SL.qty, SL.price, SL.salesman, C.percentage
, Row_Number() Over ( Partition By SL.salesman Order By C.Qty Desc ) As CommissionRank
From SalesLines As SL
Join Commissions As C
On SL.salesman = C.salesman
And SL.qty > C.qty
)
Select SL.qtr, SL.price, SL.salesman, RC.percentage
From SalesLines As SL
Left Join RankedCommissions As RC
On RC.salesman = SL.salesman
And RC.CommissionRank = 1

select
qty, price, salesman,
max(percentage)
from saleslines
inner join comissions on commisions.salesman = saleslines.salesman and
saleslines.qty > comissions.qty
group by
qty, price, salesman

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas