How would I translate this SQL query into a Raven Map/Reduce query? - ravendb

Following on from my previous question at When is a groupby query evaluated in RavenDB? I decided to completely restructure the data into a format that is theoretically easier to query on.
Having now created the new data structure, I am struggling to find how to query it.
It took me 30 seconds to write the following SQL query which gives me exactly the results I need:
SELECT GroupCompanyId, AccountCurrency, AccountName, DATEPART(year, Date) AS Year,
(SELECT SUM(Debit) AS Expr1
FROM Transactions AS T2
WHERE (T1.GroupCompanyId = GroupCompanyId) AND (T1.AccountCurrency = AccountCurrency) AND (T1.AccountName = AccountName) AND (DATEPART(year,
Date) < DATEPART(year, T1.Date))) AS OpeningDebits,
(SELECT SUM(Credit) AS Expr1
FROM Transactions AS T2
WHERE (T1.GroupCompanyId = GroupCompanyId) AND (T1.AccountCurrency = AccountCurrency) AND (T1.AccountName = AccountName) AND (DATEPART(year,
Date) < DATEPART(year, T1.Date))) AS OpeningCredits, SUM(Debit) AS Db, SUM(Credit) AS Cr
FROM Transactions AS T1
WHERE (DATEPART(year, Date) = 2011)
GROUP BY GroupCompanyId, AccountCurrency, AccountName, DATEPART(year, Date)
ORDER BY GroupCompanyId, AccountCurrency, Year, AccountName
So far I have got the Map/Reduce as follows, which from Studio appears to give the correct results - i.e. it breaks down and groups the data by date.
public Transactions_ByDailyBalance()
{
Map = transactions => from transaction in transactions
select new
{
transaction.GroupCompanyId,
transaction.AccountCurrency,
transaction.Account.Category,
transaction.Account.GroupType,
transaction.AccountId,
transaction.AccountName,
transaction.Date,
transaction.Debit,
transaction.Credit,
};
Reduce = results => from result in results
group result by new
{
result.GroupCompanyId,
result.AccountCurrency,
result.Category,
result.GroupType,
result.AccountId,
result.AccountName,
result.Date,
}
into g
select new
{
GroupCompanyId = g.Select(x=>x.GroupCompanyId).FirstOrDefault(),
AccountCurrency = g.Select(x=>x.AccountCurrency).FirstOrDefault(),
Category=g.Select(x=>x.Category).FirstOrDefault(),
GroupType=g.Select(x=>x.GroupType).FirstOrDefault(),
AccountId = g.Select(x=>x.AccountId).FirstOrDefault(),
AccountName=g.Select(x=>x.AccountName).FirstOrDefault(),
Date=g.Select(x=>x.Date).FirstOrDefault(),
Debit=g.Sum(x=>x.Debit),
Credit=g.Sum(x=>x.Credit)
};
Index(x=>x.GroupCompanyId,FieldIndexing.Analyzed);
Index(x=>x.AccountCurrency,FieldIndexing.Analyzed);
Index(x=>x.Category,FieldIndexing.Analyzed);
Index(x=>x.AccountId,FieldIndexing.Analyzed);
Index(x=>x.AccountName,FieldIndexing.Analyzed);
Index(x=>x.Date,FieldIndexing.Analyzed);
}
}
However, I can't work out how to query the data at one go.
I need the opening balance as well as the period balance, so I ended up writing this query which takes as a parameter the account. Following on from Oren's comments to my previous question, that I was mixing Linq with Lucene query, having rewritten the query, I've basically ended up again with a mixed query.
Even though I am showing in the SQL query above that I am filtering by year, in fact I need to be able to determine the current balance from any day.
private LedgerBalanceDto GetAccountBalance(BaseAccountCode account, DateTime periodFrom, DateTime periodTo, string queryName)
{
using (var session = MvcApplication.RavenSession)
{
var query = session.Query<Transactions_ByDailyBalance.Result, Transactions_ByDailyBalance>()
.Where(c=>c.AccountId==account.Id && c.Date>=periodFrom && c.Date<=periodTo)
.OrderBy(c=>c.Date)
.ToList();
var debits = query.Sum(c => c.Debit);
var credits = query.Sum(c => c.Credit);
var ledgerBalanceDto = new LedgerBalanceDto
{
Account = account,
Credits = credits,
Debits = debits,
Currency = account.Currency,
CurrencySymbol = account.CurrencySymbol,
Name = queryName,
PeriodFrom = periodFrom,
PeriodTo = periodTo
};
return ledgerBalanceDto;
}
}
Required result:
GroupCompanyId AccountCurrency AccountName Year OpeningDebits OpeningCredits Db Cr
Groupcompanies-2 EUR Customer 1 2011 148584.2393 125869.91 10297.6891 28023.98
Groupcompanies-2 EUR Customer 2 2011 236818.0054 233671.55 50959.85 54323.38
Groupcompanies-2 USD Customer 3 2011 69426.11761 23516.3776 10626.75 0
Groupcompanies-2 USD Customer 4 2011 530587.9223 474960.51 97463.544 131497.16
Groupcompanies-2 USD Customer 5 2011 29542.391 28850.19 4023.688 4231.388
Any suggestions would be greatly appreciated
Jeremy
In answer to the comment
I basically ended up doing pretty much the same thing. Actually, I wrote an index that does it in only two hits - once for the opening balance and again for the period balance. This is almost instantaneous for grouping by the account name, category etc.
However my problem now is getting a daily running balance for the individual account. If I bring down all the data for the account and the period, its not a problem - I can sum the balance on the client, however, when the data is paged, and the debits and credits are grouped by Date and Id, the paging cuts across the date, so the opening/closing balance is not correct.
Page 1
Opening balance until 26/7/12 = 0
25/7/12 Acct1 Db 100 Cr 0 Bal +100 Runn Bal +100
26/7/12 Acct1 Db 100 Cr 0 Bal +100 Runn Bal +200
26/7/12 Acct1 Db 200 Cr 0 Bal +200 Runn Bal +400
Closing balance until 26/7/12 = +400
Page 2
Opening balance until 26/7/12 = +450 (this is wrong - it should be the balance at the end of Page 1, but it is the balance until the 26/7/12 - i.e. includes the first item on Page 2)
26/7/12 Acct1 Db 50 Cr 0 Bal +50 Runn Bal +500 (should be +450)
27/7/12 Acct1 Db 60 Cr 0 Bal +60 Runn Bal +560 (should be +510)
I just can't think up an algorithm to handle this.
Any ideas?

Hi this is a problem I have also faced recently with RavenDb when I needed to retrieve rolling balances as at any imaginable date. I never found a way of doing this all in one go but I managed reduce the amount of documents that I needed to pull back in order to calculate the rolling balance.
I did this by writing multiple map reduce indexes that summed up the value of transactions within specific periods:
My First Summed up the value of all transactions grouped at the year level
My Second index Summed up the value of all transactions at the Day level
So if someone wanted their account balance As At 1st June 2012 I would:
Use the Year level Map-reduce index to get the Value of transactions for years up to 2012 and summed them together (so if transactions started being captured in 2009 I should be pulling back 3 documents)
Use the Day level Map-reduce index to get all documents from the start of the year and the 1st of June
I then Added the Day totals to the year totals for my final rolling balance (I could have also had a monthly map reduce as well but didn't bother).
Anyway not as quick as in SQL but it was the best alternative I could come up with to avoid bringing back every single transaction

Related

MS Access Query - Display All Rows from Original Joined Table

I need to write a query that displays ALL revenue centers for a month whether they have revenue or not. This seems like a simple request but I have seemed to hit a brick wall. Below is my SQL:
SELECT ID_ItemNominal, ItemNominal_Description, Sum(Nz([ITM_Net],0)) AS ITM_Net_Total
FROM TSub_ItmNominal LEFT JOIN (T_Invoice RIGHT JOIN T_LineItems ON T_Invoice.ITM_Reference = T_LineItems.ITM_Reference) ON TSub_ItmNominal.ID_ItemNominal = T_LineItems.ITM_Nominal
WHERE (((Year([ITM_Date]))=[report_year] Or (Year([ITM_Date])) Is Null) AND ((Month([ITM_Date]))=[report_month]))
GROUP BY TSub_ItmNominal.ID_ItemNominal, TSub_ItmNominal.ItemNominal_Description
HAVING (((TSub_ItmNominal.ID_ItemNominal) Like "4*"))
ORDER BY TSub_ItmNominal.ID_ItemNominal;
ID_ItemNominal = the Integer code for the Revenue Center
ItemNominal_Description = the description of the Revenue Center
ITM_Net = the Currency amount for the Line Item on the Invoice, to be SUM for a month total
ITM_ Date = the Date of the Invoice
My thought was to use the LEFT JOIN to say that I want to see ALL of the revenue centers, even if those records do not have any data for that month. What I get is the centers that DO have revenue for the year but DO NOT have revenue for the month are not shown / filtered out.
What the Current query provides:
40500 | Sales - Digital | $###.##
40700 | Sales - Misc | $###.##
40800 | Sales - Mail | $###.##
40900 | Sales - Clothing| $0.00
We have not done any revenue under 40900 this year so far so it shows as a result in the query. We have done revenue in 40600 this year but not for the month of April. The 40600 seems to be filtered out by the WHERE part of the query as well as any other revenue centers that we have revenue for the year but not the selected date.
I would like to see these revenue centers included in the query but show as $0.00 for the month.
Any help would be greatly appreciated, I feel like I am close but I just can't seem to get the correct results. Thank you in advance!
You could join the original table with all rows and join your query to it like
SELECT t1.ID_ItemNominal, t1.ItemNominal_Description,t2.ITM_Net_Total
FROM TSub_ItmNominal AS t1 LEFT JOIN (SELECT ID_ItemNominal, ItemNominal_Description, Sum(Nz([ITM_Net],0)) AS ITM_Net_Total
FROM TSub_ItmNominal LEFT JOIN (T_Invoice RIGHT JOIN T_LineItems ON T_Invoice.ITM_Reference = T_LineItems.ITM_Reference) ON TSub_ItmNominal.ID_ItemNominal = T_LineItems.ITM_Nominal
WHERE (((Year([ITM_Date]))=[report_year] Or (Year([ITM_Date])) Is Null) AND ((Month([ITM_Date]))=[report_month]))
GROUP BY TSub_ItmNominal.ID_ItemNominal, TSub_ItmNominal.ItemNominal_Description
HAVING (((TSub_ItmNominal.ID_ItemNominal) Like "4*")) ) AS t2 ON t1.ID_ItemNominal = t2.ID_ItemNominal
WHERE ((([t1l].[ID_ItemNominal]) Like "4*"))
ORDER BY t1.ID_ItemNominal;
Usually, when you run LEFT JOIN + WHERE you run an analogous INNER JOIN. But according to your specifications, the sub items table should be optionally joined since it can contain actual sales data not exhaustive of all revenue centers.
Therefore, run the WHERE filtering within a subquery and then have this subquery left joined to main set of revenue centers. Also, for readability below converts RIGHT JOIN to LEFT JOIN and uses table aliases instead of full table names.
SELECT main.ITM_Nominal,
main.ItemNominal_Description,
SUM(NZ(sub.[ITM_Net], 0)) AS ITM_Net_Total
FROM (T_LineItems AS main
LEFT JOIN T_Invoice AS inv
ON inv.ITM_Reference = main.ITM_Reference)
LEFT JOIN (
SELECT ID_ItemNominal, [ITM_Net]
FROM TSub_ItmNominal
WHERE ID_ItemNominal ALIKE '4%'
AND YEAR([ITM_Date]) = [report_year]
AND MONTH([ITM_Date]) = [report_month]
) AS sub
ON sub.ID_ItemNominal = main.ITM_Nominal
GROUP BY main.ITM_Nominal,
main.ItemNominal_Description
ORDER BY main.ITM_Nominal;

How to group on an aggregated data

so I've to gather the data from the table such that
1.Group each order by shipment type.
2.Filter orders for which sum of sales at shipping group level is less than $35 however overall sales is greater than $35.
For eg: If an order has two shipping groups: S2H and SDD.
Then:
S2H = $20
SDD = $25
Order total = $45
In the above example the individual shipment total is less than $35, however order total is greater than $35.
I've tried till here and would like to know how can I move forward from here to get the desirable date:
SELECT
(CASE WHEN Sales > 35 AND Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Above_SDD_Order,
(CASE WHEN Sales <= 35 AND  Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Below_SDD_Order,
(CASE WHEN Sales > 35 AND Digital = 1 AND SDD = 1 AND BOPS = 0 AND S2H = 1 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Above_SDD_S2H_Digital_Order,
(CASE WHEN Sales <= 35 AND Digital = 1 AND SDD = 1 AND BOPS = 0 AND S2H = 1 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Below_SDD_S2H_Digital_Order,
(CASE WHEN Sales > 35 AND Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 1 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Above_S2H_SDD_Order,
(CASE WHEN Sales <= 35 AND Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 1 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Below_SDD_S2H_Order,
(CASE WHEN Sales > 35 AND Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 1 THEN order_no
      ELSE NULL END) AS Above_Preorder_SDD_Order,
(CASE WHEN Sales <= 35 AND Digital = 0 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 1 THEN order_no
      ELSE NULL END) AS Below_Preorder_SDD_Order,
(CASE WHEN Sales <= 35 AND Digital = 1 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Below_Digital_SDD_Order,
(CASE WHEN Sales > 35 AND Digital = 1 AND SDD = 1 AND BOPS = 0 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END )AS Above_Digital_SDD_Order,
(CASE WHEN Sales <= 35 AND Digital = 0 AND SDD = 1 AND BOPS = 1 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Below_BOPS_SDD_Order,
(CASE WHEN Sales > 35 AND Digital = 0 AND SDD = 1 AND BOPS = 1 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END) AS Above_BOPS_SDD_Order,
(CASE WHEN Sales <= 35 AND Digital = 1 AND SDD = 1 AND BOPS = 1 AND S2H = 0 AND PREORDER = 0 THEN order_no
      ELSE NULL END )AS Below_Digital_BOPS_SDD_Order,
(CASE WHEN Sales > 35 AND Digital = 1 AND SDD = 1 AND BOPS = 1 AND S2H = 0 AND PREORDER = 0 THEN order_no ELSE NULL END )AS Above_Digital_BOPS_SDD_Order
FROM 
(SELECT order_no,
MAX(CASE WHEN (Shipping_Group = 'DIGITAL_GIFT_CARD' OR Shipping_Group = 'DIGITAL' OR Shipping_Group = 'GAME_INFORMER_DIGITAL' OR Shipping_Group = 'LOYALTY') THEN 1 ELSE 0 END ) AS Digital,
MAX(CASE WHEN Shipping_Group = 'GAME_INFORMER_PHYSICAL' OR Shipping_Group = 'PHYSICAL' THEN 1 ELSE 0 END) AS S2H,
MAX(CASE WHEN Shipping_Group = 'PHYSICAL_PREORDER' OR Shipping_Group = 'DIGITAL_PREORDER' OR Shipping_Group = 'PICKUP_PREORDER' THEN 1 ELSE 0 END ) AS Preorder,    
MAX(CASE WHEN Shipping_Group = 'PICKUP' THEN 1 ELSE 0 END ) AS BOPS,
MAX(CASE WHEN Shipping_Group = 'SDD' THEN 1 ELSE 0 END) AS SDD,
SUM(pl_net_price) as Sales
From 
(SELECT order_no, Shipping_Group,pl_net_price
FROM 
eim-prod.EDW_VIEWS.ORDER_SUBMIT_PRODUCT_LINEITEM 
)
group by 1)
group by 1,2,3,4,5,6,7,8,9,10,11,12,13,14;
You can also tell me if my approach is wrong and how I should take it further as I am a newbie with SQL. Thanks

Compare data in the same table

I have a table that stores monthly data and I would like to create a comparison between the quantity movement within a period.
Here is an example of the table
.
The SQL statement below is meant to return any changes that has happened within a period - any fund/policy partial or total loss as well as partial or total gain. I have been battling with it for a while - any help would be well appreciated.
I currently have 5 sets of unions - (where the policies and funds match and there's a difference in quantities held, where the policies exist in the previous and not in the current and vice versa and where the securities exist in the previous and not in the current and vice versa) but the other unions work save for the last couple (where the securities exist in the previous and not in the current and vice versa). It doesn't seem to return every occurrence.
SELECT distinct pc.[Client]
,pc.Policy
,cast(pc.Qty as decimal) AS CurrQ
,0 AS PrevQ
,cast(pc.Qty as decimal) - 0 AS QtyDiff
,CASE WHEN cast(pc.Qty as decimal) - 0 > 0 THEN 'Bought Units'
WHEN cast(pc.Qty as decimal) - 0 < 0 THEN 'Sold Units'
ELSE 'Unknown'
END AS TransactionType
,convert(varchar,cast(pc.[ValDate] as date),103) AS CurrValDate
,'' AS PrevValDate
FROM table pc
WHERE convert(varchar,cast(pc.[ValDate] as date),103) = convert(varchar,getdate(),103)
AND pc.Policy IN (SELECT policy
FROM table
WHERE convert(varchar(10),[ValDate],103) = convert(varchar(10),getdate()-1,103)
AND pc.[Fund] NOT IN (SELECT PM.[Fund]
FROM table pc
LEFT JOIN table pm ON pc.policy = pm.policy
WHERE convert(varchar,cast(pc.[ValDate] as date),103) = convert(varchar,getdate(),103))
AND convert(varchar,cast(pm.[ValDate] as date),103) = convert(varchar,getdate()-1,103))
As #Larnu rightly mentioned in the comment section, the extra conditions in the query changed the run from a LEFT JOIN to an INNER JOIN. I changed the code to have policy, fund and date in the ON clause:
FROM table pc
LEFT JOIN table pm ON (pc.policy = pm.policy
AND pc.fund = pm.fund
AND pc.[ValDate]-1 = pm.[ValDate])
and got rid of the sub queries.
Thanks again Larnu.

SQL Server: difference in days for two dates in separate rows

I am using SQL Server 2012 and working an a report currently that is asking me to find the difference in days between two dates.
Basically, for a particular ReportID, I'm trying to find the difference in days between the (ReportCompletedDate when the ReportType = 'PaperReceived') - (ReportCompletedDate when the ReportType = 'Form Completed')
I tried to give some records below...
ReportID ReportType ReportCompletedDate
-------------------------------------------------
450 PaperReceived 9/5/2013
450 Form Completed 8/13/2013
451 PaperReceived 9/7/2013
451 Form Completed 7/12/2013
452 PaperReceived 10/6/2013
452 Form Completed 3/13/2013
So, for ReportID = 450 for example, I want to get 9/5/2013 - 8/13/2013 in days.
If anyone also knows how to do this for calendar days and business days, that would be awesome...but at least calendar days should be fine.
I'm not sure how to go about this, usually when the two dates are in line, it's easier for me to figure out, but not sure how to go about it when the dates are in separate rows. Any advice would be gladly appreciated.
You can do a self join to make it appear on "one line". Like this:
SELECT DateDiff(day,BASE.ReportCompleteDate, FORM.ReportCompleteDate) as Diff
FROM TABLE_NAME_YOU_DID_NOT_SAY BASE
LEFT JOIN TABLE_NAME_YOU_DID_NOT_SAY FORM ON BASE.ReportId = FORM.ReportID AND FORM.ReportType = 'Form Completed'
WHERE BASE.ReportType = 'PaperRecieved'
Please find below one of the approach...
SELECT RCVD.REPORTID
, DATEDIFF(DAY, RCVD.REPORTCOMPLETEDDATE, CMPLD.REPORTCOMPLETEDDATE) DAY_DIFF
FROM (
SELECT REPORTID
, REPORTCOMPLETEDDATE
FROM REPORTS
WHERE REPORTTYPE = 'PaperReceived'
) RCVD
JOIN (
SELECT REPORTID
, REPORTCOMPLETEDDATE
FROM REPORTS
WHERE REPORTTYPE = 'Form Completed'
) CMPLD
ON RCVD.REPORTID = CMPLD.REPORTID

Access SQL query - Percentage of Total calculation

In Access SQL, I am attempting what should seem like a simple task in attaining a percentage of total. There are 3 item stores (Sears, kmart & Mktpl) of which in any given week, I wish to calculate their respective percent of total based on balance of sales (all can be obtained using one table - tbl_BUChannelReporting).
For example week 5 dummy numbers - Sears 7000, kmart 2500, mktpl 2000
the following ratios would be returned: sears 61%, kmart 22%, mktpl 17%
I was originally trying to create a sub query and wasn't getting anywhere so I am essentially trying to sum sales on one of the item stores in week 5 divided by the sum of all 3 item store sales in week 5. The following is my query, which is giving me "cannot have aggregate function in expression" error:
SELECT FY, FW, Rept_Chnl, BU_NM, Order_Store, Item_Store, CDBL(
SUM(IIF([item_store]="sears", revenue, IIF([item_store]="kmart", revenue, IIF([item_store]="mktpl", revenue,0)))) /
(SUM(IIF([item_store]="sears",revenue,0)+SUM(IIF([item_store]="kmart",revenue,0)+SUM(IIF([item_store]="mktpl",revenue,0))))))
AS Ratios
FROM tbl_BUChannelReporting
WHERE FY = "2017"
AND FW = 5
GROUP BY FY, FW, Rept_Chnl, BU_NM, Order_Store, item_store
Thanks all in advance for taking the time. This is my 1st post here and I don't consider myself anything but a newbie anxious to learn from the best and see how this turns out.
Take care!
-D
Consider using two derived tables or saved aggregate queries: one that groups on Item_Store and the other that does not include Item_Store in order to sum the total stores' revenue. All other groupings (FY, FW, Rept_Chnl, BU_NM, Order_Store) remain in both and used to join the two. Then in outer query, calculate percentage ratio.
SELECT i.*, CDbl(i.Store_Revenue / a.Store_Revenue) As Ratios
FROM
(SELECT t.FY, t.FW, t.Rept_Chnl, t.BU_NM, t.Order_Store, t.Item_Store,
SUM(t.Revenue) As Store_Revenue
FROM tbl_BUChannelReporting t
WHERE t.FY = '2017' AND t.FW = 5
GROUP BY t.FY, t.FW, t.Rept_Chnl, t.BU_NM, t.Order_Store, t.Item_Store) As i
INNER JOIN
(SELECT t.FY, t.FW, t.Rept_Chnl, t.BU_NM, t.Order_Store
SUM(t.Revenue) As Store_Revenue
FROM tbl_BUChannelReporting t
WHERE t.FY = '2017' AND t.FW = 5
GROUP BY t.FY, t.FW, t.Rept_Chnl, t.BU_NM, t.Order_Store) As a
ON i.FY = a.FY AND i.FW = a.FW AND i.Rept_Chnl = a.Rept_Chnl
AND i.BU_NM = a.BU_NM AND i.Order_Store = a.Order_Store
Or save each above SELECT statement as its own query and reference both below:
SELECT i.*, (i.Store_Revenue / a.Store_Revenue) As Ratios
FROM
Indiv_Item_StoreAggQ As i
INNER JOIN
All_Item_StoreAggQ As a
ON i.FY = a.FY AND i.FW = a.FW AND i.Rept_Chnl = a.Rept_Chnl
AND i.BU_NM = a.BU_NM AND i.Order_Store = a.Order_Store