Cumulative SUM in a query (SQL access) - sql

Using MS access SQL I have a query (actually a UNION made of multiple queries) and need a cumulative sum (actually a statement of account which items are in chronological order).
How do I get a cumulative sum?
Since they are duplicates by date I have to add a new ID, however, SQL in MS access does not seem to have ROW_ID or similar.

So, we need to sort donation data into chronological order across multiple tables with duplicates. First combine all the tables of donators in one query which sets up the simplest syntax. Then to put things in order we need to have an order for the duplicate dates. The dataset has two natural ways to sort duplicate dates including the donator and the amount. For instance, we could decide that after the date bigger donations come first, If the rule is complicated enough we abstract it to a code module and into public function and include it in the query so that we can sort by it:
'Sorted Donations:'
SELECT (BestDonator(q.donator)) as BestDonator, *
FROM tblCountries as q
UNION SELECT (BestDonator(j.donator)) as BestDonator, *
FROM tblIndividuals as j
ORDER BY EvDate Asc, Amount DESC , BestDonator DESC;
Public Function BestDonator(donator As String) As Long
BestDonator = Len(donator) 'longer names are better :)'
End Function
with sorted donations we have settled on an order for the duplicate dates and have combined both individual donations and country donations, so now we can calculate the running sum directly using either dsum or a subquery. There is no need to calculate row id. The tricky part is getting the syntax correct. I ended up abstracting the running sum calculation to a function and omitting BestDonator because I couldn't easily paste together this query in the query designer and I ran out of time to bug fix
Public Function RunningSum(EvDate As Date, Amount As Currency)
RunningSum = DSum("Amount", "Sorted Donations", "(EvDate < #" & [EvDate] & "#) OR (EvDate = #" & [EvDate] & "# AND Amount >= " & [Amount] & ")")
End Function
Carefully note the OR in the Dsum part of the RunningSum calculation. This is the tricky part to summing the right amounts.
| donator | EvDate | Amount | RunningSum |
| Reiny | 1/10/2020 | 321 | 321 |
| Czechia | 3/1/2020 | 7455 | 7776 |
| Germany | 3/18/2020 | 4222 | 11998 |
| Jim | 3/18/2020 | 222 | 12220 |
| Australien | 4/15/2020 | 13423 | 25643 |
| Mike | 5/31/2020 | 345 | 25988 |
| Portugal | 6/6/2020 | 8755 | 34743 |
| Slovakia | 8/31/2020 | 3455 | 38198 |
| Steve | 9/6/2020 | 875 | 39073 |
| Japan | 10/10/2020 | 5234 | 44307 |
| John | 10/11/2020 | 465 | 44772 |
| Slowenia | 11/11/2020 | 4665 | 49437 |
| Spain | 11/22/2020 | 7677 | 57114 |
| Austria | 11/22/2020 | 3221 | 60335 |
| Bill | 11/22/2020 | 767 | 61102 |
| Bert | 12/1/2020 | 755 | 61857 |
| Hungaria | 12/24/2020 | 9996 | 71853 |


Get similar employees based on their attribute values

Consider the following sample table("Customer") with these records
| customer-id | att-a | att-b | att-c | att-d | att-e | att-f | att-g | att-h | att-i | att-j |
| customer-1 | att-a-7 | att-b-3 | att-c-10 | att-d-10 | att-e-15 | att-f-11 | att-g-2 | att-h-7 | att-i-5 | att-j-14 |
| customer-2 | att-a-9 | att-b-7 | att-c-12 | att-d-4 | att-e-10 | att-f-4 | att-g-13 | att-h-4 | att-i-1 | att-j-13 |
| customer-3 | att-a-10 | att-b-6 | att-c-1 | att-d-1 | att-e-13 | att-f-12 | att-g-9 | att-h-6 | att-i-7 | tt-j-4 |
| customer-19 | att-a-7 | att-b-9 | att-c-13 | att-d-5 | att-e-8 | att-f-5 | att-g-12 | att-h-14 | att-i-13 | att-j-15 |
I have these records and many more records dumped into SQL database and wanted to find top 10 similar customer based on the attribute value. For example customer-1 and customer-19 have atleast one column value matching .i.e "att-a-7" so the output should give me 2 customer-id's or top similar customer that are customer-1 and customer-19.
P.S - there can be one or more columns similar across rows.
I'm using windowing technique to find top 10 similar customer and im not sure if I'm correct.
following is my approach I used in my query :
row_number() over (partition by att-a, att-b,..,att-j order by customer-id) as customers
is this correct. ?

How to optimize nested innner hive query

I have a table with following stock data where we have couple of columns like date, ticker, open and close(stock prices).
To query this data, I want to know which stock has given the highest margin on particular date. So if I have 516 different stocks, my query should return 516 rows of ticker, date, open, close and a new column Margin(which will be max(close-open)).
| deep_stocks.date_ | deep_stocks.ticker | | deep_stocks.close |
| 20100721 | A | 27.68 | 27.58 |
| 20100722 | A | 27.95 | 28.72 |
| 20100723 | A | 28.56 | 29.3 |
| 20100726 | A | 29.22 | 29.64 |
| 20100727 | A | 29.73 | 28.87 |
| 20100728 | A | 28.79 | 28.78 |
| 20100729 | A | 28.97 | 28.15 |
| 20100730 | A | 27.78 | 27.93 |
| 20100802 | A | 28.35 | 28.82 |
| 20100803 | A | 28.7 | 27.84 |
I have written a query where my approach was:
Step 1 - Get the difference between Close and Open prices (Inner/Sub query)
Step 2 - Get the maximum of margin for every stock (used group by with max function)
Step 3 - Join the results with Main Table and get the data.
I'll put my query in solution or comments can someone please correct it as it is taking more time. Also I would like to know can we have any other alternative approach.
As already told about my approach please find below query:
SELECT ds.ticker, ds.date_, ds.close,, ds.Margin FROM
(SELECT ticker, date_, close, open, case(close-open)>0 when true then round(close-open,2) else 0 end as Margin FROM DataStocks) ds
(SELECT dsIn.ticker, max(dsIn.Margin) mxMargin FROM
(select ticker, case(close-open)>0 when true then round(close-open,2) else 0 end as Margin FROM DataStocks ) dsIn group by dsIn.ticker) dsEx
ON ds.ticker=dsEx.ticker AND ds.Margin=dsEx.mxMargin ORDER BY ds.Margin;
Do we have any other alternatives for this query or can it be possible to optimize it.

SQL Query for Pivoting data in MS-Access

I have a table with three fields: ticketNumber, attendee and tableNumber.
ticketNumber | attendee | tableNumber
------------ | -------- | -----------
A1 | alex | 3
A2 | bret | 2
A3 | chip | 1
A4 | dale | 2
A5 | eric | 2
A6 | finn | 3
I'd like to generate a table with each tableNumber as a field and the list of names sitting at that tableNumber.
1 | 2 | 3
chip | bret | alex
| dale | finn
| eric |
The query I've been working on is:
transform attendee
select attendee
from registration
group by tableNumber, name
pivot tableNumber
This gives me:
attendee | 1 | 2 | 3
chip | chip | |
bret | | bret |
dale | | dale |
eric | | eric |
alex | | | alex
finn | | | finn
I know how to get the table I require using PivotTableView but I'd like to know how to do it with a query so that I can use it in a code I'm working on. I'm unsure of how I can write this query without having to select a field (in my case, select attendee). Also is it possible to generate the table without the empty cells?
Thank you :)
The table needs a unique identifier field that will properly sort and I don't think the ticketNumber can be relied on for that. An autonumber should serve. Then try:
TRANSFORM First(Table1.attendee) AS FirstOfattendee
SELECT DCount("*","Table1","tableNumber=" & [tableNumber] & " AND ID<" & [ID])+1 AS RowSeq
FROM Table1
GROUP BY DCount("*","Table1","tableNumber=" & [tableNumber] & " AND ID<" & [ID])+1
PIVOT Table1.tableNumber;

How to define a sub query inside SQL statement to be used several times as a table alias?

I have an MS Access database for rainfall data of several climate stations.
For each day of each station, I want to calculate the rainfall in the previous day (if recorded), and the sum of the rainfall at the previous 3 and 7 days.
Due to the huge amount of data and the limitations of Access, I made a query that takes station by station; Then I applied an auxillary query to find dates first, For each station, The following SQL statement is applied (and named RainFallStudy query):
[173].ID, [173].AirportCode, [173].RFmm,
DateSerial([rYear], [rMonth], [rDay]) AS DateSer,
[DateSer]-1 AS DM1,
[DateSer]-2 AS DM2,
[DateSer]-3 AS DM3,
[DateSer]-4 AS DM4,
[DateSer]-5 AS DM5,
[DateSer]-6 AS DM6,
[DateSer]-7 AS DM7
((([173].AirportCode) = 786660));
I used DM1, DM2, etc as the date serial of the day-1, day-2, etc.
Then I used another query that uses RainFallStudy query with left joints as shown in the figure:
The SQL statement is
RainFallStudy.ID, RainFallStudy.AirportCode,
RainFallStudy.RFmm AS RF0, RainFallStudy.DateSer,
RainFallStudy.DM1, RainFallStudy_1.RFmm AS RF1,
RainFallStudy_2.RFmm AS RF2, RainFallStudy_3.RFmm AS RF3,
RainFallStudy_4.RFmm AS RF4, RainFallStudy_5.RFmm AS RF5,
RainFallStudy_6.RFmm AS RF6, RainFallStudy_7.RFmm AS RF7,
Nz([rf1], 0) + Nz([rf2], 0) + Nz([rf3], 0) + Nz([rf4], 0) + Nz([rf5], 0) + Nz([rf6], 0) + Nz([rf7], 0) AS RF_W
RainFallStudy AS RainFallStudy_1 ON RainFallStudy.DM1 = RainFallStudy_1.DateSer)
RainFallStudy AS RainFallStudy_2 ON RainFallStudy.DM2 = RainFallStudy_2.DateSer)
RainFallStudy AS RainFallStudy_3 ON RainFallStudy.DM3 = RainFallStudy_3.DateSer)
RainFallStudy AS RainFallStudy_4 ON RainFallStudy.DM4 = RainFallStudy_4.DateSer)
RainFallStudy AS RainFallStudy_5 ON RainFallStudy.DM5 = RainFallStudy_5.DateSer)
RainFallStudy AS RainFallStudy_6 ON RainFallStudy.DM6 = RainFallStudy_6.DateSer)
RainFallStudy AS RainFallStudy_7 ON RainFallStudy.DM7 = RainFallStudy_7.RFmm;
Now I suffer from the slow performance of this query, as the records of each station range from 1,000 to 750,000 records! Is there any better way to find what I need in a faster SQL statement? The second question, can I make a standalone SQL statement for that (one query without the auxiliary query) as I will use it in python, which requires one SQL statement (as Iof my knowledge).
Thanks in advance.
As requested by #Andre, Here are some sample data of table [173] in HTML
And here is sample output (HTML)
I created an additional column rDate (DateTime) and filled it with this query:
UPDATE Rainfall SET Rainfall.rDate = DateSerial([rYear],[rMonth],[rDay]);
Then your desired result can be achieved with several subqueries, using SUM() for the last two columns:
SELECT r.ID, r.AirportCode, r.rDate, r.RFmm,
(SELECT RFmm FROM Rainfall r1 WHERE r1.AirportCode = r.AirportCode AND r1.rDate = r.rDate-1) AS Yesterday,
(SELECT SUM(RFmm) FROM Rainfall r3 WHERE r3.AirportCode = r.AirportCode AND r3.rDate BETWEEN r.rDate-3 AND r.rDate-1) AS Prev3days,
(SELECT SUM(RFmm) FROM Rainfall r7 WHERE r7.AirportCode = r.AirportCode AND r7.rDate BETWEEN r.rDate-7 AND r.rDate-1) AS PrevWeek
FROM Rainfall r
Make sure AirportCode and rDate are indexed for larger numbers of records.
| ID | AirportCode | rDate | RFmm | Yesterday | Prev3days | PrevWeek |
| 11216 | 409040 | 23.01.2012 | 0,51 | | | |
| 11217 | 409040 | 24.01.2012 | 0 | 0,51 | 0,51 | 0,51 |
| 11218 | 409040 | 25.01.2012 | 0 | 0 | 0,51 | 0,51 |
| 11219 | 409040 | 26.01.2012 | 2,03 | 0 | 0,51 | 0,51 |
| 11220 | 409040 | 27.01.2012 | 0 | 2,03 | 2,03 | 2,54 |
| 11221 | 409040 | 28.01.2012 | 0 | 0 | 2,03 | 2,54 |
| 11222 | 409040 | 29.01.2012 | 0 | 0 | 2,03 | 2,54 |
| 11223 | 409040 | 30.01.2012 | 0 | 0 | 0 | 2,54 |
| 11224 | 409040 | 31.01.2012 | 0,25 | 0 | 0 | 2,03 |
| 11225 | 409040 | 01.02.2012 | 0 | 0,25 | 0,25 | 2,28 |
| 11226 | 409040 | 02.02.2012 | 0 | 0 | 0,25 | 2,28 |
| 11227 | 409040 | 03.02.2012 | 4,32 | 0 | 0,25 | 0,25 |
| 11228 | 409040 | 04.02.2012 | 13,21 | 4,32 | 4,32 | 4,57 |
| 11229 | 409040 | 05.02.2012 | 1,02 | 13,21 | 17,53 | 17,78 |
Use Nz() to avoid NULL values in the first row.
It appears that you store the day in separate fields (rYear, rMonth, rDay). So, in order to get the date you use the DateSerial function. This means that in order to use the date for a join or where clause, Access must calculate the date for the entire table. You need to store the date in a separate field and index it to avoid the calculation.

Only Some Dates From SQL SELECT Being Set To "0" or "1969-12-31" -- UNIX_TIMESTAMP

So I have been doing pretty well on my project (Link to previous StackOverflow question), and have managed to learn quite a bit, but there is this one problem that has been really dogging me for days and I just can't seem to solve it.
It has to do with using the UNIX_TIMESTAMP call to convert dates in my SQL database to UNIX time-format, but for some reason only one set of dates in my table is giving me issues!
So these are the values I am getting -
#abridged here, see the results from the SELECT statement below to see the rest
#of the fields outputted
| firstVst | nextVst | DOB |
| 1206936000 | 1396238400 | 0 |
| 1313726400 | 1313726400 | 278395200 |
| 1318910400 | 1413604800 | 0 |
| 1319083200 | 1413777600 | 0 |
when I use this SELECT statment -
So my big question is: why in the heck are 3 out of 4 of my DOBs being set to date of 0 (IE 12/31/1969 on my PC)? Why is this not happening in my other fields?
I can see the data quite well using a more simple SELECT statement and the DOB field looks fine...?
#formatting broken to change some variable names etc.
select * FROM people;
| ref | lastName | firstName | DOB | rN | lN | firstVst | disp | repName | nextVst |
| 10001 | BlankA | NameA | 1968-04-15 | 1000000 | 4600000 | 2008-03-31 | Positive | Patrick Smith | 2014-03-31 |
| 10002 | BlankB | NameB | 1978-10-28 | 1000001 | 4600001 | 2011-08-19 | Positive | Patrick Smith | 2011-08-19 |
| 10003 | BlankC | NameC | 1941-06-08 | 1000002 | 4600002 | 2011-10-18 | Positive | Patrick Smith | 2014-10-18 |
| 10004 | BlankD | NameD | 1952-08-01 | 1000003 | 4600003 | 2011-10-20 | Positive | Patrick Smith | 2014-10-20 |
It's because those DoB's are from before 12/31/1969, and the UNIX epoch starts then, so anything prior to that would be negative.
From Wikipedia:
Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds.
A bit more elaboration: Basically what you're trying to do isn't possible. Depending on what it's for, there may be a different way you can do this, but using UNIX timestamps probably isn't the best idea for dates like that.