How to summarize data by month?

How to summarize data by month? - kql

I am trying to summarize my data monthly.
Using something like
`
bin_at(TimeGenerated, 30d,datetime(2022-01-01 00:00:00))
`
does give me data at an interval of 30 days, but it does not account for the irregularity in dates. Like it does not handle the fact that January has 31 does but feb has only 28.
I read the documentation but I found nothing I could use.
This is what I have tried and if you're aware of anything that might help me, please comment.

startofmonth()
startofmonth(TimeGenerated)
Demo:
range i from 1 to 1000 step 1
| extend TimeGenerated = ago(100d *rand())
| summarize count() by startofmonth = startofmonth(TimeGenerated)
startofmonth
count_
2022-08-01T00:00:00Z
134
2022-09-01T00:00:00Z
312
2022-10-01T00:00:00Z
310
2022-11-01T00:00:00Z
244
Fiddle

Related

I need to generate a loan repayment schedule per date from a summarized loan table in Databricks using spark SQL

I have the following table
LoanID
Event_count
StartDateTime
EndDateTime
Frequency
Amount
12
3
2020-09-01T00:00:00Z
2020-12-01T00:00:00Z
Monthly
120
99
4
2021-01-01T00:00:00Z
2021-10-01T00:00:00Z
Quarterly
50
Column definitions
Event_count is the number of times a repayment is made.
StartDateTime is the time of the first payment.
EndDateTime is the date of the last payment.
Frequency is the interval of payment.
Amount is the sum that is paid back each time.
How do I transform this to the format below? (without using loops as they are not supported by Databricks Spark SQL)
Date
LoanID
RepaymentAmount
RepaymentNumber
2020-09-01
12
120
1
2020-10-01
12
120
2
2020-11-01
12
120
3
2021-01-01
99
50
1
2021-04-01
99
50
2
2021-07-01
99
50
3
2021-10-01
99
50
4

As per today , Databricks SQL is being solved as federated query engine for most of the use cases . So this is not being used for any transformation use case . Thats why Spark SQL support is for basic SQL queries only . So procedure-oriented queries (Like for, while) are not supported on Spark. This would fall under a new feature request and may be implemented in future. You should handle through pyspark or spark SQL .

You should be able to use the range function with a CROSS JOIN, a simple example:
%sql
SELECT *
FROM tmp t
CROSS JOIN range(1, 99) r
WHERE r.id <= t.Event_count
Just a word of caution on the event count - the second argument to range should be the max number of events you can have, I've just guessed at 99 here, plus cross joins can be slow - please test with your data

postgreSQL 1 minute average of values

I need to query from my database (postgreSQL) the value average of 1 minute.
The database is recorded in milliseconds and looks like this:
timestamp | value
------------------
1528029265001 | 123
1528029265020 | 232
1528029265025 | 332
1528029265029 | 511
... | ...
1528029265176 | 231
I tried:
SELECT
avg(value),
extract(minutes FROM to_timestamp(timestamp/1000)) as one_min
FROM table GROUP BY one_min
But it seems to be stuck in querying.
I'm sure there is a super easy way to do it.
Any suggestions?

I am guessing that you want:
SELECT floor(timestamp/(60 * 1000)) as timestamp_minute
avg(value)
FROM table
GROUP BY timestamp_minute;
However, if your problem is performance, this will have the same performance issues. For that, you would want a where clause that limits the amount of data being processed.
Because the data is not being collected at even intervals, you might want the simple average of the first and last values, or something like that.

Select two rows that are closest to a given value

I have the below table in MS Access.
Day ABC
365 25
548 35
730 37
913 58
1095 146
I want to query it such that I get the row before and after a given value of the Day column. This value is variable and can be e.g. value = 432
For this example the query would result the following table.
Day ABC
365 25
548 35
Because the given value = 432 is larger than the Day value of 365 and smaller than the Day value of 548.
What I managed to do is get one field, but not both. The following query gave me correct rows of the Day field.
Select Max(Day) As Day From Table Where Day < 432
UNION
Select Min(Day) As Day From Table Where Day > 432
When I use this code and add another field like ABC I get an error.
You tried to execute a query that does not include the specified
expression 'ABC' as part of an aggregate function.
Could you help me? I think this should be a really easy task. Thank you!

There's a couple of problems with your SQL. The error message occurs because you didn't add a GROUP BY ABC clause to both the SELECT statements.
You'll also get a circular reference as due to MAX(DAY) as DAY - you need to change the as Day to something else maybe as lDay.
You'll probably (I did) get a error Syntax Error calling it Table - pretty sure that's a reserved word.
Maybe this query will work better:
Select Top 1 Day, ABC From Table3 Where Day <= 913 ORDER BY Day Desc
UNION ALL
Select Top 1 Day, ABC From Table3 Where Day >= 913 ORDER BY Day Asc

You could also be a little fancier:
Select Top 2 Day, ABC From Table3 Order By Abs(Day - 432) Desc
Note please, that the result will match the title of the question, not necessarily your detailed explanation ...

Find median of a list of values in Access 2010 - SQL or VBA

I have a list of time-intervals (in seconds) between consecutive datetime-stamped records in a dataset in Access 2010. I want to find the median time interval for each Animal on each Date.
Please can someone tell me how to go about this - either in SQL or VBA?
Example data:
Animal Date Time_interval
1 18/07/14 1
1 18/07/14 18
1 18/07/14 100
1 18/07/14 121
1 18/07/14 156
2 18/07/14 14
2 18/07/14 35
(I also have a field for Time, not included here to keep things simple)
Thanks very much!!

You could run a query with to compare the 2 date/time entries using the DateDiff function.
Here is the setup for DateDiff:
DateDiff ( interval, date1, date2, [firstdayofweek], [firstweekofyear])
From what I understand, create a new query and add a field like this:
median_time_interval: DateDiff ("s", Date, Time)

Using iif to mimic CASE for days of week

I've hit a little snag with one of my queries. I'm throwing together a simple chart to plot a number of reports being submitted by day of week.
My query to start was :
SELECT Weekday(incidentdate) AS dayOfWeek
, Count(*) AS NumberOfIncidents
FROM Incident
GROUP BY Weekday(incidentdate);
This works fine and returns what I want, something like
1 200
2 323
3 32
4 322
5 272
6 282
7 190
The problem is, I want the number returned by the weekday function to read the corresponding day of week, like case when 1 then 'sunday' and so forth. Since Access doesn;t have the SQL server equivalent that returns it as the word for the weekday, I have to work around.
Problem is, it's not coming out the way I want. So I wrote it using iif since I can't use CASE. The problem is, since each iif statement is treated like a column selection (the way I'm writing it), my data comes out unusable, like this
SELECT
iif(weekday(incidentdate) =1,'Sunday'),
iif(weekday(incidentdate) =2,'Monday')
'so forth
, Count(*) AS NumberOfIncidents
FROM tblIncident
GROUP BY Weekday(incidentdate);
Expr1000 Expr1001 count
Sunday 20
Monday 106
120
186
182
164
24
Of course, I want my weekdays to be in the same column as the original query. Halp pls

Use the WeekdayName() function.
SELECT
WeekdayName(Weekday(incidentdate)) AS dayOfWeek,
Count(*) AS NumberOfIncidents
FROM Incident
GROUP BY WeekdayName(Weekday(incidentdate));

As BWS Suggested, Switch was what I wanted. Here's what I ended up writing
SELECT
switch(
Weekday(incidentdate) = 1, 'Sunday',
Weekday(incidentdate) = 2,'Monday',
Weekday(incidentdate) = 3,'Tuesday',
Weekday(incidentdate) = 4,'Wednesday',
Weekday(incidentdate) = 5,'Thursday',
Weekday(incidentdate) = 6,'Friday',
Weekday(incidentdate) = 7,'Saturday'
) as DayOfWeek
, Count(*) AS NumberOfIncidents
FROM tblIncident
GROUP BY Weekday(incidentdate);
Posting this here so there's actual code for future readers
Edit: WeekdayName(weekday(yourdate)) as HansUp said it probably a little easier :)

check this previous post:
What is the equivalent of Select Case in Access SQL?

Why not just create a 7 row table with day number & day name then just join to it?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to summarize data by month? - kql

Related

I need to generate a loan repayment schedule per date from a summarized loan table in Databricks using spark SQL

postgreSQL 1 minute average of values

Select two rows that are closest to a given value

Find median of a list of values in Access 2010 - SQL or VBA

Using iif to mimic CASE for days of week

Categories

Resources