Split date_start and date_end by hours on Metabase - sql

I have table with a column "Begin At" and another column "End At" that represent when a task begin and when a task end i would want to have a Bar display which display the cuantity of tasks that are being done in a specific hour along an interval of time.
For example, from the following table
I would want to be able to see that from 07/12/2021 21:00 to 07/12/2021 22:00 there were 3 tasks being done (row 1, row 2, row 3).
And also as i will have several thousands of rows i would want to use the date widget from metabase in order to specify range of times.
I have been struggling with this from the last week, i tried to create auxiliar questions where to query after but finally my only succeed was to hard code the 24 hours from a day but then i was not able to use the time widget and i needed to specify the dates myself on the sql each time i want to check a specific day and also i only was able to check from 24 to 24 hours, not from for example 02/12/2021 6:00 to 04/12/2021 18:00
My metabase is running on a PostgreSQL database. Is this even possible on Metabase? If not what are your advices to build this? Other plaforms? Pure SQL? Python?
Thank you so much

I am not sure about metabase but from a PostgreSQL point of view this calls for the use of range-types, specifically the tsrange/tstzrange, depending on whether you have time zone information or not.
So a query could be:
SELECT
*
FROM "someTable"
WHERE
tsrange("Begin At", "End At", '[)')
&&
tsrange('02/12/2021 6:00', '04/12/2021 18:00', '[)')
However I don't know how you would get the '02/12/2021 6:00' and '04/12/2021 18:00' out of your metabase user-interface.

Related

Finding the Closest Unbooked Dates Using SQL

Scenario
A user selects a date. Based on the selection I check whether the date & time is booked or not (No issues here).
If a date & time is booked, I need to show them n alternative dates. Based on their date and time parameters, and those proposed alternative dates have to be as close as to their chosen date as possible. The list of alternative dates should start from the date the query is ran on My backend handles this.
My Progress So Far
SELECT alternative_date
FROM GENERATE_SERIES(
TIMESTAMP '2022-08-20 05:00:00',
date_trunc('month', TIMESTAMP '2022-08-20 07:00:00') + INTERVAL '1 month - 1 day',
INTERVAL '1 day'
) AS G(alternative_date)
WHERE NOT EXISTS(
SELECT * FROM events T
WHERE T.bookDate::DATE = G.alternative_date::DATE
)
The code above uses the GENERATE_SERIES(...) function in PSQL. It searches for all dates, starting from 2022-08-20, and up to the end of August. It specifically returns the dates which does not exist in the bookDate column (Meaning it has not yet been booked).
Problems I Need Help With
When searching for alternative dates, I'm providing 3 important things
The user's preferred booking date, so I can suggest which other dates are close to him that he can choose? How would I go about doing this? It's the part where I'm facing most trouble.
The user's start and end times, so when providing a list of alternative dates, I can tell him, hey there's free space between 06 and 07 on the date 2022-08-22 for instance. I'm also facing some issues here, a push in the right track will be great!
I want to add another WHERE but it fails, the current WHERE is a NOT EXISTS so it looks for all dates not equaling to what is given. My other WHERE basically means WHERE the place is open for booking or not.
To get closest free dates, you can ORDER BY your result by "distance" of particular alternative date to user's preferred date - the shortest intervals will be first:
ORDER BY alternative_date - TIMESTAMP '2022-08-20 05:00:00'
If you want to recommend time slots smaller than whole dates (hour range), you need to switch the whole thing from dates to hours, i.e. generate_series from 1 day to 1 hour (or whatever your smallest bookable unit is) and excluse invalid hours (nighttime I assume) in WHERE. From there, it is pretty much the same as with dates.
As for "second where", there can be only one WHERE, but it can be composed from multiple conditions - you can add more conditions using AND operator (and it can also be sub-query if needed):
WHERE NOT EXISTS(
SELECT * FROM events T
WHERE T.bookDate::DATE = G.alternative_date::DATE
) AND NOT EXISTS (
SELECT 1 FROM events WHERE "roomId" = '13b46460-162d-4d32-94c0-e27dd9246c79'
)
(warning: this second sub-query is probably dangerous in real world, since the room will be used more than one time, I assume, so you need to add some time condition to the subquery to check against date)

I would like to know if there's a way to complete this query

I'm trying to obtain the average time of an "activity" in a moodle database, i am not an sql expert, but i have managed to get to the point showed in the picture, my question is if exists a way to obtain, first the timestamp/time difference (this "activity" does not have a starting time column like many others) by day and then sum them all to get the average of that activity , for the first i tried with the function 'EXTRACT()' and comparing the dates in the format "%Y-%m-%d" but only sums the first row where they are equal, by the way i have been doing this just by a sql statement, i know the existence of store procedures but my level of sql is not that high.
Thanks in advance!
data obtained so far
Data on table logs (the most important i think)
component
action
objecttable
userid
courseid
timecreated
mod_quiz*
viewed
quiz_attempts
6
2
1645287525
mod_forum
viewed
forum
5
2
1645288525
core
loggedout
user
2
0
1645291745
mod_page
viewed
page
5
2
1645291955
Data i've trying to get:
Activity
StartTime
EndTime
Total
forum
19:01
19:10
9 minute(s)
quiz
15:45
16:00
15 minute(s)
page
...
...
...
workshop
...
...
...
but so far i get to assort the data in a column
Time
2022-x-x h:m
....
but when i try to sum by day with the function EXTRACT() and trying to match the dates in a very long query it just get the first record.
NOTE: * half of the "activities" were easy to calculate since they have a "timestart" and "timeend" columns but i can not figure out how to solve the ones that do not have a "timestart" column.

Amount of overlaps per minute

I would like to make an SQL-Statement in order to find the amount of users that are using a channel by date and time. Let me give you an example:
Let's call this table Data:
Date Start End
01.01.2020 17:00 17:30
01.01.2020 17:01 17:03
01.01.2020 17:29 18:30
Data is a table that shows when an user started the connection on a channel and the time the connection was closed. A connection can be made any time, which means from 00:00 until the next day.
What I am trying to achieve is to count the maximum number of connections that were made over a big period if time. Let's say 1st February to 1st April.
My idea was to make another table with timestamps in Excel. The table would display a Timestamp for every Minute in a specific date.
Then I tried to make a statement like:
SELECT *
FROM Data,Timestamps
WHERE Timestamps.Time BETWEEN Data.Start AND Data.End.
Now logically this statement does what is supposed to do. The only problem is that it is not really performant and therefore not finishing. With the amount of timestamps and the amount of data I have to check it is not able to finish.
Could anybody help me with this problem? Any other ideas I can try or how to improve my statement?
Regards!
So why dou you create another table in Excel and not directly in MS Access and then why won't you set up the indexes of the timestamps right. That will speed it up by factors.
By the way I think that your statement will print repeat every user that happened to match your Start .. End period, so the amount of rows produced will be enormous. You shall rather try
SELECT Timestamps.Time, COUNT(*)
FROM Data,Timestamps
WHERE Timestamps.Time BETWEEN Data.Start AND Data.End
GROUP BY Timestamps.Time;
But sorry if the syntax in MS Access is different.

Selecting Between Hours with Timestamp in SQL

I need to figure out how I can select the AVG from another column in a table, between two hour time intervals. I am using PL/SQL/Serverpages or PSP, so the user would select their interval of choice from a drop down menu (ex "2PM-4PM, 4PM-6PM",etc.) and then on the second page, using their choice I will provide information from another column in the table. The issue I have is that the format of my timestamp column is:
30-OCT-16 02.52.00.000000000 PM
30-OCT-16 02.54.00.000000000 PM
The way I have been trying to solve this problem is by using the following methodology:
IF number_text = 1 THEN
SELECT AVG(column) INTO avg_power
FROM table
WHERE date_column BETWEEN TO_DATE('12','HH') AND TO_DATE('2','HH')
AND LIKE '%PM';
I am going to use various IF statements in order to activate each select statement with the IF contingent on which interval the user selects from a drop down list.
As I said, the variable time depends on what the user selects on a prior page. My biggest issues in this situation are figuring out how I am supposed to code the WHERE clause as well as finding a way to work with the data, in terms of hours, as it exists in the database, while also taking AM and PM into account. I greatly appreciate any and all help to solve this issue.

How to group time data into buckets in QlikView?

I have a list of times in QlikView. For example:
1:45 am
2:34 am
3:55 am
etc.
How do I split it into groups like this:
1 - 2 am
2 - 3 am
4 - 5 am
etc.
I used the class function, but something is wrong. It works but it doesn't create time buckets, it creates some sort of converted decimal buckets.
You have a couple of options, by far the simplest would be to create a new field which reformats your time field, for example I created TimeBucket which formats the time field into hours, and appends this with the same time but with an hour added for the upper bound:
LOAD
TimeField,
Time(TimeField,'h tt') & ' - ' & Time(TimeField + maketime(1,0,0),'h tt') as TimeBucket;
LOAD
*
INLINE [
TimeField
1:45
2:34
3:55
16:45
17:56
];
This then results in the following:
However, depending on your exact requirements, this solution may have problems due to the nature of Time as this is a dual function.
Another alternative is to use intervalmatch as follows. One point to remember is that intervalmatch includes the end-points in an interval. This means for time, we have to make the "end" times be one second before the start of the next interval, otherwise we will generate two records instead of one if your source data has a time that sits on an interval boundary.
TimeBuckets:
LOAD
maketime(RecNo()-1,0,0) as Start,
maketime(RecNo()-1,59,59) as End,
trim(mid(time(maketime(RecNo()-1),'h tt'),1,2)) & ' - ' & trim(time(maketime(mod(RecNo(),24)),'h tt')) as Bucket
AUTOGENERATE(24);
SourceData:
LOAD
*
INLINE [
TimeField
1:45
2:34
3:55
16:45
17:56
];
BucketedSourceData:
INTERVALMATCH (TimeField)
LOAD
Start,
End
RESIDENT TimeBuckets;
LEFT JOIN (BucketedSourceData)
LOAD
*
RESIDENT TimeBuckets;
DROP TABLES SourceData, TimeBuckets;
This then results in the following:
More information on intervalmatch may be found in both the QlikView installed help as well as the QlikView Reference manual.
Write a nested if statement in your script:
If(TIME>1:45,'bucket 1',
If(TIME>2:45,'bucket 2','Others'
)
)
Not the most elegant, but if you can't get the 1:45 to work with the date() function, you can always convert to military time and just add the hours and minutes, then make buckets out of that.