Group by time span in rails - sql

I want get output from users table based on time of creation of record. Time is stored in created_at column.
Output will be like this:
Time user count
2 am - 6 am 10
6 am - 10 am 5
10 am - 2 pm 5
2 pm - 6 pm 5
6 pm - 10 pm 5
10 pm - 2 am 5
I can't do group by created_at. Solution I found is to create another column say time_span and update that column to 2 am - 6 am if created_at time falls in this span and then I can do group_by on time_span column. Any better solution?

My suggestions is to create another column on the database, this way you avoid calculations on selects at an expense of a simple column.

I'm not sure what you mean by not being able to use group_by, but the following will work:
hours = Users.all.collect {|u| u.created_at.hour}
ranges = [(2...6), (6...10), (10...14), (14...18), (18...22), (22...26)]
summary = hours.group_by {|h| ranges.find {|r| r === (h<2 ? h+24 : h)}}
ranges.each {|r| puts "#{r} = #{(summary[r] || []).length}"}
There are probably opportunities to simplify this and you could push the grouping up into the database if you'd like, but I thought I'd go ahead and share.

Related

InfluxDB v1.8: subquery using MAX selector

I'm using InfluxDB 1.8 and trying to make a little more complex query than Influx was made for.
I want to retrieve all data that refers to the last month stored, based on tag and field values that my script stores (not the default "time" field that Influx creates). Say we have this infos measurement:
time
field_month
field_week
tag_month
tag_week
some_data
1631668119209113500
8
1
8
1
random
1631668119209113500
8
2
8
2
random
1631668119209113500
8
3
8
3
random
1631668119209113500
9
1
9
1
random
1631668119209113500
9
1
9
1
random
1631668119209113500
9
2
9
2
random
Which 8 refers to August, 9 to September, and then some_data that is stored on a given week of that month.
I can use MAX selector at field_month to get the last month of the year stored (can't use Flux date package because I'm using v1.8). Further, I want the data grouped by tag_month and tag_week so I can COUNT how many times some_data was stored on each week of the month, that's why the same data is repeated in field and tag keys. Something like that:
SELECT COUNT(field_month) FROM infos WHERE field_month = 9 GROUP BY tag_month, tag_week
Replacing 9 by MAX Selector:
SELECT COUNT(field_month) FROM infos WHERE field_month = (SELECT MAX(field_month) FROM infos) GROUP BY tag_month, tag_week
The first query works (see results here), but not the second.
Am I doing something wrong? Is there any other possibility to make this work in v1.8?
NOTE: I know Influx wasn't supposed to be used like that. I've tried and managed this easily with PostgreSQL, using an adapted form of the second query above. But while we straighten things up to use Postgres, we have to use InfluxDB v1.8.
in postgresql you can try :
SELECT COUNT(field_month) FROM infos WHERE field_month =
(SELECT field_month FROM infos ORDER BY field_month DESC limit 1)
GROUP BY tag_month, tag_week;

Finding the exact overlapping time

with tickets as (
select o.SSTID, o.open_Id, o.Createddatetime openTime, c.Createddatetime closeTime
from dbo.Close_ticket c
inner join dbo.Openticket o ON o.SSTID = c.SSTID and c.Open_ID=o.open_id
)
select t1.SSTID,
SUM(isnull(datediff(hour
, case when t1.openTime > t2.openTime then t1.openTime else t2.openTime end
, case when t1.closeTime > t2.closeTime then t2.closeTime else t1.closeTime end),0)) as [OverLappingtime]
from tickets t1
left join tickets t2 on t1.SSTID = t2.SSTID
and t1.openTime < t2.closeTime and t2.openTime < t1.closeTime
and t1.open_id < t2.open_id
group by t1.SSTID
This is my code where each ticket is compared to every other ticket to find the total overlapping time. But if I create more tickets the total time exceeds 24 hours when all the tickets where created on the same day. How can I find the exact overlapping time? If we see the first three tickets, the 2nd and the third ticket were opened and closed within the opening and closing time of the first ticket.
I need the exact overlapping time.
This is my Openticket table.
[Open_ID,SSTID,Createddatetime]
- 1,1,2020-04-27 06:40:32.337
- 2,1,2020-04-27 12:40:32.337
- 3,1,2020-04-27 14:40:32.337
- 4,1,2020-04-27 15:40:32.337
- 5,1,2020-04-27 18:40:32.337
This is my Close_ticket table.
[Close_id,open_id,SSTID,Createddatetime]
- 1,1,1,2020-04-27 20:40:32.337
- 2,2,1,2020-04-27 15:40:32.337
- 3,3,1,2020-04-27 16:40:32.337
- 4,4,1,2020-04-27 17:40:32.337
- 5,5,1,2020-04-27 21:40:32.337
You keep saying "the logic I've used so far is the one I mentioned" but at no point have you actually mentioned this logic in any useful form so that anyone can understand what it is you are doing: all you are doing is stating numbers with no indication on how you calculated these numbers.
Please provide a step by step guide to show how you calculated an overlap figure of 4 hours for the first 3 tickets.
For example, taking your data but moving the start/end times to the hour (rather than 40:32.337) for the sale of simplicity, we have this:
Possible overlap calculations:
2 overlaps 1 by 3 hours => overlap is 3
3 overlaps 1 by 3 hours => overlap is 3
You want to calculate overlap of both 2 & 3 compared to 1: 3 + 3 = 6
You only want the overlap when all 3 tickets overlap: 1
You don't want to double count any overlap: 2 overlaps 1 by 3 hours, 3 overlaps 1 by 3 hours, 2 & 3 overlap each other by 1 hour (double count) => 3 + 3 - 1 = 4
So which of these possible calculations are you using or are you using completely different logic and, if so, what it that logic?

Excel Powerpivot measure conundrum- Average (of average?)

I have a powerpivot table that shows work_tickets and timestamps for each step taken towards resolution:
`Ticket | Step | Time | **TicketDuration**
--------------------------------------
1 1 5:30 15
1 2 5:33 15
1 3 5:45 15
2 1 6:00 10
2 2 6:05 10
2 3 6:10 10
[ticketDuration] is a calculated column I added on my own. Now I'm trying to create a measure for the [AverageTicketDuration] so that it returns 12.5 minutes for the table above{ (15+10)/2 }. I haven't got a clue how to use DAX to produce the results. Please help!
What you are looking for is the AVERAGEX function, which has the following definition AVERAGEX(<table>,<expression>)
The idea being that it will iterate though each row of a defined table applying your calculation, then average the results.
In the below example, I use Table1 as the table name.
To start with to iterate along tickets we would use the following VALUES( Table1[ticket]) which will return the unique values in the ticket column.
Then assuming that your ticket duration is always the same within a ticket ID, the aggregation method used in the expression would be Average(Table1[Ticket]). Since for example of ticket 1, (15 + 15 + 15)/3 = 15
Put together the measure would look like below:\
measure:=AVERAGEX( VALUES( Table1[ticket]), AVERAGE(Table1[Ticket Duration]))
The result when dropped into a pivot using your sample data.

Very simple BigQuery SQL script won't return "0" for Count rows with no results

I am trying to make this very simple SQL script work:
SELECT
DATE(SEC_TO_TIMESTAMP(created_utc)) date_submission,
COUNT(*) AS num_apples_oranges_submissions
FROM
[fh-bigquery:reddit_comments.2008]
WHERE
(LOWER(body) CONTAINS ('apples')
AND LOWER(body) CONTAINS ('oranges'))
GROUP BY
date_submission
ORDER BY
date_submission
The results look like this:
1 2008-01-07 3
2 2008-01-08 1
3 2008-01-09 2
4 2008-01-10 3
5 2008-01-11 2
6 2008-01-13 2
7 2008-01-15 2
8 2008-01-16 3
As you can see, for days where there were no submissions containing both "apples" and "oranges", instead of a value of 0 being returned, the entire row is simply missing (such as on the 12th and 14th).
How can I fix this? I'm at my wits end. Thank you.
Try below, it will return all submissions days
SELECT
DATE(SEC_TO_TIMESTAMP(created_utc)) date_submission,
SUM((LOWER(body) CONTAINS ('apples') AND LOWER(body) CONTAINS ('oranges'))) AS num_apples_oranges_submissions
FROM
[fh-bigquery:reddit_comments.2008]
GROUP BY
date_submission
ORDER BY
date_submission

SQL Update each record with its position in an ordered select

I'm using Access via OleDb. I have a table with columns ID, GroupID, Time and Place. An application inserts new records into the table, unfortunately the Place isn't calculated correctly.
I want to update each record in a group with its correct place according to its time ascending.
So assume the following data:
ID GroupId Time Place
Chuck 1 10:01 2
Alice 1 09:01 3
Bob 1 09:31 1
should result in:
ID GroupId Time Place
Chuck 1 10:01 3
Alice 1 09:01 1
Bob 1 09:31 2
I could come up with a solution using a cursor but that's AFAIK not possible in Access.
I just did a search on performing "ranking in Access" and I got this support.microsoft result.
It seems you create a query with a field that has the following expression:
Place: (Select Count(*) from table1 Where [Time] < [table1alias].[Time]) + 1
I can't test this, so I hope it works.
Using this you may be able to do (where queryAbove is the above query):
UPDATE table1
SET [Place] = queryAbove.[Place]
FROM queryAbove
WHERE table1.ID = queryAbove.ID
It's a long shot but please give it a go.
I don't think time is a number or time formatted column, time is unfortunately a text string containing the numbers and dilimetrs of the time format. This is why sorting after the time column is illegal. Removing the dilimiters ":" and "," casting to integer and then sorting numirically could do the job