I have a lot of values in a postgres db that include a time value.
The database contains a record unit colors, something like this:
[
{
id: 1234,
unit: 2,
color: "red",
time: "Wed, 16 Dec 2020 21:45:30"
},
{
id: 1235,
unit: 2,
color: "red",
time: "Wed, 16 Dec 2020 21:47:30"
},{
id: 1236,
unit: 6,
color: "blue",
time: "Wed, 16 Dec 2020 21:48:30"
},
{
id: 1237,
unit: 6,
color: "green",
time: "Wed, 16 Dec 2020 21:49:30"
},
{
id: 1237,
unit: 6,
color: "blue",
time: "Wed, 16 Dec 2020 21:49:37"
},
]
I want to be able to query this list but in 10 minute averages, which should return the earliest record which contains the average.
For example in the 10 minute period of 21:40 - 21:50 I should only recieve the 2 unique units with the average value that they had within that time period.
The returned data should look something like this:
[
{
id: 1234,
unit: 2,
color: "red",
time: "Wed, 16 Dec 2020 21:45:30"
},
{
id: 1236,
unit: 6,
color: "blue",
time: "Wed, 16 Dec 2020 21:48:30"
},
]
What type of query should I be using to acheive soething like this?
Thanks
You can use distinct on:
select distinct on (x.time_trunc, t.unit) t.*
from mytable t
cross join lateral (values (
date_trunc('hour', time)
+ extract(minute from time) / 10 * '10 minute'::interval)
) as x(time_trunc)
order by x.time_trunc, t.unit, t.time
The trick is to truncate the timestamps to 10 minute. For this, we use date arithmetics; I moved the computation in a lateral join so there is no need to repeat the expression. Then, distinct on comes into play, to select the earlier record per timestamp bucket and per unit.
I don't see how the question relates to an average whatsoever.
Related
I have current date and I have of list which is coming from server. I want to find all first nearest data.
"Results": [
{
"date": "May 9, 2020 8:09:03 PM",
"id": 1
},
{
"date": "Apr 14, 2020 8:09:03 PM",
"id": 2
},
{
"date": "Mar 15, 2020 8:09:03 PM",
"id": 3
},
{
"date": "May 9, 2020 8:19:03 PM",
"id": 4
}
],
Today date is Wed Jul 20 00:00:00 GMT+01:00 2022 I am getting through this my own StackOverflow. Inside this SO I am taking current date.
Expected Output
[Result(date=May 9, 2020 8:09:03 PM, id = 1), Result(date=May 9, 2020 8:19:03 PM, id = 4)]
So How can I do in idiomatic way in kotlin ?
There are quite a few ways this question could be solved with, ranging from simple to moderately complex depending on requirements such as efficiency. With some assumptions, below is a linear-time solution that is decently idiomatic and is only 3 lines in essence.
import kotlin.math.abs
import java.lang.Long.MAX_VALUE
import java.time.LocalDateTime
import java.time.temporal.ChronoUnit
import java.time.temporal.Temporal
data class Result(val date: LocalDateTime, val id: Int)
fun getClosestDays(toDate: Temporal, results: List<Result>): List<Result> {
// Find the minimum amount of days to the current date
var minimumDayCount = MAX_VALUE
results.forEach { minimumDayCount = minOf(minimumDayCount, abs(ChronoUnit.DAYS.between(toDate, it.date))) }
// Grab all results that match the minimum day count
return results.filter { abs(ChronoUnit.DAYS.between(toDate, it.date)) == minimumDayCount }
}
fun main() {
getClosestDays(
LocalDateTime.now(),
listOf(
Result(LocalDateTime.of(2020, 5, 9, 8, 9, 3), 1),
Result(LocalDateTime.of(2020, 4, 14, 8, 9, 3), 2),
Result(LocalDateTime.of(2020, 3, 15, 8, 9, 3), 3),
Result(LocalDateTime.of(2020, 5, 9, 8, 19, 3), 4)
)
).also { println(it) }
}
Here is the output:
[Result(date=2020-05-09T08:09:03, id=1), Result(date=2020-05-09T08:19:03, id=4)]
And here you can play with it yourself.
Given two arrays like a = [10, 20, 30], and b = [9, 21, 32], how can I construct an array that consists of the minimum or maximum element based on index in snowflake, i.e. the desired output for minimum is [9,20,30] and for the maximum is [10,21,32]?
I looked at snowflake's array functions and didn't find a function that does this.
If the arrays are always the same size (and reusing Lukasz great data cte):
WITH cte AS (
SELECT ARRAY_CONSTRUCT(10, 20, 30) AS a, ARRAY_CONSTRUCT(9, 21, 32) AS b
)
SELECT a,b
,ARRAY_AGG(LEAST(a[n.index], b[n.index])) WITHIN GROUP(ORDER BY n.index) AS min_array
,ARRAY_AGG(GREATEST(a[n.index], b[n.index])) WITHIN GROUP(ORDER BY n.index) AS max_array
FROM cte
,table(flatten(a)) n
GROUP BY 1,2;
gives:
A
B
MIN_ARRAY
MAX_ARRAY
[ 10, 20, 30 ]
[ 9, 21, 32 ]
[ 9, 20, 30 ]
[ 10, 21, 32 ]
And if you have uneven lists:
WITH cte AS (
SELECT ARRAY_CONSTRUCT(10, 20, 30) AS a, ARRAY_CONSTRUCT(9, 21, 32) AS b
union all
SELECT ARRAY_CONSTRUCT(10, 20, 30) AS a, ARRAY_CONSTRUCT(9, 21, 32, 45) AS b
)
SELECT a,b
,ARRAY_AGG(LEAST(a[n.index], b[n.index])) WITHIN GROUP(ORDER BY n.index) AS min_array
,ARRAY_AGG(GREATEST(a[n.index], b[n.index])) WITHIN GROUP(ORDER BY n.index) AS max_array
FROM cte
,table(flatten(iff(array_size(a)>=array_size(b), a, b))) n
GROUP BY 1,2;
A
B
MIN_ARRAY
MAX_ARRAY
[ 10, 20, 30 ]
[ 9, 21, 32 ]
[ 9, 20, 30 ]
[ 10, 21, 32 ]
[ 10, 20, 30 ]
[ 9, 21, 32, 45 ]
[ 9, 20, 30 ]
[ 10, 21, 32 ]
will pick the largest, but given the NULL from the smaller list will cause LEAST/GREATEST to return NULL and ARRAY_AGG drops nulls, you don't even need to size compare, unless you want to NVL/COALESCE that values to safe values for nulls.
SELECT 1 as a, null as b, least(a,b);
gives:
A
B
LEAST(A,B)
1
null
null
like so:
SELECT a,b
,ARRAY_AGG(LEAST(nvl(a[n.index],10000), nvl(b[n.index],10000))) WITHIN GROUP(ORDER BY n.index) AS min_array
,ARRAY_AGG(GREATEST(nvl(a[n.index],0), nvl(b[n.index],0))) WITHIN GROUP(ORDER BY n.index) AS max_array
FROM cte
,table(flatten(iff(array_size(a)>=array_size(b), a, b))) n
GROUP BY 1,2;
A
B
MIN_ARRAY
MAX_ARRAY
[ 10, 20, 30 ]
[ 9, 21, 32 ]
[ 9, 20, 30 ]
[ 10, 21, 32 ]
[ 10, 20, 30 ]
[ 9, 21, 32, 45 ]
[ 9, 20, 30, 45 ]
[ 10, 21, 32, 45 ]
Using numbers table/[] to access elements and ARRAY_AGG to build new arrays:
WITH cte AS (
SELECT ARRAY_CONSTRUCT(10, 20, 30) AS a, ARRAY_CONSTRUCT(9, 21, 32) AS b
), numbers AS (
SELECT ROW_NUMBER() OVER(ORDER BY seq4())-1 AS IND
FROM TABLE(GENERATOR(ROWCOUNT => 10001))
)
SELECT a,b
,ARRAY_AGG(LEAST(a[ind], b[ind])) WITHIN GROUP(ORDER BY n.ind) AS min_array
,ARRAY_AGG(GREATEST(a[ind], b[ind])) WITHIN GROUP(ORDER BY n.ind) AS max_array
FROM cte
JOIN numbers n
ON n.ind < GREATEST(ARRAY_SIZE(a), ARRAY_SIZE(b))
GROUP BY a,b;
Output:
I want to create a query that returns a list of timezones as a result when the following conditions are specified.
select day, hour
...
where
target_datetime between '2021-08-01 00:00:00' and '2021-08-01 23:59:59'
[
{ 'day': '2021-08-01', 'hour': 0 },
{ 'day': '2021-08-01', 'hour': 1 },
{ 'day': '2021-08-01', 'hour': 2 },
...
{ 'day': '2021-08-01', 'hour': 23 }
]
How can I get this?
Use generate_series to create the records from the time interval, and jsonb_build_objebt with jsonb_agg to create your json document:
SELECT
jsonb_agg(
jsonb_build_object(
'day',tm::date,
'hour',EXTRACT(HOUR FROM tm)))
FROM generate_series('2021-08-01 00:00:00'::timestamp,
'2021-08-01 23:59:59'::timestamp,
interval '1 hour') j (tm);
Demo: db<>fiddle
I am using fullcalendar.io.
businessHours: [
{
daysOfWeek: [ 1, 2, 3],
startTime: '10:00',
endTime: '12:00',
},
{
daysOfWeek: [ 1, 2, 3],
startTime: '15:00',
endTime: '18:00',
}
]
I have one custom setting for a store on my website where they are only open between 10 Am to 12 PM and from 3 pm to 6 pm, I am using the full calendar to show their timing but here when I apply this setting its not working.
How can I show grey background during other timing which is not in business hours (i.e in between 12 PM to 3 PM)?
Similar questions asked here before:
Count items for a single key: jq count the number of items in json by a specific key
Calculate the sum of object values:
How do I sum the values in an array of maps in jq?
Question
How to emulate the COUNT aggregate function which should behave similarly to its SQL original? Let's extend this question even more to include other regular SQL functions:
COUNT
SUM / MAX/ MIN / AVG
ARRAY_AGG
The last one is not a standard SQL function - it's from PostgreSQL but is quite useful.
At input comes a stream of valid JSON objects. For demonstration let's pick a simple story of owners and their pets.
Model and data
Base relation: Owner
id name age
1 Adams 25
2 Baker 55
3 Clark 40
4 Davis 31
Base relation: Pet
id name litter owner_id
10 Bella 4 1
20 Lucy 2 1
30 Daisy 3 2
40 Molly 4 3
50 Lola 2 4
60 Sadie 4 4
70 Luna 3 4
Source
From above we get a derivative relation Owner_Pet (a result of SQL JOIN of the above relations) presented in JSON format for our jq queries (the source data):
{ "owner_id": 1, "owner": "Adams", "age": 25, "pet_id": 10, "pet": "Bella", "litter": 4 }
{ "owner_id": 1, "owner": "Adams", "age": 25, "pet_id": 20, "pet": "Lucy", "litter": 2 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pet_id": 30, "pet": "Daisy", "litter": 3 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pet_id": 40, "pet": "Molly", "litter": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 50, "pet": "Lola", "litter": 2 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 60, "pet": "Sadie", "litter": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pet_id": 70, "pet": "Luna", "litter": 3 }
Requests
Here are sample requests and their expected output:
COUNT the number of pets per owner:
{ "owner_id": 1, "owner": "Adams", "age": 25, "pets_count": 2 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pets_count": 1 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pets_count": 1 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pets_count": 3 }
SUM up the number of whelps per owner and get their MAX (MIN/AVG):
{ "owner_id": 1, "owner": "Adams", "age": 25, "litter_total": 6, "litter_max": 4 }
{ "owner_id": 2, "owner": "Baker", "age": 55, "litter_total": 3, "litter_max": 3 }
{ "owner_id": 3, "owner": "Clark", "age": 40, "litter_total": 4, "litter_max": 4 }
{ "owner_id": 4, "owner": "Davis", "age": 31, "litter_total": 9, "litter_max": 4 }
ARRAY_AGG pets per owner:
{ "owner_id": 1, "owner": "Adams", "age": 25, "pets": [ "Bella", "Lucy" ] }
{ "owner_id": 2, "owner": "Baker", "age": 55, "pets": [ "Daisy" ] }
{ "owner_id": 3, "owner": "Clark", "age": 40, "pets": [ "Molly" ] }
{ "owner_id": 4, "owner": "Davis", "age": 31, "pets": [ "Lola", "Sadie", "Luna" ] }
Here's an alternative, not using any custom functions with basic JQ. (I took the liberty to get rid of redundant parts of the question)
Count
In> jq -s 'group_by(.owner_id) | map({ owner_id: .[0].owner_id, count: map(.pet) | length})'
Out>[{"owner_id": "1","pets_count": 2}, ...]
Sum
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, sum: map(.litter) | add})'
Out> [{"owner_id": "1","sum": 6}, ...]
Max
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, max: map(.litter) | max})'
Out> [{"owner_id": "1","max": 4}, ...]
Aggregate
In> jq -s 'group_by(.owner_id) | map({owner_id: .[0].owner_id, agg: map(.pet) })'
Out> [{"owner_id": "1","agg": ["Bella","Lucy"]}, ...]
Sure, these might not be the most efficient implementations, but they show nicely how to implement custom functions oneself. All that changes between the different functions is inside the last map and the function after the pipe | (length, add, max)
The first map iterates over the different groups, taking the name from the first item, and using map again to iterate over the same-group items. Not as pretty as SQL, but not terribly more complicated.
I learned JQ today, and managed to do this already, so this should be encouraging for anyone getting started. JQ is neither like sed nor like SQL, but not terribly hard either.
Extended jq solution:
Custom count() function:
jq -sc 'def count($k): group_by(.[$k])[] | length as $l | .[0]
| .pets_count = $l
| del(.pet_id, .pet, .litter);
count("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"pets_count":2}
{"owner_id":2,"owner":"Baker","age":55,"pets_count":1}
{"owner_id":3,"owner":"Clark","age":40,"pets_count":1}
{"owner_id":4,"owner":"Davis","age":31,"pets_count":3}
Custom sum() function:
jq -sc 'def sum($k): group_by(.[$k])[] | map(.litter) as $litters | .[0]
| . + {litter_total: $litters | add, litter_max: $litters | max}
| del(.pet_id, .pet, .litter);
sum("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"litter_total":6,"litter_max":4}
{"owner_id":2,"owner":"Baker","age":55,"litter_total":3,"litter_max":3}
{"owner_id":3,"owner":"Clark","age":40,"litter_total":4,"litter_max":4}
{"owner_id":4,"owner":"Davis","age":31,"litter_total":9,"litter_max":4}
Custom array_agg() function:
jq -sc 'def array_agg($k): group_by(.[$k])[] | map(.pet) as $pets | .[0]
| .pets = $pets | del(.pet_id, .pet, .litter);
array_agg("owner_id")' source.data
The output:
{"owner_id":1,"owner":"Adams","age":25,"pets":["Bella","Lucy"]}
{"owner_id":2,"owner":"Baker","age":55,"pets":["Daisy"]}
{"owner_id":3,"owner":"Clark","age":40,"pets":["Molly"]}
{"owner_id":4,"owner":"Davis","age":31,"pets":["Lola","Sadie","Luna"]}
This is a nice exercise, but SO is not a programming service, so I will focus here on some key concepts for generic solutions in jq that are efficient, even for very large collections.
GROUPS_BY
The key to efficiency here is avoiding the built-in group_by, as it requires sorting. Since jq is fundamentally stream-oriented, the following definition of GROUPS_BY is likewise stream-oriented. It takes advantage of the efficiency of key-based lookups, while avoiding calling tojson on strings:
# emit a stream of the groups defined by f
def GROUPS_BY(stream; f):
reduce stream as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s else ($s|tojson) end) as $y
| .[$t][$y] += [$x] )
| .[][] ;
distinct and count_distinct
# Emit an array of the distinct entities in `stream`, without sorting
def distinct(stream):
reduce stream as $x ({};
($x|type) as $t
| (if $t == "string" then $x else ($x|tojson) end) as $y
| if (.[$t] | has($y)) then . else .[$t][$y] += [$x] end )
| [.[][]] | add ;
# Emit the number of distinct items in the given stream
def count_distinct(stream):
def sum(s): reduce s as $x (0;.+$x);
reduce stream as $x ({};
($x|type) as $t
| (if $t == "string" then $x else ($x|tojson) end) as $y
| .[$t][$y] = 1 )
| sum( .[][] ) ;
Convenience function
def owner: {owner_id,owner,age};
Example: "COUNT the number of pets per owner"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner) + {pets_count: count_distinct(.[]|.pet_id)}
Invocation: jq -nc -f program1.jq input.json
Output:
{"owner_id":1,"owner":"Adams","age":25,"pets_count":2}
{"owner_id":2,"owner":"Baker","age":55,"pets_count":1}
{"owner_id":3,"owner":"Clark","age":40,"pets_count":1}
{"owner_id":4,"owner":"Davis","age":31,"pets_count":3}
Example: "SUM up the number of whelps per owner and get their MAX"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner)
+ {litter_total: (map(.litter) | add)}
+ {litter_max: (map(.litter) | max)}
Invocation: jq -nc -f program2.jq input.json
Output: as given.
Example: "ARRAY_AGG pets per owner"
GROUPS_BY(inputs; .owner_id)
| (.[0] | owner) + {pets: distinct(.[]|.pet)}
Invocation: jq -nc -f program3.jq input.json
Output:
{"owner_id":1,"owner":"Adams","age":25,"pets":["Bella","Lucy"]}
{"owner_id":2,"owner":"Baker","age":55,"pets":["Daisy"]}
{"owner_id":3,"owner":"Clark","age":40,"pets":["Molly"]}
{"owner_id":4,"owner":"Davis","age":31,"pets":["Lola","Sadie","Luna"]}