InfluxDB v1.8: subquery using MAX selector - sql

I'm using InfluxDB 1.8 and trying to make a little more complex query than Influx was made for.
I want to retrieve all data that refers to the last month stored, based on tag and field values that my script stores (not the default "time" field that Influx creates). Say we have this infos measurement:
time
field_month
field_week
tag_month
tag_week
some_data
1631668119209113500
8
1
8
1
random
1631668119209113500
8
2
8
2
random
1631668119209113500
8
3
8
3
random
1631668119209113500
9
1
9
1
random
1631668119209113500
9
1
9
1
random
1631668119209113500
9
2
9
2
random
Which 8 refers to August, 9 to September, and then some_data that is stored on a given week of that month.
I can use MAX selector at field_month to get the last month of the year stored (can't use Flux date package because I'm using v1.8). Further, I want the data grouped by tag_month and tag_week so I can COUNT how many times some_data was stored on each week of the month, that's why the same data is repeated in field and tag keys. Something like that:
SELECT COUNT(field_month) FROM infos WHERE field_month = 9 GROUP BY tag_month, tag_week
Replacing 9 by MAX Selector:
SELECT COUNT(field_month) FROM infos WHERE field_month = (SELECT MAX(field_month) FROM infos) GROUP BY tag_month, tag_week
The first query works (see results here), but not the second.
Am I doing something wrong? Is there any other possibility to make this work in v1.8?
NOTE: I know Influx wasn't supposed to be used like that. I've tried and managed this easily with PostgreSQL, using an adapted form of the second query above. But while we straighten things up to use Postgres, we have to use InfluxDB v1.8.

in postgresql you can try :
SELECT COUNT(field_month) FROM infos WHERE field_month =
(SELECT field_month FROM infos ORDER BY field_month DESC limit 1)
GROUP BY tag_month, tag_week;

Related

Excel Powerpivot measure conundrum- Average (of average?)

I have a powerpivot table that shows work_tickets and timestamps for each step taken towards resolution:
`Ticket | Step | Time | **TicketDuration**
--------------------------------------
1 1 5:30 15
1 2 5:33 15
1 3 5:45 15
2 1 6:00 10
2 2 6:05 10
2 3 6:10 10
[ticketDuration] is a calculated column I added on my own. Now I'm trying to create a measure for the [AverageTicketDuration] so that it returns 12.5 minutes for the table above{ (15+10)/2 }. I haven't got a clue how to use DAX to produce the results. Please help!
What you are looking for is the AVERAGEX function, which has the following definition AVERAGEX(<table>,<expression>)
The idea being that it will iterate though each row of a defined table applying your calculation, then average the results.
In the below example, I use Table1 as the table name.
To start with to iterate along tickets we would use the following VALUES( Table1[ticket]) which will return the unique values in the ticket column.
Then assuming that your ticket duration is always the same within a ticket ID, the aggregation method used in the expression would be Average(Table1[Ticket]). Since for example of ticket 1, (15 + 15 + 15)/3 = 15
Put together the measure would look like below:\
measure:=AVERAGEX( VALUES( Table1[ticket]), AVERAGE(Table1[Ticket Duration]))
The result when dropped into a pivot using your sample data.

How can i get incremental counter with sql?

Can you help me with sql query to get the desired result
Database used :- Redshift
requirement is
I have 3 columns as:- dish_id,cateogory_id,counter
So i want counter to increase +1 if the dish_id is repeated and if not it should remain 1
the query i need should be able to query the source table and get the results as
dish_id category_id counter
21 4 1
21 6 2
21 6 3
12 1 1
Unless I missunderstood your question, you can accomplish that using window functions:
SELECT *,row_number() OVER (PARTITION BY dish_id) FROM my_table;

subtract every next column value from previous?

I have a dataset, where somehow the next singular data is added on top of the previous data for one row, and that for every column, which means,
row with ID 1 is the original pure data, but row with e.g ID 10 has added the data from the previous 9 datasets on itself...
what I now want is to get the original pure data for every distinct item, which means for every ID, how can I substract all data from lets say ID, 10? I would have to substract those of the previous one, for ID 9 and so on...
I want to do this either in SQL Server or in Rapidminer, I am working with those tools, any idea?
here is a sample:
ID col1 col2 col3
1 12 2 3
2 15 5 5
3 20 8 8
so the real correct data for Item with ID 3 is not 20, 8, 8 it is (20-15),(8-5),(8-5) so its 5,3,3...
subtract the later from its previous for every item except the first..
1 12 2 3
Try it out with lag series operator, it will work for sure! To get this operator you should install the series extension from the RM marketplace.
What this operator does - he copies the selected attributes and pushes every row of the example set for one point, so row with ID 1 gets a copy with ID 2 etc (you can also specify the value for a lag). Afterwards you can substract one value from another with Generate Attributes.
I think lag() is the answer to your question:
select (case when id = 1 then col
else col - lag(col) over (order by id)
end)
However, sample data would clarify the question.
Within RapidMiner there is the Differentiate operator contained in the Series extension (which is not installed by default and needs to be downloaded from the RapidMiner Marketplace). This can be used to calculate differences between attributes in adjacent examples.

Group by time span in rails

I want get output from users table based on time of creation of record. Time is stored in created_at column.
Output will be like this:
Time user count
2 am - 6 am 10
6 am - 10 am 5
10 am - 2 pm 5
2 pm - 6 pm 5
6 pm - 10 pm 5
10 pm - 2 am 5
I can't do group by created_at. Solution I found is to create another column say time_span and update that column to 2 am - 6 am if created_at time falls in this span and then I can do group_by on time_span column. Any better solution?
My suggestions is to create another column on the database, this way you avoid calculations on selects at an expense of a simple column.
I'm not sure what you mean by not being able to use group_by, but the following will work:
hours = Users.all.collect {|u| u.created_at.hour}
ranges = [(2...6), (6...10), (10...14), (14...18), (18...22), (22...26)]
summary = hours.group_by {|h| ranges.find {|r| r === (h<2 ? h+24 : h)}}
ranges.each {|r| puts "#{r} = #{(summary[r] || []).length}"}
There are probably opportunities to simplify this and you could push the grouping up into the database if you'd like, but I thought I'd go ahead and share.

MDX query to count number of rows that match a certain condition (newest row for each question, client group)

I have the following fact table:
response_history_id client_id question_id answer
1 1 2 24
2 1 2 27
3 1 3 12
4 1 2 43
5 2 2 39
It holds history of client answers to some questions. The largest response_history_id for each client_id,question_id combination is the latest answer for that question and client.
What I want to do is to count the number of clients whose latest answer falls within a specific range
I have some dimensions:
question associated with question_id
client associated with client_id
response_history_id associated with response_history_id
range associated with answer. 0-20 low, 20-40 = medium, >40 is high
and some measures:
max_history_id as max(response_history_id)
clients_count as disticnt count(client_id)
Now, I want to group only the latest answers by range:
select
[ranges].members on 0,
{[Measures].[clients_count]} on 1
from (select [question].[All].[2] on 1 from [Cube])
What I get is:
Measures All low medium high
clients_count 2 0 2 1
But what I wanted (and I can't get) is the calculation based on the latest answer:
Measures All low medium high
clients_count 2 0 1 1
I understand why my query doesn't give me the desired result, it's more for demonstration purpose. But I have tried a lot of more complex MDX queries and still couldn't get any good result.
Also, I can't generate a static view from my fact table because later on I would like to limit the search by another column in fact table which is timestamp, my queries must eventually be able to get _the number of clients whose latest answer to a question before a given timestamp falls within a specific range.
Can anyone help me with this please?
I can define other dimensions and measures and I am using iccube.