Get average of rows group by value intervals - sql

I have a table as follows:
ID | Value
1 5
1 1000
1 1500
2 1000
2 1800
3 40
3 1000
3 1200
3 2000
3 2500
I want to obtain the average of each ID groupped by a given range r of value. For instance, if in this case r=1000, The expected result would be:
ID | Value
1 5
1 1250
2 1400
3 40
3 1100
3 2250
I have seen that this can be done with time intervals as seen here. My question is, how can I perform this type of group by operation for integer/float types?

You could try this way:
SELECT id, avg(value) as AvgValue
FROM (SELECT id, value, ROUND(value/1000, 0) AS range FROM yourtable) t
GROUP BY id, range

Related

Average of Rows Based On Another Columns Values

I have a table like this. I'm looking for a clean way in SQL to create a new column with the average between the Column 2 values for the rows where Column 1 equals 1 and 2 for each id.
I have some ideas on gross ways to do this, but I am looking for a straightforward solution since this seems like it should not be too difficult.
ID
Column 1
Column 2
1
1
100
1
2
75
1
3
50
2
1
45
2
2
90
2
3
60
Use window function avg with filtering.
select *,
avg(column2)
filter (where column1 in (1,2))
over (partition by id) as avrg
from the_table;
id
column1
column2
avrg
1
1
100
87.50
1
2
75
87.50
1
3
50
87.50
2
1
45
67.50
2
2
90
67.50
2
3
60
67.50
db-fiddle

Group by and provide groups only if unique in group

i have the Following dataset :
Amount Document Number
0 200 12345
1 90 2222
2 200 456789
3 90 4444
4 300 4789
5 300 4789
So basically i want to get group numbers for the above data (using ngroup maybe)
Grouping the data on the basis of amount. assign a group number to one group only if the Document numbers in that group has unique numbers.
This is what i would like the outcome to be.
Amount Document Number Group
0 200 12345 1
1 90 2222 2
2 200 456789 1
3 90 4444 2
4 300 4789
5 300 4789
Grouping the data on the basis of amount. assign the rows to one group only if the Document number is a unique number.
I think you want rank():
select t.*, rank() over (order by amount, document_number) as grouping
from t;
In pandas, you could first create a mask where any group by amount has a dup is flagged as False with groupby.transform and duplicated, then use this mask and groupby.ngroup like:
mask_dup = ~(df.duplicated().groupby(df['Amount']).transform(any))
df.loc[mask_dup, 'Group'] = df[mask_dup].groupby('Amount').ngroup()+1
print (df)
Amount Document Number Group
0 200 12345 2.0
1 90 2222 1.0
2 200 456789 2.0
3 90 4444 1.0
4 300 4789 NaN
5 300 4789 NaN
if you have more than the two columns at first you need to specify the subset in duplicated

re-indexing duplicate rows

Hi I have a table below;
ID length
1 1050
1 1000
1 900
1 600
2 545
2 434
3 45
3 7
4 5
I need an SQL code to make the below table
ID IDK length
1 1 1050
1 2 1000
1 3 900
1 4 600
2 1 545
2 2 434
3 1 45
3 2 7
4 1 5
IDK is the new column to reindexing the same ID according to ascending order of length.
Thank you very much
This is a pain in MS Access. Here is one way using a correlated subquery:
select t.*,
(select count(*)
from foo as t2
where t2.id = t.id and t2.length >= t.length
) as idk
from foo as t;

Possible to group by counts?

I am trying to change something like this:
Index Record Time
1 10 100
1 10 200
1 10 300
1 10 400
1 3 500
1 10 600
1 10 700
2 10 800
2 10 900
2 10 1000
3 5 1100
3 5 1200
3 5 1300
into this:
Index CountSeq Record LastTime
1 4 10 400
1 1 3 500
1 2 10 700
2 3 10 1000
3 3 5 1300
I am trying to apply this logic per unique index -- I just included three indexes to show the outcome.
So for a given index I want to combine them by streaks of the same Record. So notice that the first four entries for Index 1 have Records 10, but it is more succinct to say that there were 4 entries with record 10, ending at time 400. Then I repeat the process going forward, in sequence.
In short I am trying to perform a count-grouping over sequential chunks of the same Record, within each index. In other words I am NOT looking for this:
select index, count(*) as countseq, record, max(time) as lasttime
from Table1
group by index,record
Which combines everything by the same record whereas I want them to be separated by sequence breaks.
Is there a way to do this in SQL?
It's hard to solve your problem without having a single primary key, so I'll assume you have a primary key column that increases each row (primkey). This request would return the same table with a 'diff' column that has value 1 if the previous primkey row has the same index and record as the current one, 0 otherwise :
SELECT *,
IF((SELECT index, record FROM yourTable p2 WHERE p1.primkey = p2.primkey)
= (SELECT index, record FROM yourTable p2 WHERE p1.primkey-1 = p2.primkey), 1, 0) as diff
FROM yourTable p1
If you use a temporary variable that increases each time the IF expression is false, you would get a result like this :
primkey Index Record Time diff
1 1 10 100 1
2 1 10 200 1
3 1 10 300 1
4 1 10 400 1
5 1 3 500 2
6 1 10 600 3
7 1 10 700 3
8 2 10 800 4
9 2 10 900 4
10 2 10 1000 4
11 3 5 1100 5
12 3 5 1200 5
13 3 5 1300 5
Which would solve your problem, you would just add 'diff' to the group by clause.
Unfortunately I can't test it on sqlite, but you should be able to use variables like this.
It's probably a dirty workaround but I couldn't find any better way, hope it helps.

Sum operation performed on rows till specified value: a new row for each group for which the sum exceeds the specified value

CREATE TABLE TEMP(RESOURCE_VALUE VARCHAR2(63 BYTE),TOT_COUNT NUMBER)
I want an query which can extract the range from which to which I want to have breakup of the sum records to XYZ value. I will say 50,000 is the break up need. Then it has to display all the ranges from which RESOURCE_VALUE to which RESOURCE_VALUE I can get sum <=50,000. One RESOURCE_VALUE value can be included in only one range.
Example: sample data
The Below Is The input
resource_value | tot_count
---------------+----------
1 100
2 50
3 20
4 30
5 300
6 250
7 200
8 30
9 60
10 200
11 110
12 120
Then the output has to be something like this :
sample output 1: when sum(tot_count)<=300
start resource_value endresource_value sum
---------------------+---------------------+-------
1 4 300
5 5 300
6 6 250
7 9 290
10 10 200
11 12 230
sample output 2: when sum(tot_count)<=500
start resource_value end resource_value sum
---------------------+---------------------+------
1 4 300
5 5 300
6 8 480
9 12 490
I just guess that you use ORACLE, because of your table structure, and in oracle you can use this query to get your aim:
with vw1(val,flg,sumval) as
(select 1 val,0 flg,TOT_COUNT sumval
from TEMP where RESOURCE_VALUE = '1'
union all
select vw1.val + 1 val,
case when vw1.sumval + t1.TOT_COUNT > 300 then vw1.flg + 1 else vw1.flg end flg,
case when vw1.sumval + t1.TOT_COUNT > 300 then t1.TOT_COUNT else vw1.sumval + t1.TOT_COUNT end sumval
From TEMP t1,vw1 WHERE t1.RESOURCE_VALUE = TO_CHAR(vw1.val + 1))
select min(val) START_RESOURCE_VALUE,max(val) END_RESOURCE_VALUE,
max(sumval) "SUM" from vw1 group by flg order by min(val);
SQL Fiddle