Need query to find rows that may have 4, 5 or 6 consecutive numbers - sql

Trying to come up with a query that will find rows that contain 4, 5 or 6 consecutive numbers.
For example: Table MyNumbers contains 6 columns of number combinations from 1 to 52.
Cloumn names are: nbr1 nbr2 nbr3 nbr4 nbr5 nbr6
Row one contains: 1 5 43 50 51 52
Row two contains: 41 42 43 44 45 52 <----- five consecutive numbers
Row three contains: 8 14 38 39 42 50
Row four contains: 1 2 3 4 15 29 <----- four consecutive numbers
Row five contains: 8 14 24 36 48 51
Row six contains: 1 2 3 4 5 6 <----- six consecutive numbers
Need to come up with a query that would find rows 2, 4 and 6 based on containing a result set where there were 4 or more consecutive numbers in that row.
I created a database that contains all possible combinations for a 6 numbers out of 52 (1 to 52). What I would like to do is eliminate rows that have four or more numbers that are consecutive. So I am not sure above would do the trick. For those that asked, I am using sql server 2008 R2.

Assuming the numbers are always increasing, and not repeating
select *
from mynumbers
where nbr4 - nbr1 = 3
or nbr5 - nbr2 = 3
or nbr6 - nbr3 = 3
I took the liberty of simplifying it to the fact that for a series of 6 consecutive numbers, there must already be a series of 4 consecutive numbers.

Related

Group rows using the cumulative sum of a third column

I have a table with two columns:
sort_column = A column I use for sorting
value_column = My metric of interest (a positive integer)
Using SQL, I need to create contiguous groups of rows, ordered by sort_column, such that the sum of value_column within each group is the largest possible but staying below 100 (100 not included).
Find below an example of my desired result.
Thanks
sort_column
value_column
desired_result
1
53
1
2
25
1
3
33
2
4
25
2
5
10
2
6
46
3
7
9
3
8
49
4
9
48
4
10
53
5
11
33
5
12
52
6
13
29
6
14
16
6
15
66
7
16
1
7
17
62
8
18
57
9
19
47
10
20
12
10
Ok, so after a few lengthy attempts, I came to the conclusion the task is impossible with pure SQL, because a given value of the desired column depends on previous values of that same column, in a way that cannot be obtained from the first two columns alone, so the problem is impossible to tackle without using a recursive CTE, which BigQuery does not support.
I solved the issue by writing a javascript UDF for the task. It seems to be working fine and produces the expected results.
Many thanks everyone!

iterrows() of 2 columns and save results in one column

in my data frame I want to iterrows() of two columns but want to save result in 1 column.for example df is
x y
5 10
30 445
70 32
expected output is
points sequence
5 1
10 2
30 1
445 2
I know about iterrows() but it saved out put in two different columns.How can I get expected output and is there any way to generate sequence number according to condition? any help will be appreciated.
First never use iterrows, because really slow.
If want 1, 2 sequence by number of columns convert values to numy array by DataFrame.to_numpy and add numpy.ravel, then for sequence use numpy.tile:
df = pd.DataFrame({'points': df.to_numpy().ravel(),
'sequence': np.tile([1,2], len(df))})
print (df)
points sequence
0 5 1
1 10 2
2 30 1
3 445 2
4 70 1
5 32 2
Do this way:
>>> pd.DataFrame([i[1] for i in df.iterrows()])
points sequence
0 5 1
1 10 2
2 30 1
3 445 2

Create new ID based on cumulative sum in excel vba

I need to create a new transport ID based on the cumulative sum of the volume being transported. Let´s say that originally everything was transported in truck A with a capacity of 25. Now I want to assign these items to shipments with truck B (Capacity 15).
The only real constraint is amt shipped cannot exceed capacity.
I can´t post a picture because of the restrictions...but the overall set up would be like this:
Old Trans # Volume New Trans # Cumulative Volume for Trans
1 1
1 9
1 3
1 7
1 4
2 9
2 10
3 8
3 5
3 9
4 4
4 6
4 8
5 9
5 1
5 5
5 8
6 3
6 4
6 3
6 4
6 4
6 7
7 7
7 10
7 4
8 10
8 6
8 7
9 4
9 9
9 6
10 7
10 4
10 1
10 1
10 5
10 2
11 9
11 3
11 9
12 8
12 5
12 9
13 9
Expected output would be that the first three entries would result in a new shipment ID of 1;the next two entries would result in a new shipment ID of 2;and so on... I´ve tried everthing that I know(excluding VBA): Index/lookup/if functions. My VBA skills are very limited though.Any tips?? thanks!
I think I see what you're trying to do here, and just using an IF formula (and inserting a new column to keep track):
In the Columns C and D, insert these formulas in row 3 and copy down (changing 15 for whatever you want your new volume capacity to be):
Column C: =IF(B3+C2<15,B3+C2,B3)
Column D: =IF(B3+C2<15,D2,D2+1)
And for the cells C2 and D2:
C2: = B2
D2: = A2
Is this what you're looking to do?
A simple formula could be written that 'floats' the range totals for each successive load ID.
In the following, I've typed 25 and 15 in D1:E1 and used a custom number format of I\D 0. In this way, the column is identified and the cell can be referenced as a true number load limit. You can hard-code the limits into the formula if you prefer by overwriting D$1 but you will not have a one-size-fits-all formula that can be copied right for alternate load limits as I have in my example..
      
The formula in D2 is,
=IF(ROW()=2, 1, (SUM(INDEX($B:$B, MATCH(D1, D1:D$1, 0)):$B2)>D$1)+ D1)
Fill right to E2 then down as necessary.

PowerPivot formula for row wise weighted average

I have a table in PowerPivot which contains the logged data of a traffic control camera mounted on a road. This table is filled the velocity and the number of vehicles that pass this camera during a specific time(e.g. 14:10 - 15:25). Now I want to know that how can I get the average velocity of cars for an specific hour and list them in a separate table with 24 rows(hour 0 - 23) where the second column of each row is the weighted average velocity of that hour? A sample of my stat_table data is given below:
count vel hour
----- --- ----
133 96.00237 15
117 91.45705 21
81 81.90521 6
2 84.29946 21
4 77.7841 18
1 140.8766 17
2 56.14951 14
6 71.72839 13
4 64.14309 9
1 60.949 17
1 77.00728 21
133 100.3956 6
109 100.8567 15
54 86.6369 9
1 83.96901 17
10 114.6556 21
6 85.39127 18
1 76.77993 15
3 113.3561 2
3 94.48055 2
In a separate PowerPivot table I have 24 rows and 2 columns but when I enter my formula, the whole rows get updated with the same number. My formula is:
=sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count] * stat_table[vel])/sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count])
Create a new calculated column named "WeightedVelocity" as follows
WeightedVelocity = [count]*[vel]
Create a measure "WeightedAverage" as follows
WeightedAverage = sum(stat_table[WeightedVelocity]) / sum(stat_table[count])
Use measure "WeightedAverage" in VALUES area of pivot Table and use "hour" column in ROWS to get desired result.

Excluding rows dynamically

Let's assume we have the following:
A
1 10
2 20
3 30
4 20
5 10
6 30
7 20
8
9
10 =(AVERAGE(A1:A7)
11 4
12 6
I would like to be able to find a way to calculate the Average of A1-A7 into cell A10 while excluding row range defined in A11 and A12. That is, according to the above setup the result should be 20:
((10 + 20 + 30 + 20) / 4) = 20
because if rows 4,5 and 6 are excluded what's left is rows 1,2,3,7 to be averaged.
Two other options:
=AVERAGE(FILTER(A1:A7,ISNA(MATCH(ROW(A1:A7),A11:A12,0))))
=ArrayFormula(AVERAGEIF(MATCH(ROW(A1:A7),A11:A12,0),NA(),A1:A7))
Seems to meet your requirement, though not flexible:
=(sum(A1:A7)-indirect("A"&A11)-indirect("A"&A12))/(count(A1:A7)-2)
Adjust re misunderstanding of requirements:
=(SUM(A1:A7)-SUM(INDIRECT("A"&A11&":A"&A12)))/(COUNT(A1:A7)-A12+A11-1)