Redshift - Breaking number into 10 parts and finding which part does a number fall into

Redshift - Breaking number into 10 parts and finding which part does a number fall into - sql

I am trying to break down a given number into 10 equal parts and then compare a row of numbers and see in which of the 10 parts they fall under.
ref_number, number_to_check
70, 34
70, 44
70, 14
70, 24
In the above data set, I would like to break 70 into 10 equal parts (in this case it would be 7,14,21, and so on till 70). Next I would like to see in which "part" does the value in column "number_to_check" fall into.
Output expected:
ref_number, number_to_check, part
70, 34, 5
70, 44, 7
70, 14, 2
70, 24, 4

You want arithmetic. If I understand correctly:
select ceiling(number_to_check * 10.0 / ref_number)
Here is a db<>fiddle (the fiddle happens to use Postgres).

Related

Postgis: How do I select every second point from LINESTRING?

In DBeaver I have a table containing some GPS coordinates stored as Postgis LINESTRING format.
My questions is: If I have, say, this info:
LINESTRING(20 20, 30 30, 40 40, 50 50, 60 60, 70 70)
which built-in ST function can I use to get every N-th element in that LINESTRING? For example, if I choose 2, I would get:
LINESTRING(20 20, 40 40, 60 60)
, if 3:
LINESTRING(20 20, 50 50)
and so on.
I've tried with ST_SIMPLIFY and ST_POINTN, but that's now exactly what I need because I still want it to stay a LINESTRING but just with less points (lower resolution).
Any ideas?
Thanks :-)

Welcome to SO. Have you tried using ST_DumpPoints and applying a module % over the vertices path? e.g. every second record:
WITH j AS (
SELECT
ST_DumpPoints('LINESTRING(20 20, 30 30, 40 40, 50 50, 60 60, 70 70)') AS point
)
SELECT ST_AsText(ST_MakeLine((point).geom)) FROM j
WHERE (point).path[1] % 2 = 0;
st_astext
-------------------------------
LINESTRING(30 30,50 50,70 70)
(1 Zeile)
Further reading:
ST_MakeLine
CTE

ST_Simplify should return a linestring unless the simplification results in an invalid geometry for a lingstring, e.i., less than 2 vertex. If you always want to return a linestring consider ST_SimplifyPreserveTopology . It ensures that at least two vertices are returned in a linestring.
https://postgis.net/docs/ST_SimplifyPreserveTopology.html

self join data to get distinct years

I have this dataframe that I need to join in order to find the academic-years.
df11=pd.read_csv('https://s3.amazonaws.com/todel1623/myso.csv')
df11.course_id.value_counts()
274 3
285 2
260 1
I can use self-join and get the respective years without any problem.
df=df11.merge(df11[['course_id']], on='course_id')
df.course_id.value_counts()
274 9
285 4
260 1
But the expected count in this case is
274 6
285 4
260 2
This is because even if there are 3 years for id 274, the course duration is only 24 months. And even if there is only 1 record for 260 since the duration is 24 months, it should return 2 records. (once for current year and the other current_year + 1), rest of the column values being same for that group.
Can I write a loop for dataframe something like this?
for row in df:
if i in range((df.duration_inmonths / 12)):
df.row.year= df.row.year + i
df.append(df.row)
In the following case, the first record should be 2017 and not 2018.
myl=list()
for row in df11.values:
for i in range(int(row[15]/12)):
row[5]=row[5]+i
myl.append(row)
myl[:2]
[array([383, 1102, 'C-43049', 'M.B.A./M.M.S.', 'Un-Aided', 2018, 80000,
8000, 900, 312, 89212, 2018, 12, 260, 95, 24, 1102.0,
'M.B.A./M.M.S.'], dtype=object),
array([383, 1102, 'C-43049', 'M.B.A./M.M.S.', 'Un-Aided', 2018, 80000,
8000, 900, 312, 89212, 2018, 12, 260, 95, 24, 1102.0,
'M.B.A./M.M.S.'], dtype=object)]

numpy array do not seem to append in a list with changed values. It worked when I converted it to list.
myl.append(row.tolist())

Dask aggregate value into fixed range with start and end time?

In dask or even pandas how would you go about grouping an dask data frame that has a 3 columns of time / level / spread into a set of fixed ranges by time.
Time is only used to move one direction. Like a loop counting up. So the end result would be start time and end time with high of level, low of level, first value of level and last value of level over the fixed range? Example
12:00:00, 10, 1
12:00:01, 11, 1
12:00:02, 12, 1
12:00:03, 11, 1
12:00:04, 9, 1
12:00:05, 6, 1
12:00:06, 10, 1
12:00:07, 14, 1
12:00:08, 11, 1
12:00:09, 7, 1
12:00:10, 13, 1
12:00:11, 8, 1
For a fixed level range of (7). So level from start to end can not be more than 7 total distance from start to end for each bin of level. Just because first bin is only 8 difference in time and second is only 2 different in time, this dose not madder one the high to low madders that the total distance from high to low dose not go passed 7 the fixed bin size. The first bin could have been 5 not 8 for first bin and 200 for next bin not 2 in the example below. So the First few rows in dask would look something like this.
First Time, Last Time, High Level, Low Level, First Level, Last Level, Spread
12:00:00, 12:00:07, 13, 6, 10, 13, 1
12:00:07, 12:00:09, 14, 7, 13, 7, 1
12:00:09, X, 13, 7, X, X, X
How could this be aggregated in dask with a fix window of level moving forward in time binning each time level moves above X or equal too high/low with in X or below X?

How many different orders are possible in which the key values can occur while searching for a particular key?

When searching for the key value 60 in a binary search tree, nodes containing the key values 10, 20, 40, 50, 70, 80, 90 are
traversed, not necessarily in the order given. How many different orders are possible in which these key values can occur on the
search path from the root node containing the value 60?

The answer is 7C4.
Searching for 60, we encounter 4 keys less than 60 (10,20,40,50) & 3 keys greater (70,80,90).
The four lesser keys must appear in ascending order while the three greater ones must appear in descending order otherwise some keys will be left out in the traversal.
Note that in the traversal, these lesser keys might not be continuous and can be separated by greater keys.
For eg: 10, 90, 20, 30, 80, 40, 70, 50
but the order of both groups of keys (lesser and greater) remains the same individually.
Now, out of total seven positions, the lesser keys acquire four positions which can be selected in 7C4 ways. Once we know the places of these four keys, the places of the three greater keys only have one permutation.
For eg: if we know that
10, _, 20, 30, _, 40, _, 50
Then there's only one permutation of places for 90, 80 & 70 which is place number 2, 5 & 7 respectively.
Therefore for each combination of lesser keys, there's a unique combination of greater keys.
So total combinations= 7C4*1
=35 ways

How to floor a number in sql based on a range

I would like to know if there is a function or some sort of way to round a number to lowest whole value. Something like floor, but to a specified increment like 10. So for example:
0.766,5.0883, 9, 9.9999 would all be floored to 0
11.84848, 15.84763, 19.999 would all be floored to 10
etc...
I'm basically looking to fit numbers in the ranges of 0, 10, 20, 30, etc
Can I also do it with different ranges? For example 0, 100, 200, 300, etc
Thank you.

You can do this with arithmetic and floor():
select 10*floor(val / 10)
You can replace the 10s with whatever value you want.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redshift - Breaking number into 10 parts and finding which part does a number fall into - sql

You want arithmetic. If I understand correctly: select ceiling(number_to_check * 10.0 / ref_number) Here is a db<>fiddle (the fiddle happens to use Postgres).

Related

Postgis: How do I select every second point from LINESTRING?

self join data to get distinct years

Dask aggregate value into fixed range with start and end time?

How many different orders are possible in which the key values can occur while searching for a particular key?

How to floor a number in sql based on a range

Categories

Resources