Getting col % from a base size - tidyr

I'm trying to get an output for a multi-response table in col%. I can get a % from column total but not from a fixed based. How do I do it? For example
Past week used (Seg A, Seg B, Seg C) =
Olive Oil: 80, 100, 150
Sunflower Oil: 35, 95, 105
Coconut Oil: 109, 209, 15
Segment sizes A=120, B=250, C=165
I need col% by each segment
So Seg A should be calculated as
Olive Oil= 80/120; Sunflower Oil=35/120 & Coconut Oil=109/120
Similarly for Seg B & Seg C.
I'm using tidyr and dplyr to generate my outputs.
Any advice will be much appreciated.

Related

Get certain percentile values over SQL table

Let's say I have a table storing users, the number of red balls they have, the total number of balls (blue, yellow, other colors etc.), and the ratio of red to total balls.
Schema looks like this:
**user_id** | **ratio** | **red_balls** | **total_balls**
1 .2 2 10
2 .3 6 20
I want to select the 0, 25, 50, 75, and 100 percentile values based on ordering the red_balls column, so this doesn't mean I want the 0, 0.25, etc. values for the ratio column. I want the 25th percentile of the red_balls column. Any suggestions?
I think this can do what you want:
select *
from your_table
where ratio in (0, 0.25, 0.5, 0.75, 1)
order by red_balls
Query finds all rows with ratios that exactly one of 0, 25, 50, 75, 100 and sort rows in ascending order by count of red_balls

Redshift - Breaking number into 10 parts and finding which part does a number fall into

I am trying to break down a given number into 10 equal parts and then compare a row of numbers and see in which of the 10 parts they fall under.
ref_number, number_to_check
70, 34
70, 44
70, 14
70, 24
In the above data set, I would like to break 70 into 10 equal parts (in this case it would be 7,14,21, and so on till 70). Next I would like to see in which "part" does the value in column "number_to_check" fall into.
Output expected:
ref_number, number_to_check, part
70, 34, 5
70, 44, 7
70, 14, 2
70, 24, 4
You want arithmetic. If I understand correctly:
select ceiling(number_to_check * 10.0 / ref_number)
Here is a db<>fiddle (the fiddle happens to use Postgres).

Postgis: How do I select every second point from LINESTRING?

In DBeaver I have a table containing some GPS coordinates stored as Postgis LINESTRING format.
My questions is: If I have, say, this info:
LINESTRING(20 20, 30 30, 40 40, 50 50, 60 60, 70 70)
which built-in ST function can I use to get every N-th element in that LINESTRING? For example, if I choose 2, I would get:
LINESTRING(20 20, 40 40, 60 60)
, if 3:
LINESTRING(20 20, 50 50)
and so on.
I've tried with ST_SIMPLIFY and ST_POINTN, but that's now exactly what I need because I still want it to stay a LINESTRING but just with less points (lower resolution).
Any ideas?
Thanks :-)
Welcome to SO. Have you tried using ST_DumpPoints and applying a module % over the vertices path? e.g. every second record:
WITH j AS (
SELECT
ST_DumpPoints('LINESTRING(20 20, 30 30, 40 40, 50 50, 60 60, 70 70)') AS point
)
SELECT ST_AsText(ST_MakeLine((point).geom)) FROM j
WHERE (point).path[1] % 2 = 0;
st_astext
-------------------------------
LINESTRING(30 30,50 50,70 70)
(1 Zeile)
Further reading:
ST_MakeLine
CTE
ST_Simplify should return a linestring unless the simplification results in an invalid geometry for a lingstring, e.i., less than 2 vertex. If you always want to return a linestring consider ST_SimplifyPreserveTopology . It ensures that at least two vertices are returned in a linestring.
https://postgis.net/docs/ST_SimplifyPreserveTopology.html

self join data to get distinct years

I have this dataframe that I need to join in order to find the academic-years.
df11=pd.read_csv('https://s3.amazonaws.com/todel1623/myso.csv')
df11.course_id.value_counts()
274 3
285 2
260 1
I can use self-join and get the respective years without any problem.
df=df11.merge(df11[['course_id']], on='course_id')
df.course_id.value_counts()
274 9
285 4
260 1
But the expected count in this case is
274 6
285 4
260 2
This is because even if there are 3 years for id 274, the course duration is only 24 months. And even if there is only 1 record for 260 since the duration is 24 months, it should return 2 records. (once for current year and the other current_year + 1), rest of the column values being same for that group.
Can I write a loop for dataframe something like this?
for row in df:
if i in range((df.duration_inmonths / 12)):
df.row.year= df.row.year + i
df.append(df.row)
In the following case, the first record should be 2017 and not 2018.
myl=list()
for row in df11.values:
for i in range(int(row[15]/12)):
row[5]=row[5]+i
myl.append(row)
myl[:2]
[array([383, 1102, 'C-43049', 'M.B.A./M.M.S.', 'Un-Aided', 2018, 80000,
8000, 900, 312, 89212, 2018, 12, 260, 95, 24, 1102.0,
'M.B.A./M.M.S.'], dtype=object),
array([383, 1102, 'C-43049', 'M.B.A./M.M.S.', 'Un-Aided', 2018, 80000,
8000, 900, 312, 89212, 2018, 12, 260, 95, 24, 1102.0,
'M.B.A./M.M.S.'], dtype=object)]
numpy array do not seem to append in a list with changed values. It worked when I converted it to list.
myl.append(row.tolist())

Money Denominations

I have payroll database, on my payroll payslip i would like to make money denominations for the salary of each employee, I.e if an employe has got 759 Dollar then the cashier wil withdraw 7 one hundreds ,1 Fifty Dolar, 9 ten dollars from a banck
please give me a code in vb.net
Salary hundred Fifty ten
759 7 1 9
Please help me thans a lot
Here's an answer in python:
# Target amount
amount = 759
# The denominations to be used, sorted
denoms = [100, 50, 20, 10, 5, 1]
# Take as many of each denomination as possible
for d in denoms:
count = amount // d
amount -= count * d
print "%ix%i" % (count, d)
Sample output:
7x100
1x50
0x20
0x10
1x5
4x1