I am computing my fuel consumption from OBD2 parameter. MAF to be specific and I am receiving data on per second basis. Here is an section of my data.
TS RS EngS MAF R MAP EL TD Travel
14:41:22 31 932 1056 98 23978 12130
14:41:23 29 2084 2639 107 23210 12130
14:41:24 32 2154 3867 149 38826 12130
14:41:25 36 2426 4683 184 36266 12130
14:41:26 39 2391 3031 133 682 12130
14:41:27 40 1784 2794 132 30634 12130
14:41:28 42 1864 2853 140 30378 12130
14:41:29 43 1953 2900 132 29098 12130
14:41:30 46 2031 3017 135 29098 12130
14:41:31 45 2027 2969 126 20138 12130
14:41:32 47 2122 4253 174 42154 12130
14:41:33 51 2220 4722 183 20906 12130
Where
TS : Time Stamp,
RS : Road Speed,
EngS : Engine Speed,
MAF R : Mass Air Flow Rate,
MAP Mass Air Pressure,
EL : Engine Load,
TD Travel : Total Distance Traveled
So basically from this data I am trying to compute my Instantaneous Fuel Consumption and The Mileage in KMPL.
For that, Since The Data is per second i am taking MAF of each row and using this formula,
Fuel Consumption = MAF/(14.7*710),
where 14.7 = ideal air/fuel ratio,
and 710 is density of gasoline in grams/L
So, this should give my consumption. and I am calculating the distance(in KM) from RS /3600. And further dividing distance by fuel consumption to get mileage. However the calculation is coming horribly wrong. The mileage of my car is around 14KMPL. Here are my results.
TS Distance (inKM) Fuel Consum(L) Mileage(KMPL)
14:41:22 0.0086111111 0.1008355216 0.0853975957
14:41:23 0.0080555556 0.2519933158 0.0319673382
14:41:24 0.0088888889 0.369252805 0.0240726374
14:41:25 0.01 0.4471711626 0.0223628016
14:41:26 0.0108333333 0.2894246837 0.0374305785
14:41:27 0.0111111111 0.2667939842 0.0416467828
14:41:28 0.0116666667 0.2724277871 0.0428248043
14:41:29 0.0119444444 0.2769157317 0.0431338602
14:41:30 0.0127777778 0.2880878491 0.0443537546
14:41:31 0.0125 0.2835044163 0.0440910239
14:41:32 0.0130555556 0.4061112437 0.0321477323
14:41:33 0.0141666667 0.4508952017 0.0314189785
Can someone tell what am I doing so wrong that the computation is so wrong. As the formulas are simple there isn't much scope to do error.Thank You.
MAF is in g/s
MAF(g/s) * 1/14.7 * 1L/710g = Fuel Consumption in L/s Units
Speed (V) is in KPH (Km/hr) so V(Km/hr) * (1hr/3600s) = v KPS(Km/s)
so FC(L/s) / v (Km/s) = L/Km
you want Km/L so v/Fc so your final formula is
KmPL = V * 1/ 3600 * 1/MAF * 14.7 * 710
Divide the MAF by 14.7 to get Grams of fuel per Sec
next divide by 454 to get lbs fuel/sec
next divide 6.701 to get fuel/sec
multiply by 3600 to get gallons/ hr
other case GPH=MAF*0.0805 next MPG=MPH?GPH
Related
I'm trying to select some values based on some proprietary data, and I just changed the variables to reference house prices.
I am trying to get the total offers for houses where they were sold at the bid or at the ask price, with offers under 15 and offers * sale price less than 5,000,000.
I then want to get the total number of offers for each neighborhood on each day, but instead I'm getting the total offers across each neighborhood (n1 + n2 + n3 + n4 + n5) across all dates and the total offers in the dataset across all dates.
My current query is this:
SELECT DISTINCT(neighborhood),
DATE(date_of_sale),
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`
WHERE ((offers * accepted_sale_price < 5000000)
AND (offers < 15)
AND (house_bid = sale_price OR
house_ask = sale_price))) as bid_ask_off,
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`) as
total_offers,
FROM `big_query.a_table_name.houseprices`
GROUP BY neighborhood, DATE(date_of_sale) LIMIT 100
Which I am expecting a result like, with date being repeated throughout as d1, d2, d3, etc.:
but am instead receiving
I'm aware that there are some inherent problems with what I'm trying to select / group, but I'm not sure what to google or what tutorials to look at in order to perform this operation.
It's querying quite a bit of data, and I want to keep costs down, as I've already racked up a smallish bill on queries.
Any help or advice would be greatly appreciated, and I hope I've provided enough information.
Here is a sample dataframe.
neighborhood date_of_sale offers accepted_sale_price house_bid house_ask
bronx 4/1/2022 3 323 320 323
manhattan 4/1/2022 4 244 230 244
manhattan 4/1/2022 8 856 856 900
queens 4/1/2022 15 110 110 135
brooklyn 4/2/2022 12 115 100 115
manhattan 4/2/2022 9 255 255 275
bronx 4/2/2022 6 330 300 330
queens 4/2/2022 10 405 395 405
brooklyn 4/2/2022 4 254 254 265
staten_island 4/3/2022 2 442 430 442
staten_island 4/3/2022 13 195 195 225
bronx 4/3/2022 4 650 650 690
manhattan 4/3/2022 2 286 266 286
manhattan 4/3/2022 6 356 356 400
staten_island 4/4/2022 4 361 361 401
staten_island 4/4/2022 5 348 348 399
bronx 4/4/2022 8 397 340 397
manhattan 4/4/2022 9 333 333 394
manhattan 4/4/2022 11 392 325 392
I think that this is what you need.
As we group by neighbourhood we do not need DISTINCT.
We take sum(offers) for total_offers directly from the table and bids from a sub-query which we join to so that it is grouped by neighbourhood.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY neighborhood) s
ON h.neighborhood = s.neighborhood
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
Or the following which modifies more the initial query but may be more like what you need.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
date_of_sale dos,
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY
neighborhood,
date_of_sale) s
ON h.neighborhood = s.neighborhood
AND h.date_of_sale = s.dos
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
I have a df who looks like this:
Total Initial Follow Sched Supp Any
0 5525 3663 968 296 65 533
I transpose the df 'cause I have to add a column with the percentages based on column 'Total'
Now my df looks like this:
0
Total 5525
Initial 3663
Follow 968
Sched 296
Supp 65
Any 533
So, How can I add this percentage column?
The expected output looks like this
0 Percentage
Total 5525 100
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6
Do you know how can I add this new column?
I'm working in jupyterlab with pandas and numpy
Multiple column 0 by scalar from Total with Series.div, then multiple by 100 by Series.mul and last round by Series.round:
df['Percentage'] = df[0].div(df.loc['Total', 0]).mul(100).round(1)
print (df)
0 Percentage
Total 5525 100.0
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6
Consider below df:
In [1328]: df
Out[1328]:
b
a
Total 5525
Initial 3663
Follow 968
Sched 296
Supp 65
Any 533
In [1327]: df['Perc'] = round(df.b.div(df.loc['Total', 'b']) * 100, 1)
In [1330]: df
Out[1330]:
b Perc
a
Total 5525 100.0
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6
I was trying to solve the "analyze weather patterns" problem as described here (https://joins-238123.netlify.com/window-functions/)
You're worried that hurricanes are happening more frequently, so you
decide to do a tiny bit of analysis. For each kind of weather event
find the 2 events that occurred the closest together and when they
happened
Table weather with data like:
type day
rain 6
rain 12
thunderstorm 13
rain 21
rain 27
rain 37
rain 44
rain 54
thunderstorm 56
rain 58
rain 61
rain 65
rain 68
rain 73
rain 82
hurricane 87
rain 92
rain 95
rain 98
rain 108
thunderstorm 111
rain 118
rain 123
rain 128
rain 131
hurricane 135
rain 136
rain 140
rain 149
thunderstorm 158
rain 159
rain 167
rain 175
hurricane 178
rain 179
rain 186
rain 192
rain 200
thunderstorm 202
rain 210
rain 219
thunderstorm 222
rain 226
rain 232
thunderstorm 238
rain 241
rain 246
rain 253
thunderstorm 257
rain 257
rain 267
rain 277
rain 286
rain 295
rain 302
rain 307
thunderstorm 312
rain 316
rain 325
thunderstorm 330
I could come up with :
select type, day, COALESCE(day - LAG(day, 1) over (partition by type order by day), 0) as days_since_previous from weather
It gives me results like:
type day days_since_previous
hurricane 87 0
hurricane 135 48
hurricane 178 43
rain 6 0
rain 12 6
rain 21 9
rain 27 6
But I can't get it to narrow the results down to the 2 closest events and only display the days between them.
How do I go about doing so that I get the desired result like:
type day days_since_previous
rain 61 3
hurricane 178 43
thunderstorm 238 16
You can use another window function to widdle down the rows:
SELECT type, day, days_since_previous
FROM (
SELECT type, day, (day - prev_day) AS days_since_previous,
ROW_NUMBER() OVER(PARTITION BY type ORDER BY (day - prev_day)) AS RowNum
FROM (
select type, day,
LAG(day, 1) over (partition by type order by day) as prev_day
from weather
) src
WHERE prev_day IS NOT NULL -- Ignore "first" events
) src
WHERE RowNum = 1
order by day
I also removed the COALESCE since that was causing the "first" events to be included in the calculations.
If you don't insist on displaying the day value - you could run a nested query:
In one SELECT (in a WITH clause, or a nested sub-select) add the gap to previous day as an OLAP function, as you suggest. No need to COALESCE, really ..
From that fullselect , run a GROUP BY select.
Like so:
WITH
w_gap2prev AS (
SELECT
*
, day - LAG(day) OVER(PARTITION BY type ORDER BY day) AS gap
FROM input
)
SELECT
type
, MIN(gap) AS days_since_previous
FROM w_gap2prev
WHERE gap IS NOT NULL
GROUP BY type
;
-- out type | days_since_previous
-- out --------------+---------------------
-- out hurricane | 43
-- out rain | 3
-- out thunderstorm | 16
-- out (3 rows)
-- out
-- out Time: First fetch (3 rows): 56.441 ms. All rows formatted: 56.479 ms
This might be a duplicate question, but I didn't get the conclusive answer in them.
I have the vehicle data i.e., velocity (m/s), yaw rate(in radians), sampling times, with these two I calculated the curvature of the road using the equation - curvature = YawRate/velocity.
mSec Speed YawRate(with offset 500) Velocity
22 113 513 31.38888889
53 113 513 31.38888889
84 113 513 31.38888889
115 113 513 31.38888889
915 110 510 30.55555556
946 110 510 30.55555556
978 110 510 30.55555556
24 109 510 30.27777778
56 109 510 30.27777778
87 109 511 30.27777778
118 109 511 30.27777778
Now I want to plot the road curvature on an image of the road. I have the equation for curvature,
Curvature = YawRate/Velocity)
(something like showing the trail of the vehicle).
**Remember-I have to plot this trajectory on an image. How can I do it?
P:S at high speeds steering angle is not significant. so ruling out steering angle as input.
What you want to plot are the known positions of the vehicle across time.
From the available data set we can infer that the polar coordinates are given by the steering angle (which seems to be absolute - deduct a quarter turn) and the instantaneous radius of curvature.
Convert from polar to Cartesian.
Without better information, you have to assume that the instantaneous center of rotation is fixed.
I have the following excel file:
W1000x554 1032 408 52.1 29.5 70700 12300
W1000x539 1030 407 51.1 28.4 68700 12000
W1000x483 1020 404 46 25.4 61500 10700
W1000x443 1012 402 41.9 23.6 56400 9670
W1000x412 1008 402 40 21.1 52500 9100
W1000x371 1000 400 36.1 19 47300 8140
W1000x321 990 400 31 16.5 40900 6960
W1000x296 982 400 27.1 16.5 37800 6200
W1000x584 1056 314 64 36.1 74500 12500
I want to define a function that can ask the user for one of the first column's names and then read all the relevant data of that row later.
For example if the user defines W1000x412 then read : 1008 402 40 21.1 52500 9100.
Any ideas?
I suspect what #Marc means is that a formula such as in J2 below (copied across and down as necessary) will 'pick out' the values you want. It is not clear to me from your question whether these should be kept separate (as in Row2 of example) or strung together (CONCATENATE [&] as in J7 of example, where these are space [" "] delimited):
I am also not entirely sure about your 'define a function' but have assumed you do not require a UDF.
I have used Row1 to provide the offset for VLOOKUP, to save adjusting manually the formula for each column.
ColumnI is the expected user input, that might be best by selection from a Data Validation List with Source $A$2:$A$10.