I just want to ask on how to solve my problem.
On my report window, 2 computed fields:
1. A field that computes for decimal and I am displaying it to two decimals only (no problem on the display).
2. On my other computed field, I use the result of the first computed field to be multiplied to a certain number. The problem is here, because the value being multiplied is not the value being displayed from the first field. Instead it uses the whole amount.
Scenario:
2,055,232.135 is the computed value and id displays 2,055,232.14 which is good.
But if i multiply it to 9 (2055232.135 * 9), the result is 18,497,089.215 which will be displayed as 18,497,089.22
The problem is I want that the displayed value (2,055,232.14) to be multiplied to 9 (2055232.14 * 9) which will then results to 18,497,089.26
I only wanted to achieve the 2nd value for the computed field so that if the user computes for it, it will be equal.
The following solution can be used in a computed field:
String( Truncate( 2055232.135 * 9, 2), "#,##0.00")
Related
Hello,
I am analyzing the next dataset with this information .
The column ['program_number'] is an object but I want to change it to a integer colum.
I have tried to replace some values but it doesn´t work.
as you can see, some values like 6 is duplicate. like '6 ' and 6.
How can I resolve it? Many thanks
UPDATE
Didn't see 1X and 3X at first.
If you need those numbers and just want to remove the X then:
df["Program"] = df["Program"].str.strip(" X").astype(int)
If there is data in the column which aren't numbers or which shouldn't be converted, you can use pd.to_numeric with errors='corece'. If there are cells which can't be converted, you'll get NaN. Be aware that this will result in floating numbers.
df["Program"] = pd.to_numeric(df["Program"], errors="coerce")
old
You want to use str.strip() here, rather than replace.
Try this:
df1['program_number'] = df1['program_number'].str.strip().astype(int)
I am new to splunk.
I have aggregated a column using 'by' statement now i want to multiply each element in the column with different elements element wise, say first element with 0.05 and rest all with 0.07.
Please help
enter image description here
Adding the following to your query will generate a new column, called count, which increments by one each for each result. Then you know which is the first element, and can multiply if by 0.05, and then multiply all other results by 0.07
| streamstats count
| eval count(adCategory) = case(count=1,'count(adCategory)'*0.05,
1==1, 'count(adCategory)'*0.07)
It's in reference to the Google Finance function in Google Sheets: https://support.google.com/docs/answer/3093281?hl=en
I would like to obtain the "all time LOW" (ATL) and "all time HIGH" (ATH) for a specific ticker (i.e. ABBV or GOOG) but only in 1 cell for each. Basically "What's the ATL/ATH value for this ticker?"
I've tried to do both formulas for ATL and ATH, but only ATL gives the expected result for now.
To get the ATL, you can use
=GOOGLEFINANCE("ABBV","low","01/12/1980",TODAY(),7)
and to get the ATH you can use:
=GOOGLEFINANCE("ABBV","high","01/12/1980",TODAY(),7)
The output of this is 2 columns of data:
Please note that column A, containing the timestamp, will be the one making trouble when it comes to computing the MAX function as it translates into some weird figures.
In order to get the ATL, I'll be using the MIN function which works perfectly fine:
=MIN(GOOGLEFINANCE("ABBV","low","01/01/1980",TODAY(),7))
as it will just scan the 2 columns of data and grab the lowest value which is 32.51 in USD.
BUT when I'm trying to do the same with MAX or MAXA for the ATH using for example
=MAX(GOOGLEFINANCE("ABBV","high","01/12/1980",TODAY(),7)
the result that comes out is 43616.66667 which seems to be a random computation of the column A containing the timestamp.
The expected result of the ATH should be 125.86 in USD.
I've tried using FILTER to excluded values >1000 but FILTER doesn't let me search in column B, so then I tried with VLOOKUP using this formula
=VLOOKUP(MAX(GOOGLEFINANCE("ABBV","high","01/12/1980",TODAY(),7)),GOOGLEFINANCE("ABBV","high","01/12/1980",TODAY(),7),2,FALSE)
but again it returns the value of column B but based on the MAX value of column A which end up giving me 80.1 and not the expected 125.86.
use:
=MAX(INDEX(GOOGLEFINANCE("ABBV", "high", "01/12/1980", TODAY(), 7), , 2))
43616.66667 is not a "random computation". it's date 31/05/2019 16:00:00 converted into a date value
MAX and MIN functions return single output from all possible cells in the included range which are in your case two columns. the date is considered as a number too so maxing out those two columns will output you the max value whenever it is from 1st or 2nd column. by introducing INDEX you can skip 1st column and look for a max value only in the 2nd column.
=MAX(INDEX(GOOGLEFINANCE("BTCSGD", "price", "01/12/1980", TODAY(), 7), , 2))
replace BTCSGD with any stock price you want to search up.
You can put ABCXYZ where ABC is the stock/ETF/Crypto and XYZ is the currency
I have a data frame, lets say xyz. I have written code to find out the % of null values each column possess in the dataframe. my code below:
round(100*(xyz.isnull().sum()/len(xyz.index)), 2)
let say i got following results:
abc 26.63
def 36.58
ghi 78.46
I want to drop column ghi because it has more than 70% of null values.
I achieved it using the following code:
xyz = xyz.drop(xyz.loc[:,round(100*(xyz.isnull().sum()/len(xyz.index)), 2)>70].columns, 1)
but , i did not understand how does this code works, can anyone please explain it?
the code is doing the following:
xyz.drop( [...], 1)
removes the specified elements for a given axis, either by row or by column. In this particular case, df.drop( ..., 1) means you're dropping by axis 1, i.e, column
xyz.loc[:, ... ].columns
will return a list with the column names resulting from your slicing condition
round(100*(xyz.isnull().sum()/len(xyz.index)), 2)>70
this instruction is counting the number of nulls, adding them up and normalizing by the number of rows, effectively computing the percentage of nan in each column. Then, the amount is rounded to have only 2 decimal positions and finally you return True is the number of nan is more than 70%. Hence, you get a mapping between columns and a True/False array.
Putting everything together: you're first producing a Boolean array that marks which columns have more than 70% nan, then, using .loc you use Boolean indexing to look only at the columns you want to drop ( nan % > 70%), then using .columns you recover the name of such columns, which then are used by the .drop instruction.
Hopefully this clear things up!
If you code is hard to understand , you can just check dropna with thresh, since pandas already cover this case.
df=df.dropna(axis=1,thresh=round(len(df)*0.3))
The autoaxis for one of my embedded charts isn't behaving well, sometimes only showing one other major value besides top and bottom. So I thought I'd set my own boundaries, which seemed pretty easy given that one of the columns on the chart is always going to be larger than any of the others.
<Maximum>=(((Max(Fields!Entered.Value, "Chart1") + 10) \ 50) + 1) * 50</Maximum>
(the other columns detail what happened to the things that entered this process)
Round up to the nearest 50 with a little overage to put the label on top. Then I can put the intervals at this divided by 5 and I'm gold.
Except I'm not gold. The chart groups records by date and the individual bars are Sum(Fields!Entered.Value) et cetera, so it's drastically underscaling when multiple batches get processed on a single date. But hey, it groups records by date, I can use that:
<ChartCategoryHierarchy>
<ChartMembers>
<ChartMember>
<Group Name="Chart1_CategoryGroup">
<GroupExpressions>
<GroupExpression>=Fields!Date.Value</GroupExpression>
</GroupExpressions>
</Group>
</ChartMember>
</ChartMembers>
</ChartCategoryHierarchy>
as:
<Maximum>=(((Max(Fields!Entered.Value, "Chart1_CategoryGroup") + 10) \ 50) + 1) * 50</Maximum>
and it'll aggregate over the group just fine. Right?
The ValueAxis_Primary.Maximum expression for the chart 'Chart1' has a scope parameter that is not valid for an aggregate function. The scope parameter must be set to a string constant that is equal to either the name of a containing group, the name of a containing data region, or the name of a dataset.
Nope! It works just fine for "Chart1" but not for "Chart1_CategoryGroup"!
So, uh:
what scope are the axis calculations operating in, 'cause it ain't the category scope?
is there some way to provide them an aggregate scope that groups the data by date so they can do their calculations proper?
You Have To Nest The Scope
A little extra work gave me this insight:
Max(Fields!Entered.Value, "Chart1_CategoryGroup") returns the maximum of the entered fields within one single category group, which is not the level the Y axis is concerned with. What you're interested in is the maximum value of the summed calculation (within a group) for the whole chart, so specify the scopes to do that:
<Maximum>
=(((Max(
Sum(Fields!Entered.Value, "Chart1_CategoryGroup")
, "Chart1") + 10) \ 50) + 1) * 50
</Maximum>