I'm learning SPSS for a research methods class but I'm a bit confused on how to enter and define values that represent a meaning after a certain point. For example, the problem I am working on states:
Suppose the following indexed scores represent performance on a new survey meant to understand an individual’s level of depression. Suppose a score of above 20 represents a depressed individual based on the survey design.
Scores: 13.5, 15.7, 14.3, 16.7, 21.2, 20.7, 22.3, 17.4, 16.8, and 12.4
What is the relative frequency of those individuals that represent depressed individuals?
How would one define or make the values over 20 marked as "depressed" to accurately calculate the relative frequency?
Please and thank you!
Picture of the variables and question
You need to calculate a new variable based on the existing one which will have a value of 1 if the original variable is over the threshold, and 0 if not. There are a few ways to do that, this is the simplest one:
compute depressed=(score>-20).
You can now add labels and analyse:
value labels depressed
1 "depressed: score over 20"
0 "score not over 20".
frequencies depressed.
Related
I have a table which's been attached.
The table includes a different number of model of cars with their EU class (columns are a number of models in each class).
I am trying to identify the maximum value of each model (at each row) by identifying the EU class (column name) in Python.
for example the first row the maximum number goes to Euro 5 with 3677 cars whose model of vehicle is 320 GH.
I tried different commands such as
maxValuesObj = D_high_model_EUstd.loc[D_high_model_EUstd['Model of Vehicle'].idxmax()]
but faced this error "reduction operation 'argmax' not allowed for this type".
I was wondering if anybody can help or suggest to me any solution.
Thanks
You can use pd.idxmax
For example:
df.drop(columns="Model of Vehicle").idxmax(axis=1) # Drop text column since idxmax don't work with them
Or create another dataframe:
pd.concat({
"Model of Vehicle": df["Model of Vehicle"],
"Max":df.drop(columns="Model of Vehicle").idxmax(axis=1)
},axis=1)
There are seven "Species" of fish in this data set, some have very few observations . To make analysis of Species a little easier, I need to create a new column in the data called Species.grouped that indicates if the fish is "Perch", "Bream", or "Other". So I have to group the 5 smallest groups of "Species" into a single group called "Other". The resultant column (Species.grouped) should have the value "Perch" if the fish is a Perch, "Bream" if it's a Bream, and "Other" if it's anything else.
Then I need to run a regression predicting the Weight of a fish using Species.grouped and Width as independent predictor variables (no interaction).
Looks like a problem that could be solved by mapping 'Species' values to desired values/categories like 'Perch', 'Bream' and 'Other' using a dictionary and then applying that onto pandas.
This Answer shows plenty of examples that could help you achieve your requirement.
I need help programming a process in AnyLogic.
I want to assign a specific number to an Agent (50%=1,20%=2,30%=3). And later I want to use the number to add to a stock.
For example: if the Agent has the number 3 I want the stock (variable) to go down exaclly this amount.
I tried using a Parameter on the Agent, but I am only able to create a string and with that I can't calculate later…
I would be very happy if anyone knows a solution to the Problem! :)
Here is a simple solution for your case:
set the parameter type to "int"
Assign the initial value according to your probabilities, as below. Note that for numbers 2 and 3, it is a relative probability, given that it was not drawing value 1
Each agent instance will draw a number between 1 and 3 now (50% 1, 20% 2 and 30% 3). Then, you can do your stock adjustments with that param
I have a field in my dashboard called outcome that displays the performance results for a doctor's office for multiple measures. A majority of the outcome values are rates and should be displayed as percentages. Unfortunately, there are two outcomes, "Utilization Management: Measure 3" and "Utilization Management: Measure 5" that are NOT rates, but actually number values.
Is there a way to display the outcome field in my table so that all 'Measures' that are NOT "Utilization Management: Measure 3" or "Utilization Management: Measure 5" get displayed as percentages, while the two aforementioned measures are displayed as number values?
Please do not get hung up on the appropriateness of combining rate and number values in the same field, as I've tried to have that conversation with my customer...they are insistent on this display ability and will not let it through UAT without it. Thanks.
Same question posed and packaged workbook attached for reference here.
One approach is to define a string valued calculated field that makes the number format part of the calculation logic. This solution does not play well with Measure Names and Measure Values, since you can't put string valued measures on the Measure Values shelf. But you can still build the view your customer wants, with a little effort.
First create a static set based on your field [Measure] (confusing choice of field name, by the way. You have a dimension named Measure.). Call it [Percentage Measures] and check off the ones you want displayed as percentages.
Then for each of your numeric measure fields that you want treated this way, make a corresponding calculated field that looks something like:
if attr([Percentage Measures]) then
str(round(sum([outcome]) * 100, 2)) + "%"
else
str(round(sum([outcome]),3))
end
This approach assumes you will use your calculation on views where your dimension named [Measure] is on some (non-filter) shelf. Adjust the round() function arguments as desired
I took a look at your workbook. It seems that it is not possible to dynamically format values. There is a feature idea here: https://community.tableau.com/ideas/1411 which I believe would allow you to do what you want.
I have a list of financial metrics in column A, rest of the columns are the time periods the financial data is for.
Let's say I'm trying to calculate a ratio, but the financial metrics in A are not entirely unique, in the sense that a metric type may have more than one associated metric depending on how the company reports the metric.
For example, let's say I need Depreciation Expense on the income statement... that item may be reported as Depreciation, or DepreciationAndAmortization, or something else.
Any ideas how the formula in the ratio I'm trying to calculate can lookup the metric in A1, use the number immediately to the right as part of the formula... and if the metric Depreciation for example is 0, it would look for the next one I specify, like DepreciationAndAmortization, and use that one instead as the first one isn't reported.
If I understand correctly, this should do it:
=MAX(INDEX(B:B,MATCH("*depreciation*",A:A,)),INDEX(B:B,MATCH("*depreciation*",A:A,)+MATCH("*depreciation*",INDEX(A:A,1+MATCH("*depreciation*",A:A,)):INDEX(A:A,100+MATCH("*depreciation*",A:A,)),)))
If the alternatives are say in E2 and E3 then:
=MAX(VLOOKUP(E2,A:B,2,0),VLOOKUP(E3,A:B,2,0))
ie try both and take whichever is larger.
About your concern on the answer of Excel Hero that returns Value error, you can use the function "iferror" and returns "0" if the value/date you're looking for isn't available.
=IFERROR(MAX(INDEX(B:B,MATCH("*depreciation*",$A:$A,)),INDEX(B:B,MATCH("*depreciation*",$A:$A,)+MATCH("*depreciation*",INDEX($A:$A,1+MATCH("*depreciation*",$A:$A,)):INDEX($A:$A,100+MATCH("*depreciation*",$A:$A,)),))),0)