among the rows meeting one criteria, count how many of the 5 largest also meet another criteria - excel-2016

In a large table, I want to look at only the rows from a given sample(say sample1), and among them find the 5 with the largest probability(another parameter in the table), and then count how many of them that meet a third criteria, with is Cell type=1. I want to be able to do that for all the samples in the large table.
Have tried to use LARGE((range=criteria)*(range);{1;2;3;4;5}) together with a COUNTIFS, but can't seem to build two criteria and two ranges into it. With the COUNTIFS, it will just take the 5 rows with the largest probability in the whole table, and see how many of them are also sample1 and cell type=1.
I guess the solution is some kind of nested IF or COUNTIF(don't know about that is even possible), but just can't get it to work.
Have tried to use this one:
=COUNTIFS(Table1[Sample];"sample1";Table1[Cell type];"1";Table1[Probability];LARGE((Table1[Sample]="sample1")*(Table1[Probability]);{1;2;3;4;5}))
Which returns 1
Have also tried this one:
=COUNTIF(Table1[Sample];LARGE(IF(AND(Table1[Cell type]=1;Table1[Sample]="sample1");Table1[Probability]);{1;2;3;4;5}))
Which returns 0
I have run out of good ideas and really need some help, thanks!
To make it even more clear, with the data below, among the rows from sample1 only, I want to count how many of the 5 rows with the highest probability(within sample1) that are also cell type = 1? The results should be:
sample1 = 3
sample2 = 4
sample3 = 5
Sample -- Cell type -- Probability
sample1 -- 1 -- 0,95
sample1 -- 1 --0,9
sample1 -- 1 -- 0,85
sample1 -- 0 -- 0,8
sample1 -- 0 -- 0,75
sample1 -- 0 -- 0,7
sample1 -- 0 -- 0,65
sample2 -- 1 -- 0,97
sample2 -- 1 -- 0,95
sample2 -- 1 --0,93
sample2 -- 1 --0,91
sample2 -- 0 --0,89
sample2 -- 0 --0,87
sample2 -- 0 --0,85
sample2 -- 0 --0,83
sample2 -- 1 --0,81
sample3 --1 --0,87
sample3 --1 --0,86
sample3 --1 --0,85
sample3 --1 --0,84
sample3 --1 --0,83
sample3 --1 --0,82
sample3 --1 --0,81
sample3 --1 --0,8

Related

Convert duplicate raws into one with diffrent values

I'm trying to find a solution to rearrange my data frame. Currently I have more than a half duplicate raws for a single object and I would like to combine them into one. The fraction of my dataset you can find below:
#NAME Sample1 Sample2 Sample3 sample4 Sample5
AAC(6')-Ib7 5 0 0 0 0
AAC(6')-Ib7 0 3 0 0 25
AAC(6')-Ib7 0 0 0 0 0
AAC(6')-Ib7 0 0 0 10 0
AAC(6')-Ib7 0 0 0 0 0
And I would like to have the output:
#NAME Sample1 Sample2 Sample3 sample4 Sample5
AAC(6')-Ib7 5 3 0 10 25
Can you give me any tips how I can rearrange it?
Because my original dataset has more than 7000 raws, but most are in dublicate (should have around 800 single raws), do I have to do it for each value separately?
Will be appreciated for your help!
Thank you.

How to query the front and back n rows of a piece of random row in hive

After selecting a random row, I want to be able to select n number of records preceding and following it.
example:
id content
1 add
2 bob
3 cdf
4 asd
random row id is 3,i need select result:
2 bob
3 cdf
4 asd

How to sum values of two columns by an ID column, keeping some columns with repeated values and excluding others?

I need to organize a large df adding values of a column by a column ID (the ID is not sequencial), keeping some columns of the df that have repeated values by ID and excluding column that have different values by ID. Below I inserted a reproducible example and the output I need. I think there is a simple way to do that, but I am not soo familiar with R.
df=read.table(textConnection("
ID spp effort generalist specialist
1 a 10 1 0
1 b 10 1 0
1 c 10 0 1
1 d 10 0 1
2 a 16 1 0
2 b 16 1 0
2 e 16 0 1
"), header = TRUE)
The output I need:
ID effort generalist specialist
1 10 2 2
2 16 2 1

Ensure percentages are between 0 and 1, inclusive (using a single function)

I have percentages in a condition table:
create table condition (percent_decimal number(3,2));
insert into condition values (-0.01);
insert into condition values (0.1);
insert into condition values (1);
insert into condition values (1.1);
commit;
PERCENT_DECIMAL
---------------
-0.01
.1
1
1.1
I want to select the values, but modify them to present them as percentages between 0 and 1 (inclusive):
Convert -0.01 to 0
Leave .1 as is
Leave 1 as is
Convert 1.1 to 1
I can successfully do this using the greatest and least functions:
select
percent_decimal,
least(1,greatest(0,percent_decimal)) as percent_modified
from
condition
PERCENT_DECIMAL PERCENT_MODIFIED
--------------- ----------------
-0.01 0
.1 .1
1 1
1.1 1
However, I'm wondering if there is a more succinct way of doing this--with a single function.
You could use a single case expression:
select
percent_decimal,
case when percent_decimal < 0 then 0
when percent_decimal > 1 then 1
else percent_decimal
end as percent_modified
from
condition
/
PERCENT_DECIMAL PERCENT_MODIFIED
--------------- ----------------
-0.01 0
.1 .1
1 1
1.1 1
which is longer, but uses no functions, and I think it's clearer to someone coming along later what your logic is.

Can I split a dynamic semicolon delimited string into columns using T-SQL?

I have a table that looks like so:
Id SubNumber Values
1 1 1;4;8;3
2 2 8;9;7;10
3 3 41;45;23;0
I will not always only have 4 values and the number of "SubNumbers" can be greater than 3. Is there any way I can query this table to look like this?
Id SubNumber 1 2 3 4
1 1 1 4 8 3
2 2 8 9 7 10
2 3 41 45 23 0
The rows will always have the same number of values delimited by a semicolon but the amount separated by a semicolon can vary. So a table may even have 10 values or 1 or more.
The 2nd table doesn't have to have numbers to represent the values. It can even be blank or the default that is given by SQL when no name is provided.
This is not a duplicate of the example provided because this deals with separating a dynamic number of values into columns.