SQL: Is there a way I can find whether a value is within a specific index range of another value? - sql

I have two columns filled with mostly 0's and a few 1's. I want to check whether IF a 1 occurs in the first column, a 1 in the second column occurs within a range of 5 rows of that index. So for example, lets say a 1 occurs in column 1 row 83, then I would like to return TRUE if one or more 1's occur in column 2 row 83-88, and FALSE if this is not the case. Examples of this are listed in the code block. I would want to count the number of TRUE and FALSE occurrences.
TRUE:
0 0
0 0
0 0
1 1
0 0
0 0
0 0
0 0
0 0
0 0
TRUE:
0 0
0 0
0 0
1 0
0 0
0 0
0 1
0 1
0 0
0 0
FALSE:
0 0
0 0
0 1
1 0
0 0
0 0
0 0
0 0
0 0
0 1
I have no idea where to begin, so I do not have any code to start with:(
Kind regards,
Kai

Assuming you have an ordering column, you can use window functions:
select (case when count(*) = 0 then 'false' else 'true' end)
from (select t.*,
max(col2) over (order by <ordering column>
rows between current row and 4 following
) as max_col2_5
from t
) t
where col1 = 1 and max_col2_5 = 1;

Related

Add condition in SQL query based on table value

I am using oracle as my database. I want to add condition in sql query based on table data. In the table if CT_GENERAL is 1 then i want to add another condition in my sql query.( CST_GENERAL = USER ARGUMENT ).
select * from ch_caseinfo where
case when ct_general = 1
then cst_general = %3
end
%3 = Funding
//TABLE STRUCTURE
//CH_CASEINFO
VOLUMEID | CT_ADVERSE | CT_GENERAL | CT_HA | CT_MI | CST_GENERAL | CST_MI
149634 0 0 0 0
161077 0 0 0 0
161147 0 1 0 1 Funding Composition/ingredients
161268 0 1 0 0 Funding
161306 0 1 0 0 Manufacturing
240131 0 1 1 0 Funding
239364 0 0 0 0
239364 0 0 0 0
147434 0 0 0 0
147466 0 0 0 0
158990 0 1 0 1 Funding Administration
98863 1 1 1 1 Funding Disposal
159757 1 1 1 1 Funding Disposal
98863
191039 1 1 0 0 Other
97007 0 0 0 0
ORA-00905: missing keyword
00905. 00000 - "missing keyword"
You need to form your where clause to evaluate an expression that is true when you don't want to include the filter (CT_GENERAL is 0). Considering the example below, if ct_general = 0 then cst_general will always equal cst_general (unless null -- if that is a possibility, you need to accommodate nulls).
SELECT *
FROM ch_caseinfo
WHERE CASE WHEN ct_general = 0 THEN cst_general ELSE USERARGUMENT END = cst_general
AND OTHERCRITERIA = CRITERIA

How do I grab rows surrounding a flagged value?

I'm starting with a table like this:
code new_code_flag
abc123 0
xyz456 0
wer098 1
jio234 0
bcx190 0
eiw157 0
nzi123 0
epj676 0
ere654 0
yru493 1
ale674 0
I want to grab the 2 records before and 2 records after each value where "new_code_flag"=1. I want my output to look like this:
code new_code_flag
abc123 0
xyz456 0
wer098 1
jio234 0
bcx190 0
epj676 0
ere654 0
yru493 1
ale674 0
Any help on how to do this in SQL or SAS?
SQL tables represent unordered sets. Hence, in SQL you need to have a column that specifies the ordering. Assuming you do, you can do something like:
with t as (
select t.*, row_number() over (order by ?) as seqnum
from tbl t
)
select t.*
from t
where exists (select 1
from t t2
where t2.new_code_flag = 1 and
t.seqnum between t2.seqnum - 2 and t2.seqnum + 2
);
You could create two lag and two lead copies of the flag variable and then test if any of the 5 variables are 1 (true).
data have;
input code $ flag ;
cards;
abc123 0
xyz456 0
wer098 1
jio234 0
bcx190 0
eiw157 0
nzi123 0
epj676 0
ere654 0
yru493 1
ale674 0
;
data want ;
set have ;
set have(keep=flag rename=(flag=lead1_flag) firstobs=2) have(drop=_all_ obs=1);
set have(keep=flag rename=(flag=lead2_flag) firstobs=3) have(drop=_all_ obs=2);
lag1_flag=lag1(flag);
lag2_flag=lag2(flag);
if lag1_flag or lag2_flag or flag or lead1_flag or lead2_flag ;
run;
Results
lead1_ lead2_ lag1_ lag2_
Obs code flag flag flag flag flag
1 abc123 0 0 1 . .
2 xyz456 0 1 0 0 .
3 wer098 1 0 0 0 0
4 jio234 0 0 0 1 0
5 bcx190 0 0 0 0 1
6 epj676 0 0 1 0 0
7 ere654 0 1 0 0 0
8 yru493 1 0 . 0 0
9 ale674 0 . . 1 0
data want(drop=_: i);
merge have have(keep=flag firstobs=3 rename=(flag=_flag));
if flag or _flag then i=1;
if 0<i<=3 then do;
output;
i+1;
end;
else delete;
run;

WHERE command is not working

In using Influxql, when I try the following command
select "P_askbid_midprice1" from "/HFT/Data_HFT/OrderBook/DCIX_OB" limit 50
I got the following result
name: /HFT/Data_HFT/OrderBook/DCIX_OB
time P_askbid_midprice1
---- ------------------
2015-05-30T00:00:00Z 0
2015-05-30T00:00:01Z 0
2015-05-30T00:00:02Z 0
2015-05-30T00:00:03Z 0
2015-05-30T00:00:04Z 0
2015-05-30T00:00:05Z 0
2015-05-30T00:00:06Z 0
2015-05-30T00:00:07Z 0
2015-05-30T00:00:08Z 0
2015-05-30T00:00:09Z 0
2015-05-30T00:00:10Z 0
2015-05-30T00:00:11Z 0
2015-05-30T00:00:12Z 0
2015-05-30T00:00:13Z 0
2015-05-30T00:00:14Z 0
2015-05-30T00:00:15Z 0
2015-05-30T00:00:16Z 0
2015-05-30T00:00:17Z 0
2015-05-30T00:00:18Z 0
2015-05-30T00:00:19Z 0
2015-05-30T00:00:20Z 0
2015-05-30T00:00:21Z 0
2015-05-30T00:00:22Z 0
2015-05-30T00:00:23Z 0
2015-05-30T00:00:24Z 0
2015-05-30T00:00:25Z 0
2015-05-30T00:00:26Z 0
2015-05-30T00:00:27Z 0
2015-05-30T00:00:28Z 0
2015-05-30T00:00:29Z 0
2015-05-30T00:00:30Z 0
2015-05-30T00:00:31Z 0
2015-05-30T00:00:32Z 0
2015-05-30T00:00:33Z 0
2015-05-30T00:00:34Z 0
2015-05-30T00:00:35Z 0
2015-05-30T00:00:36Z 0
2015-05-30T00:00:37Z 0
2015-05-30T00:00:38Z 0
2015-05-30T00:00:39Z 0
2015-05-30T00:00:40Z 0
But with the command
select "P_askbid_midprice1" from "/HFT/Data_HFT/OrderBook/DCIX_OB" WHERE time > '2016-05-30' and time < '2015-05-31'
I got nothing from that command even if it is pretty similar to the previous one.
What is the problem with that command?
You need to use an or statement instead of an and statement. Time cannot be both "after" May 2016 and "before" May 2015. It has to be one or the other.
select "P_askbid_midprice1"
from "/HFT/Data_HFT/OrderBook/DCIX_OB"
WHERE
time > '2016-05-30'
or time < '2015-05-31'

SQL Case statements deriving new attribute

I have a table with indicators of directions and based on that I need to derive a new column which tells whether its IN or Out
ORG_IN ORG_OUT DEST_IN DEST_OUT Direction
0 0 0 0 NULL
0 0 0 1 Out
0 0 1 0 In
0 1 0 0 Out
0 1 0 1 Out
0 1 1 0 NULL
1 0 0 0 In
1 0 0 1 NULL
1 0 1 0 In
This is the query where ill derived the direction
http://sqlfiddle.com/#!4/a9f82/1
Do you think it will cover all cases in future for all the combinations. Right now I can see only above combinations. Any better way to write the sql.
select t.*, case ORG_IN + DEST_IN - ORG_OUT - DEST_OUT
when 2 then 'In'
when 1 then 'In'
when 0 then null
when -1 then 'Out'
when -2 then 'Out'
end as Direction
from tablename t
I can't figure out any more valid combinations. However, I'd recommend a check constraint that makes sure no invalid combinations are entered:
check (ORG_IN + ORG_OUT < 2 and DEST_IN + DEST_OUT < 2)

Calculating ratio value within a line which contain binary numbers "0" & "1"

I have a data file which contain more than 2000 lines and 45001 columns.
The first column is actually a "string" which explains the data type.
Start from column #2, up to column #45001, the data is reprsented as
"1"
or
"0"
For example, the pattern of data in a line is
(0 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0)
The total number of data is 25. Within this data line, there are 5 sub-groups which are made by only the number "1"s e.g. (11 111 1111 1 111 ). The "0"s in between the subgroups are assumed as "delimiter". The total of all "1"s is = 13.
I would like to calculate the ratio of
(total of all "1"s / total of number of sub-groups made only by "1"s)
That is
(13/5).
I tried with this code for calculating the total of all "1"s ;
awk -F '0' '{print NF}' < inputfile.in
This gives value 13.
But I donn't know how to go further from here to calcuate the ratio that I want.
I don't know how to find the number of sub-groups within each line beacuse the number of occurances of "1"s and "0"s are random.
Wish to get some kind help to sort this problem.
Appreciate any help in advance.
It is not clear to me from the description what the format of the input file is. Assume the input looks like:
$ cat file
0 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0
To count up the number of ones and the number of groups of ones and take their ratio:
$ awk '{f=0;s1=0;s2=0;for (i=2;i<=NF;i++){s1+=$i;if ($i && !f)s2++;f=$i}; print s1/s2}' file
2.6
Update: Handling all zeros
Suppose one of the lines in the file has all zeros:
$ cat file
0 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
For the second line, both sums are zero which would lead to a divide by zero error. We can avoid that by adding an if statement which will print the ratio if one exists or 0/0 is it doesn't:
if (s2>0)print s1/s2; else print s1"/"s2
The complete code is now:
$ awk '{f=0;s1=0;s2=0;for (i=2;i<=NF;i++){s1+=$i;if ($i && !f)s2++;f=$i}; if (s2>0)print s1/s2; else print s1"/"s2}' file
2.6
0/0
How it works
The code uses three variables. f is a flag which is true (1) if we are currently in a group of ones and is false (0) otherwise. s1 is the the number of ones on the line. s2 is the number of groups of ones on the line.
f=0;s1=0;s2=0
At the beginning of each line, we initialize the variables.
for (i=2;i<=NF;i++){s1+=$i;if ($i && !f)s2++;f=$i}
We loop over each field on the line starting with field 2. If the field contains a 1, we increment counter s1. If the field is 1 and is the start of a new group, we increment s2.
if (s2>0)print s1/s2; else print s1"/"s2}
If we encountered at least one one, we print the ratio s1/s2. Otherwise, we print 0/0.
Here is an awk that does what you need:
cat file
data 0 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0
data 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
data 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
data 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
BMR_10#O24-BMR_6#O13-H13 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1
data 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1
awk '{$1="";$0="0 "$0" 0";t=split($0,b,"1")-1;gsub(/ +/,"");n=split($0,a,"[^1]+")-2;print (n?t/n:0)}' t
2.6
0
25
11
5.5
3