select rows where only 1 of boolean columns is true - sql

I have a table with the following format:
CREATE TABLE segments(id INT, walk BOOLEAN, taxi BOOLEAN, bus BOOLEAN,
subway BOOLEAN, bike BOOLEAN);
INSERT INTO segments (id, walk, taxi, bus, subway, bike)
VALUES (0,false,false,false,false,true),
(1,true,true,false,false,false),(2,true,false,false,false,false),
(3,true,false,true,false,false),(4,true,true,true,false,false),
(5,false,false,true,false,false),(6,true,true,false,false,false),
(7,true,false,false,false,false),(8,true,false,true,false,false),
(9,true,true,true,false,false),(10,true,false,true,false,false);
SELECT * FROM segments;
id walk taxi bus subway bike
0 f f f f t
1 t t f f f
2 t f f f f
3 t f t f f
4 t t t f f
5 f f t f f
6 t t f f f
7 t f f f f
8 t f t f f
9 t t t f f
10 t f t f f
But I want filter rows where only 1 of walk, taxi, bus, subway or bike it true, and no other.
Expected output:
id walk taxi bus subway bike
0 f f f f t
2 t f f f f
5 f f t f f
7 t f f f f

You can cast boolean as int in postgres:
SELECT *
from segments
where walk::int + taxi::int + bus::int + subway::int + bike::int = 1;
id
walk
taxi
bus
subway
bike
0
false
false
false
false
true
2
true
false
false
false
false
5
false
false
true
false
false
7
true
false
false
false
false
See db fiddle.

One way is to encode your booleans as bits, and then check that exactly 1 bit is set. Example:
select * from segments
where 1*walk::int+2*taxi::int+4*bus::int+8*subway::int+16*bike::int in (1,2,4,8,16);
Fiddle

Related

How can I aggregate strings from many cells into one cell?

Say I have two classes with a handful of students each, and I want to think of the possible pairings in each class. In my original data, I have one line per student.
What's the easiest way in Pandas to turn this dataset
Class Students
0 1 A
1 1 B
2 1 C
3 1 D
4 1 E
5 2 F
6 2 G
7 2 H
Into this new stuff?
Class Students
0 1 A,B
1 1 A,C
2 1 A,D
3 1 A,E
4 1 B,C
5 1 B,D
6 1 B,E
7 1 C,D
6 1 B,E
8 1 C,D
9 1 C,E
10 1 D,E
11 2 F,G
12 2 F,H
12 2 G,H
Try This:
import itertools
import pandas as pd
cla = [1, 1, 1, 1, 1, 2, 2, 2]
s = ["A", "B", "C", "D" , "E", "F", "G", "H"]
df = pd.DataFrame(cla, columns=["Class"])
df['Student'] = s
def create_combos(list_students):
combos = itertools.combinations(list_students, 2)
str_students = []
for i in combos:
str_students.append(str(i[0])+","+str(i[1]))
return str_students
def iterate_df(class_id):
df_temp = df.loc[df['Class'] == class_id]
list_student = list(df_temp['Student'])
list_combos = create_combos(list_student)
list_id = [class_id for i in list_combos]
return list_id, list_combos
list_classes = set(list(df['Class']))
new_id = []
new_combos = []
for idx in list_classes:
tmp_id, tmp_combo = iterate_df(idx)
new_id += tmp_id
new_combos += tmp_combo
new_df = pd.DataFrame(new_id, columns=["Class"])
new_df["Student"] = new_combos
print(new_df)

How to build column by column dataframe pandas

I have a dataframe looking like this example
A | B | C
__|___|___
s s nan
nan x x
I would like to create a table of intersections between columns like this
| A | B | C
__|______|____|______
A | True |True| False
__|______|____|______
B | True |True|True
__|______|____|______
C | False|True|True
__|______|____|______
Is there an elegant cycle-free way to do it?
Thank you!
Setup
df = pd.DataFrame(dict(A=['s', np.nan], B=['s', 'x'], C=[np.nan, 'x']))
Option 1
You can use numpy broadcasting to evaluate each column by each other column. Then determine if any of the comparisons are True
v = df.values
pd.DataFrame(
(v[:, :, None] == v[:, None]).any(0),
df.columns, df.columns
)
A B C
A True True False
B True True True
C False True True
By replacing any with sum you can get a count of how many intersections.
v = df.values
pd.DataFrame(
(v[:, :, None] == v[:, None]).sum(0),
df.columns, df.columns
)
A B C
A 1 1 0
B 1 2 1
C 0 1 1
Or use np.count_nonzero instead of sum
v = df.values
pd.DataFrame(
np.count_nonzero(v[:, :, None] == v[:, None], 0),
df.columns, df.columns
)
A B C
A 1 1 0
B 1 2 1
C 0 1 1
Option 2
Fun & Creative way
d = pd.get_dummies(df.stack()).unstack(fill_value=0)
d = d.T.dot(d)
d.groupby(level=1).sum().groupby(level=1, axis=1).sum()
A B C
A 1 1 0
B 1 2 1
C 0 1 1

Edit text in file(UTF16)

I want replace 1 word in text file (file format is not .txt)
file Unicode is (UTF16)
few text example:
I D = " f f 0 3 4 a 9 2 - d d 9 f - 4 3 7 4 - a 8 a d - f 5 5 4 0 0 2 a 4 1 9 b " I S S U E _ D A T E = " 2 0 1 7 - 0 2 - 1 6 T 1 7 : 2 9 : 1 8 . 9 7 0 2 2 9 4 Z " S E Q U E N C E = " 0 " M A N A G I N G _ A P P L I C A T I O N _ T O K E N = " " > < L I C E N S E P U B L I C _ I D = " 3 A A - U J F - 8 K P " U S E R N A M E = " N d a G 6 Z T w u v I X Z B i t h 8 g o d d Q x E r x 0 + O g M c t 0 2 3 f X K O E w = " P A S S W O R D = " F 9 b n 6 b v w l f I 5 Z A 2 t h M h 9 d d s x Q L w = " T Y P E = " T R I A L " F L A G S = " 4 " D I S P L A Y _ N A M E =
I want change T R I A L to other word
It's not too hard to modify your text file. Use the IO class to assign it to a text file, then use String.Replace(oldValue As String, newValue As String) to change your string. Then use IO again to save the string to the file. This should work so long as your file isn't open and being used in another program - regardless of file extensions.
An example, to help you, could be something such as this:
Dim myFileContents as String = IO.File.ReadAllText("Path\To\My\File\File.extension")
myFileContents = myFileContents.Replace("T R I A L", "Some other word")
IO.File.WriteAllText("Path\To\My\File\File.extension", myFileContents)
Modify the contents to suit your situation - however, this is only a basic implementation. Additionally, it is important to note that String.Replace() will change all occurrences of your word to the new word.

pandas most efficient way to compare dataframe and series

I have a dataframe of shape (n, p) and a series of length n
I can compare them with:
for i in df.keys():
df[i] > ts
Is there a way to do it in one line? something like df > ts.
if yes, is it more efficient?
I think you need DataFrame.gt:
print (df.gt(s, axis=0))
Sample:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
s = pd.Series([1,2,3])
print (s)
0 1
1 2
2 3
dtype: int64
print (df.gt(s, axis=0))
A B C D E F
0 False True True False True True
1 False True True True True True
2 False True True True True False
If need another functions for compare:
lt
gt
le
ge
ne
eq

Is there a limit on the number of operations VBA can perform?

Inspired by a puzzle I saw online , and since I'm a VBA newbie, I thought it might be an interesting exercise to help me learn how to use For loops by making a brute force method to search for solutions to it.
This led to creating a monstrosity that takes ages to partially run and actually won't fully run at all.
All the code is meant to do is print 3 columns of valid combinations
Private Sub CommandButton1_Click()
Dim j As Long
Dim abc As String
Dim def As String
Dim ghi As String
j = 1
For a = 1 To 9
For b = 1 To 9
For c = 1 To 9
For d = 1 To 9
For e = 1 To 9
For f = 1 To 9
For g = 1 To 9
For h = 1 To 9
For i = 1 To 9
'Line breaks included for ease of reading
If a = b Or a = c Or a = d Or a = e Or a = f Or a = g Or a = h Or a = i
Or b = c Or b = d Or b = e Or b = f Or b = g Or b = h Or b = i
Or c = d Or c = e Or c = f Or c = g Or c = h Or c = i
Or d = e Or d = f Or d = g Or d = h Or d = i
Or e = f Or e = g Or e = h Or e = i
Or f = g Or f = h Or f = i
Or g = h Or g = i
Or h = i Then
Else
abc = a & b & c
def = d & e & f
ghi = g & h & i
ThisWorkbook.Sheets("Sheet1").Cells(j, 1).Value = abc
ThisWorkbook.Sheets("Sheet1").Cells(j, 2).Value = def
ThisWorkbook.Sheets("Sheet1").Cells(j, 3).Value = ghi
j = j + 1
End If
Next i
Next h
Next g
Next f
Next e
Next d
Next c
Next b
Next a
End Sub
This obviously involves lots of simple operations, and results in variously (Not Responding) messages or it just doesn't run. Was it possible to tell before clicking "Go" that that would be the case?
My job has me working on, and adding to, spread sheets that perform lots of operations on other spread sheets with conceivably hundreds of thousands of data items in each. Continuing to add functionality to these files may or may not be sustainable, I need to know how to tell before I sink time into further development.
Is there a hard limit to what can be done with VBA in terms of volumes/numbers of operations? Is there a tool that will estimate the viability of a macro actually running to completion? A heuristic commonly employed in industry?
Basically, what methods or tools exist to inform as to whether the demands of a macro or series of macros will exceed available memory?
Thanks
You should attempt to do a tiny bit of maths before you begin.
Your code generates all the permutations of 9 thing taken 9 at a time. Before you begin, you know this this will fill 362880 rows; so the code should work.
A separate issue is how much time will it take, and is this the most efficient method .
Not an answer, but can it be tackled by just looping the numbers? Something along these lines, is checking the 1st char against the rest
Dim l As Long
Dim i As Integer
For l = 11111111 To 99999999
For i = 2 To 8
DoEvents
If (Mid(l, i, 1) <> Mid(l, 1, 1)) Then
Debug.Print l
End If
Next i
Next l