I have been trying to research and Google forever for this but I cannot find an answer. I have duplicate values in 1 column but I would like to display them only once. Is it even possible in SQL?
What I have:
A
B
C
A
2
3
A
2
4
B
4
4
B
3
4
C
3
9
What I would like:
A
B
C
A
2
3
A
4
B
4
4
B
3
4
C
9
Use this:
SELECT A,
CASE WHEN (LAG(B) OVER (ORDER A)) = B THEN '' ELSE CONVERT(VARCHAR,B) END AS B,
C FROM TABLENAME
I am wondering how I can create an Edge list (from, to) based on this type of data. Both columns are inside a pandas data frame and the type is string.
Name
Co-Workers
A
A,B,C,D
B
A,B,C,D
C
A,B,C,E
D
A,B,D,E
E
C,D,E
And also I want to remove connections like AA BB CC ,....
IIUC, you can explode your data and filter it:
df2 = df.copy()
df2['Co-Workers'] = df['Co-Workers'].str.split(',')
df2 = df2.explode('Co-Workers')
df2[df2['Name'].ne(df2['Co-Workers'])]
output:
Name Co-Workers
0 A B
0 A C
0 A D
1 B A
1 B C
1 B D
2 C A
2 C B
2 C E
3 D A
3 D B
3 D E
4 E C
4 E D
First split the column from string to list of separate values.
Second, explode the column.
Third, create a directional graph.
Process the data by mozway code
And then:
from matplotlib.pyplot import figure
G = nx.from_pandas_edgelist(df2, source='Name', target='Co-Workers')
figure(figsize=(10, 8))
nx_graph = nx.compose(nx.DiGraph(), G)
nx.draw_shell(nx_graph, with_labels=True)
Result graph:
i have a table like
a 1
a 2
b 1
b 3
b 2
b 4
i wanted out put like this
1 2 3 4
a a
b b b b
Number of rows in output may vary.
Pivoting is not working as it is in exasol, and case cant work as it is dynamic
I am a new bee to python and Pandas, I have a huge data set and insted of applying function row by row I want to apply to a batch of rows and return back the result and associated back to the same corresponding row back
Example:
ID Values
a 2
b 3
c 4
d 5
e 6
f 7
df['squared_values']= df['values'].apply(lambda row: function(row))
def function(x):
#making call to api and returning values related to x
return response
above one apply function row by row which is time consuming
I need a way to do batch operations on row
example:
batch=3
df['squared_values']= df['values'].apply(lambda batch: function(batch))
on first pass values should be
ID Values squared_values
a 2 4
b 3 9
c 4 16
d 5
e 6
f 7
on second pass
ID Values squared_values
a 2 4
b 3 9
c 4 16
d 5 25
e 6 36
f 7 49
Is this operation really too slow?
df['squared_values'] = df['Values'] ** 2
you can always add the iloc to select rows:
df.iloc['squared_values'].update(df.iloc[0:4]['Values'] ** 2)
But I can't imagine this being quicker
I am trying to get those rows from the table which is corresponding to the selective indexes. For example, i have one xls file in which different columns of data. currently my code search the selective two columns and their indexes also, know i want to search those selective rows corresponding elements which is in different rows.
Lets A B C D E F G are columns name in which 1000 of rows of numbers
like
A B c D E F G
1 3 4 5 6 3 3
3 4 5 6 3 2 7
.............
4 7 3 2 5 3 2
So Currently my code search two specific columns (lets suppose B and F selective values which is in some range), now i want to search column A value which is present in those selective ranges.
B F A
3 4 5
3 5 3
7 7 3
5 4 6
...
like this
This is my current code VI
I hope we've finally gotten to the bottom of it. How about this one?