Sequence function on existing data - sequence

picture of example dataset
I am looking for a function to change the values from 0,1,2,3,4,5,6 etc. to every 5 in R. I have a big dataset similar to column A and B and would like to change it to columns H and I (like shown in the attached picture).
I'd like to change every cm til every 5 cm so that species that covers 5cm or shorter than 5 cm are registered as a point (similar to equ_palu). Moreover, the specie bet_nana covers 0-10 cm and is therefore registered as 5 and 10 in column H and I.

Related

How to label a whole dataset?

I have a question.I have a pandas dataframe that contains 5000 columns and 12 rows. Each row represents the signal received from an electrocardiogram lead. I want to assign 3 labels to this dataset. These 3 tags belong to the entire dataset and are not related to a specific row. How can I do this?
I have attached the picture of my dataframepandas dataframe.
and my labels are: Atrial Fibrillation:0,
right bundle branch block:1,
T Wave Change:2
I tried to assign 3 labels to a large dataset
(Not for a specific row or column)
but I didn't find a solution.
As you see, it has 12 rows and 5000 columns. each row represents 5000 data from one specific lead and overall we have 12 leads which refers to this 12 rows (I, II, III, aVR,.... V6) in my data frame. professional experts are recognised 3 label for this data frame which helps us to train a ML Model to detect different heart disease. I have 10000 data frame just like this and each one has 3 or 4 specific labels. Here is my question: How can I assign these 3 labels to this dataset that I mentioned.as I told before these labels don't refers to specific rows, in fact each data frame has 3 or 4 label for its whole. I mean, How can I assign 3 label to a whole data frame?

moving from tabular to graph representation of a given data

Suppose that I have the following data t:
activity
teacher
group
students
duration
subject
One
A
a
3
45
Math
One
B
b
2
45
Math
two
A
c
7
60
P.E
One
D
a
3
45
Math
two
C
c
7
60
P.E
I want to construct a graph data instead of this tabular data. I am actually interested in predicting the teacher by applying some kind of Graph ML. is there a way to transform the tabular data into graphical data ? maybe using networkX.
I tried the following code
G = nx.from_pandas_edgelist(df, "subject", "teacher", edge_attr=True, create_using=nx.Graph())
nx.draw_networkx(G)
plt.show()
the output of this looks like a graph, but I don't understand how it works or how can I get the new data or what is the best way to identify the node and the edge.
thank you in advance for any help.

Dendrograms with SciPy

I have a dataset that I shaped according to my needs, the dataframe is as follows:
Index A B C D ..... Z
Date/Time 1 0 0 0,35 ... 1
Date/Time 0,75 1 1 1 1
The total number of rows is 8878
What I try to do is create a time-series dendrogram (Example: Whole A column will be compared to whole B column in whole time).
I am expecting an output like this:
(source: rsc.org)
I tried to construct the linkage matrix with Z = hierarchy.linkage(X, 'ward')
However, when I print the dendrogram, it just shows an empty picture.
There is no problem if a compare every time point with each other and plot, but in that way, the dendrogram becomes way too complicated to observe even in truncated form.
Is there a way to handle the data as a whole time series and compare within columns in SciPy?

Reordering rows in sql database - idea

I was thinking about simple reordering rows in relational database's table.
I would like to avoid method described here:
How can I reorder rows in sql database
My simple idea was to use as ListOrder column of type double-precision 64-bit IEEE 754 floating point.
At inserting a row between two existing rows we calculate listOrder value as average of these sibling elements.
Example:
1. Starting state:
value, listOrder
a 1
b 2
c 3
d 4
e 5
f 6
2. Moving "e" two rows up
One simple sql update on e-row: update mytable set listorder=2.5 where value='e'
value, listOrder
a 1
b 2
e 2.5
c 3
d 4
f 6
3. Moving "a" one position down
value, listOrder
b 2
a 2.25
e 2.5
c 3
d 4
f 6
I have a question. How many insertions can I perform (in the edge situation) to have properly ordered list.
For the 64 bit integer there is less than 64 insertions in the same place.
Is floating point types allows to more insertions?
There are other problems with described approach?
Do you see any patches/adjustments to make this idea safe and usable in applications?
This is similar to a lexical order, which can also be done with varchar columns:
A
B
C
D
E
F
becomes
A
B
BM
C
D
F
becomes
B
BF
BM
C
D
F
I prefer the two step process, where you update every row in the table after the one you move to be one larger. Sql is efficient about this, where updating the rows following a change is not as bad as it seems. You preserve something that's more human readable, the storage size for your ordinal value scales in a linear rather with your data size, and you don't risk coming to a point where you don't have enough precision to put an item in between two values

VBA to find maximum value in a chart

I have a range of data columns A, B, and C. I have displayed as a line graph with B as the primary axis and C as the secondary axis. Column A is the category axis. I want to find the maximum value of column C and put a data callout on the point that is the maximum of column C and where column B occurs.
I know this sounds confusing. In this example, the maximum of Column C occurs at Point 27 (or 1.50% on the category axis). I would like a dot at point 27 for both Column B and C.
Column A is percentage from -5.00 to 10.00 incremented at .25%. Columns B and C are plotted against the change.
In the past I have done something similar, use a formula in column D to identify the largest number in Column C and B and make it a value high on your chart if the result is true.
Add Column D as a series to the chart.
Change the chart type on that series only to a scatter chart or something that puts points up there.
You can put a label on or simply put the amount showing above the plotted point.
You don't need VBA for this.
You might be interested to know I found a solution that works for me. First, I added columns D and E using the formula =IF(C2=MAX(C$2:C$62),C2,NA()) and =IF(C2=MAX(C$2:C$62),B2,NA()), this gave me the point on the graph for both lines B and C where B was maximum. I then formatted the graph so that these points had data callouts (a request from the client). Finally, I set columns D and E to have white font, to match the background so the appear invisible. I don't love this step, but I don't want the client to see the extra rows of #NA, etc.
The basic VBA for data callout is ActiveChart.FullSeriesCollection(5).Select
ActiveChart.SetElement (msoElementDataLabelCallout)
Where the series is 5 (column E) and I'm putting a data callout on the graphed point, which happens to be the maximum of column 3.