How can I read a data file into fortran 95? - formatting

How can I get fortran to read the following data in a txt file?
22 20
0. 10. 1 16.
0. 0. 1 16.
10. 10. 0 16.
10. 0. 0 16.
20. 10. 0 16.
20. 0. 0 16.
30. 10. 0 16.
30. 0. 0 16.
40. 10. 0 16.
40. 0. 0 16.
50. 10. 0 16.
50. 0. 0 16.
60. 10. 0 16.
60. 0. 0 16.
70. 10. 0 16.
70. 0. 0 16.
80. 10. 0 16.
80. 0. 0 16.
90. 10. 0 16.
90. 0. 0 16.
100. 10. 1 11.
100. 0. 1 11.
1 2 4
1 4 3
3 4 6
3 6 5
5 6 8
5 8 7
7 8 10
7 10 9
9 10 12
9 12 11
11 12 14
11 14 13
13 14 16
13 16 15
15 16 18
15 18 17
17 18 20
17 20 19
19 20 22
19 22 21
Hi have tried doing this below but i am only able to read the first 2 rows, i need it to read it all.
program FILEREADER
real, dimension(:,:), allocatable :: x
integer :: n,m
open (unit=99, file='data.txt', status='old', action='read')
read(99, *), n
read(99, *), m
allocate(x(n,m))
do I=1,n,1
read(99,*) x(I,:)
write(*,*) x(I,:)
enddo
end
This is the result i get:
0 0.000 0.000 0.000 0.000
22 20.000 0.000 10.000 1.000

Related

How can I plot two lines in one graph where values of the lines do not exist for the same x axis?

I would like to plot SupDem (variable) where e_boix_regime==1 and SupDem where e_boix_regime==0.
My data:
year
SupDem
e_boix_regime
1997
0.98
1
1998
0.75
0
My code:
dem = dem_aut[dem_aut["e_boix_regime"]==1].SupDem
aut = dem_aut[dem_aut["e_boix_regime"]==0].SupDem
year = dem_aut["year"]
plt.plot(year, dem, label="Suuport for Democracy in Demcoracies")
plt.plot(year, aut, label="Support for Democracy in Autocracies")
plt.show()```
The error is follwoing: x and y must have same first dimension, but have shapes (53,) and (28,)
I just wanted to plot two lines together.
This can help you solve the problem. I hope you can reproduce the codee with it:
two (or more) graphs in one plot with different x-axis AND y-axis scales in python
Issue
Your issue is regarding shape of x and y. For plotting graph you need same data point/shape of x-values and y-values.
Solution
Take each year with dem_aut["e_boix_regime"]==1 and dem_aut["e_boix_regime"]==2 condition as you are doing with SupDem
Source Code
df = pd.DataFrame(
{
"SupDem": np.random.randint(1, 11, 30),
"year": np.random.randint(10, 21, 30),
"e_boix_regime": np.random.randint(1, 3, 30),
}
) # see DataFrame below
df["e_boix_regime"].value_counts() # 1 = 18, 2 = 12
df[df["e_boix_regime"] == 2][["SupDem", "year"]] # see below
# you need same no. of data points for both x/y axis i.e. `year` and `SupDem`
plt.plot(
df[df["e_boix_regime"] == 1]["year"], df[df["e_boix_regime"] == 1]["SupDem"], marker="o", label="e_boix_regime==1"
)
# hence applying same condition for grabbing year which is applied for SupDem
plt.plot(
df[df["e_boix_regime"] == 2]["year"], df[df["e_boix_regime"] == 2]["SupDem"], marker="o", label="e_boix_regime==2"
)
plt.xlabel("Year")
plt.ylabel("SupDem")
plt.legend()
plt.show()
Output
PS: Ignore the data point plots, it's generated from random values
DataFrame Outputs
SupDem year e_boix_regime
0 1 12 2
1 10 10 1
2 5 19 2
3 4 14 2
4 8 14 2
5 4 17 2
6 2 15 2
7 10 11 1
8 8 11 2
9 6 19 2
10 5 15 1
11 8 17 1
12 9 10 2
13 1 14 2
14 8 18 1
15 3 13 2
16 6 16 2
17 1 16 1
18 7 13 1
19 8 15 2
20 2 17 2
21 5 10 2
22 1 19 2
23 5 20 2
24 7 16 1
25 10 14 1
26 2 11 2
27 1 18 1
28 5 16 1
29 10 18 2
df[df["e_boix_regime"] == 2][["SupDem", "year"]]
SupDem year
0 1 12
2 5 19
3 4 14
4 8 14
5 4 17
6 2 15
8 8 11
9 6 19
12 9 10
13 1 14
15 3 13
16 6 16
19 8 15
20 2 17
21 5 10
22 1 19
23 5 20
26 2 11
29 10 18

Keep only the first value on duplicated column (set 0 to others)

Supposing I have the following situation:
A dataframe where the first column ['ID'] will eventually have duplicated values.
import pandas as pd
df = pd.DataFrame({"ID": [1,2,3,4,4,5,5,5,6,6],
"l_1": [10,12,32,45,45,20,20,20,20,20],
"l_2": [11,12,32,11,21,27,38,12,9,6],
"l_3": [5,9,32,12,21,21,18,12,8,1],
"l_4": [6,21,12,77,77,2,2,2,8,8]})
ID l_1 l_2 l_3 l_4
1 10 11 5 6
2 12 12 9 21
3 32 32 32 12
4 45 11 12 77
4 45 21 21 77
5 20 27 21 2
5 20 38 18 2
5 20 12 12 2
6 20 9 8 8
6 20 6 1 8
When duplicated IDs occurs:
I need to keep only the first values for column l_1 and l_4 (other duplicated rows must be zero).
Columns 'l_2' and 'l_3' must stay the same.
When duplicated IDs occurs, the values on these rows on columns l_1 and l_4 will be also duplicated.
Expected output:
ID l_1 l_2 l_3 l_4
1 10 11 5 6
2 12 12 9 21
3 32 32 32 12
4 45 11 12 77
4 0 21 21 0
5 20 27 21 2
5 0 38 18 0
5 0 12 12 0
6 20 9 8 8
6 0 6 1 0
Is there a Straightforward way using pandas or numpy to accomplish this ?
I could just accomplish it doing all these steps:
x1 = df[df.duplicated(subset=['ID'], keep=False)].copy()
x1.loc[x1.groupby('ID')['l_1'].apply(lambda x: (x.shift(1) == x)), 'l_1'] = 0
x1.loc[x1.groupby('ID')['l_4'].apply(lambda x: (x.shift(1) == x)), 'l_4'] = 0
df = df.drop_duplicates(subset=['ID'], keep=False)
df = pd.concat([df, x1])
Isn't this just:
df.loc[df.duplicated('ID'), ['l_1','l_4']] = 0
Output:
ID l_1 l_2 l_3 l_4
0 1 10 11 5 6
1 2 12 12 9 21
2 3 32 32 32 12
3 4 45 11 12 77
4 4 0 21 21 0
5 5 20 27 21 2
6 5 0 38 18 0
7 5 0 12 12 0
8 6 20 9 8 8
9 6 0 6 1 0

Removing rows and keeping consecutive rows pandas

I would like to omit the first row and keep x consecutive rows.
in the example below i would like to keep 7. How do i achieve this?
df = pd.Series(range(1,101)).to_frame()
df.columns = ['numbers']
df['numbers'][1::7]
1 2
8 9
15 16
22 23
29 30
36 37
43 44
50 51
57 58
64 65
71 72
78 79
85 86
92 93
99 100
I would like to keep the values below but continue to the next row sequence.
so remove 1 then keep 2 to 7. then remove 8 and keep 9 to 14
df = pd.Series(range(1,101)).to_frame()
df.columns = ['numbers']
df['numbers'][1:7]
1 2
2 3
3 4
4 5
5 6
6 7
Or loc:
df.loc[df.index % 7 != 0]
giving
numbers
1 2
2 3
3 4
4 5
5 6
6 7
8 9
9 10
10 11
11 12
12 13
13 14
15 16
16 17
... ...
drop
df.drop(df.index[::7])
numbers
1 2
2 3
3 4
4 5
5 6
6 7
8 9
9 10
10 11
11 12
12 13
13 14
15 16
16 17
17 18
18 19
.. ...

Interpolate proportionally with duplicate index

I have a table like
df = pd.DataFrame([1,np.nan,3,1,np.nan,3,50,np.nan,52], index=[7, 8, 9, 7, 12, 27, 7, 8, 9]):
index values
7 1
8 NaN
9 3
7 1
12 NaN
27 3
7 50
8 NaN
9 52
Rows are correctly sorted. However, index here is not ordered, and has duplicates by design.
How to interpolate values here proportionally to index (method="index")?
If I try to interpolate using index, resulting Series is messed up because of duplicate index:
df.interpolate(method='index'):
index values desired actual
7 1 1 1
8 NaN 2 2
9 3 3 3
7 1 1 1
12 NaN 1.5 52 <-- wat
27 3 3 3
7 50 50 50
8 NaN 51 1.1 <-- wat
9 52 52 52
If not reproducible: Pandas 0.23.3, Numpy: 1.14.5, Python: 3.6.5
Try to add a grouping the dataframe based on index:
df.groupby(df.index.to_series().diff().lt(0).cumsum())\
.apply(lambda x: x.interpolate(method='index'))
Output:
0
7 1.0
8 2.0
9 3.0
7 1.0
12 1.5
27 3.0
7 50.0
8 51.0
9 52.0
More complicated way if you have situation like I mentioned above in scott 's comment
np.where(df['values'].isnull(),df['values'].shift()+(df['values'].shift(-1)-df['values'].shift())*(df['index']-df['index'].shift())/(df['index'].shift(-1)-df['index'].shift()),df['values'])
Out[219]: array([ 1. , 2. , 3. , 1. , 1.5, 3. , 50. , 51. , 52. ])
This is to check the distance of each null value between two valid value , and fill the value with the distance of index(different).
tolerance : only one missing value between two values

Column values of multilevel indexed DataFrame are not properly updated

import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(30).reshape(6,5), index=[list('aaabbb'), list('XYZXYZ')])
print(df)
df.loc[pd.IndexSlice['a'], 3] /= 10
print(df)
From the above code I expected below table:
0 1 2 3 4
a X 0 1 2 0.3 4
Y 5 6 7 0.8 9
Z 10 11 12 0.13 14
b X 15 16 17 18 19
Y 20 21 22 23 24
Z 25 26 27 28 29
But the actual result is as below table:
0 1 2 3 4
a X 0 1 2 NaN 4
Y 5 6 7 NaN 9
Z 10 11 12 NaN 14
b X 15 16 17 18.0 19
Y 20 21 22 23.0 24
Z 25 26 27 28.0 29
What went wrong in the code?
Need specify second level by : for select all values:
df.loc[pd.IndexSlice['a', :], 3] /= 10
print(df)
0 1 2 3 4
a X 0 1 2 0.3 4
Y 5 6 7 0.8 9
Z 10 11 12 1.3 14
b X 15 16 17 18.0 19
Y 20 21 22 23.0 24
Z 25 26 27 28.0 29
Solution with slice:
df.loc[(slice('a'), slice(None)), 3] /= 10
print(df)
0 1 2 3 4
a X 0 1 2 0.3 4
Y 5 6 7 0.8 9
Z 10 11 12 1.3 14
b X 15 16 17 18.0 19
Y 20 21 22 23.0 24
Z 25 26 27 28.0 29