tidyverse across where(!is.factor)? - tidyverse

I would like to create factor variables for all non-factor columns. I tried:
dat %>%
mutate(across(where(!is.factor), as.factor, .names = "{.col}_factor"))
But get error message:
Error in `mutate()`:
! Problem while computing `..1 = across(where(!is.factor), as.factor, .names = "{.col}_factor")`.
Caused by error in `across()`:
! invalid argument type
Run `rlang::last_error()` to see where the error occurred.

the where() function needs to be written as a formula, which in tidyverse shorthand is:
dat %>%
mutate(across(where(~!is.factor(.x)), as.factor, .names = "{.col}_factor"))

Related

Error in as.data.frame.default(y) : cannot coerce class ‘"function"’ to a data.frame

# Use x in the left_join() function
combined_tb <-
x %>%
left_join(presentation.tree, by = "toElementId", copy = TRUE) %>%
left_join(labels, by = c("toElementId"="elementId"), copy = TRUE) %>%
select(roleId, fromElementId, toElementId, order, fromHref,
toHref, concept, period, unit, value, decimals, labelString)
I do get this error message:
Error in as.data.frame.default(y) :
cannot coerce class ‘"function"’ to a data.frame
I don't know what to do.
I tried to solve the issue but nothing works

KeyError: 'date' Pandas

```if __name__ == "__main__":
pd.options.display.float_format = '{:.4f}'.format
temp1 = pd.read_csv('_4streams_alabama.csv.gz')
temp1['date'] = pd.to_datetime(temp1['date'])
def vacimpval(x):
for date in x['date'].unique():
if date >= '2022-06-16':
x['vac_count'] = x['vac_count'].interpolate()
x['vac_count'] = x['vac_count'].astype(int)
for location in temp1['location_name'].unique():
s = temp1.apply(vacimpval)```
In the code above, I am trying to use this function for all the location so that I can fill in the values using the interpolate method() but I don't know why I keep getting an key error
Source of the error:
Since there are only two places in your code where you access 'date',
and as you said, temp1.columns contains 'date', then the problem is in x['date'].

python JupyterNotebook with pandas matrix()

Hi there this is my code:
When I try to run this I get an error.
df = pd.read_csv(file, sep='|', encoding='latin-1')
arreglox = df[df.columns['id':'date_in':'date_out':'objetive':'comments']].as_matrix()
arregloy = df[df.columns[1]].as_matrix()
Here is the error:
File "<ipython-input-30-6060fe26b2b1>", line 1
arreglox = df[df.columns['id':'date_in':'date_out':'objetive':'comments']].as_matrix()
^
SyntaxError: invalid syntax
please help me, thank u very much
The syntax is wrong, if you want those columns in that order try this:
arreglox = df[['id','date_in','date_out','objetive','comments']].as_matrix()

Concatenate DataFrames.DataFrame in Julia

I have a problem when I try to concatenate multiple DataFrames (a datastructure from the DataFrames package!) with the same columns but different row numbers. Here's my code:
using(DataFrames)
DF = DataFrame()
DF[:x1] = 1:1000
DF[:x2] = rand(1000)
DF[:time] = append!( [0] , cumsum( diff(DF[:x1]).<0 ) ) + 1
DF1 = DF[DF[:time] .==1,:]
DF2 = DF[DF[:time] .==round(maximum(DF[:time])),:]
DF3 = DF[DF[:time] .==round(maximum(DF[:time])/4),:]
DF4 = DF[DF[:time] .==round(maximum(DF[:time])/2),:]
DF1[:T] = "initial"
DF2[:T] = "final"
DF3[:T] = "1/4"
DF4[:T] = "1/2"
DF = [DF1;DF2;DF3;DF4]
The last line gives me the error
MethodError: Cannot `convert` an object of type DataFrames.DataFrame to an object of type LastMain.LastMain.LastMain.DataFrames.AbstractDataFrame
This may have arisen from a call to the constructor LastMain.LastMain.LastMain.DataFrames.AbstractDataFrame(...),
since type constructors fall back to convert methods.
I don't understand this error message. Can you help me out? Thanks!
I just ran into this exact problem on Julia 0.5.0 x86_64-linux-gnu, DataFrames 0.8.5, with both hcat and vcat.
Neither clearing the workspace nor reloading DataFrames solved the problem, but restarting the REPL fixed it immediately.

genfromtxt in Python-3.5

I am trying to fix a data set using genfromtxt in Python 3.5. But I keep getting the next error:
ndtype = np.dtype(dict(formats=ndtype, names=names))
TypeError: data type not understood
This is the code I'm using. Any help will be appreciated!
names = ["country", "year"]
names.extend(["col%i" % (idx+1) for idx in range(682)])
dtype = "S64,i4" + ",".join(["f18" for idx in range(682)])
dataset = np.genfromtxt(data_file, dtype=dtype, names=names, delimiter=",", skip_header=1, autostrip=2)
dtype = "S64,i4" + ",".join(["f18" for idx in range(682)])
is going to produce something like:
s64,i4f18,f18,f18,f18...
Note the lack of a comma after the i4.