How to add entity_effects to correspond to variable? - dataframe

I wanted to add entity_effects that correspond with my column host_id, and add drop_absorbed = True to define the data.
panel_data
This is what I tried:
panel.reset_index(inplace = True)
modFE = PanelOLS(panel.price_USD2, panel[['bedrooms','beds','number_of_reviews','review_scores_rating','host_is_superhost2_x','n_listings','centrococo_d','basilica_d']], entity_effects = panel.host_id, time_effects = False, drop_absorbed = True)
resFE = modFE.fit(cov_type = 'unadjusted')
print(resFE)
However, it returned "ValueError: series can only be used with a 2-level MultiIndex" in modFE

Related

ArcPy Script Invalid Character Issue

I keep getting an error when I run this tool script and not sure how to troubleshoot (previously worked before). It's stating that I have invalid characters for line 13: arcpy.analysis.Select(in_features=TCG4232_USNG_GRIDS, out_feature_class=TCG4232_USNG_GRIDS_Selection, where_clause="")
import arcpy
#Set variables
arcpy.mp.ArcGISProject('CURRENT')
aprx = arcpy.mp.ArcGISProject('CURRENT')
TCG4232_USNG_GRIDS = "TCG4232_USNG_GRIDS"
Output_Folder = arcpy.GetParameterAsText(0)
#Execute Select
arcpy.env.overwriteOutput = True
TCG4232_USNG_GRIDS_Selection = "P:/PROJECTS/RP_Quality_Control_cGIS/MapSalesArcProProject/MapSalesLive1211/MapSalesLive/ExportedGrids.gdb/TCG4232_USNG_GRIDS_Selection"
arcpy.analysis.Select(in_features=TCG4232_USNG_GRIDS, out_feature_class=TCG4232_USNG_GRIDS_Selection, where_clause="")
#Add selected grid to map contents
arcpy.env.workspace = r"P:/PROJECTS/RP_Quality_Control_cGIS/MapSalesArcProProject/MapSalesLive1211/MapSalesLive/ExportedGrids.gdb"
aprx = arcpy.mp.ArcGISProject('CURRENT')
map = aprx.listMaps("Map")[0]
map.addDataFromPath(TCG4232_USNG_GRIDS_Selection)
#list of layer names that you want to be turned off.
p = arcpy.mp.ArcGISProject("Current")
m = p.listMaps("Map")[0]
layer_names = ['TCG4232_USNG_GRIDS_Selection','TCG4232_USNG_GRIDS']
lyrList = m.listLayers()
for lyr in lyrList:
lyr.visible = True
if lyr.name in layer_names:
lyr.visible = False
#Print to PDF
try:
aprx = arcpy.mp.ArcGISProject('CURRENT')
l = aprx.listLayouts()[0]
l.mapSeries.refresh()
if l.mapSeries is not None:
ms = l.mapSeries
if ms.enabled:
ms = l.mapSeries
indexLyr = "TCG4232_USNG_GRIDS_Selection"
ms.exportToPDF(Output_Folder,"ALL",
"",
"PDF_SINGLE_FILE",
150,
"FASTEST",
True,
"ADAPTIVE",
True,
"LAYERS_ONLY",
True,
80,
True,
False)
except Exception as e:
print(f"Error: {e.args[0]}")
#Delete selected layer
arcpy.management.Delete("TCG4232_USNG_GRIDS_Selection", '')

Replacing append with concat?

Whenever I run this code I get:
The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
What should I do to make the code run with concat?
final_dataframe = pd.DataFrame(columns = my_columns)
for symbol in stocks['Ticker']:
api_url = f'https://sandbox.iexapis.com/stable/stock/{symbol}/quote?token={IEX_CLOUD_API_TOKEN}'
data = requests.get(api_url).json()
final_dataframe = final_dataframe.append(
pd.Series([symbol,
data['latestPrice'],
data['marketCap'],
'N/A'],
index = my_columns),
ignore_index = True)
See this release note
or from another post:
"Append is the specific case(axis=0, join='outer') of concat" link
The changes in your code should be: (changed the pd.Series to variable just for presentation)
s = pd.Series([symbol, data['latestPrice'], data['marketCap'], 'N/A'], index = my_columns)
final_dataframe = pd.concat([final_dataframe, s], ignore_index = True)

Appending tables generated from a loop

I am a new python user here and am trying to append data together that I have pulled from a pdf using Camelot but am having trouble getting them to join together.
Here is my code:
url = 'https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_AT_Tables.pdf'
tables = camelot.read_pdf(url,flavor='stream', edge_tol = 500, pages = '1-end')
i = 0
while i in range(0,tables.n):
header = tables[i].df.index[tables[i].df.iloc[:,0]=='Metropolitan Statistical Area'].to_list()
header = str(header)[1:-1]
header = (int(header))
tables[i].df = tables[i].df.rename(columns = tables[i].df.iloc[header])
tables[i].df = tables[i].df.drop(columns = {'': 'Blank'})
print(tables[i].df)
#appended_data.append(tables[i].df)
#if i > 0:
# dfs = tables[i-1].append(tables[i], ignore_index = True)
#pass
i = i + 1
any help would be much appreciated
You can use pandas.concat() to concat a list of dataframe.
while i in range(0,tables.n):
header = tables[i].df.index[tables[i].df.iloc[:,0]=='Metropolitan Statistical Area'].to_list()
header = str(header)[1:-1]
header = (int(header))
tables[i].df = tables[i].df.rename(columns = tables[i].df.iloc[header])
tables[i].df = tables[i].df.drop(columns = {'': 'Blank'})
df_ = pd.concat([table.df for table in tables])

bnlearn error in structural.em

I got an error when try to use structural.em in "bnlearn" package
This is the code:
cut.learn<- structural.em(cut.df, maximize = "hc",
+ maximize.args = "restart",
+ fit="mle", fit.args = list(),
+ impute = "parents", impute.args = list(), return.all = FALSE,
+ max.iter = 5, debug = FALSE)
Error in check.data(x, allow.levels = TRUE, allow.missing = TRUE,
warn.if.no.missing = TRUE, : at least one variable has no observed
values.
Did anyone have the same problems, please tell me how to fix it.
Thank you.
I got structural.em working. I am currently working on a python interface to bnlearn that I call pybnl. I also ran into the problem you desecribe above.
Here is a jupyter notebook that shows how to use StructuralEM from python marks.
The gist of it is described in slides-bnshort.pdf on page 135, "The MARKS Example, Revisited".
You have to create an inital fit with an inital imputed dataframe by hand and then provide the arguments to structural.em like so (ldmarks is the latent-discrete-marks dataframe where the LAT column only contains missing/NA values):
library(bnlearn)
data('marks')
dmarks = discretize(marks, breaks = 2, method = "interval")
ldmarks = data.frame(dmarks, LAT = factor(rep(NA, nrow(dmarks)), levels = c("A", "B")))
imputed = ldmarks
# Randomly set values of the unobserved variable in the imputed data.frame
imputed$LAT = sample(factor(c("A", "B")), nrow(dmarks2), replace = TRUE)
# Fit the parameters over an empty graph
dag = empty.graph(nodes = names(ldmarks))
fitted = bn.fit(dag, imputed)
# Although we've set imputed values randomly, nonetheless override them with a uniform distribution
fitted$LAT = array(c(0.5, 0.5), dim = 2, dimnames = list(c("A", "B")))
# Use whitelist to enforce arcs from the latent node to all others
r = structural.em(ldmarks, fit = "bayes", impute="bayes-lw", start=fitted, maximize.args=list(whitelist = data.frame(from = "LAT", to = names(dmarks))), return.all = TRUE)
You have to use bnlearn 4.4-20180620 or later, because it fixes a bug in the underlying impute function.

R EVMIX convert pdf to uniform marginals

I'm trying to convert a distribution into a pseudo-uniform distribution. Using the spd R package, it is easy and it works as expected.
library(spd)
x <- c(rnorm(100,-1,0.7),rnorm(100,3,1))
fit<-spdfit(x,upper=0.9,lower=0.1,tailfit="GPD", kernelfit="epanech")
uniformX = pspd(x,fit)
I want to generalize extreme value modeling to include threshold uncertainity. So I used the evmix package.
library(evmix)
x <- c(rnorm(100,-1,0.7),rnorm(100,3,1))
fit = fgkg(x, phiul = FALSE, phiur = FALSE, std.err = FALSE)
pgkg(x,fit$lambda, fit$ul, fit$sigmaul, fit$xil, fit$phiul, fit$ur,
fit$sigmaur, fit$xir, fit$phiur)
Im messing up somewhere.
Please check out the help for pgkg function:
help(pgkg)
which gives the syntax:
pgkg(q, kerncentres, lambda = NULL, ul = as.vector(quantile(kerncentres,
0.1)), sigmaul = sqrt(6 * var(kerncentres))/pi, xil = 0, phiul = TRUE,
ur = as.vector(quantile(kerncentres, 0.9)), sigmaur = sqrt(6 *
var(kerncentres))/pi, xir = 0, phiur = TRUE, bw = NULL,
kernel = "gaussian", lower.tail = TRUE)
You have missed the kernel centres (the data), which is always needed for kernel density estimators. Here is the corrected code:
library(evmix)
x <- c(rnorm(100,-1,0.7),rnorm(100,3,1))
fit = fgkg(x, phiul = FALSE, phiur = FALSE, std.err = FALSE)
prob = pgkg(x, x, fit$lambda, fit$ul, fit$sigmaul, fit$xil, fit$phiul,
fit$ur, fit$sigmaur, fit$xir, fit$phiur)
hist(prob) % now uniform as expected