Error using H2O (from Python 3.9.10) and XGBoost backend on MacOS (Monterey, Apple M1) - xgboost

I am currently trying to use H2O from Python, and I encounter some problems on my Mac OS with XGBoost.
It seems like H2O does not find it anywhere.
More precisely, the next simple snippet
import pandas as pd
import h2o
data = [['2015-01-01', '2490.925806' , '-0.41'],
['2015-01-02', '2412.623113' , '-0.48'],
['2015-01-03', '2365.611276' , '-0.55']]
df = pd.DataFrame(data, columns=["time", "base", "target"]).set_index("time", drop=True)
h2o.init(nthreads=-1)
estimator = h2o.estimators.H2OXGBoostEstimator()
training_frame = h2o.H2OFrame(df)
estimator.train(["base"], "target", training_frame)
gives me the error :
H2OResponseError: Server error water.exceptions.H2ONotFoundArgumentException:
Error: POST /3/ModelBuilders/xgboost not found
Request: POST /3/ModelBuilders/xgboost
data: {'training_frame': 'Key_Frame__upload_893634781f588299bbd20d51c98d43a9.hex', 'nfolds': '0', 'keep_cross_validation_models': 'True', 'keep_cross_validation_predictions': 'False', 'keep_cross_validation_fold_assignment': 'False', 'score_each_iteration': 'False', 'fold_assignment': 'auto', 'response_column': 'target', 'ignore_const_cols': 'True', 'stopping_rounds': '0', 'stopping_metric': 'auto', 'stopping_tolerance': '0.001', 'max_runtime_secs': '0.0', 'seed': '-1', 'distribution': 'auto', 'tweedie_power': '1.5', 'categorical_encoding': 'auto', 'quiet_mode': 'True', 'ntrees': '50', 'max_depth': '6', 'min_rows': '1.0', 'min_child_weight': '1.0', 'learn_rate': '0.3', 'eta': '0.3', 'sample_rate': '1.0', 'subsample': '1.0', 'col_sample_rate': '1.0', 'colsample_bylevel': '1.0', 'col_sample_rate_per_tree': '1.0', 'colsample_bytree': '1.0', 'colsample_bynode': '1.0', 'max_abs_leafnode_pred': '0.0', 'max_delta_step': '0.0', 'score_tree_interval': '0', 'min_split_improvement': '0.0', 'gamma': '0.0', 'nthread': '-1', 'build_tree_one_node': 'False', 'calibrate_model': 'False', 'max_bins': '256', 'max_leaves': '0', 'sample_type': 'uniform', 'normalize_type': 'tree', 'rate_drop': '0.0', 'one_drop': 'False', 'skip_drop': '0.0', 'tree_method': 'auto', 'grow_policy': 'depthwise', 'booster': 'gbtree', 'reg_lambda': '1.0', 'reg_alpha': '0.0', 'dmatrix_type': 'auto', 'backend': 'auto', 'gainslift_bins': '-1', 'auc_type': 'auto', 'scale_pos_weight': '1.0'}
For more information about my distribution:
OS: Monterey 12.3
Processor: Apple M1
Python: 3.9.10
H2O: 3.36.0.3
I suspect Apple M1 to be the cause of the error, but is that really the case ?

I am sorry, the XGBoost is not supported on Apple M1 processor yet.
https://h2oai.atlassian.net/browse/PUBDEV-8482

Related

Google Analytics data api dictionary to pandas data frame

I exported google analytics data in below dictionary format with 3 Dimensions and 2 metrics. How can I change this format to pandas data frame. I don't need the columns rowCount,minimums,maximums,nextPageToken. Thank you
{'reports': [{'columnHeader': {'dimensions': ['ga:date', 'ga:eventCategory',
'ga:eventAction'], 'metricHeader': {'metricHeaderEntries': [{'name': 'ga:totalEvents', 'type':
'INTEGER'}
, {'name': 'ga:UniqueEvents', 'type': 'INTEGER'}, {'name': 'ga:eventvalue', 'type':
'INTEGER'}]}},
'data':
{'rows': [{'dimensions': ['20220820', 'accordion ', 'accordion'], 'metrics':
[{'values': ['547', '528', '0']}]},
{'dimensions': ['20220817', 'accordion click', 'benefits'], 'metrics': [{'values': ['26',
'26', '0']}]},
{'dimensions': ['20220818', 'accordion click', 'for-your-dog '], 'metrics': [{'values': ['1',
'1', '0']}]},
{'dimensions': ['20220819', 'account', 'register'], 'metrics': [{'values': ['1465', '1345',
'0']}]},
{'dimensions': ['20220820', 'account', 'reminders'], 'metrics': [{'values': ['59', '54',
'0']}]},
, 'rowCount': 17, 'minimums': [{'values': ['1', '1', '0']}], 'maximums': [{'values':
['40676', '37725', '5001337']}]}, 'nextPageToken': '1000'}]}
final dataframe format below

Slicing PySpark DataFrame by converting to Pandas DataFrame, Error when converting back to PySpark DataFrame

I want to slice a PySpark DataFrame by selecting a specific column and several rows as below:
import pandas as pd
# Data filled in our DataFrame
rows = [['Lee Chong Wei', 69, 'Malaysia'],
['Lin Dan', 66, 'China'],
['Srikanth Kidambi', 9, 'India'],
['Kento Momota', 15, 'Japan']]
# Columns of our DataFrame
columns = ['Player', 'Titles', 'Country']
# DataFrame is created
df = spark.createDataFrame(rows, columns)
# Converting DataFrame to pandas
pandas_df = df.toPandas()
# First DataFrame formed by slicing
df1 = pandas_df.iloc[[2], :2]
# Second DataFrame formed by slicing
df2 = pandas_df.iloc[[2], 2:]
# Converting the slices to PySpark DataFrames
df1 = spark.createDataFrame(df1, schema = "Country")
df2 = spark.createDataFrame(df2, schema = "Country")
I am running a notebook on Databricks and no need to import Spark Session.
There is an error message ParseException: when running following lines:
df1 = spark.createDataFrame(df1, schema = "Country")
df2 = spark.createDataFrame(df2, schema = "Country")
Please let me know any idea to solve this issue. Full error message is as below:
---------------------------------------------------------------------------
ParseException Traceback (most recent call last)
<command-4065192899858765> in <module>
23
24 # Converting the slices to PySpark DataFrames
---> 25 df1 = spark.createDataFrame(df1, schema = "Country")
26 df2 = spark.createDataFrame(df2, schema = "Country")
/databricks/spark/python/pyspark/sql/session.py in createDataFrame(self, data, schema, samplingRatio, verifySchema)
706
707 if isinstance(schema, str):
--> 708 schema = _parse_datatype_string(schema)
709 elif isinstance(schema, (list, tuple)):
710 # Must re-encode any unicode strings to be consistent with StructField names
/databricks/spark/python/pyspark/sql/types.py in _parse_datatype_string(s)
841 return from_ddl_datatype("struct<%s>" % s.strip())
842 except:
--> 843 raise e
844
845
/databricks/spark/python/pyspark/sql/types.py in _parse_datatype_string(s)
831 try:
832 # DDL format, "fieldname datatype, fieldname datatype".
--> 833 return from_ddl_schema(s)
834 except Exception as e:
835 try:
/databricks/spark/python/pyspark/sql/types.py in from_ddl_schema(type_str)
823 def from_ddl_schema(type_str):
824 return _parse_datatype_json_string(
--> 825 sc._jvm.org.apache.spark.sql.types.StructType.fromDDL(type_str).json())
826
827 def from_ddl_datatype(type_str):
/databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
1302
1303 answer = self.gateway_client.send_command(command)
-> 1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
121 # Hide where the exception came from that shows a non-Pythonic
122 # JVM exception message.
--> 123 raise converted from None
124 else:
125 raise
ParseException:
mismatched input '<EOF>' expecting {'APPLY', 'CALLED', 'CHANGES', 'CLONE', 'COLLECT', 'CONTAINS', 'CONVERT', 'COPY', 'COPY_OPTIONS', 'CREDENTIAL', 'CREDENTIALS', 'DEEP', 'DEFINER', 'DELTA', 'DETERMINISTIC', 'ENCRYPTION', 'EXPECT', 'FAIL', 'FILES', 'FORMAT_OPTIONS', 'HISTORY', 'INCREMENTAL', 'INPUT', 'INVOKER', 'LANGUAGE', 'LIVE', 'MATERIALIZED', 'MODIFIES', 'OPTIMIZE', 'PATTERN', 'READS', 'RESTORE', 'RETURN', 'RETURNS', 'SAMPLE', 'SCD TYPE 1', 'SCD TYPE 2', 'SECURITY', 'SEQUENCE', 'SHALLOW', 'SNAPSHOT', 'SPECIFIC', 'SQL', 'STORAGE', 'STREAMING', 'UPDATES', 'UP_TO_DATE', 'VIOLATION', 'ZORDER', 'ADD', 'AFTER', 'ALL', 'ALTER', 'ALWAYS', 'ANALYZE', 'AND', 'ANTI', 'ANY', 'ARCHIVE', 'ARRAY', 'AS', 'ASC', 'AT', 'AUTHORIZATION', 'BETWEEN', 'BOTH', 'BUCKET', 'BUCKETS', 'BY', 'CACHE', 'CASCADE', 'CASE', 'CAST', 'CATALOG', 'CATALOGS', 'CHANGE', 'CHECK', 'CLEAR', 'CLUSTER', 'CLUSTERED', 'CODE', 'CODEGEN', 'COLLATE', 'COLLECTION', 'COLUMN', 'COLUMNS', 'COMMENT', 'COMMIT', 'COMPACT', 'COMPACTIONS', 'COMPUTE', 'CONCATENATE', 'CONSTRAINT', 'COST', 'CREATE', 'CROSS', 'CUBE', 'CURRENT', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'CURRENT_USER', 'DAY', 'DATA', 'DATABASE', 'DATABASES', 'DATEADD', 'DATEDIFF', 'DBPROPERTIES', 'DEFAULT', 'DEFINED', 'DELETE', 'DELIMITED', 'DESC', 'DESCRIBE', 'DFS', 'DIRECTORIES', 'DIRECTORY', 'DISTINCT', 'DISTRIBUTE', 'DIV', 'DROP', 'ELSE', 'END', 'ESCAPE', 'ESCAPED', 'EXCEPT', 'EXCHANGE', 'EXISTS', 'EXPLAIN', 'EXPORT', 'EXTENDED', 'EXTERNAL', 'EXTRACT', 'FALSE', 'FETCH', 'FIELDS', 'FILTER', 'FILEFORMAT', 'FIRST', 'FN', 'FOLLOWING', 'FOR', 'FOREIGN', 'FORMAT', 'FORMATTED', 'FROM', 'FULL', 'FUNCTION', 'FUNCTIONS', 'GENERATED', 'GLOBAL', 'GRANT', 'GRANTS', 'GROUP', 'GROUPING', 'HAVING', 'HOUR', 'IDENTITY', 'IF', 'IGNORE', 'IMPORT', 'IN', 'INCREMENT', 'INDEX', 'INDEXES', 'INNER', 'INPATH', 'INPUTFORMAT', 'INSERT', 'INTERSECT', 'INTERVAL', 'INTO', 'IS', 'ITEMS', 'JOIN', 'KEY', 'KEYS', 'LAST', 'LATERAL', 'LAZY', 'LEADING', 'LEFT', 'LIKE', 'ILIKE', 'LIMIT', 'LINES', 'LIST', 'LOAD', 'LOCAL', 'LOCATION', 'LOCK', 'LOCKS', 'LOGICAL', 'MACRO', 'MAP', 'MATCHED', 'MERGE', 'MINUTE', 'MONTH', 'MSCK', 'NAMESPACE', 'NAMESPACES', 'NATURAL', 'NO', NOT, 'NULL', 'NULLS', 'OF', 'ON', 'ONLY', 'OPTION', 'OPTIONS', 'OR', 'ORDER', 'OUT', 'OUTER', 'OUTPUTFORMAT', 'OVER', 'OVERLAPS', 'OVERLAY', 'OVERWRITE', 'PARTITION', 'PARTITIONED', 'PARTITIONS', 'PERCENTILE_CONT', 'PERCENT', 'PIVOT', 'PLACING', 'POSITION', 'PRECEDING', 'PRIMARY', 'PRINCIPALS', 'PROPERTIES', 'PROVIDER', 'PROVIDERS', 'PURGE', 'QUALIFY', 'QUERY', 'RANGE', 'RECIPIENT', 'RECIPIENTS', 'RECORDREADER', 'RECORDWRITER', 'RECOVER', 'REDUCE', 'REFERENCES', 'REFRESH', 'REMOVE', 'RENAME', 'REPAIR', 'REPEATABLE', 'REPLACE', 'REPLICAS', 'RESET', 'RESPECT', 'RESTRICT', 'REVOKE', 'RIGHT', RLIKE, 'ROLE', 'ROLES', 'ROLLBACK', 'ROLLUP', 'ROW', 'ROWS', 'SECOND', 'SCHEMA', 'SCHEMAS', 'SELECT', 'SEMI', 'SEPARATED', 'SERDE', 'SERDEPROPERTIES', 'SESSION_USER', 'SET', 'MINUS', 'SETS', 'SHARE', 'SHARES', 'SHOW', 'SKEWED', 'SOME', 'SORT', 'SORTED', 'START', 'STATISTICS', 'STORED', 'STRATIFY', 'STRUCT', 'SUBSTR', 'SUBSTRING', 'SYNC', 'SYSTEM_TIME', 'SYSTEM_VERSION', 'TABLE', 'TABLES', 'TABLESAMPLE', 'TBLPROPERTIES', TEMPORARY, 'TERMINATED', 'THEN', 'TIME', 'TIMESTAMP', 'TIMESTAMPADD', 'TIMESTAMPDIFF', 'TO', 'TOUCH', 'TRAILING', 'TRANSACTION', 'TRANSACTIONS', 'TRANSFORM', 'TRIM', 'TRUE', 'TRUNCATE', 'TRY_CAST', 'TYPE', 'UNARCHIVE', 'UNBOUNDED', 'UNCACHE', 'UNION', 'UNIQUE', 'UNKNOWN', 'UNLOCK', 'UNSET', 'UPDATE', 'USE', 'USER', 'USING', 'VALUES', 'VERSION', 'VIEW', 'VIEWS', 'WHEN', 'WHERE', 'WINDOW', 'WITH', 'WITHIN', 'YEAR', 'ZONE', IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 1, pos 7)
== SQL ==
Country
-------^^^

Jupyter .save_to_html function does not store config

I'm trying to use the .save_to_html() function for a kepler.gl jupyter notebook map.
It all works great inside jupyter, and I can re-load the same map with a defined config.
Where it goes wrong is when I use the save_to_html() function. The map will save to an html, but the configuration reverts to the basic configuration, before I customized it.
Please help! I love kepler, when I solve this little thing, it will be our absolute go-to tool
Thanks
Have tried to change the filters, colours, and point sizes. None of this works.
map_1 = KeplerGl(height=500, data={'data': df},config=config)
map_1
config = map_1.config
config
map_1.save_to_html(data={'data_1': df},
file_name='privateers.html',config=config)
Config
{'version': 'v1',
'config': {'visState': {'filters': [{'dataId': 'data',
'id': 'x8t9c53mf',
'name': 'time_update',
'type': 'timeRange',
'value': [1565687902187.5417, 1565775465282],
'enlarged': True,
'plotType': 'histogram',
'yAxis': None},
{'dataId': 'data',
'id': 'biysqlu36',
'name': 'user_id',
'type': 'multiSelect',
'value': ['HNc0SI3WsQfhOFRF2THnUEfmqJC3'],
'enlarged': False,
'plotType': 'histogram',
'yAxis': None}],
'layers': [{'id': 'ud6168',
'type': 'point',
'config': {'dataId': 'data',
'label': 'Point',
'color': [18, 147, 154],
'columns': {'lat': 'lat', 'lng': 'lng', 'altitude': None},
'isVisible': True,
'visConfig': {'radius': 5,
'fixedRadius': False,
'opacity': 0.8,
'outline': False,
'thickness': 2,
'strokeColor': None,
'colorRange': {'name': 'Uber Viz Qualitative 1.2',
'type': 'qualitative',
'category': 'Uber',
'colors': ['#12939A',
'#DDB27C',
'#88572C',
'#FF991F',
'#F15C17',
'#223F9A'],
'reversed': False},
'strokeColorRange': {'name': 'Global Warming',
'type': 'sequential',
'category': 'Uber',
'colors': ['#5A1846',
'#900C3F',
'#C70039',
'#E3611C',
'#F1920E',
'#FFC300']},
'radiusRange': [0, 50],
'filled': True},
'textLabel': [{'field': None,
'color': [255, 255, 255],
'size': 18,
'offset': [0, 0],
'anchor': 'start',
'alignment': 'center'}]},
'visualChannels': {'colorField': {'name': 'ride_id', 'type': 'string'},
'colorScale': 'ordinal',
'strokeColorField': None,
'strokeColorScale': 'quantile',
'sizeField': None,
'sizeScale': 'linear'}},
{'id': 'an8tbef',
'type': 'point',
'config': {'dataId': 'data',
'label': 'previous',
'color': [221, 178, 124],
'columns': {'lat': 'previous_lat',
'lng': 'previous_lng',
'altitude': None},
'isVisible': False,
'visConfig': {'radius': 10,
'fixedRadius': False,
'opacity': 0.8,
'outline': False,
'thickness': 2,
'strokeColor': None,
'colorRange': {'name': 'Global Warming',
'type': 'sequential',
'category': 'Uber',
'colors': ['#5A1846',
'#900C3F',
'#C70039',
'#E3611C',
'#F1920E',
'#FFC300']},
'strokeColorRange': {'name': 'Global Warming',
'type': 'sequential',
'category': 'Uber',
'colors': ['#5A1846',
'#900C3F',
'#C70039',
'#E3611C',
'#F1920E',
'#FFC300']},
'radiusRange': [0, 50],
'filled': True},
'textLabel': [{'field': None,
'color': [255, 255, 255],
'size': 18,
'offset': [0, 0],
'anchor': 'start',
'alignment': 'center'}]},
'visualChannels': {'colorField': None,
'colorScale': 'quantile',
'strokeColorField': None,
'strokeColorScale': 'quantile',
'sizeField': None,
'sizeScale': 'linear'}},
{'id': 'ilpixu9',
'type': 'arc',
'config': {'dataId': 'data',
'label': ' -> previous arc',
'color': [146, 38, 198],
'columns': {'lat0': 'lat',
'lng0': 'lng',
'lat1': 'previous_lat',
'lng1': 'previous_lng'},
'isVisible': True,
'visConfig': {'opacity': 0.8,
'thickness': 2,
'colorRange': {'name': 'Global Warming',
'type': 'sequential',
'category': 'Uber',
'colors': ['#5A1846',
'#900C3F',
'#C70039',
'#E3611C',
'#F1920E',
'#FFC300']},
'sizeRange': [0, 10],
'targetColor': None},
'textLabel': [{'field': None,
'color': [255, 255, 255],
'size': 18,
'offset': [0, 0],
'anchor': 'start',
'alignment': 'center'}]},
'visualChannels': {'colorField': None,
'colorScale': 'quantile',
'sizeField': None,
'sizeScale': 'linear'}},
{'id': 'inv52pp',
'type': 'line',
'config': {'dataId': 'data',
'label': ' -> previous line',
'color': [136, 87, 44],
'columns': {'lat0': 'lat',
'lng0': 'lng',
'lat1': 'previous_lat',
'lng1': 'previous_lng'},
'isVisible': False,
'visConfig': {'opacity': 0.8,
'thickness': 2,
'colorRange': {'name': 'Global Warming',
'type': 'sequential',
'category': 'Uber',
'colors': ['#5A1846',
'#900C3F',
'#C70039',
'#E3611C',
'#F1920E',
'#FFC300']},
'sizeRange': [0, 10],
'targetColor': None},
'textLabel': [{'field': None,
'color': [255, 255, 255],
'size': 18,
'offset': [0, 0],
'anchor': 'start',
'alignment': 'center'}]},
'visualChannels': {'colorField': None,
'colorScale': 'quantile',
'sizeField': None,
'sizeScale': 'linear'}}],
'interactionConfig': {'tooltip': {'fieldsToShow': {'data': ['time_ride_start',
'user_id',
'ride_id']},
'enabled': True},
'brush': {'size': 0.5, 'enabled': False}},
'layerBlending': 'normal',
'splitMaps': []},
'mapState': {'bearing': 0,
'dragRotate': False,
'latitude': 49.52565611453996,
'longitude': 6.2730441822977845,
'pitch': 0,
'zoom': 9.244725880765998,
'isSplit': False},
'mapStyle': {'styleType': 'dark',
'topLayerGroups': {},
'visibleLayerGroups': {'label': True,
'road': True,
'border': False,
'building': True,
'water': True,
'land': True,
'3d building': False},
'threeDBuildingColor': [9.665468314072013,
17.18305478057247,
31.1442867897876],
'mapStyles': {}}}}
Expected:
Fully configurated map as in Jupyter widget
Actuals
Colors and filters are not configured. Size and position of map is sent along, so if I store it looking at an empty area, when I open the html file it looks at the same field
In the Jupyter user guide for kepler.gl under the save section
# this will save current map
map_1.save_to_html(file_name='first_map.html')
# this will save map with provided data and config
map_1.save_to_html(data={'data_1': df}, config=config, file_name='first_map.html')
# this will save map with the interaction panel disabled
map_1.save_to_html(file_name='first_map.html', read_only=True)
So it looks like its a bug if the configuration parameter doesn't work or you are making the changes to the map configure after you set it equal to config. This would be fixed if you set
map_1.save_to_html(data={'data_1': df},
file_name='privateers.html',config=map_1.config)
I think it is a bug (or feature?) happens when you use the same cell to save the map configuration or still not print the map out yet. Generally, the config only exists after you really print the map out.
the problem, as far as I see it and how I solved at a similar problem is, that you 1) named your 'data' key in instancing the map different than you told it to save in the HTML 2).
map_1 = KeplerGl(height=500, data={'data': df},config=config)
map_1.save_to_html(data={'data_1': df}, file_name='privateers.html',config=config)
Name both keys the same and your HTML file will use the correct configuration.
Had this issue as well. Solved it by converting all pandas column dtypes to those that are json serializable: i.e. converting 'datetime' column from dtype <m8[ns] to object.

How to explode quantiles in hive

I am trying to get quantiles of a field and I want to explode them so that each value is a separate row rather than all of them forming a single array. First, I calculate 20 quantiles as below:
select percentile_approx(probability,
array(0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35,
0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75,
0.8, 0.85, 0.9, 0.95, 1.0)) as quantiles
from my_table
The above code gives the array below:
[0.17808226409449213, 0.18250386256254247, 0.18525207046272224, 0.18800918537059694, 0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226, 0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496, 0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575, 0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]
But since I wanted to explode it, I tried using lateral view posexplode like below (actually, I passed the output from the above code):
select i, x
lateral view posexplode([0.17808226409449213, 0.18250386256254247,0.18525207046272224, 0.18800918537059694,
0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226,
0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496,
0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575,
0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]) q as i, x
but it gives the error message below:
ParseException: "\nextraneous input '[' expecting {'(', ')', 'SELECT', 'FROM', 'ADD', 'AS', 'ALL', 'ANY', 'DISTINCT', 'WHERE', 'GROUP', 'BY', 'GROUPING', 'SETS', 'CUBE', 'ROLLUP', 'ORDER', 'HAVING', 'LIMIT', 'AT', 'OR', 'AND', 'IN', NOT, 'NO', 'EXISTS', 'BETWEEN', 'LIKE', RLIKE, 'IS', 'NULL', 'TRUE', 'FALSE', 'NULLS', 'ASC', 'DESC', 'FOR', 'INTERVAL', 'CASE', 'WHEN', 'THEN', 'ELSE', 'END', 'JOIN', 'CROSS', 'OUTER', 'INNER', 'LEFT', 'SEMI', 'RIGHT', 'FULL', 'NATURAL', 'ON', 'PIVOT', 'LATERAL', 'WINDOW', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'UNBOUNDED', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'FIRST', 'AFTER', 'LAST', 'ROW', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'DIRECTORY', 'VIEW', 'REPLACE', 'INSERT', 'DELETE', 'INTO', 'DESCRIBE', 'EXPLAIN', 'FORMAT', 'LOGICAL', 'CODEGEN', 'COST', 'CAST', 'SHOW', 'TABLES', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'DROP', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'TO', 'TABLESAMPLE', 'STRATIFY', 'ALTER', 'RENAME', 'ARRAY', 'MAP', 'STRUCT', 'COMMENT', 'SET', 'RESET', 'DATA', 'START', 'TRANSACTION', 'COMMIT', 'ROLLBACK', 'MACRO', 'IGNORE', 'BOTH', 'LEADING', 'TRAILING', 'IF', 'POSITION', 'EXTRACT', '+', '-', '*', 'DIV', '~', 'PERCENT', 'BUCKET', 'OUT', 'OF', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'OVERWRITE', 'TRANSFORM', 'REDUCE', 'SERDE', 'SERDEPROPERTIES', 'RECORDREADER', 'RECORDWRITER', 'DELIMITED', 'FIELDS', 'TERMINATED', 'COLLECTION', 'ITEMS', 'KEYS', 'ESCAPED', 'LINES', 'SEPARATED', 'FUNCTION', 'EXTENDED', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'LAZY', 'FORMATTED', 'GLOBAL', TEMPORARY, 'OPTIONS', 'UNSET', 'TBLPROPERTIES', 'DBPROPERTIES', 'BUCKETS', 'SKEWED', 'STORED', 'DIRECTORIES', 'LOCATION', 'EXCHANGE', 'ARCHIVE', 'UNARCHIVE', 'FILEFORMAT', 'TOUCH', 'COMPACT', 'CONCATENATE', 'CHANGE', 'CASCADE', 'RESTRICT', 'CLUSTERED', 'SORTED', 'PURGE', 'INPUTFORMAT', 'OUTPUTFORMAT', DATABASE, DATABASES, 'DFS', 'TRUNCATE', 'ANALYZE', 'COMPUTE', 'LIST', 'STATISTICS', 'PARTITIONED', 'EXTERNAL', 'DEFINED', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'REPAIR', 'RECOVER', 'EXPORT', 'IMPORT', 'LOAD', 'ROLE', 'ROLES', 'COMPACTIONS', 'PRINCIPALS', 'TRANSACTIONS', 'INDEX', 'INDEXES', 'LOCKS', 'OPTION', 'ANTI', 'LOCAL', 'INPATH', STRING, BIGINT_LITERAL, SMALLINT_LITERAL, TINYINT_LITERAL, INTEGER_VALUE, DECIMAL_VALUE, DOUBLE_LITERAL, BIGDECIMAL_LITERAL, IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 3, pos 24)\n\n== SQL ==\n\nselect i, x\nlateral view posexplode([0.17808226409449213, 0.18250386256254247,0.18525207046272224, 0.18800918537059694, \n------------------------^^^\n 0.1907743631982954, 0.200154288105411, 0.30419108474685375, 0.3299437131426226, \n 0.352433633041806, 0.3589875791100745, 0.37581775428218006, 0.3825168120904496, \n 0.3966342376441502, 0.4173753044627164, 0.43268994899295316, 0.44015098935735575, \n 0.461413042176578, 0.4720422104416653, 0.487852850513824, 0.5050010622123932]) q as i, x\n"
On the other hand, if I create the array inside posexplode like below, it works fine:
select i, x
lateral view posexplode(array('a', 'b', 'c')) q as i, x
| i| x|
+---+---+
| 0| a|
| 1| b|
| 2| c|
+---+---+
select s.*, e.i, e.x
from
(
select percentile_approx(probability,
array(0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35,
0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75,
0.8, 0.85, 0.9, 0.95, 0.99)) as quantiles
from my_table
) s lateral view outer posexplode (s.quantiles) e as i, x
;

Keras: Missing clear_session, set_session, and get_session?

I'm using Keras 2.2.0 and am trying to do something like the following:
import keras.backend as K
K.clear_session()
sess = tf.Session()
K.set_session(sess)
...
with K.get_session() as sess:
However, I get errors saying AttributeError: 'module' object has no attribute 'clear_session'. So it seems this functionality is no longer in keras.backend?
For instance, if I do dir(keras.backend), I get:
['Function', 'NAME_SCOPE_STACK', 'Print', 'RandomStreams', 'T', 'T_softsign', '_BACKEND', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '_backend', '_config', '_config_path', '_epsilon', '_floatx', '_image_data_format', '_keras_base_dir', '_keras_dir', 'abs', 'absolute_import', 'all', 'any', 'arange', 'argmax', 'argmin', 'backend', 'batch_dot', 'batch_flatten', 'batch_get_value', 'batch_normalization', 'batch_set_value', 'bias_add', 'binary_crossentropy', 'cast', 'cast_to_floatx', 'categorical_crossentropy', 'clip', 'common', 'concatenate', 'constant', 'contextmanager', 'conv1d', 'conv2d', 'conv2d_transpose', 'conv3d', 'conv3d_transpose', 'cos', 'count_params', 'ctc_batch_cost', 'ctc_cost', 'ctc_create_skip_idxs', 'ctc_interleave_blanks', 'ctc_path_probs', 'ctc_update_log_p', 'cumprod', 'cumsum', 'defaultdict', 'depthwise_conv2d', 'division', 'dot', 'dropout', 'dtype', 'elu', 'epsilon', 'equal', 'eval', 'exp', 'expand_dims', 'eye', 'f', 'flatten', 'floatx', 'foldl', 'foldr', 'function', 'gather', 'get_uid', 'get_value', 'get_variable_shape', 'gradients', 'greater', 'greater_equal', 'hard_sigmoid', 'has_arg', 'identity', 'ifelse', 'image_data_format', 'image_dim_ordering', 'importlib', 'in_test_phase', 'in_top_k', 'in_train_phase', 'int_shape', 'is_keras_tensor', 'is_placeholder', 'is_sparse', 'is_tensor', 'json', 'l2_normalize', 'learning_phase', 'less', 'less_equal', 'local_conv1d', 'local_conv2d', 'log', 'logsumexp', 'map_fn', 'max', 'maximum', 'mean', 'min', 'minimum', 'moving_average_update', 'name_scope', 'ndim', 'normalize_batch_in_training', 'not_equal', 'np', 'one_hot', 'ones', 'ones_like', 'os', 'pattern_broadcast', 'permute_dimensions', 'placeholder', 'pool', 'pool2d', 'pool3d', 'pow', 'print_function', 'print_tensor', 'prod', 'py_all', 'py_any', 'py_slice', 'py_sum', 'random_binomial', 'random_normal', 'random_normal_variable', 'random_uniform', 'random_uniform_variable', 'relu', 'repeat', 'repeat_elements', 'reset_uids', 'reshape', 'resize_images', 'resize_volumes', 'reverse', 'rnn', 'round', 'separable_conv1d', 'separable_conv2d', 'set_epsilon', 'set_floatx', 'set_image_data_format', 'set_image_dim_ordering', 'set_learning_phase', 'set_value', 'shape', 'sigmoid', 'sign', 'sin', 'slice', 'softmax', 'softplus', 'softsign', 'sparse_categorical_crossentropy', 'spatial_2d_padding', 'spatial_3d_padding', 'sqrt', 'square', 'squeeze', 'stack', 'std', 'stop_gradient', 'sum', 'switch', 'sys', 'tanh', 'temporal_padding', 'th_sparse_module', 'theano', 'theano_backend', 'tile', 'to_dense', 'transpose', 'truncated_normal', 'update', 'update_add', 'update_sub', 'var', 'variable', 'zeros', 'zeros_like']
and don't see any of those 3 in there.
How should I be writing this code in modern Keras?
Thanks!
EDIT: https://github.com/keras-team/keras/issues/11015
Seems like it is not available any may have to downgrade
It might be that your backend is set to using Theano (I believe clear_session is only available through the Tensorflow backend with Keras). You can change these settings in your keras.json to TF and clear_session should be available to you.