How to use WHERE statement on JSON stored in Presto SQL column to filter? - sql

In Presto, I have data for a column in a table is as follows:
header
header 2
{Data: [{'item1': 'stuff1', 'item2': 'stuff2', 'item3': 'stuff3'}, {...}]}
cell 2
{Data: [{'item1': 'stuff11', 'item2': 'stuff21', 'item3': 'stuff31'}, {...}]}
cell 4
I was able to SELECT using JSON syntax using:
SELECT header.Data[1].item1 FROM table
and returns:
header
stuff1
stuff11
However, if I want to filter the table using the WHERE statement:
SELECT * FROM table WHERE header.Data[1].item1 = 'stuff1'
The above statement threw an error and didn't work.
I would like to return something like
header
header 2
{Data: [{'item1': 'stuff1', 'item2': 'stuff2', 'item3': 'stuff3'}, {...}]}
cell 2
Any input would be helpful. Thanks
I've tried several other queries using SQL as well such as but all returned similar error:
WHERE header.Data[1].item1 = 'stuff1'
An example of the error:
Query:
`SELECT header.Data[1].item1 AS f FROM table WHERE f LIKE '%stuff%'
'''
An error occurred while calling o12.execute. : java.sql.SQLException: Query failed (#20220330_200148_01673_9bq5k): line 2:7: Column 'f' cannot be resolved at io.prestosql.jdbc.AbstractPrestoResultSet.resultsException(AbstractPrestoResultSet.java:1761) at io.prestosql.jdbc.PrestoResultSet.getColumns(PrestoResultSet.java:252) at io.prestosql.jdbc.PrestoResultSet.create(PrestoResultSet.java:54) at io.prestosql.jdbc.PrestoStatement.internalExecute(PrestoStatement.java:249) at io.prestosql.jdbc.PrestoStatement.execute(PrestoStatement.java:227) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:750) Caused by: io.prestosql.spi.PrestoException: line 2:7: Column 'f' cannot be resolved at io.prestosql.sql.analyzer.SemanticExceptions.semanticException(SemanticExceptions.java:48) at io.prestosql.sql.analyzer.SemanticExceptions.semanticException(SemanticExceptions.java:43) at io.prestosql.sql.analyzer.SemanticExceptions.missingAttributeException(SemanticExceptions.java:33) at io.prestosql.sql.analyzer.Scope.lambda$resolveField$7(Scope.java:228) at java.base/java.util.Optional.orElseThrow(Optional.java:408) at io.prestosql.sql.analyzer.Scope.resolveField(Scope.java:228) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitIdentifier(ExpressionAnalyzer.java:438) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitIdentifier(ExpressionAnalyzer.java:342) at io.prestosql.sql.tree.Identifier.accept(Identifier.java:72) at io.prestosql.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:27) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:365) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitLikePredicate(ExpressionAnalyzer.java:702) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.visitLikePredicate(ExpressionAnalyzer.java:342) at io.prestosql.sql.tree.LikePredicate.accept(LikePredicate.java:76) at io.prestosql.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:27) at io.prestosql.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:365) at io.prestosql.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:303) at io.prestosql.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1691) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeExpression(StatementAnalyzer.java:2606) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeWhere(StatementAnalyzer.java:2465) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.lambda$visitQuerySpecification$23(StatementAnalyzer.java:1528) at java.base/java.util.Optional.ifPresent(Optional.java:183) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:1528) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:322) at io.prestosql.sql.tree.QuerySpecification.accept(QuerySpecification.java:144) at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:339) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:349) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1039) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:322) at io.prestosql.sql.tree.Query.accept(Query.java:107) at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27) at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:339) at io.prestosql.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:308) at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:83) at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:75) at io.prestosql.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:256) at io.prestosql.execution.SqlQueryExecution.(SqlQueryExecution.java:182) at io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:757) at io.prestosql.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:123) at io.prestosql.$gen.Presto_343____20220330_135137_2.call(Unknown Source) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)
'''

Alias f introduced by SELECT header.Data[1].item1 AS f is not available in WHERE so you need to use the whole expression:
where header.Data[1].item1 LIKE '%stuff%'

Related

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

I was executing a hive job successfully but since last day it is giving me an error after mapper job is completed below are the logs and query:
INSERT INTO TABLE zong_dwh.TEMP_P_UFDR_imp6
SELECT
from_unixtime(begin_time+5*3600,'yyyy-MM-dd') AS Date1,
from_unixtime(begin_time+5*3600,'HH') AS Hour1,
MSISDN AS MSISDN,
A.prot_type AS Protocol,
B.protocol as Application,
host AS Domain,
D.browser_name AS browser_type,
cast (null as varchar(10)) as media_format,
C.ter_type_name_en as device_category,
C.ter_brand_name as device_brand,
rat as session_technology,
case
when rat=1 then Concat(mcc,mnc,lac,ci)
when rat=2 then Concat(mcc,mnc,lac,sac)
when rat=6 then concat(mcc,mnc,eci)
end AS Actual_Site_ID,
sum(coalesce(L4_DW_THROUGHPUT,0)+coalesce(L4_UL_THROUGHPUT,0)) as total_data_volume,
sum(coalesce(TCP_UL_RETRANS_WITHPL,0)/coalesce(TCP_DW_RETRANS_WITHPL,1)) AS retrans_rate,
sum(coalesce(DATATRANS_UL_DURATION,0) + coalesce(DATATRANS_DW_DURATION,0)) as duration,
count(sessionkey) as usage_quantity,
round(sum(L4_DW_THROUGHPUT)/1024/1024,4)/sum(end_time*1000+end_time_msel-begin_time*1000-begin_time_msel) AS downlink_throughput,
round(sum(L4_UL_THROUGHPUT)/1024/1024,4)/sum(end_time*1000+end_time_msel-begin_time*1000-begin_time_msel) as uplink_throughput
from
ps.detail_ufdr_http_browsing_17923 A
INNER JOIN ps.dim_protocol B ON B.protocol_id=A.prot_type
INNER JOIN ps.dim_terminal C on substr(A.imei,1,8)=C.tac
inner join ps.dim_browser_type D on A.browser_type=D.browser_type_id
Group by
from_unixtime(begin_time+5*3600,'yyyy-MM-dd'),
from_unixtime(begin_time+5*3600,'HH'),MSISDN,
prot_type,
B.protocol,
host,
D.browser_name,
cast (null as varchar(10)),
C.ter_type_name_en,
C.ter_brand_name,
rat,
case
when rat=1 then Concat(mcc,mnc,lac,ci)
when rat=2 then Concat(mcc,mnc,lac,sac)
when rat=6 then concat(mcc,mnc,eci)
end;
Logs :
Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row (tag=0)
{"key":{"_col0":"2019-02-11","_col1":"05","_col2":"3002346407","_col3":146,"_col4":"","_col5":null,"_col6":null,"_col7":"35538908","_col8":6,"_col9":"","_col10":"","_col11":"","_col12":"0ED1102"},"value":{"_col0":75013,"_col1":4.0,"_col2":2253648000,"_col3":5,"_col4":0}}
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:182) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1769)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:176) Caused
by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row (tag=0)
{"key":{"_col0":"2019-02-11","_col1":"05","_col2":"3002346407","_col3":146,"_col4":"","_col5":null,"_col6":null,"_col7":"35538908","_col8":6,"_col9":"","_col10":"","_col11":"","_col12":"0ED1102"},"value":{"_col0":75013,"_col1":4.0,"_col2":2253648000,"_col3":5,"_col4":0}}
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 7 more Caused by:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute
method public org.apache.hadoop.io.Text
org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(org.apache.hadoop.io.Text,org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.IntWritable)
on object org.apache.hadoop.hive.ql.udf.UDFConv#2e2f720 of class
org.apache.hadoop.hive.ql.udf.UDFConv with arguments
{:org.apache.hadoop.io.Text, 16:org.apache.hadoop.io.IntWritable,
10:org.apache.hadoop.io.IntWritable} of size 3 at
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1034)
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182)
at
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:193)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:104)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1019)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
... 7 more Caused by: java.lang.reflect.InvocationTargetException at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1010)
... 18 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at
org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160) ...
23 more

apache-spark-sql: Error does not return the column name with error

When I use spark sql to query the data in the dataframe, my query returns the error. From the error, I cannot figure out what column has errors.
My table is gigantic with 120 columns and 176M rows.
Here is my query:
%sql
select order_entry_date, count(1) cnt, sum(paid_units) paid_unit, sum(total_revenue) rev
from mart_bc_order_item
group by 1
order by 1
The error is below:
java.lang.NumberFormatException: For input string: "�"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:252)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:125)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:94)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:167)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:166)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Can someone help here?
Thanks,
Vivek
NumberFormatException' you are getting due to the reason that String can not be parsed properly, check your code and data again.

Oracle SQL: Select specific string from CLOB field

I have a table where one of the fields is a CLOB, it stores a error message information.
The field CLOB as the following content:
oracle.retail.sim.common.core.SimServerException: Error processing message! [Inbound: true, MessageType: ItemLocCre, BusinessId: 1101505002]
at oracle.retail.sim.service.mps.SimMessageCommand.buildException(Unknown Source)
at oracle.retail.sim.service.mps.SimMessageProcessCommand.doExecute(Unknown Source)
at oracle.retail.sim.common.core.Command.execute(Unknown Source)
at sun.reflect.GeneratedMethodAccessor273.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:221)
Caused by: oracle.retail.sim.common.core.SimServerException: Item not found for Id: 1101505002
at oracle.retail.sim.server.integration.consumer.itemloc.ItemLocConsumer.buildItemNotFoundException(Unknown Source)
at oracle.retail.sim.server.integration.consumer.itemloc.ItemLocCreateConsumer.handleMessage(Unknown Source)
at oracle.retail.sim.server.integration.consumer.itemloc.ItemLocCreateConsumer.handleMessage(Unknown Source)
at oracle.retail.sim.server.integration.consumer.SimMessageConsumerFactory.consume(Unknown Source)
... 56 more
Im trying to display the result of the clob directly in the PL/SQL output, so im using the following query:
select id, dbms_lob.substr(message_error, 4000, 1) AS ERROR_MESSAGE
from THE_TABLE;
What i pretend is to select only the line that contains the 'Caused by..' string. What i need is to extract only the following error message:
Item not found for Id: 1101505002
It is possible to do so whith only a select statement?
Thanks in advance,
Best regards,
The following query (substitute your actual table and column names) will extract one line of text, from the words Caused by to the end of that line. It doesn't matter if the line of text begins with Caused by - you will only get everything from those words to the end of the line.
If you need a shorter sub-string from that, you will need to explain in more detail how it can be "recognized" - how do you decide what can be left out and what must be returned. How is that delimited?
select regexp_substr(message, 'Caused by:.*) as caused_line
from test_data;
(Note that by default the wildcard character . does not match the end-of-line in regular expressions in Oracle.)

Reference subquery fields in QueryDSL

I have to build a query in QueryDSL with a subquery, like this:
Expression<String> caseExpression = new CaseBuilder()
...;
Expression<?>[] queryProjection={
table.parameter1,
caseExpression
};
Expression<?>[] subqueryProjection={
table.parameter1.as("alias1"),
table.parameter2.as("alias2"),
table.parameter3.as("alias3"),
table.parameter4.as("alias4")
};
SQLSubQuery subQuery = new SQLSubQuery()
.from(table)
.where(...);
JPASQLQuery query = new JPASQLQuery(entityManager, ORACLE_TEMPLATE)
.from(subQuery.list(subqueryProjection));
query.list(queryProjection);
I am getting the following exception:
9/10/2014 09:45:55 AM org.apache.catalina.core.StandardWrapperValve invoke
GRAVE: Servlet.service() para servlet rtve-rest lanzó excepción
java.sql.SQLException: ORA-00904: "TABLE"."PARAMETER1": invalid identifier
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:331)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:288)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:745)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:219)
at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:813)
at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1049)
at oracle.jdbc.driver.T4CPreparedStatement.executeMaybeDescribe(T4CPreparedStatement.java:854)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1154)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3370)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3415)
at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:93)
at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.hibernate.engine.jdbc.internal.proxy.AbstractStatementProxyHandler.continueInvocation(AbstractStatementProxyHandler.java:122)
at org.hibernate.engine.jdbc.internal.proxy.AbstractProxyHandler.invoke(AbstractProxyHandler.java:81)
at $Proxy78.executeQuery(Unknown Source)
at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
at org.hibernate.loader.Loader.doQuery(Loader.java:829)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:289)
at org.hibernate.loader.Loader.doList(Loader.java:2438)
at org.hibernate.loader.Loader.doList(Loader.java:2424)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2254)
at org.hibernate.loader.Loader.list(Loader.java:2249)
at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:331)
at org.hibernate.internal.SessionImpl.listCustomQuery(SessionImpl.java:1784)
at org.hibernate.internal.AbstractSessionImpl.list(AbstractSessionImpl.java:229)
at org.hibernate.internal.SQLQueryImpl.list(SQLQueryImpl.java:156)
at org.hibernate.ejb.QueryImpl.getResultList(QueryImpl.java:257)
at com.mysema.query.jpa.sql.AbstractJPASQLQuery.list(AbstractJPASQLQuery.java:145)
This is caused because the field in my queryProjection is not the same as in my subqueryProjection. This field has to be the same as in the subquery ("alias1").
How could I reference a field by its alias? Or how could I reference a field in a subquery from outside it?
Thank you very much in advance
You need to use a Path instance with alias1 as the name or get rid of the aliasing. The simplest way to create that Path is
Expressions.path(Object.class, "alias1")
Replace Object with a more specific type if needed.
I managed to do it with Expressions.stringTemplate:
Expression<?>[] queryProjection={
Expressions.stringTemplate("alias1"),
caseExpression
};

Hive throws ArrayIndexOutOfBoundsException when select count(1) on ORC table

I have a simple table with 9 fields, using ORCFile format (I followed the steps mentioned here). When I try to count the number of rows in that table (350 million rows, btw) by submitting:
select count(1) from my_orc_table;
I get an 'ArrayIndexOutOfBoundsException'. Let me copy the stack, just in case it provides more information:
Error: java.io.IOException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Thanks!!