I am using AWS Athena to run a query against my data set to combine values from different columns found in various data sets (ex. There is a parquet file for each of client 1-4). However, the output is simply empty for "all_clients_total_clicks". The strange thing is that similar code on another table is working - just not for the one I'm currently working on.
Can someone please help me confirm whether my syntax is acceptable? Or point me in the right direction/documentation for review? SQL Below:
SELECT "columnA",
sum("columnX") AS "TotalImpressions",
cast(sum("client1_column_total_clicks") AS double)
+ cast(sum("client2_column_total_clicks") AS double)
+ cast(sum("client3_column_total_clicks") AS double)
+ cast(sum("client4_column_total_clickss") AS double) AS "all_clients_total_clicks"
FROM "db_name"."db_table"
Group by "columnA"
The issue stems from trying to add null values. Using Try + Coalesce resolved it for me.
Presto DB Documentation for Conditionals
Related
i have some problems displaying my aws timestream data in grafana. I added as a global dashboard variable DevEUI with 3 different specific values. But when i am using the multivalue syntax ${DevEUI} in my query with more then one value i get everytime a error.
hope somebody can give me a hint.
Regards and thanks in advance
You are most probably having a list of values as the value of your multivalue Grafana variable, but you are still using the = operator in your query. Try ... and DevEUI IN ('${DevEUI}'). Or maybe without the single quotes or the parantheses... the exact syntax depends on your Grafana variable.
But, this is just an educated guess, since I cannot see neither your database schema nor the definition of this Grafana variable (both of which are important details in a question like yours, for future reference).
This is how I did it for a multivalued string value:
timestream_variable_name = ANY(VALUES ${grafana_variable_name:singlequote})
You might have to adjust the formatting Grafana applies to the concatenated variable value it generates, depending on your data type.
I know this is long after the original question but #alparius pointed me in the right direction so I wanted to update the fix for the problem Joe reported.
Use formatting to get the proper quotes/values when formatting your query. Something like this:
Select * from database where searchiterm IN (${Multi-Value_Variable:sqlstring})
I have tried every combination I can think of in Bigquery, but when I get this error, I am trying to follow standard SQL procedures in putting Table_name.column but this format is not working.
I want to select the "Event_ID" field but that field is in two different tables I am using. It should be the same so I don't really care which one it pulls.
I've tried these formats, moving parentheses and periods around:
'table_name.event_id'
table_name.event_id
table_nameevent_id
The table name I am working with is very long, which may complicate things. Here is a stripped down version:
highestlevelfoldername_datasetname.tablename --I have tried highestlevelfoldername_datasetname.tablename.event_id -- and that is not working
I've googled around and also do not see the correct formatting.
You should add alias to your table and then use it to identify the field to select
For example,
select a.event_id
from `project.dataset.table1` a
...
This seems like a very simple question, however, I am still not getting rid of the Data Type Mismatch. Scenario:
-> Excel File link in as table [tbl_Mast_CC_List], I convert the possible Cost Center Numbers into Values for safety via query, there are NO text variables in the Cost Center or preceding 000's, next arrow
-> qry_CC_Clean is CostCenter:Val([tbl_Mast_CC_List.CostCenter])
-> then I create the Unmatched Query, here is the SQL:
SELECT
qry_CC_S1_Clean_F2F_Alloc.DataName
, qry_CC_S1_Clean_F2F_Alloc.Year
, qry_CC_S1_Clean_F2F_Alloc.CostCenter
FROM qry_CC_S1_Clean_F2F_Alloc
LEFT JOIN qry_CC_S1_Clean_Mast_CC_List
ON qry_CC_S1_Clean_F2F_Alloc.CostCenter = qry_CC_S1_Clean_Mast_CC_List.CostCenter
WHERE (((qry_CC_S1_Clean_Mast_CC_List.CostCenter) Is Null))
ORDER BY qry_CC_S1_Clean_F2F_Alloc.CostCenter;
The only time I can get it to work is if I make table of the query and I don't really want to do that. Any suggestions would be greatly appreciated because I have to run this unmatched query against numerous tables to make sure the company is not missing any cost centers rolling through. Thank you!
Your problem is likely due to using the val() function and trying to test it against nulls. My understanding is that val() doesn't return nulls, it returns 0 when it can't find anything. You might be better off running the conversion in the opposite direction, i.e. using CStr() on the numeric CostCenter field and comparing that to the text data from Excel.
Alternately you could change the Excel field itself to a number format instead of text.
I'm using bigquery with a dataset called '87891428' containing daily tables. I try to query a dates range thanks to the function TABLE_DATE_RANGE:
SELECT avg(foo)
FROM (
TABLE_DATE_RANGE(87891428.a_abc_,
TIMESTAMP('2014-09-30'),
TIMESTAMP('2014-10-19'))
)
But this leads to a very explicit error message:
Error: Encountered "" at line 3, column 21. Was expecting one of:
I've the feeling that TABLE_DATE_RANGE doesn"t like to have a dataset starting with a number cause when I copy few tables into a new dataset called 'test' the query run properly. Does anyone has already encountered this issue and if so what is the best workaround (as far as I know you can't rename a dataset) ?
The fix for this is to use brackets around the dataset name and table prefix:
SELECT avg(foo)
FROM (
TABLE_DATE_RANGE([87891428.a_abc_],
TIMESTAMP('2014-09-30'),
TIMESTAMP('2014-10-19'))
)
I am using Spark with Scala and trying to get data from a database using JdbcRDD.
val rdd = new JdbcRDD(sparkContext,
driverFactory,
testQuery,
rangeMinValue.get,
rangeMaxValue.get,
partitionCount,
rowMapper)
.persist(StorageLevel.MEMORY_AND_DISK)
Within the query there are no ? values to set (since the query is quite long I am not putting it here.) So I get an error saying that,
java.sql.SQLException: Parameter index out of range (1 > number of parameters, which is 0).
I have no idea what the problem is. Can someone suggest any kind of solution ?
Got the same problem.
Used this:
SELECT * FROM tbl WHERE ... AND ? = ?
And then call it with lowerbound 1, higher bound 1 and partition 1.
Will always run only one partition.
Your problem is Spark expected that your query String has a couple of ? parameters.
From Spark user list:
In order for Spark to split the JDBC query in parallel, it expects an
upper and lower bound for your input data, as well as a number of
partitions so that it can split the query across multiple tasks.
For example, depending on your data distribution, you could set an
upper and lower bound on your timestamp range, and spark should be
able to create new sub-queries to split up the data.
Another option is to load up the whole table using the HadoopInputFormat
class of your database as a NewHadoopRDD.