SQL Aliasing with Statistical Functions not working - sql

I am using Google Standard SQL with Big Query. I have the following code to get a variance and standard deviation from a table, however, the aliasing is not working. The column names return as f0 and f1 and not Variance and StdDev.
#StandardSQL
SELECT VARIANCE(Results) AS Variance,
STDDEV(Results) AS StdDev
FROM `totals`
Screenshot of output

You're seeing this apparent problem because the query is overwriting a table that previously had those column names.
If you do a
SELECT * FROM `bikes-data.bikes_data.var_and_stddev`
you'll find out that the table has the correct column names.
Somewhere on the display code the previous column names were cached, but your query is working as expected. You can also solve this problem by refreshing your browser tab.
I filed this as a bug: https://issuetracker.google.com/issues/128651254.

Related

How can I use data from more than one measurement in a single Grafana panel?

I am attempting to create a gauge panel in Grafana (Version 6.6.2 - presume that upgrading is a last resort, but possible if necessary, for the purposes of this problem) that can represent the percentage of total available memory used by the Java Virtual Machine running a process of mine. the problem that I am running into is the following:
I have used Springboot actuator's metrics and imported them into an Influx database with Micrometer, but in the process, it has stored the two values that I would like to use in my calculation into two different measurements. jvm_memory_used and jvm_memory_max
My initial Idea was to simply call a SELECT on both of the measurements to get the value that I want, and then divide the "used" / "max" and multiply that value by 100 to get the percentage to display. Unfortunately I run into syntax errors when I try to do this manually, and I am unsure if I can do this using Grafana's query builder.
I know that the syntax is incorrect, but I am not familiar enough with InfluxQL to know how to properly structure this query. Here is what I had tried:
(SELECT last("value")
FROM "jvm_memory_used"
WHERE ("area" = 'heap')
AND $timeFilter
GROUP BY time($__interval) fill(null)
) /
(SELECT last("value")
FROM "jvm_memory_max"
WHERE ("area" = 'heap')
AND $timeFilter
GROUP BY time($__interval) fill(null)
)
(The AND and GROUP BY are present as a result of the default values from Grafana's query builder, I am not sure whether they are necessary or not)
I'm assuming that my parenthesis and division process is illegal, but I am not sure how to resolve it.
How can I divide these two values from separate tables?
EDIT: I have gotten slightly further but it seems that there is a new issue. I now have the following query that I am sending in:
SELECT 100 * (last("used") / sum("max")) AS "percentUsed"
FROM(
SELECT last("value") AS "used"
FROM "jvm_memory_used"
WHERE ("area" = 'heap')
AND $timeFilter
),(
SELECT last("value") AS "max"
FROM "jvm_memory_max"
WHERE ("area" = 'heap')
AND $timeFilter
)
GROUP BY time($__interval) fill(null)
and the result I get is this:
How can I now get this query to return only one gauge with data, instead of two with nulls?
I've accepted an answer that works for versions of Grafana after 7. If there are any other answers that arise that do not involve updating the version of Grafana, please provide them as well!
I am not particulary experienced with Influx, but since your question is how to use/combine two measurements (query results) for a Grafana panel, I can tell you about one approach:
You can use a transformation. By that, you can keep two separate queries. With the transformation mode binary operation you can simply divide one of your values by the other one.
In your specific case, to display the result as percentage, you can then use Percent (0.0-1.0) as unit and you should have accomplished your goal.

What format do you use in Google BigQuery to specify a table.column when getting a "column name is ambiguous" error?

I have tried every combination I can think of in Bigquery, but when I get this error, I am trying to follow standard SQL procedures in putting Table_name.column but this format is not working.
I want to select the "Event_ID" field but that field is in two different tables I am using. It should be the same so I don't really care which one it pulls.
I've tried these formats, moving parentheses and periods around:
'table_name.event_id'
table_name.event_id
table_nameevent_id
The table name I am working with is very long, which may complicate things. Here is a stripped down version:
highestlevelfoldername_datasetname.tablename --I have tried highestlevelfoldername_datasetname.tablename.event_id -- and that is not working
I've googled around and also do not see the correct formatting.
You should add alias to your table and then use it to identify the field to select
For example,
select a.event_id
from `project.dataset.table1` a
...

How to query string inside a record in Google BigQuery? Docs not working

I want to query a subset of a record in the bitcoin blockchain using the google bigquery database. I go here and click view dataset https://console.cloud.google.com/marketplace/details/bigquery-public-data/bitcoin-blockchain. Then, on the left sidebar, it seems you have to click the dropdown at 'bigquery-public-data', then click 'bitcoin_blockchain' then 'transactions'. Then on the right you have to click the button 'Query Table'. This is the only way I have found to select the table -- just copying and pasting the command below won't recreate the error.
Based on the table that appears following the above instructions, I noticed that outputs are arecord type. I would like to view only one string from inside the record. The string is called output_pubkey_base58.
So I read the docs, and the docs imply the command would be:
SELECT outputs.output_pubkey_base58 FROM `bigquery-public-data.bitcoin_blockchain.transactions` LIMIT 1000;
I get an error: Cannot access value on Array<Struct<output_satoshis ... .. I tried outputs[0].output_pubkey_base58, didn't work
The annoying thing is that this problem is in the same format as the first example, where they query the citiesLived.place parameter from inside the citiesLived record in the same kind of command. : https://cloud.google.com/bigquery/docs/legacy-nested-repeated
You need to unnest the array into a new variable.
SELECT o.output_pubkey_base58
FROM
`bigquery-public-data.bitcoin_blockchain.transactions`,
UNNEST (outputs) as o
LIMIT
1000
Feel the confusion here is about legacy SQL and standard SQL. UNNEST must be used in standard SQL as described in document: https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql#differences_in_repeated_field_handling
Selecting nested repeated leaf fields
Using legacy SQL, you can "dot" into a nested repeated field without needing to consider where the repetition occurs. In standard SQL, attempting to "dot" into a nested repeated field results in an error.

Select IIF SUM command

I am using Jet SQL from excel using an ADODB connection to an IBM400 server to try and and get some data. I have done this fine before and it is fine with all other JET SQL commands however I have ran into a problem to which I am unable to solve. It is quite simple so I imagine that I am just not putting the correct syntax in but what I am trying to do is get some totals.
I have a table that contains part numbers and quantities within the locations of that part (more than one location per part). My goal is to have an sql command grab the total quantity (summing all locations) per part. I am able to do this one part at a time successfuly using: (for simplicity I will use part numbers 12345678 and 01234567)
SELECT SUM(CPJDDTA81.F4101JD.LIPQOH) FROM CPJDDTA81.F4101JD WHERE CPJDDTA81.F4101JD.IMLITM = '12345678'
CPJDDTA81.F4101JD is my table, IMLITM is the column name of part numbers, LIPQOH is the quantity on hand per location.
The single search produces the sum I want however the problem comes when trying to run more than one sum within one sql command. I have tried using a select iif command like the following:
SELECT IIF(CPJDDTA81.F4101JD.IMLITM = '12345678',SUM(CPJDDTA81.F4101JD.LIPQOH),IIF(CPJDDTA81.F4101JD.IMLITM = '01234567',SUM(CPJDDTA81.F4101JD.LIPQOH),0) FROM CPJDDTA81.F4101JD
This command provides an error saying that "=" is not a valid token (the = sign within the IIF statement). I was hoping that someone out there can help me write a correct statement to accomplish this. My actual part list will be much larger so I will be using VBA to construct the SQL statement but I need to learn how to do two parts first. Thanks ahead of time.
SELECT CPJDDTA81.F4101JD.IMLITM, SUM(CPJDDTA81.F4101JD.LIPQOH) AS TotalQuantity
FROM CPJDDTA81.F4101JD
GROUP BY CPJDDTA81.F4101JD.IMLITM
Does the above help?
Additional, the items can be limited by adding a WHERE clause.
SELECT CPJDDTA81.F4101JD.IMLITM, SUM(CPJDDTA81.F4101JD.LIPQOH) AS TotalQuantity
FROM CPJDDTA81.F4101JD
WHERE CPJDDTA81.F4101JD.IMLITM IN ('12345678', '01234567')
GROUP BY CPJDDTA81.F4101JD.IMLITM

How best to sum multiple boolean values via SQL?

I have a table that contains, among other things, about 30 columns of boolean flags that denote particular attributes. I'd like to return them, sorted by frequency, as a recordset along with their column names, like so:
Attribute Count
attrib9 43
attrib13 27
attrib19 21
etc.
My efforts thus far can achieve something similar, but I can only get the attributes in columns using conditional SUMs, like this:
SELECT SUM(IIF(a.attribIndex=-1,1,0)), SUM(IIF(a.attribWorkflow =-1,1,0))...
Plus, the query is already getting a bit unwieldy with all 30 SUM/IIFs and won't handle any changes in the number of attributes without manual intervention.
The first six characters of the attribute columns are the same (attrib) and unique in the table, is it possible to use wildcards in column names to pick up all the applicable columns?
Also, can I pivot the results to give me a sorted two-column recordset?
I'm using Access 2003 and the query will eventually be via ADODB from Excel.
This depends on whether or not you have the attribute names anywhere in data. If you do, then birdlips' answer will do the trick. However, if the names are only column names, you've got a bit more work to do--and I'm afriad you can't do it with simple SQL.
No, you can't use wildcards to column names in SQL. You'll need procedural code to do this (i.e., a VB Module in Access--you could do it within a Stored Procedure if you were on SQL Server). Use this code build the SQL code.
It won't be pretty. I think you'll need to do it one attribute at a time: select a string whose value is that attribute name and the count-where-True, then either A) run that and store the result in a new row in a scratch table, or B) append all those selects together with "Union" between them before running the batch.
My Access VB is more than a bit rusty, so I don't trust myself to give you anything like executable code....
Just a simple count and group by should do it
Select attribute_name
, count(*)
from attribute_table
group by attribute_name
To answer your comment use Analytic Functions for that:
Select attribute_table.*
, count(*) over(partition by attribute_name) cnt
from attribute_table
In Access, Cross Tab queries (the traditional tool for transposing datasets) need at least 3 numeric/date fields to work. However since the output is to Excel, have you considered just outputting the data to a hidden sheet then using a pivot table?