BigQuery - Increment Values aside from the Windowed Functions - google-bigquery

Is there another way to assign incremental values on a column without using the Windowed Function Row_Number() and Rank() in Google BigQuery.
Im actually encountering resource problems when using these functions
Thanks for the answer

Related

Recreating TD functions in normal SQL

I’ve been using rank(), row_number() and dense_rank() in Teradata SQL for quite sometime and have had to transition across to an older version of SQL, without these functions.
Is there a way to recreate these functions easily? I’m currently using a proc sql; in SAS-EG. I’m aware of SAS being able to use first. and last function but there must be a way to do it solely in SQL?
I’m aware of the monontonic() function but have yet to be able to reset it where I want my partition to end/able to create a dense_rank with it.
Any help would be greatly appreciated
RANK, ROW_NUMBER and DENSE_RANK are all window functions in MySQL-8.0. Just update to this version and they will be there.

How to write Window functions using Druid?

For example, i wanted to write Window functions like sum over (window)
Since over clause is not supported by Druid, how do i achieve the same using Druid Native query API or SQL API?
You should use a GroupBy Query. As Druid is a time series database, you have to specify your interval (window) where you want to query data from. You can use aggregation methods over this data, for example a SUM() aggregation.
If you want, you can also do extra filtering within your aggregation, like "only sum records where city=paris"). You could also apply the SUM aggregation only to records which exists in a certain time window within your selected interval.
If you are a PHP user then maybe this package is handy for you: https://github.com/level23/druid-client#sum
We have tried to implement an easy way to query such data.

List of aggregation functions in Spark SQL

I'm looking for a list of pre-defined aggregation functions in Spark SQL. I have in mind something analogous to Presto Aggregate Functions.
I Ctrl+F'd around a little in the SQL API docs to no avail... it's also hard to tell at a glance which functions are for aggregation vs. not. For example, if I didn't know avg is an aggregation function I'd be hard pressed to tell it is one (in a way that's actually scalable to the full set of functions):
avg - avg(expr) - Returns the mean calculated from values of a group.
If such a list doesn't exist, can someone at least confirm to me that there's no pre-defined function like any/bool_or or all/bool_and to determine if any or all of a boolean column in a group are true (or false)?
For now, my workaround is
select grp_col, count(if(bool_col, true, NULL)) > 0 any_agg
Just take a look at Spark Docs on Aggregate functions section
The list of functions is here under Relational Grouped Dataset - specifically the API's that return DataFrame (not RelationalGroupedDataSet):
https://spark.apache.org/docs/latest/api/scala/index.html?org/apache/spark/sql/RelationalGroupedDataset.html#org.apache.spark.sql.RelationalGroupedDataset

Search the position of a column in SQLite

I am searching... and I don't find how can I find in SQlite what is the position of an item in the column.
When i use ROW_NUMBER() i get:
ERROR: near "(": syntax error (code 1)
SELECT Nom ROW_NUMBER () OVER (ORDER BY Score) AS Rownumber
FROM tableau
I'm using MIT App Inventor with Taifun extension sqlite
Other question how to know which item is in position 2 (or another number) in the column?
This is too long for a comment.
ROW_NUMBER() is an ANSI-standard function. However, not all databases use it. For instance, SQLite, MySQL, and MS Access (among others) do not support this functionality.
Presumably, you are using one of these database that does not support this function.
I would suggest that you research what database you are using. Then, try investigating how to implement the functionality you want using that database. If you can't figure out how, ask another question, providing sample data, desired results, and a tag for the database you are using.
If this is indeed the code you are running:
SELECT Nom ROW_NUMBER () OVER (ORDER BY Score) AS Rownumber FROM tableau
Then what is Nom? This is a syntax error in practically every implementation. What you are probably looking to do is:
SELECT Nom, ROW_NUMBER () OVER (ORDER BY Score) AS Rownumber FROM tableau
Notice the comma after Nom.

Window functions and allow large results

The window function documentation states that window functions cannot be used generate large query results:
https://developers.google.com/bigquery/query-reference#windowfunctions
This statement is repeated in the documentation for large query results:
https://developers.google.com/bigquery/querying-data#largequeryresults
I've created a query that uses window functions and creates a lot of results. The query can be found below for interest, it is run over the standard Google Analytics data extract into BigQuery.
When I run this query it returns a "Response too large to return" message. Specifying "Allow Large Results" seems to correct the problem. So I'm using both window functions and large results for this query.
This seems to be at odds with the statement that window functions can't be used to generate large query results. Can someone help me understand what this statement means?
SELECT
CONCAT(fullVisitorId, STRING(visitId)) AS fullVisitID,
hits.hitNumber as Sequence,
hits.page.pagePath as PagePath,
LAG(Pagepath, 1) OVER
(PARTITION BY fullVisitID ORDER BY Sequence Asc) AS PrePage,
LEAD(Pagepath, 1) OVER
(PARTITION BY fullVisitID ORDER BY Sequence Asc) AS PostPage
FROM [<<TABLE NAME>>]
WHERE hits.type= 'PAGE'
This is the product improving at a faster pace than documentation.
Initially window functions were not parallelizable, hence not compatible with "allow large results" (that works by paralleling the output). However BigQuery now is capable of parallelizing window function queries when they use the PARTITION keyword - hence that query now works.
Note that each partition can't be too big for this to work.