"contains" in Bigquery standard SQL

"contains" in Bigquery standard SQL - sql

I wish to migrate from Legacy SQL to Standard SQL
I had the following code in Legacy SQL
SELECT
hits.page.pageTitle
FROM [mytable]
WHERE hits.page.pageTitle contains '%'
And I tried this in Standard SQL:
SELECT
hits.page.pageTitle
FROM `mytable`
WHERE STRPOS(hits.page.pageTitle, "%")
But it gives me this error:
Error: Cannot access field page on a value with type
ARRAY> at [4:21]

Try this one:
SELECT
hits.page.pageTitle
FROM `table`,
UNNEST(hits) hits
WHERE REGEXP_CONTAINS(hits.page.pageTitle, r'%')
LIMIT 1000
In ga_sessions schema, "hits" is an ARRAY (that is, REPEATED mode). You need to apply the UNNEST operation in order to work with arrays in BigQuery.

Related

Using Regex to determine what kind of SQL statement a row is from a list?

I have a large list of SQL commands such as
SELECT * FROM TEST_TABLE
INSERT .....
UPDATE .....
SELECT * FROM ....
etc. My goal is to parse this list into a set of results so that I can easily determine a good count of how many of these statements are SELECT statements, how many are UPDATES, etc.
so I would be looking at a result set such as
SELECT 2
INSERT 1
UPDATE 1
...
I figured I could do this with Regex, but I'm a bit lost other than simply looking at everything string and comparing against 'SELECT' as a prefix, but this can run into multiple issues. Is there any other way to format this using REGEX?

You can add the SQL statements to a table and run them through a SQL query. If the SQL text is in a column called SQL_TEXT, you can get the SQL command type using this:
upper(regexp_substr(trim(regexp_replace(SQL_TEXT, '\\s', ' ')),
'^([\\w\\-]+)')) as COMMAND_TYPE

You'll need to do some clean up to create a column that indicates the type of statement you have. The rest is just basic aggregation
with cte as
(select *, trim(lower(split_part(regexp_replace(col, '\\s', ' '),' ',1))) as statement
from t)
select statement, count(*) as freq
from cte
group by statement;

SQL is a language and needs a parser to turn it from text into a structure. Regular expressions can only do part of the work (such as lexing).
Regular Expression Vs. String Parsing
You will have to limit your ambition if you want to restrict yourself to using regular expressions.
Still you can get some distance if you so want. A quick search found this random example of tokenizing MySQL SQL statements using regex https://swanhart.livejournal.com/130191.html

How use SELECT in append query in Access SQL?

In MS Access, I created a query by create Menu-> Query Design (with name Query3).
I want use it in an SQL command in another query but when I run it got this error:
Syntax error on query expression 'select f1'
SQL command
INSERT INTO boors (boors.Nemad, boors.Volumn, boors.Price,
boors.LastPrice, boors.LastPerc, boors.LastPr,
boors.LastPer, boors.MinPrice, boors.MaxPrice,
boors.distance, boors.inout, boors.Power)
values (select f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12 FROM Query3)

It appears that you are mixing the SQL used for inserting values and inserting from a table/query. As you are doing the latter, your SQL should look like:
INSERT INTO boors (Nemad, Volumn)
SELECT F1, F2
FROM Query3
Regards,

BigQuery: Querying repeated fields

I'm trying to use the following query to get the rows for which the event names are equal to: EventGamePlayed, EventGetUserBasicInfos or EventGetUserCompleteInfos
select *
from [com_test_testapp_ANDROID.app_events_20170426]
where event_dim.name in ("EventGamePlayed", "EventGetUserBasicInfos", "EventGetUserCompleteInfos");
I'm getting the following error: Cannot query the cross product of repeated fields event_dim.name and user_dim.user_properties.value.index.
Is it possible to make it work by not having a flattened result ?
Also, I'm not sure why the error is talking about the "user_dim.user_properties.value.index" field.

The error is due to the SELECT *, which includes all columns. Rather than using legacy SQL, try this using standard SQL, which doesn't have this problem with repeated field cross products:
#standardSQL
SELECT *
FROM com_test_testapp_ANDROID.app_events_20170426
CROSS JOIN UNNEST(event_dim) AS event_dim
WHERE event_dim.name IN ("EventGamePlayed", "EventGetUserBasicInfos", "EventGetUserCompleteInfos");
You can read more about working with repeated fields/arrays in the Working with Arrays topic. If you are used to using legacy SQL, you can read about differences between legacy and standard SQL in BigQuery in the migration guide.

Sub-Queries in Sybase SQL

We have an application which indexes data using user-written SQL statements. We place those statements within parenthesis so we can limit that query to a certain criteria. For example:
select * from (select F_Name from table_1)q where ID > 25
Though we have discovered that this format does not function using a Sybase database. Reporting a syntax error around the parenthesis. I've tried playing around on a test instance but haven't been able to find a way to achieve this result. I'm not directly involved in the development and my SQL knowledge is limited. I'm assuming the 'q' is to give the subresult an alias for the application to use.
Does Sybase have a specific syntax? If so, how could this query be adapted for it?
Thanks in advance.

Sybase ASE is case sensitive w.r.t. all identifiers and the query shall work:
as per #HannoBinder query :
select id from ... is not the same as select ID from... so make sure of the case.
Also make sure that the column ID is returned by the Q query in order to be used in where clause .
If the table and column names are in Upper case the following query shall work:
select * from (select F_NAME, ID from TABLE_1) Q where ID > 25

SQL with LIMIT1 returns all records

I made a mistake and entered:
SELECT * FROM table LIMIT1
instead of
SELECT * FROM table LIMIT 1 (note the space between LIMIT and 1)
in the CLI of MySQL. I expected to receive some kind of parse error, but I was surprised, because the query returned all of the records in the table. My first thought was "stupid MySQL, I bet that this will return error in PostgreSQL", but PostgreSQL also returned all records. Then tested it with SQLite - with the same result.
After some digging, I realized that it doesn't matter what I enter after the table. As long as there are no WHERE/ORDER/GROUP clauses:
SELECT * FROM table SOMETHING -- works and returns all records in table
SELECT * FROM table WHERE true SOMETHING -- doesn't work - returns parse error
I guess that this is a standardized behavior, but I couldn't find any explanation why's that. Any ideas?

Your first query is equivalent to this query using a table alias:
SELECT * FROM yourtable AS LIMIT1
The AS keyword is optional. The table alias allows you to refer to columns of that table using the alias LIMIT1.foo rather than the original table name. It can be useful to use aliases if you wish to give tables a shorter or a more descriptive alias within a query. It is necessary to use aliases if you join a table to itself.
From the SQL lite documentation:

This is why I want DB engine to force the usage of keyword AS for alias names
http://beyondrelational.com/modules/2/blogs/70/posts/10814/should-alias-names-be-preceded-by-as.aspx

SELECT * FROM table LIMIT1;
LIMIT1 This has taken as alias by SQL, cause LIMIT1 is not a reserved literal of SQL.
Something after table name and that is not a reserved keyword always taken as an table alias by SQL.
SELECT * FROM table LIMIT 1;
When you used LIMIT just after the table name, SQL found that as a reserved keyword and worked for it as per the behavior. IF you want to use reserved key words in query It can be done by putting reserved literals in quotes. like..
SELECT * FROM table `LIMIT`;
OR
SELECT * FROM table `LIMIT 1`;
Now all words covered under `` quotes will treated as user defined.
Commonly we did mistake with date, timestamp, limit etc.. keywords by using them as column names.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

"contains" in Bigquery standard SQL - sql

Try this one: SELECT hits.page.pageTitle FROM `table`, UNNEST(hits) hits WHERE REGEXP_CONTAINS(hits.page.pageTitle, r'%') LIMIT 1000 In ga_sessions schema, "hits" is an ARRAY (that is, REPEATED mode). You need to apply the UNNEST operation in order to work with arrays in BigQuery.

Related

Using Regex to determine what kind of SQL statement a row is from a list?

How use SELECT in append query in Access SQL?

BigQuery: Querying repeated fields

Sub-Queries in Sybase SQL

SQL with LIMIT1 returns all records

Categories

Resources