Group By Using Wildcards in Big Query - google-bigquery

I have this query:
SELECT SomeTableA.*
FROM SomeTableB
LEFT JOIN SomeTableA USING (XYZ)
GROUP BY SomeTableA.*
I know that I cannot do the GROUP BY part with wildcards. At the same time, I don't really like listing all the columns (can be up to 20) manually.
Could this be added as new feature? Or is there any way how to easily get the list of all 20 columns from SomeTableA for the GROUP BY part?

If you really have the exact query shown in your question - then try below instead - no grouping required
#standardSQL
SELECT DISTINCT *
FROM `project.dataset.tableA`
WHERE xyz IN (SELECT xyz FROM `project.dataset.tableB`)
As of Group By Using Wildcards in Big Query this sounds more like grouping by struct which is not supported so you can submit feature request if you want - https://issuetracker.google.com/issues/new?component=187149&template=0

Related

Split Array Into Parts - Get all Unique Items - GoogleSQL

I'm looking to take a String-Array field in Google SQL and transpose it to show all in one column. From there I can take all unique/distinct items from it.
the image above is a sample of what I am trying
I can't get the string array to split out into resulting rows.
Any help or suggestions would be greatly appreciated
I think you can do it using unnest, assuming columnB is holding the array:
select numbers
from yourtable t
cross join unnest(t.ColumnB) numbers
and for distinct :
select distinct numbers
from yourtable t
cross join unnest(t.ColumnB) numbers
Adding this as answer (as it is too long for comments) - just to point that usually users using too verbose syntax with unnest function. For example - instead of using unnest(t.ColumnB) one can use either unnest(ColumnB) or just t.ColumnB as in examples below
select number
from your_table t, t.ColumnB number
and
select distinct number
from your_table t, t.ColumnB number
I personally prefer this shortcut version of using unnest - so wanted to share - while obviously this is a personal preferences type of things

BigQuery Wildcard tables with Regex and date range

Is it possible to combine the table wildcard functions as documented here?
I've taken a look through the Table Query functions SO answer, but doesn't quite seem to cover my use case.
I have table names in the format: s_CUSTOMER_ID_YYYYMMDD
I can find all the tables for a customer ID using:
SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")')
And I can find all the tables for a date range via:
SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01')))
But how do I query for both at the same time?
I tried using sub queries like this:
SELECT * FROM
(SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")'))
,(SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01'))))
...but the parser complains of Error: Can't parse table: project:dataset.
Adding a dot so they are project:dataset. brings an error Error: Error preparing subsidiary query: Dataset project:dataset. not found
Are my table names poorly done? What would be a better way of organising them if so?
Below quick "solution" - should work and you can improve it based on real/extra requirements you probably have
SELECT *
FROM
TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")
AND RIGHT(table_id, 8) BETWEEN "20160101" AND "20160301"')

Access SQL Query - Top 25 containing /images/

I'm a university student (thus a beginner) with some Access tasks to do, but I need help because so far it does not provide the desired result.
I have a table Table1 containing web log records. My field of interest is "cs-uri-stem", as it contains all the URL GET requests.
I want to select the TOP 25 images (records must contain /images/ in the "cs-uri-stem" field). So far I tried the following, with no success:
SELECT TOP 25
FROM Table 1
WHERE "cs-uri-stem"="cs-uri-stem"
HAVING [/images/];
An alert window keeps appearing saying that the SELECT request is not correct, but I don't know if this is caused by the fact that the Access is in Spanish in my university.
Thanks in advance!
Possible answer to my question, provided with Siyual and MCP_infiltrator's help [Thanks again!]:
SELECT TOP 25
Table1.[cs-uri-stem],
Count(Table1.[cs-method]) AS TotalHits
FROM Table1
WHERE (((Table1.[cs-uri-stem]) Like '*/images*'))
GROUP BY Table1.[cs-uri-stem]
ORDER BY Count(Table1.[cs-method]) DESC`
This will provide a top25 list of only the visited images, with no duplicates, instead of all the URLs.
There are several things wrong with your query, but this should work for what you need:
SELECT TOP 25 Images
FROM Table1
WHERE cs-uri-stem Like '%/images/%'
As for the what/why things are wrong...
For your Select statement, you aren't specifying any fields that you're wanting to get. I'm assuming by your question that you have a field named images that you're wanting to get back. If it's some other field, change that, or just use SELECT TOP 25 * to get everything.
Your From clause has a space in the table name.
Your Where clause makes no sense. This is where you need to be putting your logic for your query. In this case, you want anything that has /images/ in the cs-uri-stem field. Like is the operator you need to use here.
Finally, your Having is just plain wrong. It's not used correctly, nor is it even in the right context.
SELECT TOP 25 *
FROM TABLE1
WHERE CS-URI-STEM LIKE "*/images*"
This will select the top 25 records and all columns from the table where the cs-uri-stem has /images in it. When you use the * that is telling the db that you want all the columns of the table to be pulled for viewing.
Here is a link describing some of it
See Here: Select Top n in MS-Access
Another link
from office.microsoft.com
I would create three queries:
query1:
SELECT Images
FROM Table1
WHERE cs-uri-stem Like '*/images/*'
query2:
select images, count(images) as image_count
from query1 as q
group by images
query3:
select top 25 *
from query2
order by image_count desc
query3 should have the 25 results that you're interested in.
SELECT TOP 25 * FROM Table1 WHERE cs-uri-stem LIKE '%/images/%'
http://www.w3schools.com/sql/sql_like.asp

Use of the HAVING clause when using muliple sums

I was having a problem getting mulitple sums from multiple tables. Short story, my answer was solved in the "sql sum data from multiple tables" thread on this site. But where it came up short, is that now I'd like to only show sums that are greater than a certain amount. So while I have sub-selects in my select, I think I need to use a HAVING clause to filter the summed amounts that are too low.
Example, using the code specified in the link above (more specifically the answer that the owner has chosen as correct), I would only like to see a query result if SUM(AP2.Value) > 1500. Any thoughts?
If you need to filter on the results of ANY aggregate function, you MUST use a HAVING clause. WHERE is applied at the row level as the DB scans the tables for matching things. HAVING is applied basically immediately before the result set is sent out to the client. At the time WHERE operates, the aggregate function results are not (and cannot) be available, so you have to use a HAVING clause, which is applied after the main query is complete and all aggregate results are available.
So... long story short, yes, you'll need to do
SELECT ...
FROM ...
WHERE ...
HAVING (SUM_AP > 1500)
Note that you can use column aliases in the having clause. In technical terms, having on a query as above works basically exactly the same as wrapping the initial query in another query and applying another WHERE clause on the wrapper:
SELECT *
FROM (
SELECT ...
) AS child
WHERE (SUM_AP > 1500)
You could wrap that query as a subselect and then specify your criteria in the WHERE clause:
SELECT
PROJECT,
SUM_AP,
SUM_INV
FROM (
SELECT
AP1.[PROJECT],
(SELECT SUM(AP2.Value) FROM AP AS AP2 WHERE AP2.PROJECT = AP1.PROJECT) AS SUM_AP,
(SELECT SUM(INV2.Value) FROM INV AS INV2 WHERE INV2.PROJECT = AP1.PROJECT) AS SUM_INV
FROM AP AS AP1
INNER JOIN INV AS INV1 ON
AP1.[PROJECT] = INV1.[PROJECT]
WHERE
AP1.[PROJECT] = 'XXXXX'
GROUP BY
AP1.[PROJECT]
) SQ
WHERE
SQ.SUM_AP > 1500

Oracle Group by issue

I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example