i am building a partition based table in a dataset and i am trying to query those partitions using a date range.
Here is an example of the data:
Dataset:
logs
Tables:
logs_20170501
logs_20170502
logs_20170503
i am trying first the TABLE_RANGE_DATE
SELECT count(*) FROM TABLE_DATE_RANGE([logs.logs_],
TIMESTAMP("2017-05-01"),
TIMESTAMP("2017-05-03")) as logs_count
i am keep getting : "ERROR:Error evaluating subsidiary query"
i tried those options as well:
single comma:
SELECT count(*) FROM TABLE_DATE_RANGE([logs.logs_],
TIMESTAMP('2017-05-01'),
TIMESTAMP('2017-05-03')) as logs_count
Add Project ID:
SELECT count(*) FROM TABLE_DATE_RANGE([main_sys_logs:logs.logs_],
TIMESTAMP('2017-05-01'),
TIMESTAMP('2017-05-03')) as logs_count
And it didn't worked.
So i tried to use TABLE_SUFFIX
SELECT
count(*)
FROM [main_sys_logs:logs.logs_*]
WHERE _TABLE_SUFFIX BETWEEN '20170501' AND '20170503'
And i got this error :
Invalid table name:'main_sys_logs:logs.logs_*
i have been switching SQL Dialect between legacy SQL ON/Off and i just got different errors on the table name part.
Is there any tips or help for this matter ?
maybe my table name is build wrong with the "_" at the end and this is causing the problem ? thanks for any help.
So i tried this Query and it worked :
SELECT count(*) FROM TABLE_DATE_RANGE(logs.logs_,
TIMESTAMP("2017-05-01"),
TIMESTAMP("2017-05-03")) as logs_count
it started to work after i run this query , i don't know if this is the reason .. but i just query the TABLES data for the dataset
SELECT *
FROM logs__TABLES__
Related
is there any way within snowflake/sql query to view what tables are being queried the most as well as what columns? I want to know what data is of most value to my users and not sure how to do this programatically. Any thoughts are appreciated - thank you!
2021 update
The new ACCESS_HISTORY view has this information (in preview right now, enterprise edition).
For example, if you want to find the most used columns:
select obj.value:objectName::string objName
, col.value:columnName::string colName
, count(*) uses
, min(query_start_time) since
, max(query_start_time) until
from snowflake.account_usage.access_history
, table(flatten(direct_objects_accessed)) obj
, table(flatten(obj.value:columns)) col
group by 1, 2
order by uses desc
Ref: https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html
2020 answer
The best I found (for now):
For any given query, you can find what tables are scanned through looking at the plan generated for it:
SELECT *, "objects"
FROM TABLE(EXPLAIN_JSON(SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.any_table_or_view')))
WHERE "operation"='TableScan'
You can find all of your previous ran queries too:
select QUERY_TEXT
from table(information_schema.query_history())
So the natural next step would be combine both - but that's not straightforward, as you'll get an error like:
SQL compilation error: argument 1 to function EXPLAIN_JSON needs to be constant, found 'SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.c')'
The solution would be to combine the queries from the query_history() with the SYSTEM$EXPLAIN_PLAN_JSON outside (to make the strings constant), and then you will be able to find out the most queried tables.
I am currently working on an access database where we collect customers feedback.
I have one table with the following structure and data :
And I want to display the following result :
Indeed, what I want is a MS Access Request that displays, for every date value in my table, the amount of records that matches the same date on the column "date_import" (2nd column of the result) and the amount of records that matches this criteria on the column "date_answered" (3rd column of the result).
I have no idea how to do this since all the subqueries should be aware of each other.
Has anyone ever faced this issue and might be able to help me ?
Thanks in advance,
P.S. : I'm using the 2016 version of MS Access but I'm pretty sure what I'm trying to do is also achievable in previous versions of Access, this is what I added several tags.
Hmmm . . . I think this will work:
select dte, sum(is_contact), sum(is_answer)
from (select date_import as dte, 1 as is_contact, 0 as is_answer
from t
union all
select date_answers, 0 as is_contact, 1 as is_answer
from t
) t
group by dte;
Not all versions of MS Access allow union all in the FROM clause. If that is a problem, you can create a view and then select from the view.
Is it possible to combine the table wildcard functions as documented here?
I've taken a look through the Table Query functions SO answer, but doesn't quite seem to cover my use case.
I have table names in the format: s_CUSTOMER_ID_YYYYMMDD
I can find all the tables for a customer ID using:
SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")')
And I can find all the tables for a date range via:
SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01')))
But how do I query for both at the same time?
I tried using sub queries like this:
SELECT * FROM
(SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")'))
,(SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01'))))
...but the parser complains of Error: Can't parse table: project:dataset.
Adding a dot so they are project:dataset. brings an error Error: Error preparing subsidiary query: Dataset project:dataset. not found
Are my table names poorly done? What would be a better way of organising them if so?
Below quick "solution" - should work and you can improve it based on real/extra requirements you probably have
SELECT *
FROM
TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")
AND RIGHT(table_id, 8) BETWEEN "20160101" AND "20160301"')
I am running the following query and keep getting the error message:
SELECT NTH(2,split(Web_Address_,'.')) +'.'+NTH(3,split(Web_Address_,'.')) as D , Web_Address_
FROM [Domains.domain
limit 10
Error message: Error: (L1:110): (L1:119): SELECT clause has mix of
aggregations 'D' and fields 'Web_Address_' without GROUP BY
clause Job ID:
symmetric-aura-572:job_axsxEyfYpXbe2gpmlYzH6bKGdtI
I tried to use group by clause on field D and/or Web_address_, but still getting errors about group by.
Does anyone know why this is the case? I have had success with similar query before.
You probably want to use WITHIN RECORD aggregation here, not GROUP BY
select concat(p1, '.', p2), Web_Address_ FROM
(SELECT
NTH(2,split(Web_Ad`enter code here`dress_,'.')) WITHIN RECORD p1,
NTH(3,split(Web_Address_,'.')) WITHIN RECORD p2, Web_Address_
FROM (SELECT 'a.b.c' as Web_Address_))
P.S. If you just trying to cut off first part of web address, it will be easier to do with RIGHT and INSTR functions.
You can also consider using URL functions: HOST, DOMAIN and TLD
I'm trying to replicate the following SQL in Rails3 active record, nothing I've found so far comes close. So, any help would be appreciated.
SELECT AVG(DAILY_AVG) FROM (
SELECT user_code, (COUNT(actioned_at) / 200) as DAILY_AVG
FROM transactions
GROUP BY user_code
) TMP
I'm currently executing this directly using ...connection.select_value(sql) but would really like to figure out the active record way of doing this.
The inner query can be written as:
Transaction.group(:user_code).select("COUNT(actioned_at) / 200 AS daily_avg")
And then to nest this to get the average we can do:
Transaction.select("AVG(daily_avg)").from(Transaction.group(:user_code).select("COUNT(actioned_at) / 200 AS daily_avg"))[0].avg.to_f