BigQuery/I can't correctly result using "in" clauses - sql

I want to extract perfectly match data, not partical match data.
But, I can't extract them, if I execute sql code of the below:
I estimate this sql code extract no data , but this extract all rows of data.
【SQL code】
WITH a AS(
SELECT
001 AS id_a,
112345678901234567 AS x
UNION ALL
SELECT
002,
112345678901233567
UNION ALL
SELECT
003,
112345678901232568
),
comp_a AS(
SELECT
*
FROM
a
WHERE
x IN(112345678901234000, 112345678901233000, 112345678901232000)
),
comp_b AS(
SELECT
004 AS id_b
UNION ALL
SELECT
005
)
SELECT
id_a,
id_b
FROM
comp_a
LEFT OUTER JOIN
comp_b
ON (
comp_a.id_a = comp_b.id_b
)
WHERE
comp_b.id_b IS NULL
;
I think "in" clauses are used for perfectly match.
But, perhaps, I think this sql code isn't executed "in" clauses , but it is executed "like" clauses.
I will be glad you answer solution of my question.
■Further note:
 ・I deleted cashe of browser and Bigquery. But I couldn't solve it.
 ・This sql code is sample code , because I can't expose real sql code.
・I can recreate this problem in One enviroment of BigQuery,
but I can't recreate in Other enviroment of BigQuery.
This Problem may be not problem of sql code , but problem of enviroment
or setting.

Thank you for answering my question.
I solved my question.
The cause of my problem is not BigQuery , but it is the format of Excel.
Detail:
I tried to check data using Excel , Because the results are a lot of data.
Sad to say , Because of the format of Excel is numeric type , a part of number data are rounded. So I misunderstood the correct result to the wrong result.
Sorry about my misunderstanding.

Related

Query Snowflake Jobs [duplicate]

is there any way within snowflake/sql query to view what tables are being queried the most as well as what columns? I want to know what data is of most value to my users and not sure how to do this programatically. Any thoughts are appreciated - thank you!
2021 update
The new ACCESS_HISTORY view has this information (in preview right now, enterprise edition).
For example, if you want to find the most used columns:
select obj.value:objectName::string objName
, col.value:columnName::string colName
, count(*) uses
, min(query_start_time) since
, max(query_start_time) until
from snowflake.account_usage.access_history
, table(flatten(direct_objects_accessed)) obj
, table(flatten(obj.value:columns)) col
group by 1, 2
order by uses desc
Ref: https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html
2020 answer
The best I found (for now):
For any given query, you can find what tables are scanned through looking at the plan generated for it:
SELECT *, "objects"
FROM TABLE(EXPLAIN_JSON(SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.any_table_or_view')))
WHERE "operation"='TableScan'
You can find all of your previous ran queries too:
select QUERY_TEXT
from table(information_schema.query_history())
So the natural next step would be combine both - but that's not straightforward, as you'll get an error like:
SQL compilation error: argument 1 to function EXPLAIN_JSON needs to be constant, found 'SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.c')'
The solution would be to combine the queries from the query_history() with the SYSTEM$EXPLAIN_PLAN_JSON outside (to make the strings constant), and then you will be able to find out the most queried tables.

SQL Division shows as null in SSRS

I am trying to do some divisions in SQL and put them into an SSRS table object.
This SQL works, and the result is 0.6:
select testcol2/5 testcol from (select 3 testcol2 from dual)
But when i try to do the other way around (so divide with the column) the SSRS table shows no value instead of 1.6:
select 5/testcol2 testcol from (select 3 testcol2 from dual)
Please can someone help me how to get SSRS to show me the 1.6 value?
This is pretty strange, but I was able to reproduce it. When I run the SQL in a different tool, I can confirm that the result is 1.666666..., but SSRS isn't displaying it.
Here is a solution that works. Change the SQL to round the result:
select ROUND(5/testcol2, 2) as testcol2
from (select 3 testcol2 from dual)
Now SSRS will display 1.67 as expected. And it will honor any additional number formatting that you place on the textbox as usual.
Edit:
To satisfy my curiosity, I dug in a little more. I used the dump function in Oracle to determine that it was using a length of 193. This works if you round with up to 28 decimal places. Anything more than that and SSRS won't display the value.

How to select all data from table but only display date-specific rows within DATE-data type column, in Oracle SQL?

I'm experiencing trouble returning a query to return all columns within a table but limited to the DATE-data-type "enroll_date" column containing '30-Jan-07'; the closest solution is with the below query but neither data is displayed nor the entire workbook-just the column-which leads me to believe that this is not just an issue with approach but perhaps a formatting issue as well.
SELECT TO_DATE(enroll_date, 'DD-MM-YY')
FROM student.enrollment
WHERE enroll_date= '30-Jan-07';
Again, I need to display all columns but only rows only specific to the date '30-Jan-07'. I'm sure a nested solution is ideal and somehow the right solution, but unfortunately my chops aren't there yet but I'm working on it! :D
UPDATE
Please see attached screenshot of output. The query/solution should retrieve all columns and rows enclosed within the red-rectangle mark-up-thank you!
One possible problem is that the date column has a time component (this is hidden in SQL). One method is to use trunc():
SELECT e.*
FROM student.enrollment e
WHERE TRUNC(e.enroll_date) = DATE '2007-01-30';
You can specify whichever columns you want in the following query:
SELECT col1, col2, col3, ...
FROM student.enrollment
WHERE TO_CHAR(enroll_date, 'DD-MON-YY') = '30-JAN-07';

Running Oracle SQL query over several dates

Within Crystal Reports, I'm using the following query (against an Oracle database) to generate data for a single field in a report:
SELECT SUM(e1.ENT_LOCAL_AMOUNT+e1.ENT_DISCRETIONARY_AMOUNT) AS "Entitlement"
FROM CLAIM_PERIODS cp1
JOIN ENTITLEMENTS e1
ON cp1.CPE_REFNO=e1.ENT_CPE_REFNO
WHERE e1.ENT_REFNO=(SELECT MAX(to_number(e2.ENT_REFNO))
FROM ENTITLEMENTS e2
WHERE e1.ENT_CPE_REFNO=e2.ENT_CPE_REFNO
AND (e2.ENT_START_DATE <= {?HB_As_At_Date}
AND e2.ENT_END_DATE > {?HB_As_At_Date})
AND e2.ENT_CREATED_DATE<={?HB_As_At_Date})
AND cp1.CPE_CPA_CPY_CODE='HB'
This works fine and returns a single integer value, based on the {?HB_As_At_Date} supplied (The {?} syntax is Crystal's way of embedding parameter values into SQL). The content of the above query isn't my issue though - what I want to do is run it repeatedly for several different dates, and have that output be what gets fed through to Crystal for use in the report.
So say I want this query run for every Monday in September, I'd currently run the Crystal report once with a parameter of 07/09/2015, then again for 14/09/2015, etc.
I'd instead like to use my SELECT statement in conjunction with a query that tabulates this as needed - running the above once each per date required. With the output being something like:
Date Entitlement
07/09/2015 450,000.00
14/09/2015 460,123.00
21/09/2015 465,456.00
28/09/2015 468,789.00
Could someone point me in the right direction in terms of which keywords I should be reading up on here? I'd imagine it's quite straight-forward to generate a set of dates and run my SQL as a subquery using them, but I'm not sure where to start.
The only way I can think of without using a stored procedure is by repeating (i.e. copy/paste) your query for each date parameter and then combining them as sub-queries using UNION. Something like this:
SELECT SUM(e1.ENT_LOCAL_AMOUNT+e1.ENT_DISCRETIONARY_AMOUNT) AS "Entitlement"
FROM CLAIM_PERIODS cp1
JOIN ENTITLEMENTS e1
ON cp1.CPE_REFNO=e1.ENT_CPE_REFNO
WHERE e1.ENT_REFNO=(SELECT MAX(to_number(e2.ENT_REFNO))
FROM ENTITLEMENTS e2
WHERE e1.ENT_CPE_REFNO=e2.ENT_CPE_REFNO
AND (e2.ENT_START_DATE <= {?HB_As_At_Date_1}
AND e2.ENT_END_DATE > {?HB_As_At_Date_1})
AND e2.ENT_CREATED_DATE<={?HB_As_At_Date_1})
AND cp1.CPE_CPA_CPY_CODE='HB'
UNION
SELECT SUM(e1.ENT_LOCAL_AMOUNT+e1.ENT_DISCRETIONARY_AMOUNT) AS "Entitlement"
FROM CLAIM_PERIODS cp1
JOIN ENTITLEMENTS e1
ON cp1.CPE_REFNO=e1.ENT_CPE_REFNO
WHERE e1.ENT_REFNO=(SELECT MAX(to_number(e2.ENT_REFNO))
FROM ENTITLEMENTS e2
WHERE e1.ENT_CPE_REFNO=e2.ENT_CPE_REFNO
AND (e2.ENT_START_DATE <= {?HB_As_At_Date_2}
AND e2.ENT_END_DATE > {?HB_As_At_Date_2})
AND e2.ENT_CREATED_DATE<={?HB_As_At_Date_2})
AND cp1.CPE_CPA_CPY_CODE='HB'
As for your comment about writing a script for that, I don't know how you are running your report. But if you have an app/website running it, then you can generate the SQL in the app/website's language and assign it to the report object before you run it. Or even better, you can generate the SQL, run it, and assign the results to the report object. I do this all the time as I prefer my code to run the queries rather than the report itself, because I follow the layered design pattern in my app. The report will be located in the presentation layer which cannot communicate with the database directly, instead it calls the business/data layer with generates/runs the query and returns the results to the business/presentation layer.
Edit the parameter to take the input as multiple values and change the query as
Use either start or end but not both
SELECT SUM(e1.ENT_LOCAL_AMOUNT+e1.ENT_DISCRETIONARY_AMOUNT) AS "Entitlement"
FROM CLAIM_PERIODS cp1
JOIN ENTITLEMENTS e1
ON cp1.CPE_REFNO=e1.ENT_CPE_REFNO
WHERE e1.ENT_REFNO=(SELECT MAX(to_number(e2.ENT_REFNO))
FROM ENTITLEMENTS e2
WHERE e1.ENT_CPE_REFNO=e2.ENT_CPE_REFNO
AND e2.ENT_END_DATE in ( {?HB_As_At_Date})
AND e2.ENT_CREATED_DATE in ({?HB_As_At_Date})
AND cp1.CPE_CPA_CPY_CODE='HB'
Actually, there is a more elegant way to solve this problem.
Let's suppose that the main query is t1 , and the parameters to use are available in a table t2.
Here is an example, where the the sub-queries t1 and t2 can be replaced by real tables.
select t1.title , t2.ref_date, t2.dt_label
from (
select 'abc' title from dual union all
select 'def' title from dual
) t1
cross join
(
select to_date('01.07.2019', 'dd.mm.yyyy') ref_date,'S1/2019' dt_label from dual union all
select to_date('01.12.2019', 'dd.mm.yyyy') ref_date,'Y/2019' dt_label from dual union all
select to_date('01.07.2020', 'dd.mm.yyyy') ref_date,'S1/2020' dt_label from dual union all
select to_date('01.12.2020','dd.mm.yyyy') ref_date, 'Y/2020' dt_label from dual
) t2
where t2.ref_date < to_date('01.08.2020','dd.mm.yyyy')
order by t2.ref_date, t1.title;
This way, the query remain the same, while the extraction parameters are filled in an auxiliary table.

WHERE clause on calculated field not working

I have an Access SQL query pulling back results from a Latitude & Longitude input (similar to a store locator). It works perfectly fine until I attempt to put in a WHERE clause limiting the results to only resultants within XXX miles (3 in my case).
The following query works fine without the WHERE distCalc < 3 clause being added in:
PARAMETERS
[selNum] Long
, [selCBSA] Long
, [cosRadSelLAT] IEEEDouble
, [radSelLONG] IEEEDouble
, [sinRadSelLAT] IEEEDouble;
SELECT B.* FROM (
SELECT A.* FROM (
SELECT
CERT
, RSSDHCR
, NAMEFULL
, BRNUM
, NAMEBR
, ADDRESBR
, CITYBR
, STALPBR
, ZIPBR
, simsLAT
, simsLONG
, DEPDOM
, DEPSUMBR
, 3959 * ArcCOS(
cosRadSelLAT
* cosRadSimsLAT
* cos(radSimsLONG - radSelLONG)
+ sinRadSelLAT
* sinRadSimsLAT
) AS distCalc
FROM aBRc
WHERE CBSA = selCBSA
AND cosRadSimsLAT IS NOT NULL
AND UNINUMBR <> selNum
) AS A
ORDER BY distCalc
) AS B
WHERE B.distCalc < 3
ORDER BY B.DEPSUMBR DESC;
When I add the WHERE distCalc < 3 clause, I get the dreaded
This expression is typed incorrectly, or it is too complex to be evaluated.
error.
Given that the value is created in the A sub-query I thought that it would be available in the outer B query for comparative calcs. I could recalculate the distCalc in the WHERE, however, I'm trying to avoid that since I'm using a custom function (ArcCOS). I'm already doing one hit on each row and there is significant overhead involved doing additional if I can avoid it.
The way you have it typed you are limiting it by B.distCalc, which is requires calcuation of A.distCalc, on which you are asking for a sort. Even if this worked worked, would require n^2 calculations to compute.
Try putting the filter on distCalc in the inner query (using the formula for distCalc, not distCalc itself).
This is not an answer. It's a formatted comment. What happens when you do this:
select *
from (
select somefield, count(*) records
from sometable
group by somefield) temp
If that runs successfully, try it with
where records > 0
at the end. If that fails, you probably need another approach. If it succeeds, start building your real query using baby steps like this. Test early and test often.
I was able to "solve" the problem by pushing the complicated formula into the function and then returning the value (which was then able to be used in the WHERE clause).
Instead of:
3959 * ArcCOS( cosRadSelLAT * cosRadSimsLAT * cos(radSimsLONG - radSelLONG) + sinRadSelLAT * sinRadSimsLAT) AS distCalc
I went with:
ArcCOS2(cosRadSelLAT,cosRadSimsLAT,radSimsLONG, radSelLONG,sinRadSelLAT,sinRadSimsLAT) AS distCalc
The ArcCOS2 Function contained the full formula. The upside is it works, the downside is that appears to be a slight tad slower. I appreciate everyone's help on this. Thank you.