BigQuery - Cannot join on repeated field - google-bigquery

Im trying to create a table that is 1 column with each row being a new date between 2 separate dates. The query works fine until I add a where clause that contains a subquery ie. NOT IN (SELECT ....). It works fine if I do something like NOT IN (TIMESTAMP('xyz')).
I keep getting an error saying "Cannot join on repeated field t2.f0__group.SomeDate"
I have no clue why this is happening. Also Im fairly new to BQ so if there is an easier way to do this please let me know. Thanks
SELECT SomeDate FROM
(
SELECT DATE_ADD(Day, i, "DAY") SomeDate
FROM
(
SELECT '2020-01-03' Day
) T1
CROSS JOIN
(
SELECT
POSITION(
SPLIT(
RPAD('', DATEDIFF('2020-01-30','2020-01-03') * 2, 'a,'))) i
FROM
(
SELECT NULL
)
) T2
)
WHERE SomeDate NOT IN (SELECT OtherDate FROM
(
SELECT TIMESTAMP('2020-01-04 00:00:00 UTC') AS OtherDate
),
(
SELECT TIMESTAMP('2020-01-06 00:00:00 UTC') AS OtherDate
),
(
SELECT TIMESTAMP('2020-01-08 00:00:00 UTC') AS OtherDate
)
)

I suggest to start over from scratch using below example
I think it does exactly what you are trying to achieve with probably minor adjustments
SELECT SomeDate
FROM (
SELECT
DATE(DATE_ADD(TIMESTAMP('2020-01-03'), pos - 1, "DAY")) AS SomeDate
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP('2020-01-30'), TIMESTAMP('2020-01-03')), '.'),'') AS h
FROM (SELECT NULL)),h
))
)
) a
LEFT JOIN (
SELECT OtherDate FROM
(SELECT '2020-01-04' AS OtherDate),
(SELECT '2020-01-06' AS OtherDate),
(SELECT '2020-01-08' AS OtherDate)
) b
ON b.OtherDate = a.SomeDate
WHERE b.OtherDate IS NULL

Related

How can I transform my N little queries into one query?

I have a query that gives me the first available value for a given date and pair.
SELECT
TOP 1 value
FROM
my_table
WHERE
date >= 'myinputdate'
AND key = 'myinpukey'
ORDER BY date
I have N pairs of key and dates, and I try to find out how not to query each pair one by one. The table is rather big, and N as well, so it's currently heavy and slow.
How can I query all the pairs in one query ?
A solution is to use APPLY like a "function" created on the fly with one or many columns from another set:
DECLARE #inputs TABLE (
myinputdate DATE,
myinputkey INT)
INSERT INTO #inputs(
myinputdate,
myinputkey)
VALUES
('2019-06-05', 1),
('2019-06-01', 2)
SELECT
I.myinputdate,
I.myinputkey,
R.value
FROM
#inputs AS I
CROSS APPLY (
SELECT TOP 1
T.value
FROM
my_table AS T
WHERE
T.date >= I.myinputdate AND
T.key = I.myinputkey
ORDER BY
T.date ) AS R
You can use OUTER APPLY if you want NULL result values to be shown also. This supports fetching multiple columns and using ORDER BY with TOP to control amount of rows.
This solution is without variables. You control your N by setting the right value to the row_num predicate.
There are plenty of ways how to do you what you want and it all depends on your specific needs. As it answered already, that you can use temp/variable table to store these conditions and then join it on the same conditions you use predicates. You can also create user defined data type and use it as param to the function/procedure. You might use CROSS APPLY + VALUES clause to get that list and then join it.
DROP TABLE IF EXISTS #temp;
CREATE TABLE #temp ( d DATE, k VARCHAR(100) );
GO
INSERT INTO #temp
VALUES ( '20180101', 'a' ),
( '20180102', 'b' ),
( '20180103', 'c' ),
( '20180104', 'd' ),
( '20190101', 'a' ),
( '20190102', 'b' ),
( '20180402', 'c' ),
( '20190103', 'c' ),
( '20190104', 'd' );
SELECT a.d ,
a.k
FROM ( SELECT d ,
k ,
ROW_NUMBER() OVER ( PARTITION BY k ORDER BY d DESC ) row_num
FROM #temp
WHERE (d >= '20180401'
AND k = 'a')
OR (d > '20180401'
AND k = 'b')
OR (d > '20180401'
AND k = 'c')
) a
WHERE a.row_num <= 1;
-- VALUES way
SELECT a.d ,
a.k
FROM ( SELECT t.d ,
t.k ,
ROW_NUMBER() OVER ( PARTITION BY t.k ORDER BY t.d DESC ) row_num
FROM #temp t
CROSS APPLY (VALUES('20180401','a'), ('20180401', 'b'), ('20180401', 'c')) f(d,k)
WHERE t.d >= f.d AND f.k = t.k
) a
WHERE a.row_num <= 1;
If all the keys are using the same date, then use window functions:
SELECT key, value
FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY key ORDER BY date) as seqnum
FROM my_table t
WHERE date >= #input_date AND
key IN ( . . . )
) t
WHERE seqnum = 1;
SELECT key, date,value
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY key,date ORDER BY date) as rownum,key,date,value
FROM my_table
WHERE
date >= 'myinputdate'
) as d
WHERE d.rownum = 1;

How to use Dynamic Lag function to avoid joining a table to itself to retrieve date value

I'm currently writing code in SQL to add the column in red to the following table:
The logic is the following:
For every row:
if flag for this row =1 then use date of this row
if flag for this row =0 then find the latest row (based on date) on which flag was = 1 for the same party and return the date of that row. If no such row exists, return null
I've found a way to do this by joining the table to itself but I would like to avoid doing that as the size of the table is pretty massive.
What I have
select b.*, a.date,
from table a left join table b on a.party=b.party
where a.flag =1
Someone told me I could use the lag function, the partition over function and a case when to return the value I'm after but I haven't been able to figure it out.
Can someone help? Thank you so much!
try this
DECLARE #tab1 TABLE(PARTY CHAR(1),DATE DATE,Flag bit)
INSERT INTO #tab1
SELECT 'A','7-24-2018',1 Union ALL
SELECT 'A','7-28-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'B','7-13-2018',1 Union ALL
SELECT 'B','7-17-2018',0 Union ALL
SELECT 'B','7-18-2018',0 Union ALL
SELECT 'C','7-8-2018',1 Union ALL
SELECT 'C','7-13-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-20-2018',0
select t.*,
max(case when flag = 1 then date end) over (partition by PARTY order by date) as [Last Flag On Date]
from #tab1 t
try this :->
select b.*, a.date, from table a left join table b on a.party=b.party where a.flag = CASE WHEN a.flag = 1 THEN a.date WHEN a.flag = 0 THEN ( SELECT date FROM ( SELECT TOP 1 row_number() OVER ( ORDER BY a.date DESC ) rs , a.date FROM a WHERE a.flag = 1 GROUP BY a.date) s ) END
use CROSS APPLY() to obtain the latest row with flag 1
SELECT *
FROM yourtable t
CROSS APPLY
(
SELECT TOP 1 x.Date as [Last flag on date]
FROM yourtable x
WHERE x.Party = t.Party
AND x.Flag = 1
ORDER BY x.Date desc
) d
Yes it can be done by joining table, if written properly.
#Sahi query is also good and simple.
Since you were asking for Dynamic LAG()
This query may or may not be very performant,but it certainly worth learning.
Test this with various sample data and tell me for which scenario it do not work.
So that I correct my script accordingly.
DECLARE #tab1 TABLE(PARTY CHAR(1),DATE DATE,Flag bit)
INSERT INTO #tab1
SELECT 'A','7-24-2018',1 Union ALL
SELECT 'A','7-28-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'A','7-29-2018',0 Union ALL
SELECT 'B','7-13-2018',1 Union ALL
SELECT 'B','7-17-2018',0 Union ALL
SELECT 'B','7-18-2018',0 Union ALL
SELECT 'C','7-8-2018',1 Union ALL
SELECT 'C','7-13-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-19-2018',0 Union ALL
SELECT 'C','7-20-2018',0;
WITH cte
AS (SELECT *,
Row_number()
OVER (
partition BY party
ORDER BY flag DESC, [date] DESC ) rn
FROM #tab1)
SELECT *,
CASE
WHEN flag = 1 THEN [date]
ELSE Lag([date], (SELECT TOP 1 a.rn - a1.rn
FROM cte a1
WHERE a1.party = a.party))
OVER (
ORDER BY party )
END
FROM cte a

How to call a sql query and pass a parameter from another table?

I have a complex sql query, named qryARAT2B_EXT.
SELECT
*
FROM
(
SELECT
*
FROM
(
SELECT
*,
firstStudy,
ABS(DATEDIFF('d', firstStudy, Check_Date)) as diff
FROM
(
SELECT
*,
(
SELECT
TOP 1 Check_Date
FROM
qryARAT2B
WHERE
PATNR = [PАРАМ]
ORDER BY
Check_Date
)
AS firstStudy
FROM
(
SELECT
*
FROM
qryARAT2B
WHERE
PATNR = [PАРАМ]
)
AS myPatientsWithStudy
)
AS myPatientsFirstStudy
)
WHERE
diff = 0
)
AS T1
LEFT JOIN
(
SELECT
*
FROM
(
SELECT
*,
firstStudy,
ABS(DATEDIFF('d', firstStudy, Check_Date)) as diff
FROM
(
SELECT
*,
(
SELECT
TOP 1 Check_Date
FROM
qryARAT2B
WHERE
PATNR = [PАРАМ]
ORDER BY
Check_Date
)
AS firstStudy
FROM
(
SELECT
*
FROM
qryARAT2B
WHERE
PATNR = [PАРАМ]
)
AS myPatientsWithStudy
)
AS myPatientsFirstStudy
)
WHERE
diff = 4
)
As T2
ON T1.PATNR = T2.PATNR
When I open it in ms-access, it asks for the value of the [PARAM] and produces the result.
I have a table of patients.
tblPatient with the columns:
PATNR, and s.o.
That contains the PATNR's of patients:
000001
000002
...
XXXXXX
I need to write sql to calculate data for all PATNR's at once.
something like this:
SELECT (SELECT * FROM qryARAT2B_EXT WHERE [PARAM] = PATNR) from tblPatient
But it is not accepted from ms-access. I'm not able to pass parameter to qryARAT2B_EXT from the SQL. Is there any specific syntax for it in ms-access?

Select rows with same ID/email but different value in other table

Select rows with same ID/email but different value in other table
I have two tables: person and email, now there are mail addresses that have the same value, and persons/ID with different values.
Can anyone tell how to write an SQL query for this? I have tried but I can't figure it out. I have found some answers but then it is always finding the match in the same table
Like this
Table_person. ​​Table_email
1​​​ email#persoon1
2​​​ email#persoon2
3​​​ email#persoon3
4​​​ email#persoon1
5​​​ email#persoon5
6​​​ email#persoon2
The output should be
Table_person​​ Table_email
1​​​ email#persoon1
4​​​ email#persoon1
2​​​ email#persoon2
6​​​ email#persoon2
Using a common table expression with row_number()
;with cte as (
select *
, rn = row_number() over (partition by email order by person_id)
from email e
)
select *
from cte
where exists (
select 1
from cte i
where i.email = cte.email
and rn > 1
)
or using exists()
select *
from email e
where exists (
select 1
from email i
where i.email = e.email
and i.person_id <> e.person_id
)
rextester demo: http://rextester.com/JHFEF82373
Hope it will helps you
;with cte(Table_person,​​Table_email)
AS
(
SELECT 1​​​,'email#persoon1' UNION ALL
SELECT 2​​​,'email#persoon2' UNION ALL
SELECT 3​​​,'email#persoon3' UNION ALL
SELECT 4​​​,'email#persoon1' UNION ALL
SELECT 5​​​,'email#persoon5' UNION ALL
SELECT 6​​​,'email#persoon2'
)
,Cte2
AS
(
SELECT Table_person,​​Table_email From
(
Select Table_person,​​Table_email,ROW_NUMBER()OVER(Partition by Table_email Order By Table_person )Seq
from cte
)dt WHERE dt.Seq>1
)
,Final
AS
(
SELECT Table_person,​​Table_email From
(
Select Table_person,​​Table_email,ROW_NUMBER()OVER(Partition by Table_email Order By Table_email )Seq2
from cte
)dt
where dt.Seq2>1
Union ALL
SELECT Table_person,​​Table_email From cte2
)
SELECt Table_person,​​Table_email from Final

Optimize select query (inner select + group)

My current version is :
SELECT DT, AVG(DP_H2O) AS Tx,
(SELECT AVG(Abs_P) / 1000000 AS expr1
FROM dbo.BACS_MinuteFlow_1
WHERE (DT =
(SELECT MAX(DT) AS Expr1
FROM dbo.BACS_MinuteFlow_1
WHERE DT <= dbo.BACS_KongPrima.DT ))
GROUP BY DT) AS Px
FROM dbo.BACS_KongPrima
GROUP BY DT
but it works very slow.
basically in inner select I'm selecting maximum near time to my time, then group by this nearest time.
Is there possible optimizations ? Maybe I can join it somehow , but the trouble I'm not sure how to group by this nearest date.
Thank you
You could try to rearrange it to use the code below using a cross apply. Am not sure if this will improve performance but generally I try to avoid at all costs using a query on a specific column and SQL Server is pretty good at optimising the Apply statement.
WITH Bacs_MinuteFlow_1 (Abs_P ,DT ) AS
(SELECT 5.3,'2011/10/10'
UNION SELECT 6.2,'2011/10/10'
UNION SELECT 7.8,'2011/10/10'
UNION SELECT 5.0,'2011/03/10'
UNION SELECT 4.3,'2011/03/10'),
BACS_KongPrima (DP_H2O ,DT)AS
(SELECT 2.3,'2011/10/15'
UNION SELECT 2.6,'2011/10/15'
UNION SELECT 10.2,'2011/03/15')
SELECT DT, AVG(DP_H2O) AS Tx,
a.Px
FROM BACS_KongPrima
CROSS APPLY
(
SELECT AVG(Abs_P) / 1000000 AS Px
FROM BACS_MinuteFlow_1
WHERE DT =
(SELECT MAX(DT) AS maxdt
FROM BACS_MinuteFlow_1
WHERE DT <= BACS_KongPrima.DT
)
) a
GROUP BY DT,a.Px
Cheers