I have two tables left joined. The query is grouped by the left table's ID column. The right table has a date column called close_date. The problem is, if there are any right table records that have not been closed (thus having a close_date of 0000-00-00), then I do not want any of the left table records to be shown, and if there are NO right table records with a close_date of 0000-00-00, I would like only the right table record with the MAX close date to be returned.
So for simplicity sake, let's say the tables look like this:
Table1
id
1
2
Table2
table1_id | close_date
1 | 0000-00-00
1 | 2010-01-01
2 | 2010-01-01
2 | 2010-01-02
I would like the query to only return this:
Table1.id | Table2.close_date
2 | 2010-01-02
I tried to come up with an answer using aliased CASES and aggregate functions, but I could not search by the result, and I was attempting not to make a 3 mile long query to solve the problem. I looked through a few of the related posts on here, but none seem to meet the criteria of this particular case.
Use:
SELECT t1.id,
MAX(t2.close_date)
FROM TABLE1 t1
JOIN TABLE2 t2 ON t2.table1_id = t1.id
WHERE NOT EXISTS(SELECT NULL
FROM TABLE2 t
WHERE t.table1_id = t1.id
AND t.closed_date = '0000-00-00')
The '0000-00-00' should be implicitly converted by MySQL to a DATETIME. If not, cast the value to DATETIME.
Try:
select table1id,close_date form table2
where close_date= (select max(close_date) from table2) or close_date='0000-00-00'
Related
Hello everyone this is my first question here. I have been browsing thru the questions but couldnt quite find the answer to my problem:
I have a couple of tables which I need to join. The key I join with is non unique(in this case its a date). This is working fine but now I also need to group the results based on another column without getting cross-join like results (meaning each value of this column should only appear once but depending on the table used the column can have different values in each table)
Here is an example of what I have and what I would like to get:
Table1
Date/Key
Group Column
Example Value1
01-01-2022
a
1
01-01-2022
d
2
01-01-2022
e
3
01-01-2022
f
4
Table 2
Date/Key
Group Column
Example Value 2
01-01-2022
a
1
01-01-2022
b
2
01-01-2022
c
3
01-01-2022
d
4
Wanted Result :
Table Result
Date/Key
Group Column
Example Value1
Example Value2
01-01-2022
a
1
1
01-01-2022
b
NULL
2
01-01-2022
c
NULL
3
01-01-2022
d
2
4
01-01-2022
e
3
NULL
01-01-2022
f
4
NULL
I have tryed a couple of approaches but I always get results with values in group column appear multiple times. I am under the impression that full joining and then grouping over the group column shoul work but apparently I am missing something. I also figured I could bruteforce the result by left joining everything with setting the on to table1.date = table2.date AND table1.Groupcolumn = table2.Groupcolumn ect.. and then doing UNIONs of all permutations (so each table was on "the left" once) but this is not only tedious but bigquery doesnt like it since it contains too many sub queries.
I feel kinda bad that my first question is something that I should actually know but I hope someone can help me out!
I do not need a full code solution just a hint to the correct approach would suffice (also incase I missed it: if this was already answered I also appreciate just a link to it!)
Edit:
So one solution I came up with, which appears to work, was to select the group column of each table and union them as a with() and then join this "list" onto the first table like
list as(Select t1.GroupColumn FROM Table_1 t1 WHERE CONDITION1
UNION DISTINCT Select t1.GroupColumn FROM Table_1 t1 WHERE CONDITION2 ... ect)
result as (
SELECT l.GoupColumn, t1.Example_Value1, t2.Example_Value2
FROM Table_1 t1
LEFT JOIN( SELECT * FROM list) s
ON S.GroupColumn = t1.GroupColumn
LEFT JOIN Table_2 t2
on S.GroupColumn = t2.GroupColumn
and t1.key = t2.key
...
)
SELECT * FROM result
I think what you are looking for is a FULL OUTER JOIN and then you can coalesce the date and group columns. It doesn't exactly look like you need to group anything based on the example data you posted:
SELECT
coalesce(table1.date_key, table2.date_key) AS date_key,
coalesce(table1.group_column, table2.group_column) AS group_column,
table1.example_value_1,
table2.example_value_2
FROM
table1
FULL OUTER JOIN
table2
USING
(date_key,
group_column)
ORDER BY
date_key,
group_column;
Consider below simple approach
select * from (
select *, 'example_value1' type from table1 union all
select *, 'example_value2' type from table2
)
pivot (
any_value(example_value1)
for type in ('example_value1', 'example_value2')
)
if applied to sample data in your question - output is
I have a database with around 50 million entries showing the status of a device for a given day, simplified to the form:
id | status
-------------
1 | Off
1 | Off
1 | On
2 | Off
2 | Off
3 | Off
3 | Off
3 | On
...
such that each id is guaranteed to have at least 2 rows with an 'off' status, but doesn't have to have an 'on' status. I'm trying to get a list of only the ids that do not have an 'On' status. For example, in the above data set I'd want a query returned with only '2'
The current query is:
SELECT DISTINCT id FROM table
EXCEPT
SELECT DISTINCT id FROM table WHERE status <> 'Off'
Which seems to work, but it's having to iterate over the entire table twice which ends up taking ~10-12 minutes to run per query. Is there a simpler way to do this with only a single query?
You can use WHERE NOT EXISTS instead:
Select Distinct Id
From Table A
Where Not Exists
(
Select *
From Table B
Where A.Id = B.Id
And B.Status = 'On'
)
I would also recommend looking at the indexes on the Status column. 10-12 minutes to run is excessively long. Even with 50m records, with proper indexing, a query like this shouldn't take longer than a second.
To add an index to the column, you can run this (I'm assuming SQL Server, your syntax may vary):
Create NonClustered Index Ix_YourTable_Status On YourTable (Status Asc);
You can use conditional aggregation.
select id
from table
group by id
having count(case when status='On' then 1 end)=0
You can use the help of a SELF JOIN ..
SELECT DISTINCT A.Id
FROM Table A
LEFT JOIN Table B ON A.Id=B.Id
WHERE B.Status='On'
AND B.Id IS NULL
I'm trying to make a where clause from a different table with 2 values in match for example in SQLite.
Table 1:
Date|CID|text
1/1/90 22:22:22 1 hi
1/1/90 21:22:30 1 How are you
1/1/90 03:22:22 3 hey
Table 2:
ID|date|CID|text
100 1/1/89 11:22:11 1 hello
200 1/1/90 22:22:22 1 hi
300 1/1/90 21:22:30 1 How are you
400 1/1/90 03:22:22 3 hey
500 1/1/85 02:22:22 3 hey
600 1/1/90 03:22:22 80 hey
How to make the query give me the ID from table 2 matching the CID and date from table 1?
Note: If I use where in the result and some date matches in the case of ID 600 and 400, the result will be on the query and I'm not looking that.
Just matching CID and date from table 1 should be listed in the result.
If I understand correctly, you can want a join with two conditions
select t1.*, t2.id
from table1 t1 join
table2 t2
on t1.id = t2.cid and t1.date = t2.date;
If you want to keep all rows in table1, even those with no matches, then use a left join instead.
I have a table that I am trying to find items that are scheduled for the same start and end time and there are two Boolean fields that indicate a schedule collision. Here is kind of what the table looks like without having excess stuff in there:
id | RecordNo | Starttime | Endtime | Description | Bool1 | Bool2
Now, these records have different RecordNo but if two records have the same Description,Starttime and Endtime and one record has Bool1 as FALSE and the other record has Bool2 as TRUE or vice versa, that would be a schedule collision.
Can someone help me with this query?
For exact same starttime and endtime collissions
with records as(
select starttime, endtime from table group by starttime, endtime where count(starttime)>1
)
select recordno from table t
inner join records r on t.starttime=r.starttime and t.endtime=r.endtime
but I think you may want overlapping collisions too
select t1.recordno
from table t1
inner join table t2
on (t1.starttime between t2.starttime and t2.endtime)
or
(t1.endtime between t2.starttime and t2.endtime)
This is a little dangerous though because it will join every record in the table to every record in the table. If you have 10 rows in the table it will create a 100 row set before narrowing it to the results. For 100 rows it will create 10000 rows before narrowing to your results.
rows ^ 2
Based on your last comment maybe you would want to do the second approach based on description and different Booleans and exact times in which case get back the duplicate transactions
select t1.recordno
from table t1
inner join table t2
on t1.starttime=t2.starttime
and t1.endtime=t2.endtime
and t1.description=t2.description
and t1.Bool1 != t2.Bool1
and t1.Bool2 != t2.Bool2
Consider that I have two tables.
One is "Table1" as shown below.
One more table is "Table2" as shown below.
Now here what I need is, I need all the records from Table1 those ID's are not in Table2's Reference column.
Please guide me how to do this.
Thanks in advance.
How to do it with your current schema (impossible to use indexes):
SELECT Table1.*
FROM Table1
WHERE NOT EXISTS
(
SELECT 1
FROM Table2
WHERE CONCAT(',', Table2.Reference, ',') LIKE CONCAT('%,', Table1.ID, ',%')
)
How this works is by completely wrapping every value in the Reference column with commas. You will end up with ,2,3, and ,7,8,9, for your sample data. Then you can safely search for ,<Table1.ID>, within that string.
How to really do it:
Normalize your database, and get rid of those ugly, useless comma-separated lists.
Fix your table2 to be:
SlNo | Reference
------+-----------
1 | 2
1 | 3
2 | 7
2 | 8
2 | 9
and add a table2Names as:
SlNo | Name
------+---------
1 | Test
2 | Test 2
Then you can simply do:
SELECT Table1.*
FROM Table1
WHERE NOT EXISTS(SELECT 1 FROM Table2 WHERE Table2.Reference = Table1.ID)