SQL JOIN strange behavior - sql

I am very new to SQL.
I have two tables to merge, the following code works
SELECT *
FROM confirm
JOIN order ON confirm.email = order.email
But this one does not work for me
SELECT *
FROM confirm
JOIN order ON confirm.custid = order.custid
Everything here (email, custid) is VARCHAR(22). For the first one I get expected results but no matching results for the second one. The email and custid records are variable length between records. I also tried trim()- not avail. Any pointers?
AND custid is in the following format AB12345
-two letters and numbers

As Jeremy stated in his comment, if the "custid" does not match exactly, it will not join.
Example:
Table 1:
Email : abc#abc.com
CustId : AB12345
Email: abc2#abc.com
CustId : AB12346
Table 2:
Email: abc#abc.com
CustId : AB12345
Email: abc3#abc.com
CustId : AB12347
If you do:
SELECT * FROM TableA A
INNER JOIN TableB B on B.CustId = A.CustId
Your result will be:
Email: abc#abc.com and CustId: AB12345
This is because those columns match each other. abc2 and abc3 do not match in both tables, so they will not appear in the results.

You first go through both table check whether any matching custid is there or not,if there apply some where condition and filter it out,like select * from tblconfirm where custid="AB1234",like that chck in the case of other table,if it is there then no problem ,else no result will come.

Related

How to delete records in BigQuery based on values in an array?

In Google BigQuery, I would like to delete a subset of records, based on the value of a specific column. It's a query that I need to run repeatedly and that I would like to run automatically.
The problem is that this specific column is of the form STRUCT<column_1 ARRAY (STRING), column_2 ARRAY (STRING), ... >, and I don't know how to use such a column in the where-clause when using the delete-command.
Here is basically what I am trying to do (this code does not work):
DELETE
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
The error that I'm getting is: Syntax error: Expected end of input but got keyword LEFT at [3:1]
If I replace the DELETE with SELECT *, it does work:
SELECT *
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
Does somebody know how to use such a column to delete a subset of records?
EDIT:
Here is some code to create a reproducible example with some silly data (fill in your own dataset and table name in all queries):
Suppose you want to delete all rows where category.type contains the value 'food'.
1 - create a table:
CREATE TABLE <DATASET>.<TABLE_NAME>
(
article STRING,
category STRUCT<
color STRING,
type ARRAY<STRING>
>
);
2 - Insert data into the new table:
INSERT <DATASET>.<TABLE_NAME>
SELECT "apple" AS article, STRUCT('red' AS color, ['fruit','food'] as type) AS category
UNION ALL
SELECT "cabbage" AS article, STRUCT('blue' AS color, ['vegetable', 'food'] as type) AS category
UNION ALL
SELECT "book" AS article, STRUCT('red' AS color, ['object'] as type) AS category
UNION ALL
SELECT "dog" AS article, STRUCT('green' AS color, ['animal', 'pet'] as type) AS category;
3 - Show that select works (return all rows where category.type contains the value 'food'; these are the rows I want to delete):
SELECT *
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Initial Result
4 - My attempt at deleting rows where category.type contains 'food' does not work:
DELETE
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Syntax error: Unexpected keyword LEFT at [3:1]
Desired Result
This is the code I used to delete the desired records (the records where category.type contains the value 'food'.)
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS(SELECT 1 FROM UNNEST(t1.category.type) t2 WHERE t2 = 'food')
The embarrasing thing is that I've seen these kind of answers on similar questions (for example on update-queries). But I come from Oracle-SQL and I think that there you are required to connect your subquery with your main query in the WHERE-statement of the subquery (ie. connect t1 with t2), so I didn't understand these answers. That's why I posted this question.
However, I learned that BigQuery automatically understands how to connect table t1 and 'table' t2; you don't have to explicitly connect them.
Now it is possible to still do this (perhaps even recommended?):
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS (SELECT 1 FROM <DATASET>.<TABLE_NAME> t2 LEFT JOIN UNNEST(t2.category.type) AS type WHERE type = 'food' AND t1.article=t2.article)
but a second difficulty for me was that my ID in my actual data is somehow hidden in an array>struct-construction, so I got stuck connecting t1 & t2. Fortunately this is not always an absolute necessity.
Since you did not provide any sample data I am going to explain using some dummy data. In case you add your sample data, I can update the answer.
Firstly,according to your description, you have only a STRUCT not an Array[Struct <col_1, col_2>].For this reason, you do not need to use UNNEST to access the values within the data. Below is an example how to access particular data within a STRUCT.
WITH data AS (
SELECT 1 AS id, STRUCT("Alex" AS name, 30 AS age, "NYC" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Leo" AS name, 18 AS age, "Sydney" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Robert" AS name, 25 AS age, "Paris" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Mary" AS name, 28 AS age, "London" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Ralph" AS name, 45 AS age, "London" AS city) AS info
)
SELECT * FROM data
WHERE info.city = "London"
Notice that the STRUCT is named info and the data we accessed is city and used it in the WHERE clause.
Now, in order to delete the rows that contains an specific value within the STRUCT , in your case I assume it would be your_struct.column_1, you can use DELETE or MERGE and DELETE. I have saved the above data in a table to execute the below examples, which have the same output,
First method: DELETE
DELETE FROM `project.dataset.table`
WHERE info.city = "Sydney"
Second method: MERGE and DELETE
MERGE `project.dataset.table` a
USING (SELECT * from `project.dataset.table` WHERE info.city ="London") b
ON a.info.city =b.info.city
WHEN matched and b.id=1 then
Delete
And the output for both queries,
Row id info.name info.age info.city
1 1 Alex 30 NYC
2 1 Robert 25 Paris
3 1 Ralph 45 London
4 1 Mary 28 London
As you can see the row where info.city = "Sydney" was deleted in both cases.
It is important to point out that your data is excluded from your source table. Therefore, you should be careful.
Note: Since you want to run this process everyday, you could use Schedule Query within BigQuery Console, appending or overwriting the results after each run. Also, it is a good practice not deleting data from your source table. Thus, consider creating a new table from your source table without the rows you do not desire.

Find table with only two rows. Access 2016

I'm looking for a query that I can use in MS Access 2016 which will give me all Company ID's that have the values "Iphone" and "Ipad". So all CompanyID that has only two rows with specific values.
CompanyID Product_Name
1 Iphone
1 Ipad
1 Headphones
2 Iphone
2 Galaxy
3 Playstation 4
3 Nintendo Switch
4 Iphone
4 Ipad
In the example table above I will therefore get the CompanyID = 4.
I have tried to use the same logic as in SQL from the this post but Access doesn't allow syntax USING.
The SQL query used in post is:
SELECT CompanyID
FROM DATA AS a
JOIN DATA AS b
USING (CompanyID)
WHERE a.Product_Name = "Iphone"
AND b.Product_Name = "Ipad";
Any feedback is much appreciated.
Since you state:
So all CompanyID that has only two rows with specific values... In the
example table above I will therefore get the CompanyID = 4.
It would seem that you require the CompanyID for which the only two Product_Name values are Ipad & Iphone, with no other values associated with the CompanyID.
To obtain this result, I might suggest the following SQL query:
select t.companyid
from data t
group by t.companyid
having max(t.product_name in ('Iphone','Ipad'))=-1
Which will return:
CompanyID
4
Here, for every record within each group of records associated with a given CompanyID, the expression t.product_name in ('Iphone','Ipad') is evaluated.
This expression will either return True (-1) or False (0).
If all records within the group are either 'Iphone' or 'Ipad', then this expression will return True (-1) for every record, and the maximum over the group will be -1.
Whereas, if any record within the group is some other value, this expression will return False (0) and therefore the maximum of the group will be 0, thus excluding it from the result.
You can use an INNER JOIN to filter the results that do not contain both values:
SELECT a.CompanyID
FROM (
SELECT CompanyID
FROM DATA
WHERE Product_Name = 'IPhone'
) a
INNER JOIN (
SELECT CompanyID
FROM DATA
WHERE Product_Name = 'Ipad'
) b ON b.CompanyID = a.CompanyID
Output:
CompanyID
1
4
How does this work?
Firstly all results that have an IPhone are gathered. Then this is joined with all results that have an IPad. Only results with both rows matching (because of the INNER JOIN) will be returned.
While late to this, it looks like they're only being pulled from one table - DATA. If that's the case then the easiest solution should just be
SELECT DATA.CompanyID
FROM (DATA)
WHERE DATA.Product_Name = "Iphone"
AND DATA.Product_Name = "Ipad";
For Access, the FROM statement is generally only used for combining tables, or queries, and the data manipulation is in the other statements.
If there's more than one table and the connection between the two tables is CompanyID, then it should look more like this;
SELECT DATA1.CompanyID
FROM (DATA1 INNER JOIN DATA1.CompanyID ON DATA2.CompanyID)
WHERE DATA2.Product_Name = "Iphone"
AND DATA2.Product_Name = "Ipad";

SQL select with conditions and without duplicates

I'm trying to create SQL query to get specific entities with some conditions. The thing is I have some duplicate entities I want to avoid.
My data table (the table represent drivers) is:
I want to get some drivers with condition of specific facility ID/parkinglot ID.
For example, in case I want all the drivers at facility with ID '2', I want to get:
11112 Michael Smith
and not:
11112 Michael Smith
11112 Michael Smith
I want the same thing will happen with the parkinglot ID and with the facility Id together.
I tried:
SELECT * FROM "DRIVERS"
where facilityid = '2'
group by driverid
And I got an error:
Could not execute 'SELECT * FROM "DRIVERS" where facilityid = '2' group by driverid'
invalid column name: The column 'DRIVERS.FIRSTNAME' is invalid in the select list because the GROUP BY clause or an aggregation function does not contain it: line 1 col 8 (at pos 7)
Any ideas?
Thank you!
If you just want to get the distinct values from the table then do:
SELECT DISTINCT *
FROM "DRIVERS"
WHERE facilityid = '2';
GROUP BY should be used when you are aggregating data
Edit:
To get the results you asked for you could use:
SELECT DISTINCT DRIVERID, FIRSTNAME, LASTNAME
FROM "DRIVERS"
WHERE facilityid = '2';
YOU WILL GET THE RESULT WITH THIS QUERY.
select distinct(DRIVERID),FIRSTNAME,LASTNAME FROM "DRIVERS" where facilityid = '2'.

Can I get duplicate results (from one table) in an INTERSECT operation between two tables?

I know the wording of the question is awkward, but I couldn't phrase it any better. Let me explain the situation.
There's table A which has a bunch of columns (a, b, c ... ) and I run a SELECT query on it like so:
SELECT a FROM A WHERE b IN ('....') (the ellipsis indicates a number of values to be matched to)
There's another table B which has a bunch of columns (d, e, f ... ) and I run a SELECT query on it like so:
SELECT d FROM B WHERE f = '...' (the ellipsis indicates a single value to be matched to)
Now I should say here that the two tables store different types of information about the same entity, but the columns a and d contain the exact same data (in this case, an ID). I want to find out the intersection of the two tables so I run this:
SELECT a FROM A WHERE b IN ('....') INTERSECT SELECT d FROM B WHERE f = '...'
Now here's the problem:
The first SELECT contains a set of values in the WHERE clause, right? So let's say the set is (1234, 2345,3456). Now, the result of this query when b is matched ONLY to 1234 is, let's say, abc. When it's matched to 2345, it's def, suppose. And matching to 3456, it gives abc.
Let's suppose these two results (abc and def) are also in the set of results from the second SELECT.
So, now, putting back the entire set of values to matched into the WHERE clause, the INTERSECT operation will give me abc and def. But I want abc twice since two values in the WHERE clause set match to the second SELECT.
Is there any way I can get that?
I hope it's not too complicated to understand my problem. This is a real-life problem I'm facing in my job.
Data structure and my code
Table A contains general information about a company:
company_id | branch_id | no_of_employees | city
Table B contains the financials of the company:
company_id | branch_id | revenue | profits
First SELECT:
SELECT branch_id FROM A WHERE CITY IN ('Dallas', 'Miami', 'New Orleans')
Now, running each city separately in the first SELECT, I get the branch_ids:
branch_id | city
23 | Dallas
45 | Miami
45 | New Orleans
Once again, this seems impractical as to how two cities can have the same branch ids, but please bear with me on this.
Second SELECT:
SELECT branch_id FROM B
WHERE REVENUE = 5000000
I know this is a little impractical, but for the purpose of this example, it suffices.
Running this query I get the following set:
11
23
45
22
10
So the INTERSECT will give me just 23 and 45. But I want 45 twice, since both Miami and New Orleans have that branch_id and that branch_id has generated a revenue of 5 million.
Directly from Microsoft's documentation (https://msdn.microsoft.com/en-us/library/ms188055.aspx)
:
"INTERSECT returns distinct rows that are output by both the left and right input queries operator."
So NO, it is not possible to get the same value twice when using INTERSECT because the results will be DISTINCT. However if you build an INNER JOIN correctly you can do essentially the same thing as INTERSECT except keep the repetitive results by NOT using distinct or group by.
SELECT
A.a
FROM
A
INNER JOIN B
ON A.a = B.d
AND B.F = '....'
WHERE b IN ('....')
And for your specific Example that you edited:
SELECT
branch_id
FROM
A
INNER JOIN B
ON A.branch_id = B.branch_id
AND B.REVENUE = 5000000
WHERE A.CITY IN ('Dallas', 'Miami', 'New Orleans')
You overcomplicated your task a lot:
SELECT *
FROM A
WHERE CITY IN (...)
AND EXISTS
(
SELECT 1 FROM B
WHERE B.REVENUE = 5000000
AND B.branch_id = A.branch_id
)
INTERSECT and EXCEPT are both returning row sets with DISTINCT applied.
Regular joining/filtering operations are not performed by INTERSECT or EXCEPT.

SQL - include results you are looking for in a column and set all other values to null

I have two tables, one with orders and another with order comments. I want to join these two tables. They are joined on a column "EID" which exists in both tables. I want all orders. I also want to see all comments with only certain criteria AND all other comments should be set to null. How do I go about this?
Orders Table
Order_Number
1
2
3
4
Comments Table
Comments
Cancelled On
Ordered On
Cancelled On
Cancelled On
In this example I would like to see for my results:
Order_Number | Comments
1 | Cancelled On
2 | Null
3 | Cancelled On
4 | Cancelled On
Thanks!
This seems like a rather trivial left join.
select o.order_number, c.comments
from orders o
left join comments c
on o.eid = c.eid
and (here goes your criteria for comments)
Tested on Oracle, there might be subtle syntax differences for other DB engines.
It depends on one condition:
Are you trying to SET the other comments to null? (replace the values in the table)
or
Are you trying to DISPLAY the other comments as null? (dont display them)
If you want to change the values in the table use
UPDATE `table` SET `column` = null WHERE condition;
otherwise use:
SELECT column FROM table JOIN othertable WHERE condition;