SQL SELECT query where the IDs were already found - sql

I have 2 tables:
Table A has 3 columns (for example) with opportunity sales header data:
OPP_ID, CLOSE_DTTM, STAGE
Table B has 3 columns with the individual line items for the Opportunities:
OPP_LINE_ID, OPP_ID, AMOUNT_USD
I have a select statement that correctly parses through Table A and returns a list of Opportunities. What I would like to do is, without joining the data, to have a SELECT statement that will get data from Table B but only for the OPP_IDs that were found in my first query.
The result should be 2 views/resultset (one for each select query) and not just 1 combined view where Table B is joined to Table A.
The reason why I want to keep them separate is because I will have to perform a few manipulations to the result from table B and i don't want the result from table A affected.

Subquery is all what you need
SELECT OPP_ID, CLOSE_DTTM, STAGE
From table a
where a.opp_id IN (Select opp_id from table b)

Presuming you're using this in some client side data access library that represents B's data in some 2 dimensional collection and you want to manipulate it without affecting/ having A's data present in that collection:
Identify the records in A:
SELECT * FROM a WHERE somecolumn = 'somevalue'
Identify the records in B that relate to A, but don't return A's data:
SELECT b.* FROM a JOIN b ON a.opp_id = b.opp_id WHERE a.somecolumn = 'somevalue'
Just because JOIN is used doesn't mean your end-consuming program has to know about A's data. You could also use IN, like the other answer does, but internally the database will rewrite them to be the same thing anyway

I tend to use exists for this type of query:
select b.*
from b
where exists (select 1 from a where a.opp_id = b.opp_id);
If you want two results sets, you need to run two queries. It is unclear what the second query is, perhaps the first query on A.

Related

How to choose all column of a single table in a joined table?

I have a query that is like this:
SELECT * FROM A JOIN B
Table A and B are almost equal, but I use the join to filter on certain rows. Now my result table is having all columns twice, and this is causing referencing trouble where I am using this table. Therefore I would like to only show table A's columns. Is there a more easy way to achieve this than retyping all column names on the spot of *?
If you only want the columns from A, then use:
select A.*
You can add additional columns as you like:
select A.*, B.special_column

BigQuery: Select column if it exists, else put NULL?

I am updating a daily dashboard. Assume everyday I have two tables TODAY and BEFORE_TODAY. What I have been doing daily was something like:
SELECT a, b FROM TODAY
UNION ALL
SELECT a,b FROM BEFORE_TODAY;
TODAY table is generated daily and is appended to all the data before it. Now, I need to generate a new column say c and in order to UNION ALL the two, I need that to be available on BEFORE_TODAY as well.
How can I add a conditional statement to BEFORE_TODAY to check if I have a c column and use that column, else use NULL instead of it.
Something is wrong with your data model. You should be putting the data into separate partitions of the same table. Then you would have no problems. You could just query the master table for the partitions you want. The column would appear in the history, with a NULL value.
That is the right solution. You can create a hacked solution, assuming that your tables have a primary key. For instance, if a, b is unique on each row, you could do:
select t.a, t.b, t.c
from today t
union all
select b.a, b.b,
(select c -- not qualified on purpose
from before_today b2
where b2.a = b.a and b2.b = b.b
) as
from before_today b cross join
(select null as c) c;
If b.c does not exist, then the subquery returns c.c. If it does exist this it returns b2.c from the subquery.
Although by combining dynamic SQL and INFORMATION_SCHEMA, you can write a script to achieve what you told, this doesn't seems to be the right thing to do.
As part of your data model planning, you should add column c to BEFORE_TODAY ahead of time. The newly added column on existing rows will always have NULL value. Then you can add column to TODAY and reference column c as normal.

SQL Inner Join w/ Unique Vals

Questions similar to this one about using DISTINCT values in an INNER JOIN have been asked a few times, but I don't see my (simple) use case.
Problem Description:
I have two tables Table A and Table B. They can be joined via a variable ID. Each ID may appear on multiple rows in both Table A and Table B.
I would like to INNER JOIN Table A and Table B on the distinct values of ID which appear in Table B and select all rows of Table A with a Table A.ID which appears matching some condition in Table B.
What I want:
I want to make sure I get only one copy of each row of Table A with a Table A.ID matching a Table B.ID which satisfies [some condition].
What I would like to do:
SELECT * FROM TABLE A
INNER JOIN (
SELECT DISTINCT ID FROM TABLE B WHERE [some condition]
) ON TABLE A.ID=TABLE B.ID
Additionally:
As a further (really dumb) constraint, I can't say anything about the SQL standard in use, since I'm executing the SQL query through Stata's odbc load command on a database I have no information about beyond the variable names and the fact that "it does accept SQL queries," ( <- this is the extent of the information I have).
If you want all rows in a that match an id in b, then use exists:
select a.*
from a
where exists (select 1 from b where b.id = a.id);
Trying to use join just complicates matters, because it both filters and generates duplicates.

SQL Query returns more

I'm having a bit of a problem with a SQL Query that returns too many results. I'm fairly new to SQL so please bear with me.
Please see the following:
Table Structures
The Query that I use looks like:
SELECT TABLE_B.*
FROM
TABLE_A
JOIN
TABLE_B
ON
TABLE_A.COMMON_ID=TABLE_B.COMMON_ID
AND TABLE_A.SEQ_3C=TABLE_B.SEQ_3C
JOIN
TABLE_C
ON
TABLE_A.COMMON_ID=TABLE_C.EMPLID
WHERE
TABLE_B.ITEM_STATUS<>'C'
and TABLE_A.CHECKLIST_STATUS='I'
and TABLE_A.ADMIN_FUNCTION='ADMA'
and TABLE_A.CHECKLIST_CD='APPL'
and TABLE_A.COMMON_ID = '123456789'
and TABLE_C.ADMIT_TERM='2171'
and TABLE_C.INSTITUTION='SOMEWHERE'
I just want the results from Table_B and not what it's giving me.
Please explain this to me as I have spent 3 days on it non-stop.
What am I missing?
You want data from TABLE_B? Then select from it only and have the conditions on the other tables in your where clause.
The inner joins on the other tables serve as existence tests, I assume? Don't do that. You'd only multiply your records, just as you are doing now, only to have to dismiss duplicates later. That can cause bad performance on large tables and errors in more complicated queries. Use EXISTS or IN instead.
select *
from table_b
where item_status <> 'C'
and (common_id, seq_3c) in
(
select common_id, seq_3c
from table_a
where checklist_status = 'I'
and admin_function = 'ADMA'
and checklist_cd = 'APPL'
)
and common_id in
(
select EMPLID
from table_c
where admit_term = '2171'
and institution = 'SOMEWHERE'
);
SELECT DISTINCT TABLE_B.*
FROM
TABLE_A
JOIN
TABLE_B
ON
TABLE_A.COMMON_ID=TABLE_B.COMMON_ID
AND TABLE_A.SEQ_3C=TABLE_B.SEQ_3C
JOIN
TABLE_C
ON
TABLE_A.COMMON_ID=TABLE_C.EMPLID
WHERE
TABLE_B.ITEM_STATUS<>'C'
and TABLE_A.CHECKLIST_STATUS='I'
and TABLE_A.ADMIN_FUNCTION='ADMA'
and TABLE_A.CHECKLIST_CD='APPL'
and TABLE_A.COMMON_ID = '123456789'
and TABLE_C.ADMIT_TERM='2171'
and TABLE_C.INSTITUTION='SOMEWHERE'
This should be easy to understand without looking at all your tables and output.
Suppose you join two tables, A and B, on a column id. You only want the columns from table B, and in table B the `id' column is a unique identifier.
Even so, if in table A an id (the same id) appears five times, the join will have five rows for that id. Then you just select the columns from table B, so it will look like you got the same row five different times.
Perhaps you don't really need a join? What is your underlying problem you are trying to solve?
It's hard to answer this question without more information about why you're executing these joins. I can explain why you're getting the results you're getting, and hopefully that will allow you to solve the problem yourself.
You start, in your FROM clause, with table A. You join this table with table B on matching COMMON_ID, which, based on the tables you provide, returns three matches for the one record you have in table A. This increases your result set size to three records. Next, you join these three records with table C, on matching ID. Because all ID's are, in fact, identical, this returns nine matches for every record in your current result set: you now have 9 x 3 = 27 records in your result set.
Finally, the WHERE clause comes into effect. This clause excludes 6 out of 9 records in table C, so you have 3 of those records left. Your final result set is therefore 1 (table A) x 3 (table B) x 3 (table C) = 9 records.

How to get names present in both views?

I have a very large view containing 5 million records containing repeated names with each row having unique transaction number. Another view of 9000 records containing unique names is also present. Now I want to retrieve records in first view whose names are present in second view
select * from v1 where name in (select name from v2)
But the query is taking very long to run. Is there any short cut method?
Did you try just using a INNER JOIN. This will return all rows that exist in both tables:
select v1.*
from v1
INNER JOIN v2
on v1.name = v2.name
If you need help learning JOIN syntax, here is a great visual explanation.
You can add the DISTINCT keyword which will remove any duplicate values that the query returns.
use JOIN.
The DISTINCT will allow you to return only unique records from the list since you are joining from the other table and there could be possibilities that a record may have more than one matches on the other table.
SELECT DISTINCT a.*
FROM v1 a
INNER JOIN v2 b
ON a.name = b.name
For faster performance, add an index on column NAME on both tables since you are joining through it.
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins