How to use Except clause in Bigquery? - sql

I am trying to use the existing Except clause in Bigquery. Please find my query below
select * EXCEPT (b.hosp_id, b.person_id,c.hosp_id) from
person a
inner join hospital b
on a.hosp_id= b.hosp_id
inner join reading c
on a.hosp_id= c.hosp_id
As you can see I am using 3 tables. All the 3 tables have the hosp_id column, so I would like to remove duplicate columns which are b.hosp_id and c.hosp_id. Simlarly, I would like to remove b.person_id column as well.
When I execute the above query, I get the syntax error as shown below
Syntax error: Expected ")" or "," but got "." at [9:19]
Please note that all the columns that I am using in Except clause is present in the tables used. Additional info is all the tables used are temp tables created using with clause. When I do the same manually by selecting column of interest, it works fine. But I have several columns and can't do this manually.
Can you help? I am trying to learn Bigquery. Your inputs would help

I use the EXCEPT on a per-table basis:
select p.* EXCEPT (hosp_id, person_id),
h.*,
r.* EXCEPT (hosp_id)
from person p inner join
hospital h
on p.hosp_id = h.hosp_id inner join
reading r
on p.hosp_id = r.hosp_id;
Note that this also uses meaningful abbreviations for table aliases, which makes the query much simpler to understand.
In your case, I don't think you need EXCEPT at all if you use the USING clause.

Try this instead:
select * EXCEPT (person_id) from
person a
inner join hospital b
using (hosp_id)
inner join reading c
using (hosp_id)
You can only put column names (not paths) in the EXCEPT list, and you can simply avoid projecting the duplicate columns with USING instead of ON.

Related

Best way to "SELECT *" from multiple tabs in sql without duplicate?

I am trying to retrieve every data stored in 2 tabs from my database through a SELECT statement.
The problem is there are a lot of columns in each tab and manually selecting each column would be a pain in the ass.
So naturally I thought about using a join :
select * from equipment
join data
on equipment.id = data.equipmentId
The problem is I am getting the equipment ID 2 times in the result.
I thought that maybe some specific join could help me filter out the duplicate key, but I can't manage to find a way...
Is there any way to filter out the foreign key or is there a better way to do the whole thing (I would rather not have to post process the data to manually remove those duplicate columns)?
You can use USING clause.
"The USING clause specifies which columns to test for equality when
two tables are joined. It can be used instead of an ON clause in the
JOIN operations that have an explicit join clause."
select *
from test
join test2 using(id)
Here is a demo
You can also use NATURAL JOIN
select *
from test
natural join test2;

Unnesting 3rd level dependency in Google BigQuery

I'm trying to Replace the schema in existing table using BQ. There are certain fields in BQ which have 3-5 level schema dependency.
For Ex. comsalesorders.comSalesOrdersInfo.storetransactionid this field is nested under two fields.
Since I'm using this to replace existing table, I can not change the field names in query.
The query looks similar to this
SELECT * REPLACE(comsalesorders.comSalesOrdersInfo.storetransactionid AS STRING) FROM CentralizedOrders_streaming.orderStatusUpdated, UNNEST(comsalesorders) AS comsalesorders, UNNEST(comsalesorders.comSalesOrdersInfo) AS comsalesorders.comSalesOrdersInfo
BQ enables unnesting first schema field but presents problem for 2nd nesting.
What changes do I need to make to this query to use UNNEST() for such depedndent schemas ?
Given that you don't have a schema, I will try to provide a generalized answer. Please try to understand the difference between the 2 queries.
-- Provide an alias for each unnest (as if each is a separate table)
select c.stuff
from table
left join unnest(table.first_level_nested) a
left join unnest(a.second_level_nested) b
left join unnest(b.third_level_nested) c
-- b and c won't work here because you are 'double unnesting'
select c.stuff
from table
left join unnest(table.first_level_nested) a
left join unnest(first_level_nested.second_level_nested) b
left join unnest(first_level_nested.second_level_nested.third_level_nested) c
I'm not sure I understand your question, but as I could guess, you want to change one column type to another type, such as STRING.
The UNNEST function is only used with columns that are array types, for example:
"comsalesorders":["comSalesOrdersInfo":{}, comSalesOrdersInfo:{}, comSalesOrdersInfo:{}]
But not with this kind of columns:
"comSalesOrdersInfo":{"storeTransactionID":"X1056-943462","ItemsWarrenty":0,"currencyCountry":"USD"}
Therefore, if a didn't misunderstand your question, I would make a query like this:
SELECT *, CAST(A.comSalesOrdersInfo.storeTransactionID as STRING)
FROM `TABLE`, UNNEST(comsalesorders) as A

ORA-00904 Invalid Identifier for a query involving inner join

I have a very simple query that I am trying to execute:
select *
from submissions
inner join (
select *
from hackers
inner join challenges
on hackers.hacker_id = challenges.hacker_id
) dtable
on submissions.challenge_id = dtable.challenge_id
and submissions.hacker_id = dtable.hacker_id;
Oracle rejects it with:
ORA-00904: "DTABLE"."HACKER_ID": invalid identifier.
I have kept the alias dtable visible by keeping it outside the brackets. Why is Oracle rejecting my query?
SELECT * in your sub-query is a problem.
You have the same column name in both tables that are being joined. This means that you're trying to create an inline-view called dtable where at least two columns have the same name (in this case both tables have a hacker_id column and your use of * is essentially saying "use them both".). You can't do that.
You're going to need SELECT hackers.a, hackers.b, challenges.x, challenges.y, etc, etc in your sub-query. By being explicit in this way you can ensure that no two columns have the same name.
An alternative could be SELECT hackers.*, challenges.a AS c_a, challenges.b AS c_b, etc.
Either way you are being explicit about which fields to pick up, what their positions and names are, etc. The end result being that you can then avoid columns having the same name as other columns.
You do not need subqueries for this. Your query is not actually "simple". The simple form looks more like this:
select . . .
from submissions s join
hackers h
on s.hacker_id = h.hacker_id join
challenges c
on s.challenge_id = c.challenge_id;
Note that I removed the condition between challenge and hackers on hacker_id. That extra join condition doesn't really make sense to me (although it might make sense if you provided sample data).
As others have said: The sub-select selects two different columns hacker_id from two different tables. This confuses Oracle.
But there is no need for a sub-select
select * from
submissions
inner join challenges
on submissions.challenge_id = challenges.challenge_id
inner join hackers
on submissions.hacker_id = hackers.hacker_id;

Column in field list is ambiguous error

i've been recently working in mysql and in one of the requests i wrote :
SELECT SIGLE_EEP, ID_SOUS_MODULE, LIBELLE
FROM mef_edi.eep a, mef_edi.envoi e, mef_edi.sous_module s
WHERE a.ID_EEP = e.ID_EEP
AND a.ID_SOUS_MODULE = s.ID_SOUS_MODULE;
and they told me :
Column ID_SOUS_MODULE in field list is ambiguous
What should i do ?
More than one table has a column named ID_SOUS_MODULE.
So you need to name the table every time you mention the column to specify which table you mean.
Change
SELECT ID_SOUS_MODULE
for instance to
SELECT a.ID_SOUS_MODULE
I agree with the answer above, you may have duplicate column names across your 3 tables, assigning the table id (a, e, s) as noted above will avoid that issue in the select. In addition to what #juergen said you may want to get rid of that cartesian join by using an inner or left join (inner seems to be what your going for). The way you are joining your table you are joining every possible combination of rows together than filtering. using a proper join will get you better performance in the long run as your table line counts grow. Here is an example of a non cartesian join:
SELECT SIGLE_EEP, ID_SOUS_MODULE, LIBELLE
FROM mef_edi.eep a
INNER JOIN mef_edi.envoi e ON (a.ID_EEP = e.ID_EEP)
INNER JOIN mef_edi.sous_module s ON (a.ID_SOUS_MODULE = s.ID_SOUS_MODULE)

SQL Server: select over several tables / conditions

Okay, I'm relatively new to the more advanced uses of SQL Server.
I have several tables that I need to gather informations from, and several of these tables links to other tables where I need a specific information. As a result, I just want one row with all the information, preferential named with aliases.
For example:
Tab_Transcoders:
ID, VideoCamID, InputStreamID, OutputStreamID.
where InputStream links to another table where I need the row of the matching ID, where in this row are other ID's (e.g. StreamType_ID that belongs to a third table containing ID_StreamType and Description etc.)
Same with OutputStreamID, same with VideoCamID.
In the end, I need a row containing for example:
ID, VideoCamID, InputStreamID, InputStreamType, InputStreamTypeDesc,
OutputStreamID, OutputStreamType, OutputStreamDesc, VideoCamID, etc. etc. etc.
It is important for me that I can set aliases, as for example InputStreamID & OutputStreamID links to the same table where all my Streams are listed (with IP's, Descs..)
I can accomplish this with doing like 100 SELECTS & SUBSELECTS, but I don't think that's an appropriate way.
I read some informations about things like CURSOR, UNION, FETCH, JOIN etc. etc.. but I don't know which one I have to use for my purpose.
eli
I think you want something like the following....
Select
t.ID,
t.VideoCamID,
i.InputStreamID,
is.StreamType as InputStreamType,
is.StreamDesc as InputStreamDesc,
o.OutputStreamID,
os.StreamType as OutputStreamType,
os.StreamDesc as OutputStreamDesc,
v.VideoCamID
from
Tab_Transcoders t
inner join InputStreams i on i.InputStreamID=t.InputStreamId
inner join Streams is on is.StreamId=i.StreamId
inner join OutputStreams o on o.OutputStreamId=t.OutputStreamId
inner join Streams os on os.StreamID=o.StreamId
inner join VideoCams v on v.VideoCamId=t.VideoCamID
If there is a defined relationship between your tables, then Use Join.
e.g Customer Order
Order will have customer Id
Select Order.ID,Order.Quantity, Order.CustomerId, Customer.FullName, Customer.Address
From Orders Order
Join
Customer
On
Order.CustomerId = Customer.CustomerId
First start by getting data from two tables using the join and then if it works as per your requirement, add another required table in the join.
Read about SQl JOINS.. It is fairly simple.
I will recommend reading you some of the articles around CTE aka Common Table Expression.
Refer http://msdn.microsoft.com/en-us/library/ms190766%28v=sql.105%29.aspx.
Apart from this never use subqueries. Try to use inner join / any other join if possible.