Im using hive 1.2.1 and I'm running into some problems when trying to join using a subquery.
My main table is applications and I'm trying to join it to table credits, based on account and dates. The date condition is giving me troubles when I try to get just one row (the credit has to be after the application, and it can only be one to avoid dupes in the join). I'm using the following code:
SELECT COUNT(1)
FROM applications apps
LEFT JOIN credits c
ON c.python_id =
(
SELECT python_id
FROM credits cr
WHERE cr.ind in ('NP','0P')
AND cr.acct_nbr = apps.acct_nbr
AND cr.date >= apps.date
ORDER BY cr.date DESC
LIMIT 1
)
I'm getting the following error:
[Code: 40000, SQL State: 42000] Error while compiling statement: FAILED: ParseException line 8:24 cannot recognize input near 'SELECT' 'python_id' 'FROM' in expression specification
Could you please help?
Thank you
Issue with your query is
> hive does not support sub query with equals clause, you can write sub query only for IN, NOT IN, EXISTS and NOT EXISTS clause.
> You cannot have a sub query which returns more than one row.
Please look into - [https://cwiki.apache.org/confluence/display/Hive/Subqueries+in+SELECT][1]
There is issue with you logic as well.
My understanding is You are trying to get count from main table wit left join and there is no filter condition defined on outer query to say what records you want.
So the count will always be equal to number of records in main table (applications), If you can provide sample data with expected input and output, we can help you with the query.
Hope this helps.
You should join based on act_nbr and use the date as a filter in the where clause.
SELECT COUNT(1)
FROM applications apps
JOIN credits c
ON c.acct_nbr = apps.acct_nbr
WHERE c.ind in ('NP','0P')
AND c.date >= apps.date
ORDER BY c.date DESC
LIMIT 1
Related
I had a working sample query earlier in my code as mentioned below.
SELECT DISTINCT
nombre_aplicacion,
APLICACION,
NOMBRE_APLCODE,
DESCRIPCION,
AREAFUNC
FROM (
select **CODAPLICATION nombre_aplicacion**,
APLICACION,
NOMBRE_APLCODE,
DESCRPTION,
AREAFUNC
from admin.VW_APLICACIONES#dblink,
admin.VW_PRODUCTOS#dblink
where **nombre_aplicacion (+) = CODAPLICATION**
)
WHERE 1=1
ORDER BY nombre_aplicacion ASC;
When I try similar type of query with different tables I was getting error as invalid ORA-00904: "NOMBRE_APLICACION": invalid identifier.
If I remove nombre_aplicacion (+) = CODAPLICATION in where condition query is fetching the result. Can any one suggest why I was facing error as its working earlier with sample query and I was getting error? Is this join is valid?
The query is not valid as:
In the inner sub-query you select areafunc and in the outer query you use area which does not appear in the inner sub-query so will not be available.
In the inner sub-query, you define CODAPLICATION to have the alias nombre_aplicacion and then you try to use that alias in the WHERE clause as a join condition; that will not work.
You have not described which column belongs to which table but you want something like:
SELECT DISTINCT
a.codaplication AS nombre_aplicacion,
a.aplicacion,
a.nombre_aplcode,
p.descrption,
p.areafunc
from APLICACIONES a
LEFT OUTER JOIN PRODUCTOS p
ON (a.primary_key_column = p.foreign_key_column)
ORDER BY nombre_aplicacion ASC;
Note: you are going to have to correct the code to give the correct table aliases for each column and give the correct columns for the join condition.
From this question
Write a select query for getting table value using another table field value
I tried this query
select guardian_nm,guardian_age
from guardian
where stu_uid IN (
select stu_uid from student where stu_id=1 order by timestamp desc limit 1)
But getting the following error
Error code is -4743 ATTEMPT TO USE A FUNCTION WHEN THE APPLICATION
COMPATIBILITY SETTING IS SET FOR A PREVIOUS LEVEL
Can anyone help me please ?
You are using limit 1 so stu_uid you can write it =.
And I will suggest do not use reserved MYSQL words in column names like timestamp,
Use join instead of subquery.
The advantage of a join includes that it executes faster. The
retrieval time of the query using joins almost always will be faster
than that of a subquery. By using joins, you can maximize the
calculation burden on the database i.e., instead of multiple queries
using one join query.
Try:
select guardian_nm,guardian_age,
from guardian
inner join
student
on guardian.stu_uid=student.stu_uid
where student.stu_id=1
order by timestamp desc limit 1 ;
Working demo: http://sqlfiddle.com/#!9/923c30/3
I am trying to create a table by checking two sub-query expressions within the where clause but my query fails with the below error :
Unsupported sub query expression. Only 1 sub query expression is
supported
Code snippet is as follows (Not the exact code. Just for better understanding) :
Create table winners row format delimited fields terminated by '|' as
select
games,
players
from olympics
where
exists (select 1 from dom_sports where dom_sports.players = olympics.players)
and not exists (select 1 from dom_sports where dom_sports.games = olympics.games)
If I execute same command with only one sub-query in where clause it is getting executed successfully. Having said that is there any alternative to achieve the same in a different way ?
Of course. You can use left join.
Inner join will act as exists. and left join + where clause will mimic the not exists.
There can be issue with granularity but that depends on your data.
select distinct
olympics.games,
olympics.players
from olympics
inner join dom_sports dom_sports on dom_sports.players = olympics.players
left join dom_sports dom_sports2 where dom_sports2.games = olympics.games
where dom_sports2.games is null
I would like to JOIN 2 databases.
1 database is keyword_data (keyword mapping)
1 database is filled with Google rankings and other metrics
Somehow I cannot JOIN these two databases.
Some context:
DATA SET NAME: visibility
TABLE 1
keyword_data
VALUES
keyword
universe
category
search_volume
cpc
DATA SET NAME: visibility
TABLE 2
results
VALUES
Date
Keyword
Website
Position
In order to receive ranking data by date I wrote the following SQL line.
SELECT Date, Position, Website FROM `visibility.results` Keyword INNER
JOIN `visibility.keyword_data` keyword ON `visibility.results` Keyword
= `visibility.keyword_data` keyword GROUP BY Date;
(besides that, 100 other lines with no success ;-) )
I am using Google BigQuery for this with standard SQL (unchecked Legacy SQL).
How can I JOIN those 2 data tables?
How familiar are you with SQL? I think you're using aliases wrong, something like this should work
SELECT r.Date, r.Position, r.Website
FROM `visibility.results` AS r
INNER JOIN `visibility.keyword_data` AS k
ON r.Keyword = k.keyword
GROUP BY DATE
First of all i have never worked with Google big query but there is a couple of things wrong in my opinion with this query.
To start with you join tables by including the name of the table then you provide the key that the tables are joined by. Also if you don't use aggregate functions (MIN/MAX etc.) in your select statement you must include all values in the group by clause as well. In reference I can provide you a solution that would work if you would of used Microsoft SQL Server if that would be of any help because if you reference here the syntax is quite similar.
SELECT results.Date AS DATE,
,results.Position AS POSITION
,results.Website AS WEBSITE
FROM visibility.dbo.keyword_data AS keyword_data
INNER JOIN visibility.dbo.results AS results
ON results.keyword = keyword_data.keyword
GROUP BY results.Date
,results.Position
,results.Website
I'm trying to put together a query that updates a field within a table. I'm attempting to run a sub select query that gives me a number, and then use that number that resulted from the sub-query as part of the criteria for the update query.
USE EMMS
Update [2_import_VZW_tbl_SMTN]
set [2_import_VZW_tbl_SMTN].[Client_ID] =[tbl_Foundation_Account].[Client_ID]
where ([tbl_Foundation_Account].[Foundation_Account_ID] =
(Select TOP 1 tbl_Foundation_Account.Foundation_Account_ID
FROM tbl_Foundation_Account
INNER JOIN [2_Import_tbl_AWCDSU]
ON tbl_Foundation_Account.Foundation_Account_ID =
[2_Import_tbl_AWCDSU].[ECPD Profile ID]))
My issue is I keep receiving this error
The multi-part identifier
tbl_Foundation_Account.Foundation_Account_ID" could not be bound.
Am I using the sub-query incorrectly? When I've received this error before, it's been because of some ambiguity in the table or field names, but this time I've checked for all that and it should be fine. Can anyone explain what SQL sin I have committed?
On the error
The multi-part identifier
tbl_Foundation_Account.Foundation_Account_ID" could not be bound.
This is because the table column [tbl_Foundation_Account].[Client_ID] does not exists in the scope of outer UPDATEquery .
The only table the outer query has an inkling about is [2_import_VZW_tbl_SMTN] and it does not have a column like [tbl_Foundation_Account].[Client_ID].
It is akin to writing a column name with a typo or like you said
When I've received this error before, it's been because of some
ambiguity in the table or field names
Please try a query like below.
Note that I am using Inner query syntax and ensuring that a single value is returned by using
select top 1 [Client_ID]
in the inner query. rest of the query syntax is same.
USE EMMS
Update [2_import_VZW_tbl_SMTN]
set [2_import_VZW_tbl_SMTN].[Client_ID] =
(
select top 1 [Client_ID]
from [tbl_Foundation_Account]
where [Foundation_Account_ID] =
(
Select TOP 1 a.Foundation_Account_ID
FROM tbl_Foundation_Account a
INNER JOIN [2_Import_tbl_AWCDSU] b
ON a.Foundation_Account_ID = b.[ECPD Profile ID]
)
)
Another poster submitted this answer earlier, but then deleted it. I was able to try it before they deleted it and it works exactly how I needed it to work. I will use this as the right answer unless someone else can tell me why this is a bad Idea.
USE EMMS
Update [2_import_VZW_tbl_SMTN]
set [2_import_VZW_tbl_SMTN].[Client_ID] = [tbl_Foundation_Account].[Client_ID]
from [tbl_Foundation_Account]
where ([tbl_Foundation_Account].[Foundation_Account_ID] =
(Select TOP 1 tbl_Foundation_Account.Foundation_Account_ID
FROM tbl_Foundation_Account
INNER JOIN [2_Import_tbl_AWCDSU]
ON tbl_Foundation_Account.Foundation_Account_ID = [2_Import_tbl_AWCDSU].[ECPD Profile ID]))