Join two tables based on the partial matched column - hive

i want to join two tables in hive based on the partial match. So far, i tried with the below SQL query:
select * from tableA a join tableB b on a.id like '%'+b.id+'%';
and instr but nothing working, is there a way?

JOIN in Hive only support equation conditions. Plus, you should use CONCAT for your string concatenation.
This is one solution instead.
select *
from tableA a, tableB b
where a.id like concat('%',b.id,'%')

Related

Distinct IDs from one table for inner join SQL

I'm trying to take the distinct IDs that appear in table a, filter table b for only these distinct IDs from table a, and present the remaining columns from b. I've tried:
SELECT * FROM
(
SELECT DISTINCT
a.ID,
a.test_group,
b.ch_name,
b.donation_amt
FROM table_a a
INNER JOIN table_b b
ON a.ID=b.ID
ORDER by a.ID;
) t
This doesn't seem to work. This query worked:
SELECT DISTINCT a.ID, a.test_group, b.ch_name, b.donation_amt
FROM table_a a
inner join table_b b
on a.ID = b.ID
order by a.ID
But I'm not entirely sure this is the correct way to go about it. Is this second query only going to take unique combinations of a.ID and a.test_group or does it know to only take distinct values of a.ID which is what I want.
Your first and second query are similar.(just that you can not use ; inside your query) Both will produce the same result.
Even your second query which you think is giving you desired output, can not produce the output what you actually want.
Distinct works on the entire column list of the select clause.
In your case, if for the same a.id there is different a.test_group available then it will have multiple records with same a.id and different a.test_group.

join or merge two table based on id merge

I have two tables:
I am looking for the results like mentioned in the last.
I tried union (only similar col can be merged), left join, right join i am getting repeated fields in Null areas what can be other options where i can get null without column repeating
A full join would get all results from both tables.
select
A.ID,
A.ColA,
A.ColB,
B.ColC,
B.ColD
from TableA A
full join Table B on A.ID = B.ID
Here is a good post to understand joins
You can try distinct:
select distinct * from
tableA a,
tableB b
where a.id = b.id;
It will not give any duplicate tuples.

how do I get the following postgres queries?

I have two tables: tableA and tableB. TableA has a field idA and tableB has a record idB and idBPtrA where idBptrA is a pointer to tableA (one of the idA).
I want, using postgres, to select records from a TableA that have the minimal number of records in tableB.
Something like:
select idA,idB,count(idBPtrA) as c
from tableA,tableB
group by idBPtrA
where idA=idB order by c
This of course doesn't work and gives me an error, but I think it should be very similar to that... Any ideas?
I think this is the query that you want:
select a.idA, count(b.idB) as c
from tableA a left join
tableB b
on a.idA = b.idptrA
group by a.idA
order by c;
Notes:
Use table aliases and qualify column names (especially if you are learning SQL).
Learn and use proper, explicit JOIN syntax. Simple rule: Never use commas in the FROM clause.

Select returns the same row multiple times

I have the following problem:
In DB, I have two tables. The value from one column in the first table can appear in two different columns in the second one.
So, the configuration is as follows:
TABLE_A: Column Print_group
TABLE _B: Columns Print_digital and Print_offset
The value from the different rows and Print_group column of the Table_A can appear in one row of the Table_B but in different column.
I have the following query:
SELECT DISTINCT * FROM Table_A
INNER JOIN B ON (Table_A. Print_digital = Table_B.Print_group OR
Table_A.Print_offset = Table_B.Print_group)
The problem is that this query returns the same row from the Table_A two times.
What I am doing wrong? What is the right query?
Thank you for your help
If I'm understanding your question correctly, you just need to clarify your fields to come from Table_A:
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
EDIT:
Given your comments, looks like you just need SELECT DISTINCT B.*
SELECT DISTINCT B.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
I've still another question... first,to be clear, the right query version is
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group.
If I want it returns also one column from the B table it again returns duplicate rows. My query (the bad one) is the following one:
SELECT DISTINCT A.*, B.Id
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group

Join SQL query to get data from two tables

I'm a newbie, just learning SQL and have this question: I have two tables with the same columns. Some registers are in the two tables but others only are in one of the tables. To illustrate, suppose table A = (1,2,3,4), table B=(3,4,5,6), numbers are registers. I need to select all registers in table B if they are not in table A, that is result=(5,6). What query should I use? Maybe a join. Thanks.
You can either use a NOT IN query like this:
SELECT col from A where col not in (select col from B)
or use an outer join:
select A.col
from A LEFT OUTER JOIN B on A.col=B.col
where B.col is NULL
The first is easier to understand, but the second is easier to use with more tables in the query.
Select register from TABLE_B b
Where not exists (Select register from TABLE_A a where a.register = b.register)
I assumed you have a column named register in TABLE_A and TABLE_B