Say I have multiset in my table and with the below I will get the duplicates
select name, address from users group by
name, address having count(*) > 1
But my problem is ... say I have another field called credit. I would want to compare credits in the duplicate values and would take the second if the second credit is higher than the first (that is max)
select name, address, from users group by
name, address having count(*) > 1
use
select A.name, A.address, max ( A.credits ) mc from users A
where (A.name, A.address) in
(
select B.name, B.address from users B group by
B.name, B.address having count(*) > 1
)
group by A.name, A.address
Related
Hello I am trying to find what customer have the same emails but having a different name. Pretty much what happen is in our name, first name, last name it populate the email name so it looks like janedoe-at-yahoo. So was able to find all the -at- now I want to match those emails with the correct name Jane Doe so I can fix all the -at- emails. Hope that all made sense please help.
SELECT C.CUST_NO, C.NAM, C.EMAIL_ADRS_1
FROM AR_CUST C
INNER JOIN (SELECT NAM, EMAIL_ADRS_1, COUNT(*) AS COUNTOF
FROM AR_CUST
GROUP BY NAM, EMAIL_ADRS_1
HAVING COUNT(*)>1
) DT ON C.NAM = DT.NAM AND C.EMAIL_ADRS_1 = DT.EMAIL_ADRS_1
ORDER BY C.EMAIL_ADRS_1
This code will bring up all the duplicate emails only
SELECT C.CUST_NO, C.NAM, C.FST_NAM, C.LST_NAM, C.EMAIL_ADRS_1
FROM AR_CUST C
INNER JOIN (SELECT NAM, EMAIL_ADRS_1, COUNT(*) AS COUNTOF
FROM AR_CUST
GROUP BY NAM, EMAIL_ADRS_1
HAVING COUNT(*)>1 AND NAM LIKE '%-at-%'
) DT ON C.EMAIL_ADRS_1 = DT.EMAIL_ADRS_1
ORDER BY C.EMAIL_ADRS_1
This code will bring me the -at- emails and some of the correlate names but it only brings me 100 records when there should be at least 900 records
Hello I am trying to find what customer have the same emails but having a different name.
If you just wand the email address that are duplicates, you can answer this with aggregation:
SELECT C.EMAIL_ADRS_1
FROM AR_CUST C
GROUP BY C.EMAIL_ADRS_1
HAVING MIN(C.NAM) <> MAX(C.NAM);
If you want the original records, use exists:
select c.*
from ar_cust c
where exists (select 1
from ar_cust c2
where c2.email_adrs_1 = c.email_adrs_1 and
c2.nam <> c.nam
);
Here are the two table structures
I have two tables where I am trying to fetch some duplicate records based on some condition like where a.fname=b.fname and a.phone_no<>b.phone_no
But also I need to include other column 'address' which is in table 2 and introduce the same condition for duplicate checking for address as well.
SELECT
"Fname"||' '||"Lname" AS "Customer_Name",
COUNT(*) AS "Countof"
FROM "S_CONTACT" A
WHERE EXISTS (
SELECT 1
FROM "S_CONTACT" B
WHERE A."PHONE" != B."PHONE"
AND A."Fname" = B."Fname"
AND A."EMAIL"=B."EMAIL"
AND A."Lname"=B."Lname"
AND "DOB" IS NULL
)
GROUP BY "Fname","Lname","EMAIL"
HAVING count(*) >1;
The above sql gives me a list of customers with duplicate names and email.
But I do not know how to introduce the column address in this sql which is from different table t2
If you are trying to find duplicates in one table:
select c.fname, c.lname, c.email
from contacts c
group by c.fname, c.lname, c.email
having min(c.phone) <> max(c.phone);
If you want to count null as a different value, then use:
having min(c.phone) <> max(c.phone) or count(c.phone) <> count(*)
You can do the same thing on the second table:
select c.fname, c.lname, c.email
from second_table c
group by c.fname, c.lname, c.email
having min(c.address) <> max(c.address) or count(c.address) <> count(*)
If you need the results in the same result set, then use union or union all or some similar mechanism.
I have this query:
SELECT
A.USERID
A.NAME
PVT.PHONE 'PROBABLY A CASE STATEMENT ON NULL WILL GO HERE...
PVT.ADDRESS 'ON HERE AS WELL...
FROM
USERS A
'I NEED TO CREATE A PIVOT TABLE HERE WITH THE ALIAS OF 'PVT' ON TABLE 'B'
B Contents:
UserID PHONE ADDRESS TYPE
1 444-555-2222 XXXXXXX PHONE
1 XXXXXXX 66 Nowhere NOTADDRESS
I want, on the same row, the user's phone by getting B.PHONE if TYPE = 'PHONE'.
I also want, on the same row, the user's address by getting B.ADDRESS content if TYPE = 'ADDRESS'.
As you see in the table dump above, I don't have a record matching the user ID AND TYPE = 'ADDRESS'
So I would need to show a blank or 'No address' in the main SELECT which will show the phone, but on the same row, blank or 'No address'.
I don't want to create an INNER JOIN because if there are no matching UserID's in B, the query will not return the info that I have in table A for that user.
Also, a LEFT JOIN will create two rows, which I don't want.
I think I pivoted table as alias would do it, but I don't know how to create such an alias.
Any ideas ?
How about using conditional aggregation?
SELECT A.USERID, A.NAME
B.PHONE, B.ADDRESS
FROM USERS A LEFT JOIN
(SELECT UserId, MAX(CASE WHEN TYPE = 'PHONE' THEN PHONE END) as PHONE,
MAX(CASE WHEN TYPE = 'ADDRESS' THEN ADDRESS END) as ADDRESS
FROM B
GROUP BY UserId
) B
ON B.UserId = A.UserId;
If you have to use PIVOT then you'd need the pivot in a subquery and left join to it
SELECT
A.USERID,
A.NAME,
PVT.PHONE,
PVT.[ADDRESS]
FROM
Users A
LEFT JOIN (SELECT *
FROM
(SELECT
UserID,
[Type],
(CASE [Type] WHEN 'PHONE' THEN PHONE WHEN 'ADDRESS' THEN [Address] END) Info
FROM UserInfo) AS UI
PIVOT (
MAX(Info)
FOR [Type] IN ([PHONE], [ADDRESS])
) P
) PVT ON A.UserID = PVT.UserID
This gives you pretty much the same execution plan as the conditional aggregation query, but not as easy on the eyes.
SQL Fiddle
If I have the following scenario
Table that store people
id_person, name, age (...)
And a table that stores address of people
id_address, id_person, city
If I run a query like this
select * from people P left join address A on P.id_person = A.id_person
I'm getting id_person === null in result set (because there IS a person, but no address has been recorded it, which is fine).
The null is comming from the table address. Is it possible to solve this without doing select field1, field2, field3 ... (lots os fields)?
Example
Person
id_person Name
1 John
2 Steve
Address
id_address id_person city
1 1 'AnyCity'
When I run a query like this
select * from people P left join address A on P.id_person = A.id_person
where P.name = 'Steve'
His id_person is returning null
You mean you only want the id_person from the people table, not from the address table (which sometimes is NULL)?
select p.id_person, p.name, p.age, a.id_address, a.city
from people P left join address A ON P.id_person = A.id_person
Is it possible to solve this without doing select field1, field2, field3 ... (lots os fields)
No - you either use * or identify the fields. You could select all fields from one table and then cherry pick from the other table:
select P.*, A.address, A.City, ...
from people P
left join address A where P.id_person = A.id_person
In Hive, I've got four tables:
temp_basic_info (ID, MSISDN, GENDER, AGE, DAY, MONTH, YEAR, RELATIONSHIPSTATUS)
temp_education (ID, EDUCATION)
likes_and_music (ID, NAME, PAGE)
temp_output (ID, MSISDN, GENDER, AGE, DAY, MONTH, YEAR, RELATIONSHIPSTATUS, EDUCATION, LIKES_AND_PREFERENCES)
temp_output is empty.
Now, I want to transfer the appropriate fields from the other three tables into temp_output. likes_and_music has multiple instances of the same ID's, paired with varying NAMEs and PAGEs, so I'd have to put them in an array.
My projected output is something like the following:
0001 msisdn1 male 21 1 2 92 0 College [Jeep, soccer, PC games, etc...]
And here's my query so far:
Select a.ID, a.MSISDN, a.GENDER, a.AGE, a.DAY, a.MONTH, a.YEAR, a.RELATIONSHIPSTATUS, b.EDUCATION, COLLECT_SET(c.NAME) FROM temp_basic_info a JOIN temp_education b ON (a.ID = b.ID) JOIN likes_and_music c ON (c.ID = b.ID) GROUP BY a.ID, a.MSISDN, a.GENDER, a.AGE, a.DAY, a.MONTH, a.YEAR, a.RELATIONSHIPSTATUS, b.EDUCATION, c.name limit 10;
But the latter returns the following error:
FAILED: SemanticException [Error 10002]: Line 1:311 Invalid column reference 'EDUCATION'
What am I missing?
temp_education (ID, NAME)
I don't see the column b.education for table temp_education b