Teradata Rank Over Query (Getting one row to left join) - sql

Hi am new to Teradata and am stuck with a problem
There is an ID table which stores an Unique ID given to each person
CREATE TABLE IDS(
ID VARCHAR(8),
UPDATED_DATE DATE)
Then we have a name and address table which do not have any primary keys that stores demographic information for the IDS
CREATE TABLE NAMES(
ID VARCHAR(8),
NAME VARCHAR(50))
CREATE TABLE ADRRESSES(
ID VARCHAR(8)
ADDRESS VARCHAR(200))
Now each ID can have multiple name and IDS. However for names and address I want to use the ones that are have more counts. If two names have the same COUNT I just want the First row
ID NAME COUNT
1234 John Smith 6
1234 Johnnie Smith 6
1234 J Smith 2
In the above example I want the name John Smith. Here is the left Join I am performing since an ID may not have a name or address. Here is what I am trying
SELECT * FROM
(SELECT ID as V_ID from IDS) a
LEFT JOIN
(SELECT ID, NAME, COUNT(*) AS COUNTER,(RANK() OVER(ORDER BY COUNTER DESC)) AS RNK
FROM NAMES
GROUP BY ID)b
ON a.ID = b.ID
AND b.RNK = 1 -- Should give me only the first row
LEFT JOIN
(SELECT ID, ADDRESS, COUNT(*) AS COUNTER, (RANK() OVER (ORDER BY COUNTER DESC) ) AS RNK
FROM ADDRESSES
GROUP BY ID) c
ON c.ID = a.ID
And c.RNK = 1
However this is not getting me the desired result. I tried using ROW NUMBER instead of RANK also but still no results. How should I write this query in TERDATA?

I solved it ...I needed a qualify and a partition by
SELECT * FROM
(SELECT ID as V_ID from IDS) a
LEFT JOIN
(SELECT ID, NAME, COUNT(*) AS COUNTER
FROM NAMES
GROUP BY ID
qualify ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COUNTER DESC) = 1
)b
ON a.ID = b.ID
LEFT JOIN
(SELECT ID, ADDRESS, COUNT(*) AS COUNTER
FROM ADDRESSES
GROUP BY ID
qualify ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COUNTER DESC) = 1
) c
ON c.ID = a.ID

Related

Selecting the Id's that have the same EmailAddress column value

What I need:
I am looking for a solution that can give me all the Employee Id's that have the same EmailAddress Column (the filter needs to be by EmailAddress).
I want to know what are the Id's correspondent to the duplicated Email Addresses and retrieve that information.
Table Employee:
Id | PlNumber | EmailAddress | EmployeeBeginingDate | EmployedEndDate | Name UserId(FK) | CreatedBy | CreatedOn
SELECT a.Id,a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.Id as EmployeeId,
Employee.EmailAddress as EmailAddress,
FROM Employee
GROUP BY Employee.Id,Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.Id= b.EmployeeId
ORDER BY a.Id
I am always getting an error:
the multi-part identifier could not be bound.
I know why the error is happening but I couldn't solve this.
UPDATE: After a few changes the query is returning 0 rows but I know it should return at least 3 rows that I have duplicate values.
Try the below query as you have an aliased table Employee as a. So in place of Employee, you have to use a.
SELECT a.Id, a.EmailAddress
FROM Employee a
INNER JOIN (SELECT
Employee.EmailAddress as EmailAddress
FROM Employee
GROUP BY Employee.EmailAddress
HAVING count(Employee.EmailAddress) > 1
) b
ON a.EmailAddress = b.EmailAddress
ORDER BY a.Id
Live db<>fiddle demo.
Assuming the ids are different on each row, I would go for exists:
SELECT e.Id, e.EmailAddress
FROM Employee e
WHERE EXISTS (SELECT 1
FROM Employee e2
WHERE e2.EmailAddress = e.EmailAddress AND
e2.Id <> e.Id
)
ORDER BY e.EmailAddress;
Or, if you want to know the number of matches, use window functions:
SELECT e.Id, e.EmailAddress, cnt
FROM (SELECT e.*, COUNT(*) OVER (PARTITION BY e.EmailAddress) as cnt
FROM Employee e
) e
WHERE cnt >= 2;

How can I randomly distribute rows in one table to rows in another table in oracle SQL

I am trying to figure out a SQL query that will distribute records from one table to another table randomly.
for example :
I have a table of Customers, and I want to assign each a car out of a table of cars.
I want to make sure that the car are randomly distributed, but there is no property of an Customers that would predict which car they would receive.
Customers:
(Jon,Sam,Sara,Jack,Adam,Adrian)
Cars:
(BMW,Dodge,Lexus)
Result:
(Jon-BMW,Sam-Lexus,Sara-BMW,Jack-Dodge,Adam-Dodge,Adrian-BMW)
How can i do that in Oracle SQL?
Here's one option:
SQL> with t as
2 (select u.name ||'-'||a.name comb,
3 row_number() over (partition by u.name order by dbms_random.value(1, n.cnt)) rn
4 from customers u cross join cars a
5 join (select count(*) cnt from cars) n on 1 = 1
6 )
7 select t.comb
8 from t
9 where rn = 1;
COMB
-----------------------------------------
Adam-Lexus
Adrian-BMW
Jack-Lexus
Jon-BMW
Sam-Dodge
Sara-Lexus
6 rows selected.
SQL>
One method that might be more efficient than a full cross join is:
select c.*, cc.car
from (select c.*,
row_number() over (order by dbms_random.value(1, cc.cnt) as seqnum
from customers c cross join
(select count(*) as cnt from cars) cc
) c join
(select cc.*, row_number() over (order by dbms_random.random) as seqnum
from cars cc
) cc
on cc.seqnum = c.seqnum;
if no limit to use all cars and DB resources:
select customer_name||'-'||car_name result
from (
select u.name customer_name, c.name car_name,
row_number() over ( partition by u.name order by dbms_random.value ) ord
from customers u
cross join cars c
)
where ord = 1

PARTITION BY duplicated id and JOIN with the ID with the least value

I need to JOIN through a view in SQLServer 2008 tables hstT and hstD. The main table contains a data regarding employees and their "logins" (so multiple records associated to x employee in x month) and the second table has info about their area based on months, and I need to join both tables but keeping the earliest record as reference for the join and the rest of records associated to that id.
So hstT its something like:
id id2 period name
----------------------
x 1 0718 john
x 1 0818 john
y 2 0718 jane
And hstD:
id2 period area
----------------------
1 0718 sales
1 0818 hr
2 0707 mng
With an OUTER JOIN I manage to merge all data based on ID2 (user id) and the period BUT as I mentioned I need to join the other table based on the earliest record by associating ID (which I could use as criteria) so it would look like this:
id id2 period name area
---------------------------
x 1 0718 john sales
x 1 0818 john sales
y 2 0718 jane mng
I know I could use ROW_number but I don't know how to use it in a view and JOIN it on those conditions:
SELECT T.*,D.*, ROW_NUMBER() OVER (PARTITION BY T.ID ORDER BY T.PERIOD ASC) AS ORID
FROM dbo.hstT AS T LEFT OUTER JOIN
dbo.hstD AS D ON T.period = D.period AND T.id2 = D.id2
WHERE ORID = 1
--prompts error as orid doesn't exist in any table
You can use apply for this:
select t.*, d.area
from hstT t outer apply
(select top (1) d.*
from hstD d
where d.id2 = t.id2 and d.period <= t.period
order by d.period asc
) d;
Actually, if you just want the earliest period, then you can filter and join:
select t.*, d.area
from hstT t left join
(select d.*, row_number() over (partition by id2 order by period asc) as seqnum
from hstD d
order by d.period asc
) d;
on d.id2 = t.id2 and seqnum = 1;

How to GROUP BY max date in left join

I have three tables in Oracle DB
House (
id
)
Person (
id,
house_id
)
Bill (
id,
date,
amount,
person_id
)
I need to get list if person id and amount from last bill if exist by house id. Last bill is the bill with the oldest date field.
I can get it by person id this way:
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN
(SELECT amount FROM Bill WHERE date =
(SELECT MAX(date) FROM Bill b1 WHERE person_id = 1)
) b ON b.person_id = p.id
WHERE p.id = 1;
Haw can I get list of person ids with amounts of latest bill by house id?
Sample data:
House(id:1)
House(id:2)
Person(id:1, house_id:1)
Person(id:2, house_id:1)
Person(id:3, house_id:2)
Bill(id:1, date:01-11-2011, amount:100, person_id:1)
Bill(id:2, date:01-11-2012, amount:200, person_id:1)
Bill(id:3, date:01-11-2011, amount:90, person_id:2)
Bill(id:4, date:01-11-2012, amount:10, person_id:2)
Bill(id:5, date:01-11-2011, amount:190, person_id:3)
Result for select by house_id = 1:
person_id:1, amount:200
person_id:2, amount:10
You can do this with aggregation:
select p.person_id,
max(b.amount) keep (dense_rank first order by b.date desc) as most_recent_amount
from bill b join
person p
on b.person_id = p.id
where p.house_id = 1
group by p.person_id;

SQL First row only following aggregation Rules

I need to select only the most significant value from a table. Using Postgre SQL (last version) Follows the data sample:
Table Company
Id, Name, ExternalId, StartAt
1 Comp1 54123 21/05/2000
2 Comp2 23123 21/05/2000
Table Address
Id, Company, Address_Type, City
1 1 7 A
2 2 2 B
3 2 62 C
Table Adress_Type
Id, Name, importance_order
62 Adt1 1
7 Adt2 2
2 Adt3 2
What i need to do is to get the company and its major Address, based on the "importance_order". There is already a function that returns this result:
Create function~~~~
Select * from Company c
join Address a on c.address_id = a.id
Join AddressType at on a.adresstype_id = at.id
ORDER by at.importance_order
Limit 1
My problem now is that this function is called one time for every row in the query, and it take so much time (about 20 min.). Should it be possible to do this similar aproach by joinning tables? I need this join to get the First "most important"address, and then get the City name, but need to do this in a "faster" way. I need to reduce subquery`s number to its minimal.
Select * from table t
inner join Company c on t.company_id = c.id
left join address a on (c.company_id = c.id)
left join addresstype at on (a.adresstype_id = at.id)
where at.id = (
select max(id) from addresstype
where adresstype in (
select adresstype from adress where company_id = c.id
)
)
If it is not clear tell me that i get more into details.
Thanks.
For this you need PostgreSQL 8.4+ I suppose
SELECT T.*
FROM TABLE AS T
INNER JOIN
(
SELECT * FROM
(
SELECT C1.*, ROW_NUMBER() OVER(partition by C1.ID ORDER BY T.IMPORTANCE_ORDER) AS RN
FROM COMPANY AS C1
INNER JOIN ADDRESS AS A
ON C1.ID = A.COMPANY
INNER JOIN ADDRESS_TYPE AS T
ON T.ID = A.ADDRESS_TYPE_ID
) A
WHERE RN = 1
) AS B
ON B.ID= T.COMPANY_ID