Get the rows with first phone number - sql

I need to display 100 members per page. Because of multiple phone numbers of a member, I have to pick the first phone number for each member.
Here is the query which one gets every phone numbers of a member:
SELECT * FROM
(
SELECT
row_number() over(order by(1)) rn,
NAME, PHONE
FROM MEMBERS t0
LEFT OUTER JOIN MEMBER_IDENTITY ON MEMBER_IDENTITY.ID=t0.ID
LEFT JOIN MEMBER_PHONE ON MEMBER_PHONE.MEMBER_ID=t0.ID
WHERE
NAME LIKE 'U%'
ORDER BY NAME ASC
)
WHERE rn >= 0
AND rn <= 100
How can I pick first -or MAX, etc- phone number?

You could make a sub query for retrieving the phone numbers together with their row number per member, and then filter out the first of them:
SELECT *
FROM (
SELECT row_number() OVER (ORDER BY name ASC) rn,
name,
phone
FROM members t0
LEFT JOIN member_identity
ON member_identity.id = t0.id
LEFT JOIN (
SELECT member_id,
phone
row_number() OVER (PARTITION BY member_id ORDER BY (1)) ph_rn
FROM member_phone
) member_phone
ON member_phone.member_id = t0.id
AND ph_rn = 1
WHERE name LIKE 'U%'
ORDER BY name ASC
)
WHERE rn BETWEEN 0 AND 100
I would:
use the same order for determining the rn value as the result set (ORDER BY name ASC), otherwise the order will not be consistent across pages;
use BETWEEN in the outer WHERE condition, although the lower bound condition (0) is not necessary for the first page.

When you switch to MAX you can apply the ROW_NUMBER directly on the name (Windowed Aggregate Functions are calulated after GROUP BY/HAVING):
SELECT * FROM
(
SELECT
row_number() over (ORDER BY NAME ASC) rn,
NAME, MAX(PHONE)
FROM MEMBERS t0
LEFT OUTER JOIN MEMBER_IDENTITY ON MEMBER_IDENTITY.ID=t0.ID
LEFT JOIN MEMBER_PHONE ON MEMBER_PHONE.MEMBER_ID=t0.ID
WHERE NAME LIKE 'U%'
GROUP BY NAME
)
WHERE rn >= 0
AND rn <= 100
or move the aggregation into a Derived Table:
SELECT * FROM
(
SELECT
row_number() over (ORDER BY NAME ASC) rn,
NAME, PHONE
FROM MEMBERS t0
LEFT OUTER JOIN MEMBER_IDENTITY ON MEMBER_IDENTITY.ID=t0.ID
LEFT JOIN
( SELECT MEMBER_ID, MAX(PHONE) AS PHONE
FROM MEMBER_PHONE
GROUP BY NAME
) MEMBER_PHONE
ON MEMBER_PHONE.MEMBER_ID=t0.ID
WHERE NAME LIKE 'U%'
)
WHERE rn >= 0
AND rn <= 100

Related

How to join to a statement with a row_number() function in SQL?

I a SQL with a row_number() function, and I would like to join on additional tables to get the fields below. How would I accomplish this?
Desired fields:
EMPLOYEE.EMPLID
EMPLOYEE.JOBTITLE
NAME.FIRST_NAME
NAME.LAST_NAME
LOCATION.ADDRESS
PROFESSIONAL_NAME.PROF_NAME
Beginning SQL:
SELECT COUNT(*)
FROM
(
SELECT EMPLOYEE.*, ROW_NUMBER() OVER (PARTITION BY EMPLID ORDER BY
PRIM_ROLE_IND DESC, EMPL_RCD ASC) as RN
FROM EMPLOYEE
WHERE JOB_INDICATOR = 'P'
) dt
WHERE RN = 1
When I try to add a left join at the end, I get an error that says "EMPLOYEE"."EMLID" invalid identifier.
What I'm trying:
SELECT
EMPLOYEE.EMPLID,
EMPLOYEE.JOBTITLE,
NAME.FIRST_NAME,
NAME.LAST_NAME,
LOCATION.ADDRESS,
PROFESSIONAL_NAME.PROF_NAME
FROM
(
SELECT EMPLOYEE.*, ROW_NUMBER() OVER (PARTITION BY EMPLID ORDER BY
PRIM_ROLE_IND DESC, EMPL_RCD ASC) as RN
FROM EMPLOYEE
WHERE JOB_INDICATOR = 'P'
)
LEFT JOIN NAME ON EMPLOYEE.EMPLID = NAME.EMPLID
WHERE
RN = 1
AND
NAME.EFFDT = (
SELECT
MAX (NAME2.EFFDT)
FROM
NAME NAME2
WHERE
NAME2.EMPLID = NAME.EMPLID
AND NAME.NAME_TYPE = 'PRI'
)
AND EMPLOYEE.JOB_INDICATOR = 'P'
You just need to alias your table
...
(
SELECT EMPLOYEE.*, ROW_NUMBER() OVER (PARTITION BY EMPLID ORDER BY
PRIM_ROLE_IND DESC, EMPL_RCD ASC) as RN
FROM EMPLOYEE
WHERE JOB_INDICATOR = 'P'
) temp_employee --add this
LEFT JOIN NAME ON temp_employee.EMPLID = NAME.EMPLID
...
When you create your new table with row_number() in an inner select you essentially create a new table. You need to alias or name this table and then refer to that alias. In the above your from is the inner select, not the EMPLOYEE table. See below for simplified example.
select newtable.field from (select field from mytable) newtable

select value based on max of other column

I have a few questions about a table I'm trying to make in Postgres.
The following table is my input:
id
area
count
function
1
100
20
living
1
200
30
industry
2
400
10
living
2
400
10
industry
2
400
20
education
3
150
1
industry
3
150
1
education
I want to group by id and get the dominant function based on max area. With summing up the rows for area and count. When area is equal it should be based on max count, when area and count is equal it should be based on prior function (i still have to decide if education is prior to industry or vice versa). So the result should be:
id
area
count
function
1
300
50
industry
2
1200
40
education
3
300
2
industry
I tried a lot of things and maybe it's easy, but i don't get it. Can someone help to get the right SQL?
One method uses row_number() and conditional aggregation:
select id, sum(area), sum(count),
max(function) over (filter where seqnum = 1) as function
from (select t.*,
row_number() over (partition by id order by area desc) as seqnum
from t
) t
group by id;
Another method uses ``distinct on`:
select id, sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
function
from t
order by id, area desc;
Use a scalar sub-query for "function".
select t.id, sum(t.area), sum(t.count),
(
select "function"
from the_table
where id = t.id
order by area desc, count desc, "function" desc
limit 1
) as "function"
from the_table as t
group by t.id order by t.id;
SQL Fiddle
you can use sum as window function:
select distinct on (t.id)
id,
sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
( select function from tbl_test where tbl_test.id = t.id order by count desc limit 1 ) as function
from tbl_test t
This is how you get the function for each group based on id:
select id, function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null;
(we ensure that no yt2 exists that would be of the same id but of higher areay)
This would work nicely, but you might have several max areas with different values. To cope with this isue, let's ensure that exactly one is chosen:
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id;
Now, let's join this to our main table;
select yourtable.id, sum(yourtable.area), sum(yourtable.count), t.function
from yourtable
join (
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id
) t
on yourtable.id = t.id
group by yourtable.id;

How do I select 100 records from one table for each unique record from another

I have one table of addresses, another table of coupons. I want to select 10 coupons per address. How would I go about doing that? I know this is very basic, but I've been out of SQL for some time now and trying to get reacquainted with it the best I can...
Table 1
Name Address
-------------------
Store 1 Address 1
Store 2 Address 2
Table 2
Coupons
--------
coupon1
coupon2
...
coupon19
coupon20
You can use window functions:
select t1.*, t2.coupons
from (
select t1.*, row_number() over(order by id) rn
from table1 t1
) t1
inner join (
select t2.*, row_number() over(order by id) rn
from table2 t2
) t2 on (t2.rn - 1) / 10 = t1.rn
The idea is to enumerate rows of each table with row_number(), then join the results with a condition on the row numbers. The above query gives you 10 coupons per address.
To get a stable result, you need a column (or a set of columns) in each table that uniquely identifies each row: I assumed id in both tables.
Do you want 10 coupons per store? 100 coupons per store? Your question response is different than the post. Or maybe you'd like to evenly distribute all available coupons across all the stores? Some of this query is building data to be able to demonstrate the output, but the main thing to focus on is the using of NTILE(10) to break up the Coupons into ten groups that can then have a ROW_NUMBER applied to it that gives you ten coupons per id value that can be joined upon...
WITH random_data AS
(
SELECT ROW_NUMBER() OVER (ORDER BY id) AS nums
FROM sysobjects
), store_info AS
(
SELECT ROW_NUMBER() OVER (ORDER BY nums) AS join_id,
'Store' + CONVERT(VARCHAR(10),nums) AS StoreName,
'Address' + CONVERT(VARCHAR(10),nums) AS StoreAddress
FROM random_data
), more_random_data AS
(
SELECT ROW_NUMBER() OVER (ORDER BY t2.nums) AS nums
FROM random_data t1
CROSS JOIN random_data t2
), coupons AS
(
SELECT NTILE(10) OVER (ORDER BY nums) AS group_id,
'Coupon' + CONVERT(VARCHAR(10),nums) AS Coupon,
nums
FROM more_random_data
), coupons_with_join_id AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY nums) AS join_id,
Coupon
FROM coupons
)
SELECT StoreName, StoreAddress, Coupon
FROM store_info AS si
JOIN coupons_with_join_id AS cwji
ON si.join_id = cwji.join_id
ORDER BY si.join_id, Coupon
The inherent issue here is that the 2 tables have no relation to each other. So your options are either to force a pseudo relation, like the other answers show, or create a relation between the two tables, like adding a store_name column to the coupon table.
This distributes all coupons (almost) evenly across all adresses:
with addr as
( -- prepare addresses by adding a sequence
select Name, Address,
-- 1-n
row_number() over (order by name) as rn
from table1
)
,coup as
( -- prepare coupons by adding same "sequence"
select coupons,
-- 1-n, same number of coupons (+/-1) for each address
ntile((select count(*) from table1))
over (order by coupons) as num
from table2
)
select *
from addr
join coup
on addr.rn = coup.num

SQL query to get first and last of a sequence

With the following query ...
select aa.trip_id, aa.arrival_time, aa.departure_time, aa.stop_sequence, aa.stop_id, bb.stop_name
from OeBB_Stop_Times aa
left join OeBB_Stops bb on aa.stop_id = bb.stop_id
I get the following table:
Now I want the first and last line/value from the column stop_sequence referring to column trip_id, so the result should be:
How can I do that?
Thanks
You can do a sub-query to get the min and max and join against that data.
Like this:
select aa.trip_id, aa.arrival_time, aa.departure_time, aa.stop_sequence, aa.stop_id, bb.stop_name
from OeBB_Stop_Times aa
join (
SELECT trip_id, max(stop_sequence) as max_stop, min(stop_sequence) as min_stop
FROM OeBB_Stop_Times
GROUP BY trip_di
) sub on aa.trip_id = sub.trip_id AND (aa.stop_sequence = sub.max_stop or aa.stop_sequence = sub.min_stop)
left join OeBB_Stops bb on aa.stop_id = bb.stop_id
You can use the ROW_NUMBER() window function twice to filter out rows, as in:
select *
from (
select *,
row_number() over(partition by trip_id order by arrival_time) as rn,
row_number() over(partition by trip_id order by arrival_time desc) as rnr
from OeBB_Stop_Times
) x
where rn = 1 or rnr = 1
order by trip_id, arrival_time
You can use row_number():
select s.*
from (select st.trip_id, st.arrival_time, st.departure_time,
st.stop_sequence, st.stop_id, s.stop_name,
row_number() over (partition by st.trip_id order by st.stop_sequence) as seqnum_asc,
row_number() over (partition by st.trip_id order by st.stop_sequence desc) as seqnum_desc
from OeBB_Stop_Times st left join
OeBB_Stops s
on st.stop_id = s.stop_id
) s
where 1 in (seqnum_asc, seqnum_desc);
Note that I fixed the table aliases so they are meaningful rather than arbitrary letters.
Actually, if the stop_sequence is guaranteed to start at 1, this is a bit simpler:
select s.*
from (select st.trip_id, st.arrival_time, st.departure_time,
st.stop_sequence, st.stop_id, s.stop_name,
max(stop_sequence) over (partition by st.trip_id) as max_stop_sequence
from OeBB_Stop_Times st left join
OeBB_Stops s
on st.stop_id = s.stop_id
) s
where stop_sequence in (1, max_stop_sequence);

How to perform reference a window function inside current table?

I have this part in a larger query which consume lot of RAM:
TopPerPost as
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id (The most common close Reason)
from ReasonsPerPost
where Name is NOT NULL and TopPerPost.seq=1 -- Remove useless results here, instead of doing it later
)
but I got The multi-part identifier "TopPerPost.seq" could not be bound.
Last detail... I only Use theNamecolumn in a laterINNER JOINof that table.
You can't reference a window function in the where of the same query. Just create a second cte.
with TopPerPost as
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id
from ReasonsPerPost
where Name is NOT NULL
)
, OnlyTheTop as
(
select *
from TopPerPost
where seq = 1
)
Or you can do it like this.
select * from
(
select Id,
CloseReasonTypeId,
Name,
ReasonsPerPost.TotalByCloseReason,
row_number() over(partition by Id order by TotalByCloseReason desc) seq -- Get the most common Id
from ReasonsPerPost
where Name is NOT NULL
) s
where seq = 1
Here is another option that should eliminate the need for so many rows being returned.
select Id,
CloseReasonTypeId,
Name,
s.TotalByCloseReason
from ReasonsPerPost rpp
cross apply
(
select top 1 TotalByCloseReason
from ReasonsPerPost rpp2
where rpp2.Id = rpp.Id
order by TotalByCloseReason desc
) s
where Name is NOT NULL
Attempt #4...this would be a LOT easier with a sql fiddle to work with.
select Id,
CloseReasonTypeId,
Name,
s.TotalByCloseReason
from ReasonsPerPost rpp
inner join
(
select top 1 TotalByCloseReason
from ReasonsPerPost rpp2
where rpp2.Id = rpp.Id
and Name is NOT NULL
order by TotalByCloseReason desc
) s on s.Id = rpp.Id
where Name is NOT NULL
The below might work for your need.
But without looking at the data is hard to tell it will or not.
;with t as
(
Select Id, max(totalbyclosereason) TC from reasonsperpost where name is not null group by id
)
Select T.id,t.tc,c.closereasontypeid,c.name
From t join reasonsperpost c on t.id = c.id and t.tc = c.totalbyclosereason