Compare dates and data column - sql

I have tables like this:
TABLE 1 - PERSON:
m_id | name |
-------------
22 | jo |
-------------
77 | john |
--------------
TABLE 2 - AMT_DATA
m_id | amt | activity |
-------------------------
22 | 100 | - |
-------------------------
77 | 300 | n |
-------------------------
TABLE 3 - STATUS_DATA:
m_id | status | s_date |
22 | - | 01.01.2000 |
22 | n | 01.01.2001 |
22 | - | 01.01.2002 |
77 | - | 01.01.2001 |
77 | n | 01.01.2002 |
How can i write a query or procedure that will return me all m_ids which biggest status_data.s_date for that m_id also have status_data.status = '-'?
I need to get result like this:
person.m_id | person.name | amt_data.amt | status | s_date
------------------------------------------------------------------
22 | jo | 100 | - | 01.01.2002

I don't see what amt really has to do with the question. You can just join that in.
One method is:
select p.*, status_date, status
from person p join
(select m_id, max(s_date) as status_date,
max(status) keep (dense_rank first order by s_date desc) as status
from status_data
group by m_id
) s
using (m_id)
where status = '-';
The keep syntax is Oracle's (rather verbose) way of implementing a "first" aggregation function.

You can use the analytical function as follows:
Select * from
(Select p.m_id,
P.name,
A.amt,
S.status,
S.s_date,
Row_number() over (partition by p.m_id order by s.s_date desc) as rn
From person p
join amt_data a on p.m_id = a.m_id
Join status_data s on p.m_id = s.m_id
Where s.status = '-')
Where rn = 1;

Related

Each rows to column values

I'm trying to create a view that shows first table's columns plus second table's first 3 records sorted by date in 1 row.
I tried to select specific rows using offset from sub table and join to main table, but when joining query result is ordered by date, without
WHERE tblMain_id = ..
clause in joining SQL it returns wrong record.
Here is sqlfiddle example: sqlfiddle demo
tblMain
| id | fname | lname | salary |
+----+-------+-------+--------+
| 1 | John | Doe | 1000 |
| 2 | Bob | Ross | 5000 |
| 3 | Carl | Sagan | 2000 |
| 4 | Daryl | Dixon | 3000 |
tblSub
| id | email | emaildate | tblmain_id |
+----+-----------------+------------+------------+
| 1 | John#Doe1.com | 2019-01-01 | 1 |
| 2 | John#Doe2.com | 2019-01-02 | 1 |
| 3 | John#Doe3.com | 2019-01-03 | 1 |
| 4 | Bob#Ross1.com | 2019-02-01 | 2 |
| 5 | Bob#Ross2.com | 2018-12-01 | 2 |
| 6 | Carl#Sagan.com | 2019-10-01 | 3 |
| 7 | Daryl#Dixon.com | 2019-11-01 | 4 |
View I am trying to achieve:
| id | fname | lname | salary | email_1 | emaildate_1 | email_2 | emaildate_2 | email_3 | emaildate_3 |
+----+-------+-------+--------+---------------+-------------+---------------+-------------+---------------+-------------+
| 1 | John | Doe | 1000 | John#Doe1.com | 2019-01-01 | John#Doe2.com | 2019-01-02 | John#Doe3.com | 2019-01-03 |
View I have created
| id | fname | lname | salary | email_1 | emaildate_1 | email_2 | emaildate_2 | email_3 | emaildate_3 |
+----+-------+-------+--------+---------+-------------+---------------+-------------+---------------+-------------+
| 1 | John | Doe | 1000 | (null) | (null) | John#Doe1.com | 2019-01-01 | John#Doe2.com | 2019-01-02 |
You can use conditional aggregation:
select m.id, m.fname, m.lname, m.salary,
max(s.email) filter (where seqnum = 1) as email_1,
max(s.emailDate) filter (where seqnum = 1) as emailDate_1,
max(s.email) filter (where seqnum = 2) as email_2,
max(s.emailDate) filter (where seqnum = 3) as emailDate_2,
max(s.email) filter (where seqnum = 3) as email_3,
max(s.emailDate) filter (where seqnum = 3) as emailDate_3
from tblMain m left join
(select s.*,
row_number() over (partition by tblMain_id order by emailDate desc) as seqnum
from tblsub s
) s
on s.tblMain_id = m.id
where m.id = 1
group by m.id, m.fname, m.lname, m.salary;
Here is a SQL Fiddle.
Here is a solution that should get you what you expect.
This works by first ranking records within each table and joining them together. Then, the outer query uses aggregation to generate the expected output.
This solution will work even if the first record in the main table does not have id 1. Also filtering takes occurs within the JOINs, so this should be quite efficient.
SELECT
m.id,
m.fname,
m.lname,
m.salary,
MAX(CASE WHEN s.rn = 1 THEN s.email END) email_1,
MAX(CASE WHEN s.rn = 1 THEN s.emaildate END) email_date1,
MAX(CASE WHEN s.rn = 2 THEN s.email END) email_2,
MAX(CASE WHEN s.rn = 2 THEN s.emaildate END) email_date2,
MAX(CASE WHEN s.rn = 3 THEN s.email END) email_3,
MAX(CASE WHEN s.rn = 3 THEN s.emaildate END) email_date3
FROM
(
SELECT m.*, ROW_NUMBER() OVER(ORDER BY id) rn
FROM tblMain
) m
INNER JOIN (
SELECT
email,
emaildate,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY emaildate) rn
FROM tblSub
) s
ON m.id = s.tblmain_id
AND m.rn = 1
AND s.rn <= 3
GROUP BY
m.id,
m.fname,
m.lname,
m.salary

Retrieve the minimal create date with multiple rows

I have an issue with an SQL query that I am trying to write. I am trying to retrieve the row that has the minimal create_dt for each inst (see table) and amount (which isn't unique).
Unfortunately I can't use group by as the amount column isn't unique.
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company A | 400 | 4545 | 01/11/2018 |
| Company A | 200 | 4545 | 31/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
| Company B | 212 | 4893 | 04/10/2016 |
| Company B | 100 | 4893 | 10/10/2017 |
| Company B | 20 | 4893 | 04/10/2018 |
+--------------+--------+------+-------------+
In the above example I expect to see:
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
+--------------+--------+------+-------------+
Code:
SELECT
bill_company, bill_name, account_no
FROM
dbo.customer_information;
SELECT
balance_id, balance_id2, minus_balance,new_balance,
create_date, account_no
FROM
dbo.btr
SELECT
balance_id, balance_id2, expired_Date, amount, balance_type, account_no
FROM
dbo.btr_balance
SELECT
balance_ist, expired_date, account_no, balance_type
FROM
dbo.BALANCE_inst
Retrieve the minimal create data for a balance instance with the lowest balance for a balance inst.
(SELECT
bill_company,
bill_name,
account_no,
balance_ist,
amount,
MIN(create_date)
FROM
dbo.mtr btr
LEFT JOIN
btr_balance btrb ON btr.balance_id = btrb.balance_id
AND btr.balance_id2 = btrb.balance_id2
LEFT JOIN
balance_inst bali ON btr.account_no = bali.account_no
AND btrb.expired_date = bali.expired_date
GROUP BY
bill_company, bill_name, account_no,amount, balance_ist)
I have seen some solutions about using correlated query but can't see to get my head around it.
Common Table Expression (CTE) will help you.
;with cte as (
select *, row_number() over(partition by company_name order by create_date) rn
from dbo.myTable
)
select * from cte
where rn = 1;
use row_number() i assumed bill_company is your company name
select * from
( SELECT bill_company,
bill_name,
account_no,
balance_ist,
amount,
create_date,
row_number() over(partition by bill_company order by create_date) rn
FROM dbo.mtr btr left join btr_balance btrb
on btr.balance_id = btrb.balance_id and btr.balance_id2 = btrb.balance_id2
left join balance_inst bali
on btr.account_no = bali.account_no and btrb.expired_date = bali.expired_date
) t where t.rn=1

sum of columns and list difference between rows

I am trying to get the difference between rows based on group by SELL_ID on the below table,
table1 - (table formatting courtesy of GitHub)
+---------+---------+----------+----------+------------------+---------+
| seq_ID | REQ_ID | CALL_ID | SELL_ID | REGION | COUNT |
+---------+---------+----------+----------+------------------+---------+
| 1 | 123 | C001 | S1 | AGL | 510563 |
| 2 | 123 | C001 | S1 | USL | 122967 |
| 3 | 123 | C001 | S1 | VALIC | 614106 |
| 4 | 123 | C001 | S2 | Inforce | 1247636 |
| 5 | 123 | C001 | S2 | NB | 0 |
| 6 | 123 | C001 | S3 | Seriatim Summary | 1247636 |
+---------+---------+----------+----------+------------------+---------+
I am trying to get the results as below,
table2 -
+---------+---------+----------+----------+-------+
| seq_ID | REQ_ID | CALL_ID | Summary | COUNT |
+---------+---------+----------+----------+-------+
| 1 | 123 | C001 | S1_vs_S2 | 0 |
| 2 | 123 | C001 | S2_vs_S3 | 0 |
| 3 | 123 | C001 | S3_vs_s1 | 0 |
+---------+---------+----------+----------+-------+
S1_vs_S2 is the difference between (sum(count) from table1 where sell_id='S1') and (sum(count) from table1 where sell_id='S2')
Below is the code that i am using, But couldn't fetch the results,
INSERT INTO table2 (SEQ_ID, REQ_ID,call_id,summary,count)
SELECT min(seq_id) seq_id
, req_id
, call_id
, S1_vs_S2
,((SELECT sum(c2) FROM TABLE_STG_CTRL WHERE source='S1')-
SELECT sum(c2) FROM TABLE_STG_CTRL WHERE source='S2'))
FROM table1
GROUP BY req_ID, Ctrl_ID, c1, source
ORDER BY SEQ_ID ;
Does this do what you want?
select req_id, call_id, sell_id,
lead(sell_id) over (partition by req_id, call_id order by seq_id) as next_sell_id,
(cnt -
lead(cnt) over (partition by req_id, call_id order by seq_id)
) as diff
from (select req_id, call_id, sell_id, sum(count) as cnt, min(seq_id) as seq_id
from t
group by req_id, call_id, sell_id
) t
At first group data on sell_id, req_id, call_id. This is subquery t in my code. Then self join properly this result and show difference. The only problem is to construct join condition carefully:
demo with your sample data
with t as (
select sell_id sid, req_id, call_id, sum(cnt) cnt
from table1
group by sell_id, req_id, call_id )
select case t1.sid when 'S1' then 1 when 'S2' then 2 when 'S3' then 3 end id,
t1.req_id, t1.call_id, t1.sid||'_vs_'||t2.sid call_id, t1.cnt - t2.cnt diff
from t t1
join t t2 on t1.req_id = t2.req_id
and t1.call_id = t2.call_id
and (t1.sid, t2.sid) in (('S1', 'S2'), ('S2', 'S3'), ('S3', 'S1'))
order by id
BTW count is Oracle reserved word, please avoid such names when naming columns etc.

Sql two table query most duplicated foreign key

I got those two tables sport and student:
First table sport:
|idsport | name |
_______________________
| 1 | bobsled |
| 2 | skating |
| 3 | boarding |
| 4 | iceskating |
| 5 | skiing |
Second table student:
foreign key
|idstudent | name | sport_idsport
__________________________________________
| 1 | john | 3 |
| 2 | pauly | 2 |
| 3 | max | 1 |
| 4 | jane | 2 |
| 5 | nico | 5 |
so far i did this it output which number is mostly inserted, but cant get it to work
with two tables
SELECT sport_idsport
FROM (SELECT sport_idsport FROM student GROUP BY sport_idsport ORDER BY COUNT(*) desc)
WHERE ROWNUM<=1;
I need to output name of most popular sport, in that case it would be skating.
I use oracle sql.
with counter as (
Select sport_idsport,
count(*) as cnt,
dense_rank() over (order by count(*) desc) as rn
from student
group by sport_idsport
)
select s.*, c.cnt
from sport s
join counter c on c.sport_idsport = s.idsport and c.rn = 1;
SQLFiddle example: http://sqlfiddle.com/#!4/b76e21/1
select cnt, sport_idsport from (
select count(*) cnt, sport_idsport
from student
group by sport_idsport
order by count(*) desc
)
where rownum = 1

how to get median for every record?

There's no median function in sql server, so I'm using this wonderful suggestion:
https://stackoverflow.com/a/2026609/117700
this computes the median over an entire dataset, but I need the median per record.
My dataset is:
+-----------+-------------+
| client_id | TimesTested |
+-----------+-------------+
| 214220 | 1 |
| 215425 | 1 |
| 212839 | 4 |
| 215249 | 1 |
| 210498 | 3 |
| 110655 | 1 |
| 110655 | 1 |
| 110655 | 12 |
| 215425 | 4 |
| 100196 | 1 |
| 110032 | 1 |
| 110032 | 1 |
| 101944 | 3 |
| 101232 | 2 |
| 101232 | 1 |
+-----------+-------------+
here's the query I am using:
select client_id,
(
SELECT
(
(SELECT MAX(TimesTested ) FROM
(SELECT TOP 50 PERCENT t.TimesTested
FROM counted3 t
where t.timestested>1
and CLIENT_ID=t.CLIENT_ID
ORDER BY t.TimesTested ) AS BottomHalf)
+
(SELECT MIN(TimesTested ) FROM
(SELECT TOP 50 PERCENT t.TimesTested
FROM counted3 t
where t.timestested>1
and CLIENT_ID=t.CLIENT_ID
ORDER BY t.TimesTested DESC) AS TopHalf)
) / 2 AS Median
) TotalAvgTestFreq
from counted3
group by client_id
but it is giving my funny data:
+-----------+------------------+
| client_id | median???????????|
+-----------+------------------+
| 100007 | 84 |
| 100008 | 84 |
| 100011 | 84 |
| 100014 | 84 |
| 100026 | 84 |
| 100027 | 84 |
| 100028 | 84 |
| 100029 | 84 |
| 100042 | 84 |
| 100043 | 84 |
| 100071 | 84 |
| 100072 | 84 |
| 100074 | 84 |
+-----------+------------------+
i can i get the median for every client_id ?
I am currently trying to use this awesome query from Aaron's site:
select c3.client_id,(
SELECT AVG(1.0 * TimesTested ) median
FROM
(
SELECT o.TimesTested ,
rn = ROW_NUMBER() OVER (ORDER BY o.TimesTested ), c.c
FROM counted3 AS o
CROSS JOIN (SELECT c = COUNT(*) FROM counted3) AS c
where count>1
) AS x
WHERE rn IN ((c + 1)/2, (c + 2)/2)
) a
from counted3 c3
group by c3.client_id
unfortunately, as Richardthekiwi points out:
it's for a single median whereas this question is about a median
per-partition
i would like to know how i can join it on counted3 to get the median per partition?>
Note: If testFreq is an int or bigint type, you need to CAST it before taking an average, otherwise you'll get integer division, e.g. (2+5)/2 => 3 if 2 and 5 are the median records - e.g. AVG(Cast(testfreq as float)).
select client_id, avg(testfreq) median_testfreq
from
(
select client_id,
testfreq,
rn=row_number() over (partition by CLIENT_ID
order by testfreq),
c=count(testfreq) over (partition by CLIENT_ID)
from tbk
where timestested>1
) g
where rn in (round(c/2,0),c/2+1)
group by client_id;
The median is found either as the central record in an ODD number of rows, or the average of the two central records in an EVEN number of rows. This is handled by the condition rn in (round(c/2,0),c/2+1) which picks either the one or two records required.
try this:
select client_id,
(
SELECT
(
(SELECT MAX(testfreq) FROM
(SELECT TOP 50 PERCENT t.testfreq
FROM counted3 t
where t.timestested>1
and c3.CLIENT_ID=t.CLIENT_ID
ORDER BY t.testfreq) AS BottomHalf)
+
(SELECT MIN(testfreq) FROM
(SELECT TOP 50 PERCENT t.testfreq
FROM counted3 t
where t.timestested>1
and c3.CLIENT_ID=t.CLIENT_ID
ORDER BY t.testfreq DESC) AS TopHalf)
) / 2 AS Median
) TotalAvgTestFreq
from counted3 c3
group by client_id
I added the c3 alias to the outer CLIENT_ID references and the outer table.