Extract different rates associated with one ID

Extract different rates associated with one ID - sql

I have a single loan database with a user_id, loan_id, interest_rate, loan_date and other stuff that isn't relevant here.
How would I extract all the user_id's for those who took out at least two loans, and had the later ones at better interest rates.
select member_id, Annual_interest_rate, count(*)
from (select member_id, Annual_interest_rate, count(*)
from loan_book
group by member_id
having count(*)>1)
group by member_id, Annual_interest_rate
It shows the stuff from the subquery but with count 1 instead of count 2
Does the subquery destroy the necessary info? is there a way to write it as one query?
sample table
user loan air date
0001 2345 2.6 09/03
0002 1346 2.6 03/05
0003 1118 3.7 05/03
0002 6756 1.2 05/08
0003 1286 3.2 01/10
0001 2222 3.0 09/11
the result would be:
user loan air date
0002 6756 2.6 05/08
0003 1286 3.2 01/10
as those were the two loans that had better interest rates than their predecessors. If there are more than two then the ones that were better than one of their predeccessors should show

Here is a query that might work or at least the approach might help with some ideas.
SELECT LB2.*
FROM loan_book LB1 INNER JOIN loan_book LB2
ON LB1.user_id = LB2.user_id
AND LB1.loan_id != LB2.loan_id
AND LB1.loan_date < LB2.loan_date
AND LB1.interest_rate > LB2.interest_rate
You join the table with itself so each user will have two loans in each row and then you can do the necessary comparisons and groupings from the result. Hope this helps.

Related

Duplicate rows because 1 column has multiple distinct values

I'm running a SELECT query to get data across multiple tables in the same server instance. However I've just noticed that the rows pulled on some data get duplicated because the main table I'm pulling from has a few different values in one of the columns. Here's the query:
SELECT DISTINCT BIF030.C_ACCOUNT AS ACCOUNTNUMBER,
BIF003.C_ACCOUNTTYPE AS ACCOUNTTYPECODE,
CON013.C_DESCRIPTION AS ACCOUNTTYPE,
BIF003.C_DIVISION AS ZONE_DIVISONCODE,
CON028.C_DESCRIPTION AS ZONE_DIVISION,
BIF030.C_METER as METERNUMBER,
BIF005.C_METERCUSTOM1 AS REGISTERNUMBER,
CONVERT(DECIMAL(20,2), BIF030.N_CONSUMP) AS CONSUMPTION,
CON007.C_DESCRIPTION AS UNITS,
BIF030.T_READDATE AS READINGDATE,
MONTH(BIF030.T_READDATE) AS READINGMONTH,
DAY(BIF030.T_READDATE) AS READINGDAY,
YEAR(BIF030.T_READDATE) AS READINGYEAR,
BIF030.I_DAYS AS READINGDAYSCOUNT
FROM ADVANCED.BIF030
LEFT JOIN ADVANCED.CON007 ON CON007.C_UNITS=BIF030.C_UNITS
LEFT JOIN ADVANCED.BIF005 ON BIF005.C_METER=BIF030.C_METER
LEFT JOIN ADVANCED.BIF003 ON BIF003.C_ACCOUNT=BIF030.C_ACCOUNT
LEFT JOIN ADVANCED.CON013 ON CON013.C_ACCOUNTTYPE=BIF003.C_ACCOUNTTYPE
LEFT JOIN ADVANCED.CON028 ON CON028.C_DIVISION=BIF003.C_DIVISION
WHERE T_READDATE > '01-01-2014'
ORDER BY ACCOUNTNUMBER, READINGDATE ASC
I know SELECT DISTINCT is frowned upon, but I get even more rows without it. Here's a sample of what the data looks like when pulled:
ACCOUNTNUMBER
ACCOUNTTYPECODE
ACCOUNTTYPE
ZONE_DIVISIONCODE
ZONE_DIVISION
METERNUMBER
REGISTERNUMBER
CONSUMPTION
UNITS
READINGDATE
READINGMONTH
READINGDAY
READINGYEAR
READINGDAYSCOUNT
1234567
SP
ACCOUNT TYPE 1
00
00-NO ZONE
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
1234567
MF
ACCOUNT TYPE 2
02
02-GRAVITY
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
1234567
SR
ACCOUNT TYPE 3
02
02-GRAVITY
123456789
987654321
3.00
Thousands of Gallons
2014-01-16 00:00:00.00
1
16
2014
30
I also know the column that is messing this up is the "AccountTypeCode" because other accounts that don't have multiple codes associated with the "AccountNumber" only show 1 set of rows. So this one specifically (and probably others) is tripling the amount of rows pulled when it should only pull one for each "ReadingDate".
Also if anyone knows a good way to optimize the query I'd be happy to learn. I know just enough SQL to be dangerous, but not enough to figure this out. Thanks.

Ok. So good news and I want to add this in case it helps anyone else in the future. I found out that since the ACCOUNTTYPECODE and ZONE_DIVISIONCODE were coming from the table BIF003 I needed to add more in the WHERE statement. This is what fixed it for me:
AND BIF030.C_CUSTOMER = BIF003.C_CUSTOMER
Because the C_CUSTOMER column was different (it's a column in the BIF003 and BIF030 tables) which lead to the separate ACCOUNTTYPECODE results I need to check it in the WHERE statement.
Thanks everyone for kick starting my brain on this one.

How to get the set size, first and last record in a db2 ordered set with one call

I have a very big transaction table on DB2 v11, and I need to query a subset of it as efficiently as possible. All I need is the total count of the set (not known in advance, it's based on criteria, lets say 1 day) and the ID of the first record, and the ID of the last record.
The old code was fetching the entire table, then just using the 1st record ID, and the last record ID, and size, and not making use of the rest. Now this code is timing out. It's a complex query of several joins.
IS there a way to just fetch the size of the set, 1st record, last record all in one select query ?
I've read that reordering the list in order to fetch the 1st record(so fetch with Desc, then change to Asc) is not efficient.
sample table 1 TRANSACTION_RECORDS:
tdID TIMESTAMP name
-------------------------------
123 2020-03-31 john
234 2020-03-31 dan
456 2020-03-01 Eve
675 2020-04-01 joy
sample table 2 TRANSACTION_TYPE:
invoiceId tdID account
------------------------------
897 123 abc
898 123 def
877 234 mnc
899 456 opp
Sample query
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
group by tr.tdID
order by TR.tdID ASC
This results in multiple columns, (but it requires the group by)
123,123
234,234
456,456
What I want is:
123,456

As I mentioned in the comments, for this query you don't need Group BY and neither Order by, just do:
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
It should work as expected

Given account contributions: How to sum contributions per individual? In relation to a threshold?

Given contribution amounts per account, how do I 1)SUM the contributions made by each individual, 2)Find the number of people who have contributed <, =, or > $5,000?
Right now I have a database table "[dbo].[FakeRRSPs]" which looks like:
Account_ID
Personal_ID
Contributions
My current code gives the # of unique individuals successfully:
select distinct(personal_id), sum(contributions), count(account_id),
(select count(distinct(personal_id))
from [dbo].[FakeRRSPs]
)
from [dbo].[FakeRRSPs]
where personal_id is not null
group by personal_id
For example, there are 2M people holding 2.5M accounts.
Issues I face:
How do I count the number of individuals who contribute below, at, or
above the $5K threshold (after SUM(contribution) per person)
There are people who contribute $10K total for example, $5K in 2
accounts. Both accounts are picked up when I'm hoping to only capture
the SUM(Contribution) for this person.
I hope this is clear enough - it certainly isn't to me! Thanks everyone.

SQL Fiddle
MS SQL Server 2017 Schema Setup:
create table Contribution (PID int,AID int,C int)
insert into Contribution(PID,AID,C)VALUES(235,1245,1200)
insert into Contribution(PID,AID,C)VALUES(256,1246,0)
insert into Contribution(PID,AID,C)VALUES(256,1247,3500)
insert into Contribution(PID,AID,C)VALUES(256,1248,10000)
insert into Contribution(PID,AID,C)VALUES(421,1249,0)
Query 1:
select * from (select PID,sum(C) AS SC from Contribution
group by PID) as test
where test.SC<=5000
Results:
| PID | SC |
|-----|------|
| 235 | 1200 |
| 421 | 0 |

Create column based on grouping other values

I have difficulties formulating my issue.
I have a view which brings these results. There's a need to add a column to the view, which will pair up round-trip flights with identical number.
Flt_No From_Airport To_Airport Dep_Date RequiredResult
124 |LCA |CDG |10/19/14 5:00 1
125 |CDG |LCA |10/19/14 10:00 1
197 |LCA |BCN |10/4/12 5:00 2
198 |BCN |LCA |10/4/12 11:00 2
501 |LCA |HER |15/8/12 12:05 3
502 |HER |LCA |15/8/12 15:15 3
I.e. flight 124 is going from Larnaca to CDG, and flight 125 is going back from CDG to Larnaca - they both have to have the same identifier.
Round-trip flights will always have following flight numbers.
I have a bunch of conditions which I won't write now.
Omitting hours is not an option, they're important.
I was thinking dense_rank() but I don't know how to create one identifier for 2 flights with different numbers, please help.

If your data is similar to the sample data posted, then the following query should give the required result:
SELECT *,
DENSE_RANK() OVER (ORDER BY CASE
WHEN From_Airport < To_Airport THEN From_Airport
ELSE To_Airport
END)
FROM mytable

Join conditions are not limited to simple equality. Assuming {Flight No, Departure, Destination} is unique on any one day, then a self join should do it:
select whatever
from flights outbound
inner join flights inbound on outbound.flt_no+1 = inbound.flt_no
and cast(outbound.dep_date, date)
= cast(inbound.dep_date, date)
and outbound.From_Airport = inbound.To_Airport
and outbound.To_Airpott = inbound.From_Ariport

SQL hourly log , show all matching rows that have a value below threshold for n hours

I have a simple SQL log table (named market_history in SQLite) for US markets it looks something like this:
Sample table (market_history)
id datetime market percent
1 9/5/2014 7:50 ARIZONA 50.0
2 9/5/2014 7:50 ATLANTA 97.4
3 9/5/2014 7:50 AUSTIN 78.8
4 9/5/2014 7:50 BOSTON 90.9
6 9/5/2014 7:50 CHARLOTTE 100.0
7 9/5/2014 7:50 CHICAGO 90.3
This table is an hourly snapshot of network capacity in various systems in each market. What I would like to do is set up an alert system that if any one particular market is below a threshold percent (say 50) for more than 2 consecutive hours (each row is recorded every hour), it triggers an alert email.. So the query should show me a a unique list of Market names where the percents is < 50.0 for more than the last 2 consecutive entries
Here's the SQL I'm trying, but it's not working:
Sample SQL (not working):
SELECT
mh.datetime, mh.market, mh.percent
FROM markets_history mh
WHERE
(SELECT mh1.precent FROM markets_history mh1 WHERE mh1.datetime BETWEEN "2015-03-23 00:00:00" AND "2015-03-23 00:59:59" AND mh.market=mh1.market ) < 50 AND (SELECT mh2.precent FROM markets_history mh2 WHERE mh2.datetime BETWEEN "2015-03-23 01:00:00" AND "2015-03-23 01:59:59" AND mh.market=mh2.market ) < 50
ORDER by mh.datetime
I know I'm missing something.. any sugggestions

If the time windows are fixed and reliable, just make sure the largest one isn't more than the threshold. It wouldn't really matter how far back you look either if you needed to extend this to more than two.
select market
from markets_history mh
where mh.datetime between <last_two_hours> and <now>
group by mh.market
having max(percent) < 50.0
-- and count(*) = 2 /* if you need to be sure of two ... */

Here is an approach that should work in SQLite. Find the last good id (if any) in each market. Then count the number of rows larger than than id.
select lastgood.market,
sum(case when lastgood.market is null then 1
when lastgood.id < mh.id then 1
else 0
end) as NumInRow
from market_history mh left join
(select market, max(id) as maxid
from market_history mh
where percent < 50.0
group by market
) as lastgood
on lastgood.market = mh.market and lastgood.id < mh.id;
This query is a little bit complicated because it needs to take into account the possibility of there not being any good id. If that is the case, then all rows for the market count.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Extract different rates associated with one ID - sql

Related

Duplicate rows because 1 column has multiple distinct values

How to get the set size, first and last record in a db2 ordered set with one call

Given account contributions: How to sum contributions per individual? In relation to a threshold?

Create column based on grouping other values

SQL hourly log , show all matching rows that have a value below threshold for n hours

Categories

Resources