Joining onto a table that doesn't have ranges, but requires ranges - sql

Trying to find the best way to write this SQL statement.
I have a customer table that has the internal credit score of that customer. Then i have another table with definitions of that credit score. I would like to join these tables together, but the second table doesn't have any way to link it easily.
The score of the customer is an integer between 1-999, and the definition table has these columns:
Score
Description
And these rows:
60 LOW
99 MED
999 HIGH
So basically if a customer has a score between 1 and 60 they are low, 61-99 they are med, and 100-999 they are high.
I can't really INNER JOIN these, because it would only join them IF the score was 60, 99, or 999, and that would exclude anyone else with those scores.
I don't want to do a case statement with the static numbers, because our scores may change in the future and I don't want to have to update my initial query when/if they do. I also cannot create any tables or functions to do this- I need to create a SQL statement to do it for me.
EDIT:
A coworker said this would work, but its a little crazy. I'm thinking there has to be a better way:
SELECT
internal_credit_score
(
SELECT
credit_score_short_desc
FROM
cf_internal_credit_score
WHERE
internal_credit_score = (
SELECT
max(credit.internal_credit_score)
FROM
cf_internal_credit_score credit
WHERE
cs.internal_credit_score <= credit.internal_credit_score
AND credit.internal_credit_score <= (
SELECT
min(credit2.internal_credit_score)
FROM
cf_internal_credit_score credit2
WHERE
cs.internal_credit_score <= credit2.internal_credit_score
)
)
)
FROM
customer_statements cs

try this, change your table to contain the range of the scores:
ScoreTable
-------------
LowScore int
HighScore int
ScoreDescription string
data values
LowScore HighScore ScoreDescription
-------- --------- ----------------
1 60 Low
61 99 Med
100 999 High
query:
Select
.... , Score.ScoreDescription
FROM YourTable
INNER JOIN Score ON YourTable.Score>=Score.LowScore
AND YourTable.Score<=Score.HighScore
WHERE ...

Assuming you table is named CreditTable, this is what you want:
select * from
(
select Description, Score
from CreditTable
where Score > 80 /*client's credit*/
order by Score
)
where rownum = 1
Also, make sure your high score reference value is 1000, even though client's highest score possible is 999.
Update
The above SQL gives you the credit record for a given value. If you want to join with, say, Clients table, you'd do something like this:
select
c.Name,
c.Score,
(select Description from
(select Description from CreditTable where Score > c.Score order by Score)
where rownum = 1)
from clients c
I know this is a sub-select that executed for each returning row, but then again, CreditTable is ridiculously small and there will be no significant performance loss because of the the sub-select usage.

You can use analytic functions to convert the data in your score description table to ranges (I assume that you meant that 100-999 should map to 'HIGH', not 99-999).
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 60 score, 'Low' description from dual union all
3 select 99, 'Med' from dual union all
4 select 999, 'High' from dual
5 )
6 select description,
7 nvl(lag(score) over (order by score),0) + 1 low_range,
8 score high_range
9* from x
SQL> /
DESC LOW_RANGE HIGH_RANGE
---- ---------- ----------
Low 1 60
Med 61 99
High 100 999
You can then join this to your CUSTOMER table with something like
SELECT c.*,
sd.*
FROM customer c,
(select description,
nvl(lag(score) over (order by score),0) + 1 low_range,
score high_range
from score_description) sd
WHERE c.credit_score BETWEEN sd.low_range AND sd.high_range

Related

how can I count some values for data in a table based on same key in another table in Bigquery?

I have one table like bellow. Each id is unique.
id
times_of_going_out
fef666
2
S335gg
1
9a2c50
1
and another table like this one ↓. In this second table the "id" is not unique, there are different "category_name" for a single id.
id
category_name
city
S335gg
Games & Game Supplies
tk
9a2c50
Telephone Companies
os
9a2c50
Recreation Centers
ky
fef666
Recreation Centers
ky
I want to find the difference between destinations(category_name) of people who go out often(times_of_going_out<5) and people who don't go out often(times_of_going_out<=5).
** Both tables are a small sample of large tables.
 ・ Where do people who go out twice often go?
 ・ Where do people who go out 6times often go?
Thank you
The expected result could be something like
less than 5
more than 5
top ten “category_name” for uid’s with "times_of_going_out" less than 5 times
top ten “category_name” for uid’s with "times_of_going_out" more than 5 times
Steps:
combining data and aggregating total time_going_out
creating the categories that you need : less than equal to 5 and more than 5. if you don't need equal to 5, you can adjust the code
ranking both categories with top 10, using dense_rank(). this will produce the rank from 1 - 10 based on the total time_going out
filtering the cases so it takes top 10 values for both categories
with main as (
select
category_name,
sum(coalesce(times_of_going_out,0)) as total_time_per_category
from table1 as t1
left join table2 as t2
on t1.id = t2.id
group by 1
),
category as (
select
*,
if(total_time_per_category >= 5, 'more than 5', 'less than equal to 5') as is_more_than_5_times
from main
),
ranking_ as (
select *,
case when
is_more_than_5_times = 'more than 5' then
dense_rank() over (partition by is_more_than_5_times order by total_time_per_category desc)
else NULL
end AS rank_more_than_5,
case when
is_more_than_5_times = 'less than equal to 5' then
dense_rank() over (partition by is_more_than_5_times order by total_time_per_category)
else NULL
end AS rank_less_than_equal_5
from category
)
select
is_more_than_5_times,
string_agg(category_name,',') as list
from ranking_
where rank_less_than_equal_5 <=10 or rank_more_than_5 <= 10
group by 1

SQL query to get average sum of other rows and store in current rows

I have table like this, and I want to query to store the average of others row points.
USER_ID POINTS
------------- --------
a14e43e4f851 134
1e86e5adedbf 40
3c66730edf69 149
32e24082f97b 67
b33e3100a7be 124
274ee414ad8f 85
bdeef25fc797 172
For example - for user_id = a14e43e4f851, the average sum of points should be
avg(40+149+67+124+85+172) .
PS - not taken the points (134) in calculation for user a14e43e4f851.
Output should look like this --
USER_ID POINTS AVG
------------- ------- ------
a14e43e4f851 134 106 which is avg(40+149+67+124+85+172)
1e86e5adedbf 40 avg(134+149+67+124+85+172)
3c66730edf69 149 avg(134+40+67+124+85+172)
32e24082f97b 67 avg(134+40+149+124+85+172)
b33e3100a7be 124 ...
274ee414ad8f 85 ...
bdeef25fc797 172 ...
You could use a correlated subquery:
select t.*,
(select avg(t1.points) from mytable t1 where t1.user_id <> t.user_id) as average
from mytable t
An alternative uses window functions:
select t.*,
(sum(points) over() - points) / nullif(count(*) - 1, 0) as average
from mytable t
Note: avg obviously conflicts with a language keyword, I use average instead.
If you wanted an update statement:
update mytable t
set t.average = (
select avg(t1.points) from mytable t1 where t1.user_id <> t.user_id
)
However, I would not recommend actually storing this value; this is derived information, that can easily be computed on the fly whenever needed, using the first statement. If you are going to run the query often, you could create a view:
create view myview as
select t.*,
(sum(points) over() - points) / nullif(count(*) - 1, 0) as average
from mytable t
I'm assuming user_id is the PK.
WITH q AS (SELECT sum(points) AS s, count(*) AS n FROM mytable)
UPDATE table SET average = (q.s-points)/(q.n-1);
The idea is that
average of all other user's score is sum(score)/count(*)
sum of all user scores except this one is equal to sum of all scores minus this user's score
average of all other users' score except this one is (sum(score)-score_for_this_user)/(count(*)-1)
Nice thing is it only has to calculate the sum() and count() once.
To handle the case where there is only one row in the table:
WITH q AS (SELECT sum(points) AS s, NULLIF(count(*),0) AS n FROM mytable)
UPDATE table SET average = (q.s-points)/(q.n-1);
This makes the count NULL instead of 0, so the average updated should be NULL too.

SQL counting query

Sorry if this is a basic question.
Basically, I have a table that is as follows, below is a basic sample
store-ProdCode-result
13p I10x 5
13p I20x 7
13p I30x 8
14a K38z 23
17a K38z 23
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
All help is appreciated.
Thanks
The expected output should be: for every store(s_code) display the top 10 prodcode
so:
store--prodcode--result
1a abc 5
1a abd 4
2a dgf 1
2a ldk 6
.(10 times until next store code)
You can use the table twice in the FROM clause, once for the data, and once to get a count of how many records have fewer results for that store.
SELECT a.s_code, a.prod_code, count(*)
FROM top10_secondary a
LEFT OUTER JOIN top10_secondary b
ON a.s_code = b.s_code
AND b.result < a.result
GROUP BY a.s_code, a.prod_code
HAVING count(*) < 10
With this technique though, you may get more than 10 records per store if the 10th result value exists multiple times. Because the limit rule is simply "include record as long as there are less than 10 records with result values than mine"
It looks like in your case, "result" is a ranking, so they would not be duplicated per store.
This is a good case for Window functions.
SELECT
s_code,
prod_code,
prod_count
FROM
(
SELECT
s_code,
prod_code,
prod_count,
RANK() OVER (PARTITION BY s_code ORDER BY prod_Count DESC) as prod_rank
FROM
(SELECT s_code as store, prod_code, count(prod_Code) prod_count FROM table GROUP BY s_code, prod_code) t1
) t2
WHERE prod_rank <= 10
The inner most query gets the count of each product at the store. The second inner more query determines the rank for those products for each store based on that count. Then the outer most query limits the results based on that rank.
o

sql combining 2 queries with different order by group by

I have a query where I am counting the most frequent response in a database and ranking them by highest amount so using group by and order by.
The following shows how to do it for one:
select health, count(health) as count
from [Health].[Questionaire]
group by Health
order by count(Health) desc
which outputs the following:
Health Count
----------- -----
Very Good 6
Good 5
Poor 4
I would like to do with another column on the same table another query similar to the following so two queries using one sql statement like the following:
Health Count Diet Count
----------- ----- ----- -----
Very Good 6 Very Good 6
Good 5 Good 4
Poor 4 Poor 3
UPDATE!!
Hello this is how the table looks like at the moment
ID Diet Health
----------- ----- -------
101 Very Good Very Good
102 Poor Good
103 Poor Poor
I would like to do with another column on the same table another query similar to the following so two queries using one sql statement like the following:
Health Count Diet Count
----------- ----- ----- -----
Very Good 2 Very Good 1
Poor 1 Good 1
Good 0 Poor 1
Can anyone please help me out with this one?
Can provide further clarification if needed!
Here are 2 different ways of doing it, notice i removed the redundant column:
Test data:
DECLARE #t table(Health varchar(20), Diet varchar(20))
INSERT #t values
('Very good', 'Very good'),
('Poor', 'Good'),
('Poor', 'Poor')
Query 1:
;WITH CTE1 as
(
SELECT Health, count(*) CountHealth
FROM #t --[Health].[Questionaire]
GROUP BY health
), CTE2 as
(
SELECT Diet, count(*) CountDiet
FROM #t --[Health].[Questionaire]
GROUP BY Diet
)
SELECT
coalesce(Health, Diet) Grade,
coalesce(CountHealth, 0) CountHealth,
coalesce(CountDiet, 0) CountDiet
FROM CTE1
FULL JOIN
CTE2
ON CTE1.Health = CTE2.Diet
ORDER BY CountHealth DESC
Result 1:
Grade CountHealth CountDiet
Poor 2 1
Very good 1 1
Good 0 1
Mixing the results like that is really not good practice, so here is a different solution
Query 2:
SELECT Health, count(*) Count, 'Health' Grade
FROM #t --[Health].[Questionaire]
GROUP BY health
UNION ALL
SELECT Diet, count(*) CountDiet, 'Diet'
FROM #t --[Health].[Questionaire]
GROUP BY Diet
ORDER BY Grade, Count DESC
Result 2:
Health Count Grade
Good 1 Diet
Poor 1 Diet
Very good 1 Diet
Poor 2 Health
Very good 1 Health
You need to join the table to itself, but (as your sample data shows) to deal with gaps in actual data for specific values.
If you have a table that has the range of health/diet values:
select
v.value Status,
count(a.id) healthCount,
count(b.id) DietCount
from health_diet_values v
left join Questionaire a on a.health = v.value
left join Questionaire b on b.diet = v.value
group by v.value
or if you don't have such a table, you need to generate the list of values manually and join from that:
select
v.value Status,
count(a.id) healthCount,
count(b.id) DietCount
from (select 'Very Good' value union all
select 'Good' union all
select 'Poor') v
left join Questionaire a on a.health = v.value
left join Questionaire b on b.diet = v.value
group by v.value
Both of these queries produce zeroes if there is no matching data for the value.
Note that in your desired output you have a redundant column - you repeat the value column. The above queries produce output that looks like:
Status HealthCount DietCount
-------------------------------
Very Good 2 1
Good 1 1
Poor 0 1

SQL query to select min date

I have a SQL table called transaction where different type of transactions are stored e.g. Payment arrangements, sent letter and so on.
I have ran a query:
SELECT TOP 6 Case_Ref as Case Ref,TrancRefNO as Tranc RefNO, Date_CCYYMMDD, LetterSent, Arr_Freq,
SMS_Sent_CCYYMMDD
From Transaction
Where (LEN(LetterSent ) >0 OR Arr_Freq >0)
The table looks something like this
Case Ref Tranc RefNO Date_CCYYMMDD LetterSent Arr_Freq SMS_Sent_CCYYMMDD
-------- ----------- ---------- ---------- ---------- -----------------
15001 100 20140425 Stage1
15001 101 20140430 Stage2
15001 102 20140510 30
15001 104 20140610 30
15002 105 20140425 Stage1
15002 106 20140610 30
From the table, I can clearly see that a letter was sent on '20140430' for the case 15001 and the person started arrangements on '20140510'. And a letter was sent on '20140425' for the case 15001 and the person made arrangements on on '20140610'.
I'm trying to create a excel report using C# which will show the total number of cases got arrangements after getting a letter and total number of cases for arrangements after receiving a SMS.
I have tried
select MAX(ROW_NUMBER() OVER(ORDER BY o3.Date_CCYYMMDD ASC)), o3.
from
(
select o.TrancRefNO, o.Date_CCYYMMDD , sq.LetterSent
from Transaction o
join Transaction sq on sq.TrancRefNO= o.TrancRefNO
and sq.Date_CCYYMMDD <= o.Date_CCYYMMDD
where o.Arr_Freq >0
and len(sq.LetterSent ) > 0
) o2
join Transaction o3 on o3.TrancRefNO= o2.TrancRefNO
But gives me an error :
Msg 4109, Level 15, State 1, Line 2
Windowed functions cannot be used in the context of another windowed function or aggregate.
P.s Title will need to be changed as I don't know what to call it.
SELECT * FROM table as t1
WHERE (LetterSent != '' OR SMS_SENT_CCYYMMDD != '')
AND (SELECT COUNT(*) FROM table AS t2
WHERE t1.case_ref = t2.case_ref
AND t1.DATE_CCYYMMDD < t2.DATE_CCYYMMDD
AND Arr_freq > 0) > 1
My assumptions based on what I could glean from your post:
ARR_FREQ!='' indicates that some time of arrangement was made at the specified date
Since NULL is not shown, I'm assuming all values are ''. With null values you will have to use a coalesce command
Hope this helps. I'm not sure about your second question (max date) in the comments. You would need to explain it a bit more.
SELECT TOP 1 ROW_NUMBER() OVER(ORDER BY Date_CCYYMMDD ASC), mytable.*
FROM mytable
or just
SELECT TOP 1 * FROM mytable
ORDER BY Date_CCYYMMDD ASC
but i guess, you want to get not the MIN date overall, but group by first
SELECT * FROM table where Date =
(SELECT MIN(Date) from table)