SQL select join the row with max (arithmatic(value1, value2)) - sql

I am trying to make a Trade system where people can make offer on the items they want. There are two currencies in the system, gold and silver. 100 silver = 1 gold. Note that people can make offers the same price as others, so there could be duplicate highest offer price.
Table structure looks roughly like this
Trade table
ID
TradeOffer table
ID
UserID
TradeID references Trade(ID)
GoldOffer
SilverOffer
I want to display to the user a list of trades sorted by the highest offer price whenever they do a search with constraint.
The Ideal output would be similar to this
Trade.ID TradeOffer.ID HighestGoldOffer HighestSilverOffer UserID
where HighestGoldOffer and HighestSilverOffer are the value of GoldOffer and SilverOffer column of the Offer with highest (GoldOffer * 100 + SilverOffer) and UserID is the user who made the offer
I know I can run 2 separate queries, one to retrieve all the Trades that satisfies all the constraint and extract all the ID to run another query to get the highest offer, but I am a perfectionist so I would prefer to do it with one sql instead of two.
I could just select all offers that are (GoldOffer * 100 + SilverOffer) = MAX (GoldOffer * 100 + SilverOffer) but this would possibly return duplicated Trade if there are multiple people offered the same price. Also there could be nobody offered on the Trade yet so GoldOffer and SilverOffer will be empty, I would still like to show the Trade as no offer when this happened.
Hope I made myself clear and thanks for any help

Model and test data
CREATE TABLE Trade (ID INT)
CREATE TABLE TradeOffer
(
ID INT,
UserID INT,
TradeID INT,
GoldOffer INT,
SilverOffer INT
)
INSERT Trade VALUES (1), (2), (3)
INSERT TradeOffer VALUES
(1, 1, 1, 10, 15),
(2, 2, 1, 11, 15),
(3, 1, 2, 10, 16),
(4, 2, 2, 10, 16)
Query
SELECT
[TradeID],
[TradeOfferID],
[HighestGoldOffer],
[HighestSilverOffer],
[UserID]
FROM (
SELECT
t.ID AS [TradeID],
tOffer.ID AS [TradeOfferID],
tOffer.GoldOffer AS [HighestGoldOffer],
tOffer.SilverOffer AS [HighestSilverOffer],
tOffer.[UserID],
RANK() OVER (
PARTITION BY t.ID
ORDER BY (([GoldOffer] * 100) + [SilverOffer]) DESC
) AS [Rank]
FROM Trade t
LEFT JOIN TradeOffer tOffer
ON tOffer.TradeID = t.ID
) x
WHERE [Rank] = 1
Result

Related

SQL split volume into buckets with no repetition of bucket ID

Im solving maybe simple but for me impossible task in MS SQL. I have 2 tables- 1 with company name and total volume of used bottles. In second table I have a list of buckets with unique ID and their bottle capacity. My task is to assign to each company correct number of buckets (to cover all volume of bottles) while not using the same buckets twice (not repeat buckets with the same ID for 2 or moře companies).
Is anyone able to help me with that?
Thank you!
Given the additional simplifications that every bottle is the same and every bucket has the same capacity (per your comment), this will do the trick:
-- demo schema
create table companies (cname char, bottlesUsed int);
create table buckets (id int, capacity int);
-- demo data
insert companies values ('a', 41), ('b', 2), ('c', 5), ('d', 50);
insert buckets select top 20 row_number() over (order by object_id), 20 from sys.objects;
with
bucketnums as
(
select i = row_number() over (order by id),
id
from buckets
),
bucketRanges as
(
select cname,
firstBucketNum = 1 + lag(lastBucketNum, 1, 0) over (order by cname),
lastBucketNum
from ( -- running total of bucket count required by each customer
select cname,
lastBucketNum = sum(ceiling(bottlesUsed * 1.0 / 20))
over (order by cname rows unbounded preceding)
from companies
) t
)
select conmpanyName = br.cname,
allocatedBucketId = bn.id
from bucketRanges br
join bucketnums bn on bn.i between firstBucketNum and lastBucketNum;
If the bottle size or bucket capacity is variable, this problem becomes much more... "interesting" :)

How to specify a linear programming-like constraint (i.e. max number of rows for a dimension's attributes) in SQL server?

I'm looking to assign unique person IDs to a marketing program, but need to optimize based on each person's Probability Score (some people can be sent to multiple programs, some only one) and have two constraints such as budgeted mail quantity for each program.
I'm using SQL Server and am able to put IDs into their highest scoring program using the row_number() over(partition by person_ID order by Prob_Score), but I need to return a table where each ID is assigned to a program, but I'm not sure how to add the max mail quantity constraint specific to each individual program. I've looked into the Check() constraint functionality, but I'm not sure if that's applicable.
create table test_marketing_table(
PersonID int,
MarketingProgram varchar(255),
ProbabilityScore real
);
insert into test_marketing_table (PersonID, MarketingProgram, ProbabilityScore)
values (1, 'A', 0.07)
,(1, 'B', 0.06)
,(1, 'C', 0.02)
,(2, 'A', 0.02)
,(3, 'B', 0.08)
,(3, 'C', 0.13)
,(4, 'C', 0.02)
,(5, 'A', 0.04)
,(6, 'B', 0.045)
,(6, 'C', 0.09);
--this section assigns everyone to their highest scoring program,
--but this isn't necessarily what I need
with x
as
(
select *, row_number()over(partition by PersonID order by ProbabilityScore desc) as PersonScoreRank
from test_marketing_table
)
select *
from x
where PersonScoreRank='1';
I also need to specify some constraints: two max C packages, one max A & one max B package can be sent. How can I reassign the IDs to a program while also using the highest probability score left available?
The final result should look like:
PersonID MarketingProgram ProbabilityScore PersonScoreRank
3 C 0.13 1
6 C 0.09 1
1 A 0.07 1
6 B 0.045 2
You need to rethink your ROW_NUMBER() formula based on your actual need, and you should also have a table of Marketing Programs to make this work efficiently. This covers the basic ideas you need to incorporate to efficiently perform the filtering you need.
MarketingPrograms Table
CREATE TABLE MarketingPrograms (
ProgramID varchar(10),
PeopleDesired int
)
Populate the MarketingPrograms Table
INSERT INTO MarketingPrograms (ProgramID, PeopleDesired) Values
('A', 1),
('B', 1),
('C', 2)
Use the MarketingPrograms Table
with x as (
select *,
row_number()over(partition by ProgramId order by ProbabilityScore desc) as ProgramScoreRank
from test_marketing_table
)
select *
from x
INNER JOIN MarketingPrograms m
ON x.MarketingProgram = m.ProgramID
WHERE x.ProgramScoreRank <= m.PeopleDesired

avoiding group by for column used in datediff?

As the database is currently constructed, I can only use a Date Field of a certain table in a datediff-function that is also part of a count aggregation (not the date field, but that entity where that date field is not null. The group by in the end messes up the counting, since the one entry is counted on it's own / as it's own group.
In some detail:
Our lead recruiter want's a report that shows the sum of applications, and conducted interviews per opening. So far no problem. Additionally he likes to see the total duration per opening from making it public to signing a new employee per opening and of cause only if the opening could already be filled.
I have 4 tables to join:
table 1 holds the data of the opening
table 2 has the single applications
table 3 has the interview data of the applications
table 4 has the data regarding the publication of the openings (with the date when a certain opening was made public)
The problem is the duration requirement. table 4 holds the starting point and in table 2 one (or none) applicant per opening has a date field filled with the time he returned a signed contract and therefor the opening counts as filled. When I use that field in a datediff I'm forced to also put that column in the group by clause and that results in 2 row per opening. 1 row has all the numbers as wanted and in the second row there is always that one person who has a entry in that date field...
So far I haven't come far in thinking of a way of avoiding that problem except for explanining to the colleague that he get's his time-to-fill number in another report.
SELECT
table1.col1 as NameOfProject,
table1.col2 as Company,
table1.col3 as OpeningType,
table1.col4 as ReasonForOpening,
count (table2.col2) as NumberOfApplications,
sum (case when table2.colSTATUS = 'withdrawn' then 1 else 0 end) as mberOfApplicantsWhoWithdraw,
sum (case when table3.colTypeInterview = 'PhoneInterview' then 1 else 0 end) as NumberOfPhoneInterview,
...more sum columns...,
table1.finished, // shows „1“ if opening is occupied
DATEDIFF(day, table4.colValidFrom, **table2.colContractReceived**) as DaysToCompletion
FROM
table2 left join table3 on table2.REF_NR = table3.REF_NR
join table1 on table2.PROJEKT = table1.KBEZ
left join table4 on table1.REFNR = table4.PRJ_REFNR
GROUP BY
**table2.colContractReceived**
and all other columns except the ones in aggregate (sum and count) functions go in the GROUP BY section
ORDER BY table1.NameOfProject
Here is a short rebuild of what it looks like. First a row where the opening is not filled and all aggregations come out in one row as wanted. The next project/opening shows up double, because the field used in the datediff is grouped independently...
project company; no_of_applications; no_of_phoneinterview; no_of_personalinterview; ... ; time_to_fill_in_days; filled?
2018_312 comp a 27 4 2 null 0
2018_313 comp b 54 7 4 null 0
2018_313 comp b 1 1 1 42 1
I'd be glad to get any idea how to solve this. Thanks for considering my request!
(During the 'translation' of all the specific column and table names I might have build in a syntax error here and there but the query worked well ecxept for that unwanted extra aggregation per filled opening)
If I've understood your requirement properly, I believe the issue you are having is that you need to show the date between the starting point and the time at which an applicant responded to an opening, however this must only show a single row based on whether or not the position was filled (if the position was filled, then show that row, if not then show that row).
I've achieved this result by assuming that you count a position as filled using the "ContractsRecevied" column. This may be wrong however the principle should still provide what you are looking for.
I've essentially wrapped your query in to a subquery, performed a rank ordering by the contractsfilled column descending and partitioned by the project. Then in the outer query I filter for the first instance of this ranking.
Even if my assumption about the column structure and data types is wrong, this should provide you with a model to work with.
The only issue you might have with this ranking solution is if you want to aggregate over both rows within one (so include all of the summed columns for both the position filled and position not filled row per project). If this is the case let me know and we can work around that.
Please let me know if you have any questions.
declare #table1 table (
REFNR int,
NameOfProject nvarchar(20),
Company nvarchar(20),
OpeningType nvarchar(20),
ReasonForOpening nvarchar(20),
KBEZ int
);
declare #table2 table (
NumberOfApplications int,
Status nvarchar(15),
REF_NR int,
ReturnedApplicationDate datetime,
ContractsReceived bit,
PROJEKT int
);
declare #table3 table (
TypeInterview nvarchar(25),
REF_NR int
);
declare #table4 table (
PRJ_REFNR int,
StartingPoint datetime
);
insert into #table1 (REFNR, NameOfProject, Company, OpeningType, ReasonForOpening, KBEZ)
values (1, '2018_312', 'comp a' ,'Permanent', 'Business growth', 1),
(2, '2018_313', 'comp a', 'Permanent', 'Business growth', 2),
(3, '2018_313', 'comp a', 'Permanent', 'Business growth', 3);
insert into #table2 (NumberOfApplications, Status, REF_NR, ReturnedApplicationDate, ContractsReceived, PROJEKT)
values (27, 'Processed', 4, '2018-04-01 08:00', 0, 1),
(54, 'Withdrawn', 5, '2018-04-02 10:12', 0, 2),
(1, 'Processed', 6, '2018-04-15 15:00', 1, 3);
insert into #table3 (TypeInterview, REF_NR)
values ('Phone', 4),
('Phone', 5),
('Personal', 6);
insert into #table4 (PRJ_REFNR, StartingPoint)
values (1, '2018-02-25 08:00'),
(2, '2018-03-04 15:00'),
(3, '2018-03-04 15:00');
select * from
(
SELECT
RANK()OVER(Partition by NameOfProject, Company order by ContractsReceived desc) as rowno,
table1. NameOfProject,
table1.Company,
table1.OpeningType,
table1.ReasonForOpening,
case when ContractsReceived >0 then datediff(DAY, StartingPoint, ReturnedApplicationDate) else null end as TimeToFillInDays,
ContractsReceived Filled
FROM
#table2 table2 left join #table3 table3 on table2.REF_NR = table3.REF_NR
join #table1 table1 on table2.PROJEKT = table1.KBEZ
left join #table4 table4 on table1.REFNR = table4.PRJ_REFNR
group by NameOfProject, Company, OpeningType, ReasonForOpening, ContractsReceived,
StartingPoint, ReturnedApplicationDate
) x where rowno=1

Sort user preferences through their selections and rank

Overview
I'm attempting to create a sorter that allows me to get only possible preferences based on the ranks of users and their preferences
I'm not really sure where to start with this. Below you'll see a SQL Fiddle of a simplified version of what I'm looking at.
http://sqlfiddle.com/#!3/40f0c5/1/0
Initial Code
CREATE TABLE selections
(
id int,
item_id int,
preference int
);
CREATE TABLE ranks
(
id int,
rank int
);
INSERT INTO selections
(id, item_id, preference)
VALUES
(14063, 1, 1),
(14063, 2, 2),
(14063, 3, 3),
(15026, 1, 2),
(15026, 2, 1),
(15026, 3, 3),
(25014, 1, 1),
(25014, 2, 2),
(25014, 3, 3);
INSERT INTO ranks
(id, rank)
VALUES
(14063, 1),
(15026, 2),
(25014, 3);
Expected Outcome
Based on the tables below, if I run the sorter, we should see the results showing the below. Ideally, I would ONLY want to show the item the user got, based on their preference and rank.
14063(1) - item(1)
15026(2) - item(2)
25014(3) - item(3)
I was able to come up with a working solution for you, but it's far from perfect: using a WHILE loop like I'm doing here breaks one of the basic rules of SQL optimization, which is to work with set-based queries as opposed to RBAR. That said, though, I tried coming up with a way to do this with a CTE, with ROW_NUMBER(), and with some NOT EXISTS queries, and failed each time because of the dual nature of the sort. My WHILE loops are pretty unimpressive, so hopefully someone can come along and suggest some improvements for you. There are plenty of people out there whose righteous indignation could probably motivate a criticism or two - hopefully they'll also toss in some ideas or an answer of their own. :)
With that cheerfully self-critical caveat, and wishing you the best of luck on performance, here's a query that will get you the desired resultset:
DECLARE #SortingOutcome TABLE
(
UserID INT,
UserRank INT,
ItemID INT,
ItemPreference INT
)
DECLARE #Looper INT = 1
DECLARE #Ender INT
SELECT #Ender = MAX(Rank) FROM Ranks
WHILE #Looper <= #Ender
BEGIN
INSERT INTO #SortingOutcome
(
UserID,
UserRank,
ItemID,
ItemPreference
)
SELECT TOP 1
r.ID,
rank,
item_id,
preference
FROM
Ranks r
INNER JOIN
Selections s ON
r.id = s.ID
WHERE
r.rank = #Looper AND
NOT EXISTS
(
SELECT 1
FROM #SortingOutcome
WHERE ItemID = s.item_id
)
ORDER BY preference
SET #Looper = #Looper + 1
END
SELECT * FROM #SortingOutcome
SQLFiddle

sql select a field into 2 columns

I am trying to run below 2 queries on the same table and hoping to get results in 2 different columns.
Query 1: select ID as M from table where field = 1
returns:
1
2
3
Query 2: select ID as N from table where field = 2
returns:
4
5
6
My goal is to get
Column1 - Column2
-----------------
1 4
2 5
3 6
Any suggestions? I am using SQL Server 2008 R2
Thanks
There has to be a primary key to foreign key relationship to JOIN data between two tables.
That is the idea about relational algebra and normalization. Otherwise, the correlation of the data is meaningless.
http://en.wikipedia.org/wiki/Database_normalization
The CROSS JOIN will give you all possibilities. (1,4), (1,5), (1, 6) ... (3,6). I do not think that is what you want.
You can always use a ROW_NUMBER() OVER () function to generate a surrogate key in both tables. Order the data the way you want inside the OVER () clause. However, this is still not in any Normal form.
In short. Why do this?
Quick test database. Stores products from sporting goods and home goods using non-normal form.
The results of the SELECT do not mean anything.
-- Just play
use tempdb;
go
-- Drop table
if object_id('abnormal_form') > 0
drop table abnormal_form
go
-- Create table
create table abnormal_form
(
Id int,
Category int,
Name varchar(50)
);
-- Load store products
insert into abnormal_form values
(1, 1, 'Bike'),
(2, 1, 'Bat'),
(3, 1, 'Ball'),
(4, 2, 'Pot'),
(5, 2, 'Pan'),
(6, 2, 'Spoon');
-- Sporting Goods
select * from abnormal_form where Category = 1
-- Home Goods
select * from abnormal_form where Category = 2
-- Does not mean anything to me
select Id1, Id2 from
(select ROW_NUMBER () OVER (ORDER BY ID) AS Rid1, Id as Id1
from abnormal_form where Category = 1) as s
join
(select ROW_NUMBER () OVER (ORDER BY ID) AS Rid2, Id as Id2
from abnormal_form where Category = 2) as h
on s.Rid1 = h.Rid2
We definitely need more information from the user.