Several rows grouped in one row

Several rows grouped in one row - sql

I need a sql solution for this problem I am dealing with:
I have the following rows in a table
cod coda pricea priceb pricec
x1 y 20 50
x2 y 20 50
x3 y 60
x4 z 80
x5 z 85
x6 z 85
I need to get this result in only one row considering prices are always the same by coda
coda pricea priceb pricec
y 20 50 60
z 80 85 85
How can I get this result with sql?
I tried to do it by sum and group by coda but it returns the sum of prices.

If prices are always same for a coda value, you could use group by and use min/max to get it in the same row...
select coda
, min(pricea) pricea
, min(priceb) priceb
, min(pricec) pricec
FROM table
group by coda

Related

Guidance needed | Optimization challenge. Would love to get some inputs 🙏🏻

I need some guidance on how to approach for this problem. I've simplified a real life example and if you can help me crack this by giving me some guidance, it'll be awesome.
I've been looking at public optimization algorithms here (https://www.cvxpy.org/) but I'm a noob and I'm not able to figure out which algorithm would help me (or if I really need it).
Problem:
x1 to x4 are items with certain properties (a,b,c,y,z)
I have certain needs:
Parameter My Needs
a 150
b 800
c 80
My goal is get all optimal coefficient sets for x1 to x4 (can be
fractions) so as to get as much of a, b and c as possible to satisfy
needs from the smallest possible y.
These conditions must always be met:
1)Individual values of z should stay within threshold (between maximum and minimum for x1, x2, x3 and x4)
2)And Total y should be maintained within limits (y <=1000 & y>=2000)
To illustrate an example:
x1
Each x1 has the following properties
a 20 Minimum z 10 Maximum z 50
b 200
c 0
y 300
z 20
x2
Each x2 has the following properties
a 30 Minimum z 60 Maximum z 160
b 5
c 20
y 50
z 40
x3
Each x3 has the following properties
a 20 Minimum z 100 Maximum z 200
b 200
c 15
y 200
z 40
x4
Each x4 has the following properties
a 5 Minimum z 100 Maximum z 300
b 30
c 20
y 500
z 200
One possible arrangement can be (not the optimal solution as I'm trying to keep y as low as possible but above 1000 but to illustrate output)
2x1+2x2+1x3+0.5x4
In this instance:
Coeff x1 2
Coeff x2 2
Coeff x3 3
Coeff x4 0.5
This set of coefficients yields
Optimal?
total y 1550 Yes
total a 162.5 Yes
total b 1025 Yes
total c 95 Yes
z of x1 40 Yes
z of x2 80 Yes
z of x3 120 Yes
z of x4 100 Yes
Lowest y? No
Can anyone help me out?
Thanks!

SQL : SELECT all rows with maximum values and with WHERE condition also

I've table which look like this:
id data version rulecode
---------------------------
a1 1 100 x1
a2 1 100 x1
a1 1 200 x2
a4 2 500 x2
a7 2 200 x1
a6 2 500 x1
a7 2 500 x2
a8 2 150 x2
a9 3 120 x1
a10 3 130 x2
a10 3 120 x1
a12 3 130 x2
a13 3 130 x1
a14 3 110 x2
a15 3 110 x1
a16 4 220 x1
a17 4 230 x2
a18 4 240 x2
a19 4 240 x1
..........................
..........................
Now I want only those rows which has maximum version and data value as (1,2 and 4)
When I tried with dense_rank(), I am getting only those rows which have 1 value from data column:
SELECT * FROM
(SELECT *, dense_rank() OVER (ORDER BY version desc) as col
FROM public.table_name WHERE data in (1,2,4))x
WHERE x.col=1
Output:
id data version rulecode
---------------------------
a1 1 200 x2
My expected output:
id data version rulecode
a1 1 200 x2
a4 2 500 x2
a6 2 500 x1
a8 2 500 x2
a18 4 240 x2
a19 4 240 x1
Note: the value of data column is till millions.
Can someone help me out here to get the expected output?

You seem to want a PARTITION BY:
SELECT *
FROM (SELECT *,
DENSE_RANK() OVER (PARTITION BY data ORDER BY version desc) as seqnum
FROM public.table_name
WHERE data in (1, 2, 4)
) x
WHERE x.seqnum = 1

Using analytic functions:
WITH cte AS (
SELECT *, MAX(version) OVER (PARTITION BY data) max_version
FROM yourTable
)
SELECT id, data, version, rulecode
FROM cte
WHERE version = max_version AND data IN (1, 2, 4);
Note that we could have also filtered the data values inside the CTE. I will leave it as is, for a general solution to your problem.

How to do conditional count based on row value in SAS/SQL?

Re-uploading since there was some problems with my last post, and I did not know that we were supposed to post sample data. I'm fairly new to SAS, and I have a problem that I know how to solve in Excel but not SAS. however, the dataset is too large to reasonably use in Excel.
I have four variables: id, year_start, groupname, test_score.
Sample data:
id year_start group_name test_score
1 19931231 Red 90
1 19941230 Red 89
1 19951231 Red 91
1 19961231 Red 92
2 19930630 Red 85
2 19940629 Red 87
2 19950630 Red 95
3 19950931 Blue 90
3 19960931 Blue 90
4 19930331 Red 95
4 19940331 Red 97
4 19950330 Red 98
4 19960331 Red 95
5 19931231 Red 96
5 19941231 Red 97
My goal is to achieve a ranked list (fractional) by test_score for each year. I hoped that I would be able to achieve this using PROC RANK FRACTION. This function would calculate order by a test_score (highest is 1, 2nd highest is 2 and so on) and then divide by the total number of observations to provide a fractional rank. Unfortunately, year_start differs widely from row to row. For each id/year combo, I want to perform a one-year look-back from year-start, and rank that observation compared to all other id's that have a year_start in that one year range. I'm not interested in comparing by calendar year, and the rank of each id should be relative to its own year_start. Adding another level of complication, I would like this rank to be performed by groupname.
PROC SQL is totally fine if someone has a SQL solution.
Using the above data, the ranks would be like this:
id year_start group_name test_score rank
1 19931231 Red 90 0.75
1 19941230 Red 89 0.8
1 19951231 Red 91 1
1 19961231 Red 92 1
2 19930630 Red 85 1
2 19940629 Red 87 0.8
2 19950630 Red 95 0.75
3 19950931 Blue 90 1
3 19960931 Blue 90 1
4 19930331 Red 95 1
4 19940331 Red 97 0.2
4 19950330 Red 98 0.2
4 19960331 Red 95 0.333
5 19931231 Red 96 0.25
5 19941231 Red 97 0.667
In order to calculate the rank for row 1,
we first exclude blue observations.
Then we count the number of observations that fall within a year before that year_start, 19931231 (so we have 4 observations).
We count how many of these observations have a higher test_score, and then add 1 to find the order of the current observation (So it is the 3rd highest).
Then, we divide the order by the total number to get the rank (3/4= 0.75).
In Excel, the formula for this variable would look something like this. Assume formula is for row 1 and there are 100 rows. id=A, year_start=B, groupname=C, and test_score=D:
=(1+countifs(D1:D100,">"&D1,
B1:B100,"<="&B1,
B1:B100,">"&B1-365.25,
C1:C100, C1))/
countifs(B1:B100,"<="&B1,
B1:B100,">"&B1-365.25,
C1:C100, C1)
Thanks so much for the help!
ahammond428

Your example isn't correct if I'm reading it correctly, so it's hard to know exactly what you're trying to do. But try the following and see if it works. You may need to tweak inequalities to be open or closed depending on whether you want to include one year to the date. Note that your year_start column needs to be imported in a SAS date format for this to work. Otherwise you can change it over with input(year_start, yymmdd8.).
proc sql;
select distinct
a.id,
a.year_start,
a.group_name,
a.test_score,
1+sum(case when b.test_score > a.test_score then 1 else 0 end) as rank_num,
count(b.id) as rank_denom,
calculated rank_num / calculated rank_denom as rank
from testdata a left join testdata b
on a.group_name = b.group_name
and intnx('year',a.year_start,-1,'s') le b.year_start le a.year_start
group by a.id, a.year_start, a.group_name, a.test_score
order by id, year_start;
quit;
Note that I changed dates of 9/31 to 9/30 (since there is no 9/31), but left 3/30, 6/29, and 12/30 alone since perhaps that was intended, though the other dates seem to be quarter-end.

Consider correlated count subqueries in SQL:
DATA
data ranktable;
infile datalines missover;
input id year_start group_name $ test_score;
datalines;
1 19931231 Red 90
1 19941230 Red 89
1 19951231 Red 91
1 19961231 Red 92
2 19930630 Red 85
2 19940629 Red 87
2 19950630 Red 95
3 19950930 Blue 90
3 19960930 Blue 90
4 19930331 Red 95
4 19940331 Red 97
4 19950330 Red 98
4 19960331 Red 95
5 19931231 Red 96
5 19941231 Red 97
;
run;
data ranktable;
set ranktable;
format year_start date9.;
year_start = input(put(year_start,z8.),yymmdd8.);
run;
PROC SQL
Additional fields included for your review
proc sql;
select r.id, r.year_start, r.group_name, r.test_score,
put(intnx('year', r.year_start, -1, 's'), yymmdd10.) as year_ago,
(select count(*) from ranktable sub
where sub.test_score >= r.test_score
and sub.group_name = r.group_name
and sub.year_start <= r.year_start
and sub.year_start >= intnx('year', r.year_start, -1, 's')) as num_rank,
(select count(*) from ranktable sub
where sub.group_name = r.group_name
and sub.year_start <= r.year_start
and sub.year_start >= intnx('year', r.year_start, -1, 's')) as denom_rank,
calculated num_rank / calculated denom_rank as rank
from ranktable r;
run;
OUTPUT
You will notice a slight difference between your expected results which may be due to the quarter day (365.25) you apply for all years as SAS's intnx takes one full calendar year in days which change with each year

Firebird sql select multiple rows

I have a table in Firebird 2.5 like
Point X Y Z
1 100 100 50
2 110 120 50.34
3 145 155 56
How can I make a select query to select point 1 and point 3 with result
point1 P1X P1Y P1Y point2 P2X P2Y P2Z
1 100 100 50 3 145 155 56

What you really want to do is a bit unclear. The following returns the desired results:
select min(point) as point1, min(x) as p1x, min(y) as p1y, min(z) as p1z,
max(point) as point2, max(x) as p2x, max(y) as p2y, max(z) as p2z
from t;
Alternatively, you might want:
select p1.*, p2.*
from t p1 join
t p2
on p1.point = 1 and p2.point = 3;

SQL combine columns based on another column

I've got a list of items and two locations (X and Y) that these items can be in.these two locations have these items in different quantities.
So When someone places an order for a few items, the items can be pulled from either of these two locations.
Below is the 'Orders' table I've created but it shows two columns for two locations and available stock.
ItemNumber Location Stock X Stock Y
A X 12 32
B X 10 54
C X 5 23
A Y 54 30
C Y 65 36
D Y 76 23
E X 12 31
F X 32 19
F Y 72 40
What I want to see is available stock for the requested location in a column, not both locations and stock availability in two columns as I've done above.
Result table I Want to see is,
ItemNumber Location Avail Stock
A X 12
B X 10
C X 5
A Y 30
C Y 36
D Y 23
E X 12
F X 32
F Y 40
I just cant get my head around this to do it. great if anyone could help or tell me if its even possible.
Thanks

You can use a CASE WHEN expression:
SELECT ItemNumber,
Location,
CASE WHEN Location = 'X' THEN [Stock X]
WHEN Location = 'Y' THEN [Stock Y]
END Avail_Stock
FROM Orders

You have the union tag, so:
SELECT ItemNumber,
Location,
[Stock X] AS Avail_Stock
FROM Orders
WHERE Location = 'X'
UNION
SELECT ItemNumber,
Location,
[Stock Y] AS Avail_Stock
FROM Orders
WHERE Location = 'Y'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Several rows grouped in one row - sql

If prices are always same for a coda value, you could use group by and use min/max to get it in the same row... select coda , min(pricea) pricea , min(priceb) priceb , min(pricec) pricec FROM table group by coda

Related

Guidance needed | Optimization challenge. Would love to get some inputs 🙏🏻

SQL : SELECT all rows with maximum values and with WHERE condition also

How to do conditional count based on row value in SAS/SQL?

Firebird sql select multiple rows

SQL combine columns based on another column

Categories

Resources