Multiple rows of data, need subquery to only pull one row? - sql

I currently have data like so:
Product_ID IND 1_Revenue 2_Revenue Revenue_Code Channel
1 S $50. $75. 1 E
1. S $50. $75. 2 SE
2. P $100. $0. 1 E
3. S $400. $60. 1 SE
3. S $400. $60. 2 S
I am trying to pick when IND=S, give me the row with the highest revenue if the channel= SE. the revenue code refers to the fields 1_Revenue and 2_Revenue.
So in this case I’d expect the output to have 2nd row and the 4th row.
I’ve tried multiple things and nothing has worked. What is the best solution?

As per our understanding a simple where clause is sufficient to get your result like:
select Product_ID, IND, 1_Revenue, 2_Revenue, Revenue_Code, Channel
from yourtable
where IND = 'S' and Channel = 'SE'
If anything else is required then kindly mention it.

I don't quite understand what is meant by the highest revenue. Based on your description, if you just apply a filter to pick rows where IND = S and channel = SE then won't you get rows 2 and 4 out? (as follows)
data want;
set have;
if IND = 'S' and channel = 'SE';
run;
or if you want to use SQL
PROC SQL;
create table want as
select * from have where IND = 'S' and channel = 'SE';
quit;

Related

How can I use a row value to dynamically select a column name in Oracle SQL 11g?

I have two tables, one with a single row for each "batch_number" and another with defect details for each batch. The first table has a "defect_of_interest" column which I would like to link to one of the columns in the second table. I am trying to write a query that would then pick the maximum value in that dynamically linked column for any "unit_number" in the "batch_number".
Here is the SQLFiddle with example data for each table: http://sqlfiddle.com/#!9/a1c27d
For example, the maximum value in the DEFECT_DETAILS.SCRATCHES column for BATCH_NUMBER = A1 is 12.
Here is my desired output:
BATCH_NUMBER DEFECT_OF_INTEREST MAXIMUM_DEFECT_COUNT
------------ ------------------ --------------------
A1 SCRATCHES 12
B3 BUMPS 4
C2 STAINS 9
I have tried using the PIVOT function, but I can't get it to work. Not sure if it works in cases like this. Any help would be much appreciated.
If the number of columns is fixed (it seems to be) you can use CASE to select the specific value according to the related table. Then aggregating is simple.
For example:
select
batch_number,
max(defect_of_interest) as defect_of_interest,
max(defect_count) as maximum_defect_count
from (
select
d.batch_number,
b.defect_of_interest,
case when b.defect_of_interest = 'SCRATCHES' then d.scratches
when b.defect_of_interest = 'BUMPS' then d.bumps
when b.defect_of_interest = 'STAINS' then d.stains
end as defect_count
from defect_details d
join batches b on b.batch_number = d.batch_number
) x
group by batch_number
order by batch_number;
See Oracle example in db<>fiddle.

SQL: Select Top 2 Query is Excluding Records with more than 2 Records

I just joined after having a problem writing a query in MS Access. I am trying to write a query that will pull out the first two valid samples in from a list of replicated sample results and then would like to average the sample values. I have written a query that does pull samples with only two valid samples and averages these values. However, my query doesn't pull samples where there are more than two valid sample results. Here's my query:
SELECT temp_platevalid_table.samp_name AS samp_name, avg (temp_platevalid_table.mean_conc) AS fin_avg, count(temp_platevalid_table.samp_valid) AS sample_count
FROM Temp_PlateValid_table
WHERE (Temp_PlateValid_table.id In (SELECT TOP 2 S.id
FROM Temp_PlateValid_table as S
WHERE S.samp_name = S.samp_name and s.samp_valid=1 and S.samp_valid=1
ORDER BY ID))
GROUP BY Temp_PlateValid_table.samp_name
HAVING ((Count(Temp_PlateValid_table.samp_valid))=2)
ORDER BY Temp_PlateValid_table.samp_name;
Here's an example of what I'm trying to do:
ID Samp_Name Samp_Valid Mean_Conc
1 54d2d2 1 15
2 54d2d2 1 20
3 54d2d2 1 25
The average mean_conc should be 17.5, however, with my current query, I wouldn't receive a value at all for 54d2d2. Is there a way to tweak my query so that I get a value for samples that have more than two valid values? Please note that I'm using MS Access, so I don't think I can use fancier SQL code (partition by, etc.).
Thanks in advance for your help!
Is this what you want?
select pv.samp_name, avg(pv.value_conc)
from Temp_PlateValid_table pv
where pv.samp_valid = 1 and
pv.id in (select top 2 id
from Temp_PlateValid_table as pv2
where pv2.samp_name = pv.samp_name and pv2.samp_valid = 1
)
group by pv.samp_name;
You might need avg(pv.value_conc * 1.0).

SQL Case with calculation on 2 columns

I have a value table and I need to write a case statement that touches 2 columns: Below is the example
Type State Min Max Value
A TX 2 15 100
A TX 16 30 200
A TX 31+ 500
Let say I have another table that has the following
Type State Weight Value
A TX 14 ?
So when I join the table , I need a case statement that looks at weight from table 2 , type and state - compare it to the table 1 , know that the weight falls between 2 and 15 from row 1 and update Value in table 2 with 100
Is this doable ?
Thanks
It returns 0 if there aren't rows in this range of values.
select Type, State, Weight,
(select coalesce(Value, 0)
from table_b
where table_b.Type = table_a.Type
and table_b.State = table_a.State
and table_a.Value between table_b.Min and table_b.Max) as Value
from table_a
For an Alteryx solution: (1) run both tables into a Join tool, joining on Type and State; (2) Send the output to a Filter tool where you force Weight to be between Min and Max; (3) Send that output to a Select tool, where you grab only the specific columns you want; (since the Join will give you all columns from all tables). Done.
Caveats: the data running from Join to Filter could be large, since you are joining every Type/State combination in the Lookup table to the other table. Depending on the size of your datasets, that might be cumbersome. Alteryx is very fast though, and at least we're limiting on State and Type, so if your datasets aren't too large, this simple solution will work fine.
With larger data, try to do it as part of your original select, utilizing one of the other solutions given here for your SQL query.
Considering that Min and Max columns in first table are of Integer type
You need to use INNER JOIN on ranges
SELECT *
FROM another_table a
JOIN first_table b
ON a.type = b.type
AND a.State = b.State
AND a.Weight BETWEEN b.min AND b.max

Conditional Select in SAS

I am trying to create a table in SAS, which is a subset of a larger table. I am using the following chart as an example. As you can see, columnA has 501 and 502 repeated twice. What I want is to select the row with the max number in ColumnB. The second chart is the result that I would like to have.
Chart 1
A B C
501 1 O
502 1 K
503 1 V
501 2 Y
502 2 U
504 1 I
Chart 2
A B C
503 1 V
501 2 Y
502 2 U
504 1 I
What I am thinking right now is:
PROC SQL;
CREATE TABLE CHART2 AS
SELECT
C.COLUMNA,
C.COLUMNC
FROM CHART1 C;
QUIT;
I am not sure how to say that when there is a duplicate rows in columnA, only select the rows where columnB has the max number. The formatting of the table is a little bit weirdo. I hope you get my point.
One option is to use the having clause in proc sql. Think of it as a filter that gets applied after any groupings have been done.
proc sql noprint;
create table want as
select *
from sashelp.class
group by sex
having age = max(age)
;
quit;
In the above code, we are keeping the rows where the age value on the row is equal to the maximum age (max(age)) for that sex (as we are grouping by sex).
You will notice in the results that for Females we get two rows returned because there were two records that had an age equal to the max female age, but only one row for Males.
Without more details about your data I can't be certain that this will exactly fit your needs but it may.
You can try this:
PROC SORT data = Chart1;
by A descending B;
RUN;
DATA Chart2;
set Chart1;
by A;
if first.A then output;
RUN;
The first step sorts your data by ascending order of A and then by descending order of B. The second step keeps only the first row for each value of A.

Merging cells within the same table in SAS or Proc Sql

I have a table with a customer column followed by multiple columns (relating to shops) and a flag to indicate if they have visited this shop, if they haven’t, the cell is null. The shops are listed in order of importance, with Shop1 being highest, the shop2, shop3 and so on…. The flag to say if a customer has visited that particular shop is a number relating to the shop number. So for example, if a customer hasn't visited shop1 this will be blank, but if they have visited shop2, this cell will be '2'.
I need to merge the columns together, to create a table which for each customer has the top 4 shops they have visited, so for example the entry for a customer could read first column '2', second column '5',third column '7', fourth column '8' as they haven't visited shops numbers 1,3,4 or 6. Could someone please help? Thank you.
I think this is what you are looking for. Take the input data (what I assume you mean), perform a transpose, filter the missing (null) values, and transpose again.
data input;
input customer $ shop01 shop02 shop03 shop04;
datalines;
Bill 1 2 . .
Ted . 2 3 .
;
proc sort data=input;
by customer;
run;
proc transpose data=input out=temp(where=(col1 ^= .) drop=_name_);
by customer;
run;
proc transpose data=temp out=output(drop=_name_);
by customer;
run;
This gives me:
Bill 1 2
Ted 2 3
Given how your data is laid out, you could use the smallest function with a variable list to return the first smallest non-missing value in your shop variables, the second smallest, and so on.
data test(drop=shop01-shop08);
set input;
first = smallest(1,of shop01-shop08);
second = smallest(2,of shop01-shop08);
third = smallest(3,of shop01-shop08);
fourth = smallest(4,of shop01-shop08);
run;