Grouping data and keeping only distinct values in SQL - sql

Is it possible to group and the following data in pgsql:
(TL;DR: Note the similar target entries for the two print_names qz.M2 and qz.M1)
print_name
target
qz.R
q3zA
qz.S
NULL
qz.M1
q2zA
qz.M1
q1zA
qz.M2
q2zA
qz.M2
q1zA
in such a way that the distinct values of target are still in the result while the doubling of qz.M* is avoided.
The result desired would therefore be:
print_name
target
qz.R
q3zA
qz.S
NULL
qz.M1
q2zA
qz.M2
q1zA
I tried:
SELECT min(target) FROM Table GROUP BY print_name;
However, this of course only yields one of two entries in target.
Thank you for your help!

I dont think this is achievable without casing specific print_name if you want consistent answer.
SELECT t.print_name
FROM Table t
CASE
WHEN t.print_name = 'qz.M1' THEN max(t.target)
WHEN t.print_name = 'qz.M2' THEN min(t.target)
ELSE t.target END as Target
GROUP BY t.print_name

Your desired results would seem to indicate just a simple aggregate:
select print_name, Max(target) target
from t
group by print_name
Note your sample data does not include any reliable method or sorting, max() will be based on string ordering.

Related

Eliminating Entries Based On Revision

I need to figure out how to eliminate older revisions from my query's results, my database stores orders as 'Q000000' and revisions have an appended '-number'. My query currently is as follows:
SELECT DISTINCT Estimate.EstimateNo
FROM Estimate
INNER JOIN EstimateDetails ON EstimateDetails.EstimateID = Estimate.EstimateID
INNER JOIN EstimateDoorList ON EstimateDoorList.ItemSpecID = EstimateDetails.ItemSpecID
WHERE (Estimate.SalesRepID = '67' OR Estimate.SalesRepID = '61') AND Estimate.EntryDate >= '2017-01-01 00:00:00.000' AND EstimateDoorList.SlabSpecies LIKE '%MDF%'
ORDER BY Estimate.EstimateNo
So for instance, the results would include:
Q120455-10
Q120445-11
Q121675-2
Q122361-1
Q123456
Q123456-1
From this, I need to eliminate 'Q120455-10' because of the presence of '-11' for that order, and 'Q123456' because of the presence of the '-1' revision. I'm struggling greatly with figuring out how to do this, my immediate thought was to use case statements but I'm not sure what is the best way to implement them and how to filter. Thank you in advance, let me know if any more information is needed.
First you have to parse your EstimateNo column into sequence number and revision number using CHARINDEX and SUBSTRING (or STRING_SPLIT in newer versions) and CAST/CONVERT the revision to a numeric type
SELECT
SUBSTRING(Estimate.EstimateNo,0,CHARINDEX('-',Estimate.EstimateNo)) as [EstimateNo],
CAST(SUBSTRING(Estimate.EstimateNo,CHARINDEX('-',Estimate.EstimateNo)+1, LEN(Estimate.EstimateNo)-CHARINDEX('-',Estimate.EstimateNo)+1) as INT) as [EstimateRevision]
FROM
...
You can then use
APPLY - to select TOP 1 row that matches the EstimateNo or
Window function such as ROW_NUMBER to select only records with row number of 1
For example, using a ROW_NUMBER would look something like below:
SELECT
ROW_NUMBER() OVER(PARTITION BY EstimateNo ORDER BY EstimateRevision DESC) AS "LastRevisionForEstimate",
-- rest of the needed columns
FROM
(
-- query above goes here
)
You can then wrap the query above in a simple select with a where predicate filtering out a specific value of LastRevisionForEstimate, for instance
SELECT --needed columns
FROM -- result set above
WHERE LastRevisionForEstimate = 1
Please note that this is to a certain extent, pseudocode, as I do not have your schema and cannot test the query
If you dislike the nested selects, check out the Common Table Expressions

Access Count of Rows Where Field is not Null

I need to count rows where the value of Master.[Date BP] is not Null - any ideas how I would do this?
I tried this but it doesn't seem to work.
SELECT Master.[Date BP], Count(Master.[Date BP]) AS CountOfField,
FROM Master
GROUP BY Master.[Date BP];
SELECT Count([Date BP]) AS CountOfField
FROM Master
Using count is the right idea. You just need to remove the group by clause, as you want a single answer. Additionally, you have a redundant comma at the end of the select list:
SELECT Count(Master.[Date BP]) AS CountOfField
FROM Master
The above answers are correct. Just in case you may face other problems with other aggregate functions, aggregate functions in SQL will ignore the null value.
Reference: https://msdn.microsoft.com/en-us/library/ms173454.aspx

How to select all data from table but only display date-specific rows within DATE-data type column, in Oracle SQL?

I'm experiencing trouble returning a query to return all columns within a table but limited to the DATE-data-type "enroll_date" column containing '30-Jan-07'; the closest solution is with the below query but neither data is displayed nor the entire workbook-just the column-which leads me to believe that this is not just an issue with approach but perhaps a formatting issue as well.
SELECT TO_DATE(enroll_date, 'DD-MM-YY')
FROM student.enrollment
WHERE enroll_date= '30-Jan-07';
Again, I need to display all columns but only rows only specific to the date '30-Jan-07'. I'm sure a nested solution is ideal and somehow the right solution, but unfortunately my chops aren't there yet but I'm working on it! :D
UPDATE
Please see attached screenshot of output. The query/solution should retrieve all columns and rows enclosed within the red-rectangle mark-up-thank you!
One possible problem is that the date column has a time component (this is hidden in SQL). One method is to use trunc():
SELECT e.*
FROM student.enrollment e
WHERE TRUNC(e.enroll_date) = DATE '2007-01-30';
You can specify whichever columns you want in the following query:
SELECT col1, col2, col3, ...
FROM student.enrollment
WHERE TO_CHAR(enroll_date, 'DD-MON-YY') = '30-JAN-07';

Oracle Group by issue

I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?
Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?
If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)