SQL MS-Access Select Distinct for multiple columns - sql

sorry for asking on this topic again, but I havent been able to derive a solution to my problem from existing answers.
I have one Table ("Data") from which I need to pull three columns ( "PID", "Manager", "Customer" )
and only the "PID" has to be distinct. I dont care which records are pulled for the other columns ("Manger" / "Customer" ) it could be the first entry or whatever.
SELECT Distinct PID, Manager, Customer
FROM Data;
Will give me all the rows where PID,Manager and Customer are distinct, so if there is two entrys with the same PID but with a different Manager, I will get two records instead of one.
Thank you very much.

You can do this
Hope you will find this helpful
SELECT PID, max(Manager), max(Customer)
FROM Data
group by PID
Or
SELECT PID, min(Manager), min(Customer)
FROM Data
group by PID
EDIT
I will give you an example to explain you the Max & Min Func
Here is the Sample Table
CREATE TABLE data(
PID int ,
Manager varchar(20) ,
Customer varchar(20)
) ;
insert into data
values
(1,'a','b'),
(1,'c','d'),
(3,'1','e'),
(3,'5','e'),
(3,'3','e')
Now,
These are the Three Queries that will return respective outputs,,
select * from data;
SELECT PID, max(Manager), max(Customer)
FROM Data
group by PID;
SELECT PID, min(Manager), min(Customer)
FROM Data
group by PID
Output for the above queries is
Explanation :
MAX :
MAX is returning C & 5 for Manager Coz, C is greater then A & likewise 5 is greater then 1 & 3
Min fuction is totally opposite of MAX function & is self explenatory.
I have also created on demo Please click to see the demo on Fiddle
Click Here To See The Demo

SELECT "PID", max("Manager"), max("Customer")
FROM "Data"
GROUP BY "PID";
This query returns unique "PID"s and max values of "Manager" and "Customer" for each "PID".
DISTINCT is applied for all the columns from the select list. So you need to use GROUP BY + an aggregate function (returns one value for several rows).

Related

Group by question in SQL Server, migration from MySQL

Failed finding a solution to my problem, would love your help.
~~ Post has been edited to have only one question ~~-
Group by one query while selecting multiple columns.
In MySQL you can simply group by whatever you want, and it will still select all of them, so if for example I wanted to select the newest 100 transactions, grouped by Email (only get the last transaction of a single email)
In MySQL I would do that:
SELECT * FROM db.transactionlog
group by Email
order by TransactionLogId desc
LIMIT 100;
In SQL Server its not possible, googling a bit suggested to specify each column that I want to have with an aggregate as a hack, that couldn't cause a mix of values (mixing columns between the grouped rows)?
For example:
SELECT TOP(100)
Email,
MAX(ResultCode) as 'ResultCode',
MAX(Amount) as 'Amount',
MAX(TransactionLogId) as 'TransactionLogId'
FROM [db].[dbo].[transactionlog]
group by Email
order by TransactionLogId desc
TransactionLogId is the primarykey which is identity , ordering by it to achieve the last inserted.
Just want to know that the ResultCode and Amount that I'll get doing such query will be of the last inserted row, and not the highest of the grouped rows or w/e.
~Edit~
Sample data -
row1:
Email : test#email.com
ResultCode : 100
Amount : 27
TransactionLogId : 1
row2:
Email: test#email.com
ResultCode:50
Amount: 10
TransactionLogId: 2
Using the sample data above, my goal is to get the row details of
TransactionLogId = 2.
but what actual happens is that I get a mixed values of the two, as I do get transactionLogId = 2, but the resultcode and amount of the first row.
How do I avoid that?
Thanks.
You should first find out which is the latest transaction log by each email, then join back against the same table to retrieve the full record:
;WITH MaxTransactionByEmail AS
(
SELECT
Email,
MAX(TransactionLogId) as LatestTransactionLogId
FROM
[db].[dbo].[transactionlog]
group by
Email
)
SELECT
T.*
FROM
[db].[dbo].[transactionlog] AS T
INNER JOIN MaxTransactionByEmail AS M ON T.TransactionLogId = M.LatestTransactionLogId
You are currently getting mixed results because your aggregate functions like MAX() is considering all rows that correspond to a particular value of Email. So the MAX() value for the Amount column between values 10 and 27 is 27, even if the transaction log id is lower.
Another solution is using a ROW_NUMBER() window function to get a row-ranking by each Email, then just picking the first row:
;WITH TransactionsRanking AS
(
SELECT
T.*,
MostRecentTransactionLogRanking = ROW_NUMBER() OVER (
PARTITION BY
T.Email -- Start a different ranking for each different value of Email
ORDER BY
T.TransactionLogId DESC) -- Order the rows by the TransactionLogID descending
FROM
[db].[dbo].[transactionlog] AS T
)
SELECT
T.*
FROM
TransactionsRanking AS T
WHERE
T.MostRecentTransactionLogRanking = 1

Multiplying fields from separate columns which have the same ID in SQL?

I have two tables which are joined by an ID...
table 1
- Assessment ID
- Module ID
- Assessment Weighting
table 2
- ID
- AssessmentID
- ModuleID
- UserID
- MarkFrom100
An assessment can have many students taking the assessment.
For example
A module has two assessments, one worth 60% and the other worth 40%. in table 2, I want to take the weighting value from table 1 and multiply it against the mark from 100.
SELECT * FROM Assessment, ModuleAssessmentUser WHERE
INNER JOIN moduleassementuser.assessmentID on Assessment.assessmentID
MULTIPLY AssessmentWeighting BY MarkFrom100 AS finalmark
UserID = 1
I know this is way off, but I really don't know how else to go about it.
My SQL knowledge is limited, so any help is appreciated!
You may use a SUM function in your query which will sum all the data of a certain group in a sub query wich will allow you to multiply the sum to the weight
sub query :
SELECT ModuleID, AssessmentID, UserID, SUM(MarkFrom100) as Total
FROM Table_2
GROUP BY ModuleID
Then use this sub query as a table in a main query :
SELECT T1.Assessment_ID, T1.ModuleID, Q1.UserID (Q1.Total * T1.Assessment_Weighting) as FinalMark
FROM (SELECT ModuleID, UserID, SUM(MarkFrom100) as Total
FROM Table_2
GROUP BY ModuleID) AS Q1
INNER JOIN Table_1 as T1 on T1.ModuleID = Q1.ModuleID
-- WHERE T1.ModuleID = 2 -- a particular module ID
GROUP BY ModuleID;
Note that the WHERE statement is in comment. If you want the whole data, remove it, if you want a particular data, use it ^^
NOTE :
I don't have your database, so it may need some tweeks, but the main idea is there

Get unique records from table avoiding all duplicates based on two key columns

I have a table Trial_tb with columns p_id,t_number and rundate.
Sample values:
p_id|t_number|rundate
=====================
111|333 |1/7/2016||
111|333 |1/1/2016||
222|888 |1/8/2016||
222|444 |1/2/2016||
666|888 |1/6/2016||
555|777 |1/5/2016||
pid and tnumber are key columns. I need fetch values such that the result should not have any record in which pid-tnumber combination are duplicated. For example there is duplication for 111|333 and hence not valid. The query should fetch all other than first two records.
I wrote below script but it fetches only the last record. :(
select rundate,p_id,t_number from
(
select rundate,p_id,t_number,
count(p_id) over (partition by p_id) PCnt,
count(t_number) over (partition by t_number) TCnt
from trialtb
)a
where a.PCnt=1 and a.TCnt=1
The having clause is ideal for this job. Having allows you to filter on aggregated records.
-- Finding unique combinations.
SELECT
p_id,
t_number
FROM
trialtb
GROUP BY
p_id,
t_number
HAVING
COUNT(*) = 1
;
This query returns combinations of p_id and t_number that occur only once.
If you want to include rundate you could add MAX(rundate) AS rundate to the select clause. Because you are only looking at unique occurrences the max or min would always be the same.
Do you mean:
select
p_id,t_number
from
trialtb
group by
p_id,t_number
having
count(*) = 1
or do you need the run date too?
select
p_id,t_number,max(rundate)
from
trialtb
group by
p_id,t_number
having
count(*) = 1
Seeing as you are only looking items with one result using max or min should work fine

SQL Server Sum multiple rows into one - no temp table

I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.

SQL Max Function per group

I have a complex query and which may return more than one record per group. There is a field that has a numeric sequential number. If in a group there is more than one record returned I just want the record with the highest sequential number.
I’ve tried using the SQL MAX function, but if I try to add more than one field it returns all records, instead of the one with the highest sequential field in that group.
I am trying to accomplish this in MS Access.
Edit: 4/5/11
Trying to create a table as an example of what I am trying to do
I have the following table:
tblItemTrans
ItemID(PK)
Eventseq(PK)
ItemTypeID
UserID
Eventseq is a number field that increments for each ItemID. (Don’t ask me why, that’s how the table was created.) Each ItemID can have one or many Evenseq’s. I only need the last record (max(Eventseq)) PER each ItemTypeID.
Hope this helps any.
SELECT A.*
FROM YourTable A
INNER JOIN (SELECT GroupColumn, MAX(SequentialColumn) MaxSeq
FROM YourTable
GROUP BY GroupColumn) B
ON A.GroupColumn = B.GroupColumn AND A.SequentialColumn = B.MaxSeq
If your SequentialNumber is an ID (unique across the table), then you could use
select *
from tbl
where seqnum in (
select max(seqnum) from tbl
group by groupcolumn)
If it is not, an alternative to Lamak's query is the Access domain function DMAX
select *
from tbl
where seqnum = DMAX("seqnum", "tbl", "groupcolumn='" & groupcolumn & "'")
Note: if the groupcolumn is a date, use # instead of single quotes ' in the above, if it is a numeric, remove the single quotes.