Transformation into a table related to BI - sql

Current situation
date (nvarchar(9))
sku (nvarchar(5))
smith (decimal)
jones (decimal)
jonson (decimal)
nguyen (decimal)
date sku smith jones johnson nguyen
-----------------------------------------------------------
11/4/2007 X2271 2404 9055,33 7055,22 0
11/4/2007 B1112 108,99 0 244,92 1001.01
Requested result:
date sku salesperson sales
------------------------------------------
11/4/2007 X2271 Smith 2404
11/4/2007 X2271 Jones 9055,33
11/4/2007 X2271 Johnson 7055,22
11/4/2007 B1112 Smith 108,99
11/4/2007 B1112 Johnson 244,92
11/4/2007 B1112 Nguyen 1001,01
Gonna need some help to solve this problem. This task is related to a BI task.

You can use the UNPIVOT command to accomplish this.
SELECT
date,
sku,
salesperson,
sales
FROM (
SELECT
date,
sku,
smith,
jones,
johnson,
nguyen
FROM
YourTable
) q
UNPIVOT
(sales FOR salesperson in (smith, jones, johnson, nguyen)) AS YourUnpivot

Related

Get max value and max date from sql query

I have a table with duplicate member names but these duplicates also have more than one date and a specific ID. I want the row of the member name with the most recent date (because a member could have been called more than one time a day) and biggest CallID number.
MemberID FirstName LastName CallDate CallID
0123 Carl Jones 2019-03-01 123456
0123 Carl Jones 2020-10-12 215789
0123 Carl Jones 2020-10-12 312546
2045 Sarah Marty 2021-05-09 387945
2045 Sarah Marty 2021-08-11 398712
4025 Jane Smith 2021-10-18 754662
4025 Jane Smith 2021-11-03 761063
8282 Suzy Aaron 2019-12-12 443355
8282 Suzy Aaron 2019-12-12 443386
So the desired output from this table would be
MemberID FirstName LastName CallDate CallID
0123 Carl Jones 2020-10-12 312546
2045 Sarah Marty 2021-08-11 398712
4025 Jane Smith 2021-11-03 761063
8282 Suzy Aaron 2019-12-12 443386
The query I've tried is
SELECT DISTINCT MemberID, FirstName, LastName, MAX(CallDate) as CallDate, MAX(CallID) as CallID
FROM dbo.table
GROUP BY MemberID, FirstName, LastName, CallDate, CallID
ORDER BY LastName asc;
But I'm still getting duplicate names with all their calldates and CallID
try removing CallDate, CallID from the group by clause.
So :
SELECT MemberID, FirstName, LastName, MAX(CallDate) as CallDate, MAX(CallID) as CallID
FROM dbo.table
GROUP BY MemberID, FirstName, LastName
ORDER BY LastName asc;
Hopefully that should do it.
you can use window function:
select * from (
select * , row_number() over (partition by MemberID order by CallID desc) rn
from tablename
) t where rn = 1

How do I select a max date by person in a table

I am not too advanced with SSRS/SQL queries, and need to write a report that pulls out % allocations by person to then compare to a wage table to allocate the wages. These allocations change quarterly, but all allocations continue to be stored in the table. If a persons allocation did not change, they do NOT get a new entry in the table. Here is a sample table called Allocations.
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
01/01/20
A
25.00
Doe
Jane
01/01/20
B
25.00
Doe
Jane
01/01/20
C
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
01/01/20
A
100.00
Wayne
Bruce
04/01/20
B
100.00
The results that I would want to have from this sample table when querying it are:
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
04/01/20
B
100.00
However, I would also like to pull this by comparing it to a date that the user inputs, so that they could run this report at any point in time and get the correct "max" dates. So, for example, if there were also 7/1/20 dates in here, but the user input date was 6/30/20, I would NOT want to pull the 7/1/20 data. In other words, I would like to pull the rows with the maximum date by name w/o going over the user's input date.
Any idea on the best way to accomplish this?
Thanks in advance for any advice you can provide.
In SQL, ROW_NUMBER can be used to order records in groups by a particular field.
SELECT * FROM (
SELECT *, ROW_NUMBER()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1
Then you filter for ROW_NUM = 1.
However, I noticed that there are a couple with the same date and you want both. In this caseyou'd want to use RANK - which allows for ties so there may be multiple records with the same date that you want to capture.
SELECT * FROM (
SELECT *, RANK()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1

Find duplicate batches based on multiple columns

I have a table that contains a series of related records (batches). Each batch has a unique id and can contain customer payments. I want to find if a batch is duplicate even if it is submitted on different days.
A batch can have 1 or more records. Here is sample data set:
BatchId InputAmount CustomerName BatchDate
------- ----------- ------------ ----------
182944 $475.00 Barry Smith 16-Mar-2019
182944 $260.00 John Smith 16-Mar-2019
182944 $265.00 Jane Smith 16-Mar-2019
182944 $400.00 Sara Smith 16-Mar-2019
182944 $175.00 Andy Smith 16-Mar-2019
182945 $475.00 Barry Smith 16-Mar-2019
182945 $260.00 John Smith 16-Mar-2019
182945 $265.00 Jane Smith 16-Mar-2019
182945 $400.00 Sara Smith 16-Mar-2019
182945 $175.00 Andy Smith 16-Mar-2019
183194 $100.00 Paul Green 21-Mar-2019
183195 $100.00 Nancy Green 21-Mar-2019
183197 $150.00 John Brown 20-Mar-2019
183197 $210.00 Sarah Brown 20-Mar-2019
183198 $150.00 John Brown 21-Mar-2019
183198 $210.00 Sarah Brown 21-Mar-2019
183200 $125.00 John Doe 20-Mar-2019
183200 $110.00 Sarah Doe 20-Mar-2019
183202 $125.00 John Doe 21-Mar-2019
183202 $110.00 Sarah Doe 21-Mar-2019
183202 $115.00 Paul Rudd 21-Mar-2019
Batches (182944, 182945) and (183197,183198) are duplicate while the other batches are not.
I thought maybe I could create a summary table with counts and sums and get close but I'm having trouble finding the true duplicates by including the names as well.
DECLARE #Summaries TABLE(
BatchId INT,
BatchDate DATETIME,
BatchCount INT,
BatchAmount MONEY)
-- Summarize the Data so we can look for duplicates
INSERT INTO #Summaries
SELECT a.BatchId, a.BatchDate, COUNT(*) AS RecordCount, SUM(a.InputAmount) AS BatchAmount
FROM Batches a
WHERE a.BatchDate BETWEEN '20190316' and '20190321'
GROUP BY a.BatchId, a.BatchDate
ORDER BY a.BatchId DESC
-- find the potential duplicate batches based on the Counts and Sums
SELECT A.* FROM #Summaries A
INNER JOIN (SELECT BatchCount, BatchAmount, BatchDate FROM #Summaries
GROUP BY BatchCount, BatchAmount, BatchDate
HAVING COUNT(*) > 1) B
ON A.BatchCount = B.BatchCount
AND A.BatchAmount = B.BatchAmount
WHERE DATEDIFF(DAY, a.BatchDate, b.BatchDate) BETWEEN -1 AND 1
Thank you for the help. I'm using a SQL Server 2012 database.
you can try like below
with cte as
(select BatchId from table_name
group by BatchId
having count(*)>1
) select * from table_name a where a.BatchId in (select BatchId from cte)

SQL aggregation without min and max

I'm relatively new to SQL. I currently have the following CoursesTbl
StudentName CourseID InstructorName
Harry Potter 180 John Wayne
Harry Potter 181 Tiffany Williams
John Williams 180 Robert Smith
John Williams 181 Bob Adams
Now what I really want is this:
StudentName Course1(180) Course2(181)
Harry Potter John Wayne Tiffany Williams
John Williams Robert Smith Bob Adams
I've tried this query:
Select StudentName, Min(InstructorName) as Course1, Max(InstructorName) as
Course2 from CoursesTbl
Group By StudentName
Now it's clear to me that I need to group by the Student Name. But using Min and Max messes up the instructor order.
i.e. Min for Harry is John Wayne and Max is Tiffany Williams
Min for John Williams is Bob Adams and Max is Robert Smith.
So it does not display instructors in the correct order.
Can anyone please suggest how this could be fixed?
You can use conditional aggregation with a CASE statement along with an aggregate function to PIVOT the data into columns:
select
[StudentName],
Course1 = max(case when CourseId = 180 then InstructorName end),
Course2 = max(case when CourseId = 181 then InstructorName end)
from #Table1
group by StudentName
See Demo. You could also use the PIVOT function to get the result:
select
StudentName,
Course1 = [180],
Course2 = [181]
from
(
select StudentName,
CourseId,
InstructorName
from #Table1
) d
pivot
(
max(InstructorName)
for CourseId in ([180], [181])
) piv
Another Demo.

How do I transpose multiple rows to columns in SQL

My first time reading a question on here.
I am working at a university and I have a table of student IDs and their supervisors, some of the students have one supervisor and some have two or three depending on their subject.
The table looks like this
ID Supervisor
1 John Doe
2 Peter Jones
2 Sarah Jones
3 Peter Jones
3 Sarah Jones
4 Stephen Davies
4 Peter Jones
4 Sarah Jones
5 John Doe
I want to create a view that turns that into this:
ID Supervisor 1 Supervisor 2 Supervisor 3
1 John Doe
2 Peter Jones Sarah Jones
3 Peter Jones Sarah Jones
4 Stephen Davies Peter Jones Sarah Jones
5 John Doe
I have looked at PIVOT functions, but don't think it matches my needs.
Any help is greatly appreciated.
PIVOT was the right clue, it only needs a little 'extra' :)
DECLARE #tt TABLE (ID INT,Supervisor VARCHAR(128));
INSERT INTO #tt(ID,Supervisor)
VALUES
(1,'John Doe'),
(2,'Peter Jones'),
(2,'Sarah Jones'),
(3,'Peter Jones'),
(3,'Sarah Jones'),
(4,'Stephen Davies'),
(4,'Peter Jones'),
(4,'Sarah Jones'),
(5,'John Doe');
SELECT
*
FROM
(
SELECT
ID,
'Supervisor ' + CAST(ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Supervisor) AS VARCHAR(128)) AS supervisor_id,
Supervisor
FROM
#tt
) AS tt
PIVOT(
MAX(Supervisor) FOR
supervisor_id IN ([Supervisor 1],[Supervisor 2],[Supervisor 3])
) AS piv;
Result:
ID Supervisor 1 Supervisor 2 Supervisor 3
1 John Doe NULL NULL
2 Peter Jones Sarah Jones NULL
3 Peter Jones Sarah Jones NULL
4 Peter Jones Sarah Jones Stephen Davies
5 John Doe NULL NULL
You will notice that the assignment to Supervisor X is done by ordering by the Supervisor-VARCHAR. If you want the ordering done differently, you might want to include an [Ordering] column; then change to ROW_NUMBER() OVER(PARTITION BY ID ORDER BY [Ordering]). Eg an [Ordering] column could be an INT IDENTITY(1,1). I'll leave that as an excercise to you if that's what's really needed.