How to create an SQL query that takes values on different rows and joins them together on the same row (variable number of joins required) - sql

Not sure how to phrase the question really, but here's what I have and here's what I need.
I've got a table that looks like this:
Name K% Year
Albert Pujols 7.90% 2006
Albert Pujols 8.50% 2007
Albert Pujols 8.40% 2008
Albert Pujols 9.10% 2009
Albert Pujols 10.90% 2010
Albert Pujols 8.90% 2011
Albert Pujols 11.30% 2012
I'd like to create a query that will produce output that looks like:
Albert Pujols 7.90% 8.50% 8.40% 9.10% 10.90% 8.90% 11.30%
While this particular player has 7 rows, I can't be guaranteed that such will exist.
Is this even possible?
I'd appreciate any help. I wouldn't have any trouble if I knew that there were only 2 rows (inner join on name)... but the variable number of rows is throwing me for a loop.
Edit**
Peter Wooster's answer of pivoting was the solution I needed.

If you are doing this so you can print a report, best thing to do is use a report writer that supports cross tabs. Jasper Reports does.
SQL is not really good at this kind of stuff. There are tricky ways you could get it to give you the results, but they'd be pretty silly.

Related

How to merge crosstab info down in Access?

Not sure if this is possible but I'm hoping it is. I am using MS Access for Estate Planning for work. I've gotten to the point where I've got the data to look like this:
File_Name
Executor_1
Executor_2
Beneficiary_1
Beneficiary_2
Hill, Hank
Peggy Hill
Peggy Hill
Hill, Hank
Bobby Hill
Bobby Hill
Gribble, Dale
Nancy Gribble
Gribble, Dale
Joseph Gribble
Joseph Gribble
Gribble, Dale
John Redcorn
But I need it to look like this:
File_Name
Executor_1
Executor_2
Beneficiary_1
Beneficiary_2
Hill, Hank
Peggy Hill
Bobby Hill
Peggy Hill
Bobby Hill
Gribble, Dale
Nancy Gribble
Joseph Gribble
Joseph Gribble
John Redcorn
I need it in the latter format so I can use MailMerge in word and create the Will. Can anyone provide any guidance? We don't currently use any software for Est. Planning so anything beats having to go into Word manually and retype everything. Please let me know if more information is needed.
Edit:
This is what the SQL looks like:
TRANSFORM Last(File_Roles.File_Name) AS LastOfFile_Name
SELECT File_Roles.Executor_1,
File_Roles.Executor_2,
File_Roles.Beneficiary_1,
File_Roles.Beneficiary_2,
File_Roles.Trustee_1,
File_Roles.Trustee_2,
File_Roles.Guardian_1,
File_Roles.Guardian_2,
File_Roles.ATTY_IF_1, File_Roles.ATTY_IF_2,
File_Roles.HCATTY_IF_1,
File_Roles.HCATTY_IF_2
FROM File_Roles
GROUP BY File_Roles.Executor_1,
File_Roles.Executor_2,
File_Roles.Beneficiary_1,
File_Roles.Beneficiary_2,
File_Roles.Trustee_1,
File_Roles.Trustee_2,
File_Roles.Guardian_1,
File_Roles.Guardian_2,
File_Roles.ATTY_IF_1,
File_Roles.ATTY_IF_2,
File_Roles.HCATTY_IF_1,
File_Roles.HCATTY_IF_2
PIVOT File_Roles.File_Name;
You can use GROUP BY and MAX()
SELECT
t.File_Name,
MAX(t.Executor_1) As Executor_1,
MAX(t,Executor_2) As Executor_2,
MAX(t.Beneficiary_1) As Beneficiary_1,
MAX(t.Beneficiary_2) As Beneficiary_2
FROM table_or_query t
GROUP BY File_Name
But maybe you can fix your original crosstab query to do this right away. Probably you are doing the grouping wrong. You must group by File_Name in the crosstab query and apply Max to the total row of the value (so it is difficult to say without seeing this query).
GROUP BY File_Name means that one row is created for each distinct value of File_Name.
Since this will merge several rows into one, you must specify an aggregate function for every column in the SELECT list not listed in the GROUP BY clause. This can be e.g. SUM(), AVG(), MIN() or MAX(). See SQL Aggregate Functions for a complete list. Since any Null value is considered to be less than any other value, MAX() will take this non-Null value from the merged rows.

The difference between those two SQL queries

I have converted a sql query written by an other senoir developer who also is the group lead and I am new to programming. He wrote a query that was reading a collection of rows from DB by sending array of parameters, For example:
SELECT [LastName],[FirstMidName],[EnrollmentDate]
FROM [ContosoUniversity1].[dbo].[Student]
WHERE ([LastName] ='Alexander' AND [FirstMidName] = 'Carson')
OR ([LastName] ='Justice' AND [FirstMidName] = 'Peggy')
However, I was given an assignment to improve the security of the query. I did some changes to apply sqlParameter() to the query. The query was written as:
SELECT [LastName],[FirstMidName],[EnrollmentDate]
FROM [ContosoUniversity1].[dbo].[Student]
WHERE [LastName] IN ('Alexander','Justice')
AND [FirstMidName] IN ('Carson','Peggy')
So basically its follows the where.. in clause that I can further do my other tasks. And these two lines give the same result but he insisted that mine was logically bad. I have very hard time to understand his explanation and self-doubt that if I am doing wrong to convert this query. Could anyone share any opinion?
The first query will only bring in an exact grouping of names. Imagine if someone else went to the school called Carson Justice. Your query would bring him in, the seniors query would not.
I.e.
FirstMidName | LastName
Alexander | Carson
Peggy | Justice
Peggy | Carson
Alexander | Justice
Seniors query would return Alexander Carson, Peggy Justice
Your query would return all 4 names (Alexander Carson, Peggy Justice, Peggy Carson, Alexander Justice)
Yours is logically wrong because it will bring in Peggy Alexander. The first query won't bring her in. And that doesn't seem like the intent of the exercise.

SQL Group By with Text Transformation

I'm trying to do some transformations on a large data set that I'm working on and was hoping for a bit of assistance on a particular grouping. I have a series of records that follow a pattern similar to below:
Language Full Name Customer ID
--------------------------------------
English John Smith 12222
French John Smith 12222
Spanish John Smith 12222
English Karen Wong 55999
Cantonese Karen Wong 55999
I need the data such that the Full Name and Customer ID are not repeated so simply using DISTINCT for that. However, one oddity in the requirement is that all the different languages need to be preserved and squashed into the resulting output so the resulting data needs to look like this:
Languages Spoken Full Name Customer ID
----------------------------------------------------
English, French, Spanish John Smith 12222
English, Cantonese Karen Wong 55999
Sounded like a simple thing but I guess I'm not a big SQL guru and keep getting funny results. Any help would be much appreciated :)
If you're using SQL Server 2017 or Azure SQL than you can just use STRING_AGG
https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-2017
For everything else (covers solutions from SQL Server 2005 and on):
Simulating group_concat MySQL function in Microsoft SQL Server 2005?

SSRS Combining values within columns from multiple rows when grouped

I feel like this should be relatively easy to do in a SSRS report. Using VS 2010. I have a table that comes in from a basic sql query. Just dropping the columns into the a table in visual studio. I want to group the table by company first, which I do via the row group properties. I have a table that looks like this.
Company Contact ContactSub SubCert Year
Bank3 Joey Steven.B A 2010
Bank2 Dave James A 2010
Bank2 Dave Steve B 2010
Bank2 Dave Mark B 2010
Bank2 Dave James A 2011
Bank2 Dave Steve A 2011
Bank2 Dave Mark B 2011
Bank2 Dave James A 2012
Bank2 Dave Steve A 2012
Bank2 Dave Mark A 2012
I now want to combine the Contact Subs and their subcert joined into one row. BUT only using the most recent year. Because some ContactSub may have had their SubCert upgraded to an A from a B.
Company Contact ContactSub SubCert Year
Bank3 Joey Steven.B A 2010
Bank2 Dave James,Steve,Mark A,A,A 2012
I added an additional gorup by property, the "Year" column to the row and used this formula for the ContactSub and SubCert columns in the table:
=Join(LookupSet(Fields!Company.Value,Fields!Company.Value,Fields!SubCert.Value,"DataSet Name"),",")
But this returned me:
Company Contact ContactSub SubCert Year
Bank3 Joey Steven.B A 2010
Bank2 Dave James,Steve,Mark,James A,B,B,A, 2012
Steve,Mark,James, Steve A,B,A,A,
Mark A
How could I clarify my formula to make it say for only the newest year instead of using the values for all years?
Hope this makes sense.
With your data:
And a table grouped on Company:
I use the following expressions:
ContactSub
=Join(LookupSet(Fields!Company.Value & Max(Fields!Year.Value)
, Fields!Company.Value & Fields!Year.Value
, Fields!ContactSub.Value
, "DataSet1"), ",")
SubCert
=Join(LookupSet(Fields!Company.Value & Max(Fields!Year.Value)
, Fields!Company.Value & Fields!Year.Value
, Fields!SubCert.Value
, "DataSet1"), ",")
You can see I'm using Max(Fields!Year.Value) as well as Fields!Company.Value to only match on the highest year in the LookupSet expression.
This gives the required results:
Your problem is that it's working as intended - the LOOKUPSET() function is returning all records from your dataset where the Company matches. You need to either tighten your criteria in your use of the LOOKUPSET() function, or add some custom code to go through the returned array and purge duplicates.
One option for tightening up the lookup might be to add a calculated field to your dataset that concatenates the Company name and the Year together, which, at least looking at your sample data, would provide the slightly more unique key you're looking for.

Combining almost identical rows into 1

I have a tricky problem that I wouldn't mind a bit of help on, I've made some progress using queries that I've here and elsewhere, but am getting seriously stumped now.
I have a mailing list that has numerous near duplications that I'm trying to combine into one meaningful row, taking data such as this.
Title Forename Surname Address1 Postcode Phone Age Income Ownership Gas
Mrs D Andrews 122 Somewhere BH10 123456 66-70 Homeowner
Ms Diane Andrews 122 Somewhere BH10 123456 £25-40 EDF
and making one row along the lines of
Title Forename Surname Address1 Postcode Phone Age Income Ownership Gas
Mrs Diane Andrews 122 Somewhere BH10 123456 66-70 £25-40 Homeowner EDF
I have over 127 million records, most duplicated with a similar pattern, but no clear logic as was proven when I added an identity field. I also have over 90 columns to consider, so it's a bit of work!
There isn't a clear pattern to the data, so I'm thinking I may have a huge case statement to try to climb over.
Using the following code I can get a decent start on only returning the full name, but with the pattern of data - trying to compare the fields across rows is as follows.
SELECT c1.*
FROM
Mailing c1
JOIN
Mailingc2 ON c1.Telephone1 = c2.Telephone1 AND c1.surname = c2.surname
WHERE
len(c1.Forename) > len(c2.Forename)
AND c2.over_18 <> ''
AND c1.Telephone1 = '123456'
Has anyone got any pointers as to how I should progress please? I'm open to discussion and ideas...
I'm using SQL 2005 and apologies in advance if the tagging is all over the place!
Cheers,
Jon
Would it work by assuming that all persons with the same surname and phone number (Do all persons have a phone?) were the same person?
INSERT INTO newtable <fieldnames>
SELECT lastname,phone,max(field3),max(field4)....
FROM oldtable
GROUP BY lastname,phone
But that would collapse John Smith and Jack Smith living together into one person.
Perhaps you should consider outsourcing it to a data-entry sweatshop somewhere, adter you have preprocessed the data. :-)
And/or be prepared to take the flack for mistaken bundling.
Perhaps adding something like "To improve our green footprint, we have merged x listings on your adress together. If you would like separate mailings, please contact us"