Transpose SQL table when the value is text - sql

I have a table containing a series of survey responses, structured like this:
Question Category | Question Number | Respondent ID | Answer
This seemed the most logical storage for the data. The combination of Question Category, Question Number, and Respondent ID is unique.
Problem is, I've been asked for a report with the Respondent ID as the columns, with Answer as the rows. Since Answer is free-text, the numeric-expecting PIVOT command doesn't help. It would be great if each row was a specific Question Category / Question Number pairing so that all of the information is displayed in a single table.
Is this possible? I'm guessing a certain amount of dynamic SQL will be required, especially with the expected 50-odd surveys to display.

I think this task should be done by your client code - trying to do this transposing on SQL side is not very good idea. Such SQL (even if it can be constructed) will likely be extremely complicated and fragile.
First of all, you should count how many distinct answers are there - you probably don't want to create report 1000 columns wide if all answers are different. Also, you probably want to make sure that answers are narrow - what if someone gave really bad 1KB wide answer?
Then, you should construct your report (would that be HTML or something else) dynamically based on results of your standard, non-transposed SQL.
For example, in HTML you can create as many columns as you want using <th>column</th> for table header and <td>value</td> for data cell - as long as you know already how many columns are going to be in your output result.

To me, it seems that the problem is the number of columns. You don't know how many respondents there are.
One idea would be to concatenate the respondent ids. You can do this in SQL Server as:
select distinct Answer,
(select cast(RespondentId as varchar(255))+'; '
from Responses r2
where r2.Answer = r.Answer
for xml path ('')
) AllResponders
from Responses r
(This is untested so may have syntax errors.)

If reporting services or excel power pivot are possibilities for the report then they could both probably accomplish what you want easier than a straight sql query. In RS you can use a tablix, and in power pivot a pivot table. Both avoid having to define your pivot columns in an sql pivot statement, and both can dynamically determine the column names from a tabular result set.

Related

How to implement a matrix concept in SQL Server

I am stuck in this concept of creating a matrix in SQL Server where it is created in Excel. I couldn't find good answer online. There are room numbers as the first row and on the first column there are functional requirements. So for example when there is a camera needed in one of the rooms,I will place X mark in the desired row and col coordinate to indicate that it contains one.I attached an sample of the Excel to explain better. Excel Matrix.png
Rather than having multiple columns for every possible functional requirement, use proper relational methods for a many-to-may relationship:
Rooms
------
Id
RoomName
Functions
---------
Id
FunctionName
RoomFunctions
-------------
RoomId
FunctionId
Then you can relate one room to a variable number of functions, and can add functions easily without changing your data structure.
Without having data to work with, it's hard to give you an example.
With that said, the pivot method may help you out. You can just have dummy column with a 1 or 0 based on whether or not it has an 'X' in your data. Then in the pivot you would just do a max on that for the various values.
It may require massaging your data into a better format, but should be doable.

SQL - Return first instance of record when many instances are found in database

I am working with some inherited code and having trouble solving the following issue: I would like to be able to search our inventory database by a number of criteria, including hardware serial number and comments left about the hardware. I then would like to return a list of pieces of hardware in our inventory that match this search criteria.
The issue I am having is that there may be multiple comments for one piece of hardware, so when returning the list of hardware, I see multiple results for the same piece (because of the joining to the Events table). How can I display the record once for each piece of hardware instead of multiple records for each piece of hardware?
Here is a snippet of the SQL query minus all of the if statements containing search parameters and such:
SELECT
UPPER(Hardware.HardwareSerialNumber) AS HardwareSerialNumber,
UPPER(Hardware.HardwareName) AS HardwareName,
Hardware.HardwareFirstDeploymentDate,
Hardware.HardwareActive,
Hardware.HardwareAccountNumber,
Hardware.BillingAccountNumber,
Hardware.LastUpdated,
Hardware.Comments,
Events.EventComments
FROM
Hardware
LEFT JOIN
Events
ON
Hardware.HardwareSerialNumber = Events.HardwareSerialNumber
WHERE 1=1
Thank you in advance!
I read your question and the comments a few times and I think your question is perhaps simpler than we are making it. It sounds like you have the data you WANT but just can't figure out how to tease out subsets from it. I would suggest that the following approaches are possible.
GROUP
This is a solution on the client side Since you have the data you want, order by serial number then output based on the GROUP param of cfoutput. Something like:
<cfoutput group="serialnumber">
.... output a line item for the hardware.
<cfoutput group="comments">
... line item for each comment.
</cfoutput>
</cfoutput>
Of course if you just want to show the hardware piece itself, just do the OUTER cfoutput and skip the inner one.
SELECT DISTINCT Q OF A Q
Again this is a client side solution. Since you have the data you want run a separate query of a query and select distinct the rows that are related to the hardware (sans comments).
<cfquery name="hardware" dbtype="query">
SELECT DISTINCT hardwareserialnumber, hardwarename
FROM qryHardware
</cfquery>
...where qryHardware in the example is the name of the query returned above. Then use this subquery as you need.
Split Queries
You might try just running 2 queries, one containing the hardware and the other the comments. If you need to search against the comments use a subquery to figure out WHICH hardware serials you should be working with, then pull in the comments in a second query. Often people work really really hard to get all the data into one query - which is a laudible goal and lets the DB do what it does best, but there are times when it is a bit of a wasted effort. Lineheader / lineitem or Item / comment can be one of those times IMO.

Adding columns dynamically to a View or return from Stored Procedure

I've found a lot of bits and pieces of this, but I can't put the together. This is basically the idea of the table where name is a varchar, date is a datetime, and number is an int
Name | Date | Number
A 1-2-11 15
B 1-2-11 8
A 1-1-11 5
I'd like to create a view that looks like this
Name | 1-2-11 | 1-1-11
A 15 5
B 8
At first I was using a temp table, and appending each date row to it. I read on another forum that way was a major resource hog. Is that true? Is there a better way to do this?
I would combine dynamic SQL with a pivot as I mentioned in this answer.
You want to look into "cross-tab" or "pivot" statements. In SQL Server 2005 and up, its PIVOT, but syntax varries between platform.
This is a very complex subject, particuarly since you want to add columns to a view as your data grows over time. Besides your platform's documentation, check out the myriad other SO posts on the subject.
If the date column is a known set then you can use pivot in some cases.
It is often faster to use dynamic sql BUT this can be very dangerous so be wary.
To really know what the best solution is for your problem we would need some more information -- how much data -- how much variation is expected in the different columns, etc.
However, it is true, both PIVOT and Dynamic SQL will be faster than a temp table.
I would do it with Access or Excel instead of T-SQL.

Can scalar functions be applied before filtering when executing a SQL Statement?

I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.

How to get multi row data of one column to one row of one Column

I need to get data in multiple row of one column.
For example data from that format
ID Interest
Sports
Cooking
Movie
Reading
to that format
ID Interest
Sports,Cooking
Movie,Reading
I wonder that we can do that in MS Access sql. If anybody knows that, please help me on that.
Take a look at Allen Browne's approach: Concatenate values from related records
As for the normalization argument, I'm not suggesting you store concatenated values. But if you want to join them together for display purposes (like a report or form), I don't think you're violating the rules of normalization.
This is called de-normalizing data. It may be acceptable for final reporting. Apparently some experts believe it's good for something, as seen here.
(Mind you, kevchadder's question is right on.)
Have you looked into the SQL Pivot operation?
Take a look at this link:
http://technet.microsoft.com/en-us/library/ms177410.aspx
Just noticed you're using access. Take a look at this article:
http://www.blueclaw-db.com/accessquerysql/pivot_query.htm
This is nothing you should do in SQL and it's most likely not possible at all.
Merging the rows in your application code shouldn't be too hard.