How to implement a matrix concept in SQL Server

How to implement a matrix concept in SQL Server - sql

I am stuck in this concept of creating a matrix in SQL Server where it is created in Excel. I couldn't find good answer online. There are room numbers as the first row and on the first column there are functional requirements. So for example when there is a camera needed in one of the rooms,I will place X mark in the desired row and col coordinate to indicate that it contains one.I attached an sample of the Excel to explain better. Excel Matrix.png

Rather than having multiple columns for every possible functional requirement, use proper relational methods for a many-to-may relationship:
Rooms
------
Id
RoomName
Functions
---------
Id
FunctionName
RoomFunctions
-------------
RoomId
FunctionId
Then you can relate one room to a variable number of functions, and can add functions easily without changing your data structure.

Without having data to work with, it's hard to give you an example.
With that said, the pivot method may help you out. You can just have dummy column with a 1 or 0 based on whether or not it has an 'X' in your data. Then in the pivot you would just do a max on that for the various values.
It may require massaging your data into a better format, but should be doable.

Related

SQL DB2 - How to SELECT or compare columns based on their name?

Thank you for checking my question out!
I'm trying to write a query for a very specific problem we're having at my workplace and I can't seem to get my head around it.
Short version: I need to be able to target columns by their name, and more specifically by a part of their name that will be consistent throughout all the columns I need to combine or compare.
More details:
We have (for example), 5 different surveys. They have many questions each, but SOME of the questions are part of the same metric, and we need to create a generic field that keeps it. There's more background to the "why" of that, but it's pretty important for us at this point.
We were able to kind of solve this with either COALESCE() or CASE statements but the challenge is that, as more surveys/survey versions continue to grow, our vendor inevitably generates new columns for each survey and its questions.
Take this example, which is what we do currently and works well enough:
CASE
WHEN SURVEY_NAME = 'Service1' THEN SERV1_REC
WHEN SURVEY_NAME = 'Notice1' THEN FNOL1_REC
WHEN SURVEY_NAME = 'Status1' THEN STAT1_REC
WHEN SURVEY_NAME = 'Sales1' THEN SALE1_REC
WHEN SURVEY_NAME = 'Transfer1' THEN Null
ELSE Null
END REC
And also this alternative which works well:
COALESCE(SERV1_REC, FNOL1_REC, STAT1_REC, SALE1_REC) as REC
But as I mentioned, eventually we will have a "SALE2_REC" for example, and we'll need them BOTH on this same statement. I want to create something where having to come into the SQL and make changes isn't needed. Given that the columns will ALWAYS be named "something#_REC" for this specific metric, is there any way to achieve something like:
COALESCE(all columns named LIKE '%_REC') as REC
Bonus! Related, might be another way around this same problem:
Would there also be a way to achieve this?
SELECT (columns named LIKE '%_REC') FROM ...
Thank you very much in advance for all your time and attention.
-Kendall

Table and column information in Db2 are managed in the system catalog. The relevant views are SYSCAT.TABLES and SYSCAT.COLUMNS. You could write:
select colname, tabname from syscat.tables
where colname like some_expression
and syscat.tabname='MYTABLE
Note that the LIKE predicate supports expressions based on a variable or the result of a scalar function. So you could match it against some dynamic input.
Have you considered storing the more complicated properties in JSON or XML values? Db2 supports both and you can query those values with regular SQL statements.

Transpose SQL table when the value is text

I have a table containing a series of survey responses, structured like this:
Question Category | Question Number | Respondent ID | Answer
This seemed the most logical storage for the data. The combination of Question Category, Question Number, and Respondent ID is unique.
Problem is, I've been asked for a report with the Respondent ID as the columns, with Answer as the rows. Since Answer is free-text, the numeric-expecting PIVOT command doesn't help. It would be great if each row was a specific Question Category / Question Number pairing so that all of the information is displayed in a single table.
Is this possible? I'm guessing a certain amount of dynamic SQL will be required, especially with the expected 50-odd surveys to display.

I think this task should be done by your client code - trying to do this transposing on SQL side is not very good idea. Such SQL (even if it can be constructed) will likely be extremely complicated and fragile.
First of all, you should count how many distinct answers are there - you probably don't want to create report 1000 columns wide if all answers are different. Also, you probably want to make sure that answers are narrow - what if someone gave really bad 1KB wide answer?
Then, you should construct your report (would that be HTML or something else) dynamically based on results of your standard, non-transposed SQL.
For example, in HTML you can create as many columns as you want using <th>column</th> for table header and <td>value</td> for data cell - as long as you know already how many columns are going to be in your output result.

To me, it seems that the problem is the number of columns. You don't know how many respondents there are.
One idea would be to concatenate the respondent ids. You can do this in SQL Server as:
select distinct Answer,
(select cast(RespondentId as varchar(255))+'; '
from Responses r2
where r2.Answer = r.Answer
for xml path ('')
) AllResponders
from Responses r
(This is untested so may have syntax errors.)

If reporting services or excel power pivot are possibilities for the report then they could both probably accomplish what you want easier than a straight sql query. In RS you can use a tablix, and in power pivot a pivot table. Both avoid having to define your pivot columns in an sql pivot statement, and both can dynamically determine the column names from a tabular result set.

How to get multi row data of one column to one row of one Column

I need to get data in multiple row of one column.
For example data from that format
ID Interest
Sports
Cooking
Movie
Reading
to that format
ID Interest
Sports,Cooking
Movie,Reading
I wonder that we can do that in MS Access sql. If anybody knows that, please help me on that.

Take a look at Allen Browne's approach: Concatenate values from related records
As for the normalization argument, I'm not suggesting you store concatenated values. But if you want to join them together for display purposes (like a report or form), I don't think you're violating the rules of normalization.

This is called de-normalizing data. It may be acceptable for final reporting. Apparently some experts believe it's good for something, as seen here.
(Mind you, kevchadder's question is right on.)

Have you looked into the SQL Pivot operation?
Take a look at this link:
http://technet.microsoft.com/en-us/library/ms177410.aspx
Just noticed you're using access. Take a look at this article:
http://www.blueclaw-db.com/accessquerysql/pivot_query.htm

This is nothing you should do in SQL and it's most likely not possible at all.
Merging the rows in your application code shouldn't be too hard.

How best to sum multiple boolean values via SQL?

I have a table that contains, among other things, about 30 columns of boolean flags that denote particular attributes. I'd like to return them, sorted by frequency, as a recordset along with their column names, like so:
Attribute Count
attrib9 43
attrib13 27
attrib19 21
etc.
My efforts thus far can achieve something similar, but I can only get the attributes in columns using conditional SUMs, like this:
SELECT SUM(IIF(a.attribIndex=-1,1,0)), SUM(IIF(a.attribWorkflow =-1,1,0))...
Plus, the query is already getting a bit unwieldy with all 30 SUM/IIFs and won't handle any changes in the number of attributes without manual intervention.
The first six characters of the attribute columns are the same (attrib) and unique in the table, is it possible to use wildcards in column names to pick up all the applicable columns?
Also, can I pivot the results to give me a sorted two-column recordset?
I'm using Access 2003 and the query will eventually be via ADODB from Excel.

This depends on whether or not you have the attribute names anywhere in data. If you do, then birdlips' answer will do the trick. However, if the names are only column names, you've got a bit more work to do--and I'm afriad you can't do it with simple SQL.
No, you can't use wildcards to column names in SQL. You'll need procedural code to do this (i.e., a VB Module in Access--you could do it within a Stored Procedure if you were on SQL Server). Use this code build the SQL code.
It won't be pretty. I think you'll need to do it one attribute at a time: select a string whose value is that attribute name and the count-where-True, then either A) run that and store the result in a new row in a scratch table, or B) append all those selects together with "Union" between them before running the batch.
My Access VB is more than a bit rusty, so I don't trust myself to give you anything like executable code....

Just a simple count and group by should do it
Select attribute_name
, count(*)
from attribute_table
group by attribute_name
To answer your comment use Analytic Functions for that:
Select attribute_table.*
, count(*) over(partition by attribute_name) cnt
from attribute_table

In Access, Cross Tab queries (the traditional tool for transposing datasets) need at least 3 numeric/date fields to work. However since the output is to Excel, have you considered just outputting the data to a hidden sheet then using a pivot table?

Database : best way to model a spreadsheet

I am trying to figure out the best way to model a spreadsheet (from the database point of view), taking into account :
The spreadsheet can contain a variable number of rows.
The spreadsheet can contain a variable number of columns.
Each column can contain one single value, but its type is unknown (integer, date, string).
It has to be easy (and performant) to generate a CSV file containing the data.
I am thinking about something like :
class Cell(models.Model):
column = models.ForeignKey(Column)
row_number = models.IntegerField()
value = models.CharField(max_length=100)
class Column(models.Model):
spreadsheet = models.ForeignKey(Spreadsheet)
name = models.CharField(max_length=100)
type = models.CharField(max_length=100)
class Spreadsheet(models.Model):
name = models.CharField(max_length=100)
creation_date = models.DateField()
Can you think about a better way to model a spreadsheet ? My approach allows to store the data as a String. I am worried about it being too slow to generate the CSV file.

from a relational viewpoint:
Spreadsheet <-->> Cell : RowId, ColumnId, ValueType, Contents
there is no requirement for row and column to be entities, but you can if you like

Databases aren't designed for this. But you can try a couple of different ways.
The naiive way to do it is to do a version of One Table To Rule Them All. That is, create a giant generic table, all types being (n)varchars, that has enough columns to cover any forseeable spreadsheet. Then, you'll need a second table to store metadata about the first, such as what Column1's spreadsheet column name is, what type it stores (so you can cast in and out), etc. Then you'll need triggers to run against inserts that check the data coming in and the metadata to make sure the data isn't corrupt, etc etc etc. As you can see, this way is a complete and utter cluster. I'd run screaming from it.
The second option is to store your data as XML. Most modern databases have XML data types and some support for xpath within queries. You can also use XSDs to provide some kind of data validation, and xslts to transform that data into CSVs. I'm currently doing something similar with configuration files, and its working out okay so far. No word on performance issues yet, but I'm trusting Knuth on that one.
The first option is probably much easier to search and faster to retrieve data from, but the second is probably more stable and definitely easier to program against.
It's times like this I wish Celko had a SO account.

You may want to study EAV (Entity-attribute-value) data models, as they are trying to solve a similar problem.
Entity-Attribute-Value - Wikipedia

The best solution greatly depends of the way the database will be used. Try to find a couple of top use cases you expect and then decide the design. For example if there is no use case to get the value of a certain cell from database (the data is always loaded at row level, or even in group of rows) then is no need to have a 'cell' stored as such.

That is a good question that calls for many answers, depending how you approach it, I'd love to share an opinion with you.
This topic is one the various we searched about at Zenkit, we even wrote an article about, we'd love your opinion on it: https://zenkit.com/en/blog/spreadsheets-vs-databases/

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas