Is it possible to map values onto a table given corresponding row and column indices in SQL? - sql

I have a SQL table in the form of:
| value | row_loc | column_loc |
|-------|---------|------------|
| a | 0 | 1 |
| b | 1 | 1 |
| c | 1 | 0 |
| d | 0 | 0 |
I would like to find a way to map it onto a table/grid, given the indices, using SQL. Something like:
| d | a |
| c | b |
(The context being, I would like to create a colour map with colours corresponding to values a, b, c, d, in the locations specified)
I would be able to do this iteratively in python, but cannot figure out how to do it in SQL, or if it is even possible! Any help or guidance on this problem would be greatly appreciated!
EDIT: a, b, c, d are examples of numeric values (which would not be able to be selected using named variables in practice, so I'm relying on selecting them based on location. Also worth noting, the number of rows and columns will always be the same. The value column is also not the primary key to this table, so is not necessarily unique, it is just as a continuous value.

Yes, it is possible, assuming the column number is limited since SQL supports only determined number of columns. The number of rows in result set depends on number of distinct row_loc values so we have to group by column row_loc. Then choose value using simple case.
with t (value, row_loc, column_loc) as (
select 'a', 0, 1 from dual union all
select 'b', 1, 1 from dual union all
select 'c', 1, 0 from dual union all
select 'd', 0, 0 from dual
)
select max(case column_loc when 0 then value else null end) as column0
, max(case column_loc when 1 then value else null end) as column1
from t
group by row_loc
order by row_loc
I tested it on Oracle. Not sure what to do if multiple values match on same coordinate, I chose max. For different vendors you could also utilize special clauses such as count ... filter (where ...). Or the Oracle pivot clause can also be used.

Related

How to create a table to count with a conditional

I have a database with a lot of columns with pass, fail, blank indicators
I want to create a function to count each type of value and create a table from the counts. The structure I am thinking is something like
| Value | x | y | z |
|-------|------------------|-------------------|---|---|---|---|---|---|---|
| pass | count if x=pass | count if y=pass | count if z=pass | | | | | | |
| fail | count if x=fail | count if y=fail |count if z=fail | | | | | | |
| blank | count if x=blank | count if y=blank | count if z=blank | | | | | | |
| total | count(x) | count(y) | count (z) | | | | | | |
where x,y,z are columns from another table.
I don't know which could be the best approach for this
thank you all in advance
I tried this structure but it shows syntax error
CREATE FUNCTION Countif (columnx nvarchar(20),value_compare nvarchar(10))
RETURNS Count_column_x AS
BEGIN
IF columnx=value_compare
count(columnx)
END
RETURN
END
Also, I don't know how to add each count to the actual table I am trying to create
Conditional counting (or any conditional aggregation) can often be done inline by placing a CASE expression inside the aggregate function that conditionally returns the value to be aggregated or a NULL to skip.
An example would be COUNT(CASE WHEN SelectMe = 1 THEN 1 END). Here the aggregated value is 1 (which could be any non-null value for COUNT(). (For other aggregate functions, a more meaningful value would be provided.) The implicit ELSE returns a NULL which is not counted.
For you problem, I believe the first thing to do is to UNPIVOT your data, placing the column name and values side-by-side. You can then group by value and use conditional aggregation as described above to calculate your results. After a few more details to add (1) a totals row using WITH ROLLUP, (2) a CASE statement to adjust the labels for the blank and total rows, and (3) some ORDER BY tricks to get the results right and we are done.
The results may be something like:
SELECT
CASE
WHEN GROUPING(U.Value) = 1 THEN 'Total'
WHEN U.Value = '' THEN 'Blank'
ELSE U.Value
END AS Value,
COUNT(CASE WHEN U.Col = 'x' THEN 1 END) AS x,
COUNT(CASE WHEN U.Col = 'y' THEN 1 END) AS y
FROM #Data D
UNPIVOT (
Value
FOR Col IN (x, y)
) AS U
GROUP BY U.Value WITH ROLLUP
ORDER BY
GROUPING(U.Value),
CASE U.Value WHEN 'Pass' THEN 1 WHEN 'Fail' THEN 2 WHEN '' THEN 3 ELSE 4 END,
U.VALUE
Sample data:
x
y
Pass
Pass
Pass
Fail
Pass
Fail
Sample results:
Value
x
y
Pass
3
1
Fail
1
1
Blank
0
2
Total
4
4
See this db<>fiddle for a working example.
I think you don't need a generic solution like a function with value as parameter.
Perhaps, you could create a view grouping your data and after call this view filtering by your value.
Your view body would be something like that
select value, count(*) as Total
from table_name
group by value
Feel free to explain your situation better so I could help you.
You can do this by grouping by the status column.
select status, count(*) as total
from some_table
group by status
Rather than making a whole new table, consider using a view. This is a query that looks like a table.
create view status_counts as
select status, count(*) as total
from some_table
group by status
You can then select total from status_counts where status = 'pass' or the like and it will run the query.
You can also create a "materialized view". This is like a view, but the results are written to a real table. SQL Server is special in that it will keep this table up to date for you.
create materialized view status_counts with distribution(hash(status))
select status, count(*) as total
from some_table
group by status
You'd do this for performance reasons on a large table which does not update very often.

How to create a funnel visual/bar chart in Tableau by creating a calculated field using an existing column in the data source?

In my data source, there's a column called 'Pool'
Within that column, there are about 3 values:
| Pool |
| C |
| B |
| C |
| A |
So as you can see, there are 3 distinct values, A, B, C. I want to create a funnel, or essentially a bar chart that will calculate each and count them in the whole column for each of those three values. However, I know I can't just place the column itself in the sheet since I also want to have a fourth bar that counts all the values as a "All" category.
So eventually having a visual that states (but this is in tabular form to help illustrate what I mean)
All | 20
A | 10
B | 5
C | 5
Please find an indicative answer in fiddle
You could use UNION between two results one to bring the COUNT for each of your values and one COUNT for all your samples.
(SELECT Pool, COUNT(Pool) AS your_count
FROM your_table
GROUP BY Pool)
UNION
(SELECT 'ALL', COUNT(*) AS your_count
FROM your_table)
ORDER BY your_count DESC

How do I select rows where only return keys that don't have '1' in column c

Title is confusing I know, I'm just not sure how to word this. Anyway let me describe with a table:
| key | column b | column c |
|-----|----------|----------|
| a | 13 | 2 |
| a | 14 | 2 |
| a | 15 | 1 |
| b | 16 | 2 |
| b | 17 | 2 |
I'd like to select all keys where column c doesn't equal 1, so the select will result in returning only key 'b'
To clarify, my result set should not contain keys that have a row where column c is set to 1. Therefore I'd like a sql query that would return the keys that satisfy the previous statement.
To make my question as clear as possible. From the table above, what I want returned by some sql statement is a result set containing [{b}] based on the fact that key 'a' has at least one row where column c is equal to 1 whereas key 'b' does not have any rows that contain 1 in column c.
SELECT t.[Key]
FROM TableName t
WHERE NOT EXISTS (SELECT 1
FROM TableName
WHERE t.[key] = [key]
AND ColumnC = 1)
GROUP BY t.[Key]
SELECT KEY
FROM WhateverYourTableNameIs
WHERE c <> '1'
I would do this using group by and aggregation:
select [key]
from table t
group by [key]
having sum(case when c = 1 then 1 else 0 end) = 0;
The having clause counts the number of rows that have c = 1. The = 0 says that there are no such rows for a given key.
Elaboration based on other comments:
You asked for ALL keys where column c doesn't equal 1. That is exactly what the query I suggested will give you. The other part of your question so the SELECT will result in returning only key 'b', is ambiguous. The question as asked will give you results from columns A and B. There is nothing in your question to limit the result set. You either need an additional condition to your WHERE clause, or your question is inherently unanswerable.

SQL group by and count fixed column values

I'm facing a problem in a data importation script in SQL(MySQL) where I need to GROUP rows by type to COUNT how much rows there are from each type. So far, it isn't really a problem, because I know that I can do:
SELECT
data.type,
COUNT(data.type)
FROM data
GROUP BY data.type;
So, by doing it, I have the result:
-------------- ---------------------
| type | COUNT(data.type) |
|--------------|---------------------|
| 0 | 1 |
| 1 | 46 |
| 2 | 35 |
| 3 | 423 |
| 4 | 64 |
| 5 | 36 |
| 9 | 1 |
-------------- ---------------------
I know that in the type column the values will always be in the range from 0 to 9, like the above result. So, I would like to list not only the existing values in the table content but the missing type values too, with their COUNT value set to 0.
Based on the above query result, the expected result would be:
-------------- ---------------------
| type | COUNT(data.type) |
|--------------|---------------------|
| 0 | 1 |
| 1 | 46 |
| 2 | 35 |
| 3 | 423 |
| 4 | 64 |
| 5 | 36 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 1 |
-------------- ---------------------
I could trickly INSERT one row of each type before GROUP/COUNT-1 the table content, flagging some other column on INSERT to be able to DELETE these rows after. So, the steps of my importation script would change to:
TRUNCATE table; (I can't securily import new content if there were old data in the table)
INSERT "control" rows;
LOAD DATA INFILE INTO TABLE;
GROUP/COUNT-1 the table content;
DELETE "control" rows; (So I can still work with the table content)
Do any other jobs;
But, I was looking for a cleaner way to reach the expected result. If possible, a single query, without a bunch of JOINs.
I would appreciate any suggestion or advice. Thank you very much!
EDIT
I would like to thank for the answers about CREATE a table to store all types to JOIN it. It really solves the problem. My approach solves it too, but does it storing the types, as you did.
So, I have "another" question, just a clarification, based on the received answers and my desired scope... is it possible to reach the expected result with some MySQL command that will not CREATE a new table and/or INSERT these types?
I don't see any problem, actually, in solve my question storing the types... I just would like to find a simplified command... something like a 'best practice'... some kind of filter... as I could run:
GROUP BY data.type(0,1,2,3,4,5,6,7,8,9)
and it could return these filtered values.
I am really interested to learn such a command, if it really exists/is possible.
And again, thank you very much!
Let's assume that you have a types table with all the valid types:
SELECT t.type,
COUNT(data.type)
FROM data join types t on data.type = t.type
GROUP BY t.type
order by t.type
You should include the explicit order by and not depend on the group by to produce results in a particular order.
The easiest way is to create a table of all type values and then join on that table when getting the count:
select t.type,
count(d.type)
from types t
left join data d
on t.type = d.type
group by t.type
See SQL Fiddle with demo
Or you can use the following:
select t.type,
count(d.type)
from
(
select 0 type
union all
select 1
union all
select 2
union all
select 3
union all
select 4
union all
select 5
union all
select 6
union all
select 7
union all
select 8
union all
select 9
) t
left join data d
on t.type = d.type
group by t.type
See SQL Fiddle with Demo
One option would be having a static numbers table with the values 0-9. Not sure if this is the most elegant approach, and if you were using SQL Server, I could think of another approach.
Try something like this:
SELECT
numbers.number,
COUNT(data.type)
FROM numbers
left join data
on numbers.number = data.type
GROUP BY numbers.number;
And the SQL Fiddle.
Okay... I think I found it! Thank you all!!! I'm accepting my own answer.
I agree with the #GordonLinoff comment that the best practice refers to store the types values and describe them, so you can keep a concise/understandable database and queries.
But, as far as I've learned, if you have some data which might be an irrelevant information, it is preferable to treat it in some other way than storing it.
So, I developed this query:
SELECT
SUM(IF(data.type = 0, 1, 0)) AS `0`,
SUM(IF(data.type = 1, 1, 0)) AS `1`,
SUM(IF(data.type = 2, 1, 0)) AS `2`,
SUM(IF(data.type = 3, 1, 0)) AS `3`,
SUM(IF(data.type = 4, 1, 0)) AS `4`,
SUM(IF(data.type = 5, 1, 0)) AS `5`,
SUM(IF(data.type = 6, 1, 0)) AS `6`,
SUM(IF(data.type = 7, 1, 0)) AS `7`,
SUM(IF(data.type = 8, 1, 0)) AS `8`,
SUM(IF(data.type = 9, 1, 0)) AS `9`
FROM data;
Not a so faster, optimized and beauty query, but to the size of data I'll manage (less than 100.000 rows each importation) it "manually" does the GROUP/COUNT job, running in 0.13 sec in a common developer machine.
It differs from my expected result just in the way rows and columns are selected - instead of 10 rows with 2 columns I've got 1 row with 10 columns, labeled with the matching type. Also, as we have a standardization to the type value (and we'll not change it for sure) which gives it a name and description, I'm now able to use the type name as the column label, instead of joining to a table with the types info to select a third column in the result (which really, is not that important as it's an importation script based on some standards).
Thank you all so much for the help!

How can I efficiently transfer data from a vertical databaselayout to a horizontal one

I want to transfer data from a vertical db layout like this:
---------------------
| ID | Type | Value |
---------------------
| 1 | 10 | 111 |
---------------------
| 1 | 14 | 222 |
---------------------
| 2 | 10 | 333 |
---------------------
| 2 | 25 | 444 |
---------------------
to a horizontal one:
---------------------------------
| ID | Type10 | Type14 | Type25 |
---------------------------------
| 1 | 111 | 222 | |
---------------------------------
| 2 | 333 | | 444 |
---------------------------------
Creating the layout is not a problem but the database is rather large with millions of entries and queries get canceled if they take to much time.
How can this be done efficiently (so that the query is not canceled).
with t as
(
select 1 as ID, 10 as type, 111 as Value from dual
union
select 1, 14, 222 from dual
union
select 2, 10, 333 from dual
union
select 2, 25, 444 from dual
)
select ID,
max(case when type = 10 then Value else null end) as Type10,
max(case when type = 14 then Value else null end) as Type14,
max(case when type = 25 then Value else null end) as Type25
from t
group by id
Returns what you want, and I think it is the better way.
Note that the max function is just here to perform the group by clause, any group function can be use here (like sum, min...)
Break it up into smaller chunks and don't wrap the whole thing in a single transaction. First, create the table, and then do groups of inserts from the old table into the new table. Insert by range of ID, for example, in small enough chunks that it won't overwhelm the database's log and take too long.
The vertical table -- also known as the Entity-Attribute-Value anti-pattern -- always becomes a problem, sometimes very shortly after it is put into practice. If you haven't done so already, check out what Joe Celko has to say about this tactic, and you'll see even more proof of how troublesome this approach is. I'll stop there, since you're the smart person who knew to come to this site, and not the guilty but well-intentioned party who perpetrated the EAV table in your database.
The options for dealing with this type of table are not pretty, and, as you've stated, they get worse/slower as the amount of data needed for production queries grows.
Build a declared global temporary table (DGTT) that is not logged and preserves committed rows, and use it to stage the horizontal version of the EAV table contents. DGTTs are good for this kind of data shoveling because they do not incur any logging overhead.
Employ the traditional CASE and MAX() groupings as shown in the previous recommendation. The problem is that the query changes every time a new TYPE is introduced into your EAV table.
Use DB2's SQL-XML publishing features to turn the vertical data into XML. Here's an example that works with the table and column names you provided:
WITH t(id, type, value) as (
VALUES (1,10,111), (1,14,222), (2,10,333), (2,25,444)
)
SELECT
XMLSERIALIZE( CONTENT
XMLELEMENT(NAME "outer",
XMLATTRIBUTES(id AS "id"),
XMLAGG(XMLELEMENT(NAME attr ,
XMLATTRIBUTES(type as "typeid"), value) ORDER BY type)
) AS VARCHAR(1024)
)
FROM t as t group by id;
The benefit of the SQL-XML approach is that any new values handled by the EAV table will not require a rewrite to the SQL that pivots the values.