Oracle: any way to transform data this way? - sql

Is there any way to map the first table to the second table with an SQL query or, if too complicated, a PL/SQL block?
Original
--------------------------------------
| col1 | col2 | col3 | col4 |
--------------------------------------
| key | case 1 | case 2 | case 3 |
| value1 | v1c1 | v1c2 | v1c3 |
| value2 | v2c1 | v2c2 | v2c3 |
--------------------------------------
Target
-----------------------------
| key | case | result |
-----------------------------
| value1 | case 1 | v1c1 |
| value1 | case 2 | v1c2 |
| value1 | case 3 | v1c3 |
| value2 | case 1 | v2c1 |
| value2 | case 2 | v2c2 |
| value2 | case 3 | v2c3 |
-----------------------------
The original table can have a variable number of columns, and 'key' is a hardcoded string and is always in column 1 of the original table. No other row has “key” in column 1, so this row is a unique pivot.
Thank you

If dynamic sql is allowed, then it is possible to have all your requirements fullfilled using one query:
SELECT col1 as "key"
,extractvalue(dbms_xmlgen.getXMLType('select "' || tc.Column_Name ||
'" as v from Original where col1 = ''key''')
,'/ROWSET/ROW/V') "case"
,extractvalue(dbms_xmlgen.getXMLType('select "' || tc.Column_Name ||
'" as v from Original where col1 = ''' ||
replace(col1, '''', '''''') || '''')
,'/ROWSET/ROW/V') "result"
FROM Original
,(SELECT Column_Name
FROM All_Tab_Columns tc
WHERE tc.Owner = 'YOURSCHEMA'
and tc.Table_Name = 'ORIGINAL'
and Column_Name != 'COL1'
ORDER BY tc.COLUMN_ID) tc
WHERE col1 != 'key'
ORDER BY "key"
,"case"
Some more details as requested:
dbms_xmlgen.getXMLType returns an XmlType instance which is basically the result of the supplied query string as XML.
The format is ROWSET for the root node and ROW for each row. Every column will be an element as well.
The 2 selects that I am creating are only returning one value and to makes things easier, I gave them a column alias "V" so that I know which value to pick from the XML.
extractValue is a function that returns the result of an XPath expression from an XmlType.
'/ROWSET/ROW/V' returns the first V node, from the first ROW node that resides under the root node ROWSET.
<ROWSET><ROW><V>Abc</V></ROW></ROWSET>

The original table can have a variable
number of columns
Really?
The straightforward way is to select and union the parts you want.
select col1 as key, 'case1' as case, col2 as result
from test
where col1 <> 'key'
union all
select col1 as key, 'case2' as case, col3 as result
from test
where col1 <> 'key'
union all
select col1 as key, 'case3' as case, col4 as result
from test
where col1 <> 'key'
Straightforward, but not dynamic.
Later . . .
Based on your comment . . . although I don't think it's necessary.
select col1 as key, (select col2 from test where col1='key') as case, col2 as result
from test
where col1 <> 'key'
union all
select col1 as key, (select col3 from test where col1='key') as case, col3 as result
from test
where col1 <> 'key'
union all
select col1 as key, (select col4 from test where col1='key') as case, col4 as result
from test
where col1 <> 'key'
Oracle 11 also supports UNPIVOT, which I haven't used.

I don't know which parts can change, but this should be a start for you. If the column names can change (key, case 1, etc.) you will have to have another query to get the correct column names. If you have questions feel free to ask:
declare
v_query VARCHAR2(5000);
v_case VARCHAR2(255);
v_colcount PLS_INTEGER;
begin
-- Get number of columns
select count(*)
INTO v_colcount
from user_tab_columns
where table_name = 'T1';
-- Build case statement to get correct value for result column
v_case := 'case';
for i in 1 .. v_colcount-1
loop
v_case := v_case||' when rn = '||to_char(i)||' then col'||to_char(i+1);
end loop;
v_case := v_case||' end result';
-- Build final query
v_query := 'select col1 key, ''case ''||rn case, '||v_case||'
from t1
cross join (
select rownum rn
from dual
connect by level <= '||to_char(v_colcount-1)||'
) cj
where col1 <> ''key''
order by key, case';
-- Display query (would probably be replaced with an insert using execute immediate)
dbms_output.put_line(v_query);
end;
This produces the following query (which assumes your original table is called t1):
select col1 key, 'case '||rn case, case when rn = 1 then col2 when rn = 2 then col3 when rn = 3 then col4 end result
from t1
cross join (
select rownum rn
from dual
connect by level <= 3
) cj
where col1 <> 'key'
order by key, case

Try this:
with data as
(select level l from dual connect by level <= 3)
select col1,
'case' || l as "case",
decode(l,1,col2,2,col3,3,col4) as "values"
from myTable, data
order by 1,2;
Cheers

Related

SQL - Create a formatted ouput with placeholder rows

For reasons of our IT department, I am stuck doing this entirely within an SQL query.
Simplified, I have this as an input table:
And I need to create this:
And I am just not sure where to start with this. In my normal C# way of thinking its easy. Column1 is ordered, if the value in Col1 is new, then add a new row to the output and put the contents in column1 in the output. Then, whilst the contents of the input Column1 is unchanged, keep adding the contents of column2 to new rows.
In SQL... nope, I just cannot see the right way to start!
This is a presentation issue that can be easily done in the application or presentation layer. In SQL this can be clunky. The goal of a database is not to render a UI but to store and retrieve data fast and also efficiently, in order to serve as many clients as possible with the same hardware and software resources constraints.
The query that could do this can look like:
with
y as (
select col1, row_number() over(order by col1) as r1
from (select distinct col1 as col1 from t) x
),
z as (
select
t.col1, y.r1, t.col2,
row_number() over(partition by t.col1 order by t.col2) as r2
from t
join y on y.col1 = x.col1
)
select col1, col2
from (
select col1, null as col2, r1, 0 from y
union all
select null, col2, r1, r2 from z
) w
order by r1, r2
As you see, it looks clunky and bloated.
You need a header row for each group which will consist of col1 and null and all the rows of the table with null as col1.
You can do it with UNION ALL and conditional sorting:
select
case when t.col2 is null then t.col1 end col1,
t.col2
from (
select col1, col2 from tablename
union all
select distinct col1, null from tablename
) t
order by
t.col1,
case when t.col2 is null then 1 else 2 end,
t.col2
See the demo (for MySql but it is standard SQL).
Results:
| col1 | col2 |
| ---- | ----- |
| SetA | |
| | BH101 |
| | BH102 |
| | BH103 |
| SetB | |
| | BH201 |
| | BH202 |
| | BH203 |
I agree, formatting should be done outside of SQL, but if you have no choice, here is some SQL Server code that will generate your output
select *
from (
select top 100
case
when col2 is null then ' '+col1
else '' end as firstCol,
IsNull(col2,'') as Col2
from dbo.test t1
group by col1,col2 with rollup
order by col1,col2
) x
where x.firstcol is not null

Selecting multiple values into single row - SQL server

I need to merge a table with ID and various bit flags like this
-----------------
a1 | x | | x |
-----------------
a1 | | x | |
-----------------
a1 | | | |
-----------------
b2 | x | | |
-----------------
b2 | | | |
-----------------
c3 | x | x | x |
into such form
-----------------
a1 | x | x | x |
-----------------
b2 | x | | |
-----------------
c3 | x | x | x |
The problem is that data are join by kind of option ID each option has an unique ID which is joined with a1, b2. When I try to SELECT it by using DISTINCT I receive results from table number 1. I can make it by subqueries in SELECT but it is really weak solution due to performance reasons.
Do you have any idea how select and combine all these flags into single row?
use aggregation
select col1 ,max(col2),max(col3),max(col4)
form table_name group by col1
For the given result set it is eligible to use MIN and GROUP BY:
SELECT
tbl.Col
, MIN(tbl.Col1) Col1
, MIN(tbl.Col2) Col2
, MIN(tbl.Col3) Col3
FROM #table tbl
GROUP BY tbl.Col
However, if you have empty rows, then use MAX(). Otherwise MIN() returns empty rows:
SELECT
tbl.Col
, MAX(tbl.Col1) Col1
, MAX(tbl.Col2) Col2
, MAX(tbl.Col3) Col3
FROM #table tbl
GROUP BY tbl.Col
For example:
DECLARE #table TABLE
(
Col VARCHAR(50),
Col1 VARCHAR(50),
Col2 VARCHAR(50),
Col3 VARCHAR(50)
)
INSERT INTO #table
(
Col,
Col1,
Col2,
Col3
)
VALUES
( 'a1', -- Col - varchar(50)
'x', -- Col1 - varchar(50)
Null, -- Col2 - varchar(50)
'x' -- Col3 - varchar(50)
)
, ('a1', NULL, 'x', null)
, ('a1', NULL, 'x', null)
, ('b2', 'x', null, null)
, ('b2', null, null, null)
, ('c3', 'x', 'x', 'x')
SELECT
tbl.Col
, MIN(tbl.Col1) Col1
, MIN(tbl.Col2) Col2
, MIN(tbl.Col3) Col3
FROM #table tbl
GROUP BY tbl.Col
OUTPUT:
Col Col1 Col2 Col3
a1 x x x
b2 x NULL NULL
c3 x x x
You want aggregation :
select col1, max(col2), max(col2), max(col3)
from table t
group by col1;
This assuming blank value as null.
The general solution for such a situation is to simply aggregate and either use MIN or MAX on the columns.
SQL Server's data type BIT, however, is quirky. It's a little like a BOOLEAN, but not a real boolean. It is a little like a very limited numeric data type, but it isn't really a numeric type either. And there simply exist no aggregation functions for this data type. In standard SQL you'd have ANY and EVERY for the BOOLEAN type. In PostgreSQL you have BIT_OR and BIT_AND for BIT and BOOL_OR and BOOL_AND for BOOLEAN. SQL Server has nothing.
So convert your columns to a numeric type before using MIN (which would be a bitwise AND) or MAX (which would be a bitwise OR) on it. E.g.
select
id,
max(bit1 + 0) as bit1agg,
max(bit2 + 0) as bit2agg,
max(bit3 + 0) as bit3agg
from mytable
group by id
order by id;
You can also use CAST or CONVERT instead of course.

Update column value in all rows of a table on mod(rownum,10) = number

I have a table tab1 that looks like this:
col1 | col2 | col3
------|------|------
abc | 100 | text
abc | 100 | text
abc | 100 | text
... | ... | ...
I need to update col2 value in each row like this:
update tab1
set col2 = 1,23
when mod(rownum,10) = 1;
update tab1
set col2 = 12,34
when mod(rownum,10) = 2;
update tab1
set col2 = 123,45
when mod(rownum,10) = 3;
and etc. until when mod(rownum,10) = 9.
But obviously this query doesn't work, and the reason is that rownum always returns 1 in this situation, afaik. However, I've got the correct last digits for each row number with select mod(rownum,10) as lastDig from tab1 query. But I don't understand how to use the result of this select for my update when conditions.
Could you please provide an example of a query that will do the job in this situation? Do I need to use a subquery or select in a temporary table? Please explain. I'm a junior frontend guy, but I need to create a demo table this way. I believe, pl/sql is v10, as well as PL/SQL Developer.
Result wanted looks like this:
col1 | col2 | col3
------|-------|------
abc | 1.23 | text
abc | 12.34 | text
abc | 123.45| text
... | ... | ...
You could use CASE expression or DECODE:
update tab1
set col2 = CASE mod(rownum,10) WHEN 1 THEN 1.23
WHEN 2 THEN 12.34
WHEN 3 THEN 123.45
-- ...
ELSE col2
END
-- WHERE ...
UPDATE tab1
SET col2 = DECODE(mod(rownum,10), 1, 1.23, 2, 12.34, 3, 123.45, ..., col2)
-- WHERE ...;
DBFiddle Demo
You have not told us if there is a specific order in which you want to treat rows as 1,2,3 .. If there is indeed an order, then ROWNUM is unreliable and may not work, you would need row_number() with a specific order by column. That can be combined with a MERGE statement.
MERGE INTO tab1 tgt USING (
SELECT
CASE mod( ROW_NUMBER() OVER(
ORDER BY
col1 -- the column which is in order and unique
),10)
WHEN 1 THEN 1.23
WHEN 2 THEN 12.34
WHEN 3 THEN 123.45
--..
--.. 9
ELSE col2
AS col2
FROM
tab1 t
)
src ON ( tgt.rowid = src.rowid ) --use primary key/unique key if there is one instead of rowid
WHEN MATCHED THEN UPDATE SET tgt.col2 = src.col2;
Demo

SQL IF/CONDITIONAL Querty

Is it possible to structure a query that will display a static value for a row based on a column?
EG.
In INFORMIX, the syscolumns type is returned as an integer. I would like to have it print out the table type as a string rather than an integer.
For example, when I run a simple query to get the system tables
SELECT * FROM SYSCOLUMNS WHERE TABID < 100
I get
colname tabid colno coltype collength
------------------------------------------------
tabname 1 1 13 128
WHERE coltype = 13 corresponds to VARCHAR
So my original query would give me
COLNAME COLTYPE
col1 0
col2 1
...
But I want it to be returned as
COLNAME COLTYPE
col1 CHAR
col2 SMALLINT
...
Is such a thing possible to do in a single query?
SELECT COLUMN,
CASE
WHEN COLTYPE = 0 THEN 'CHAR'
WHEN COLTYPE = 1 THEN 'SMALLINT'
ELSE CAST(COLTYPE AS VARCHAR)
END as COLTYPE
FROM MyTable
So, for example, with this table:
COLUMN | COLTYPE
-------------+--------------
col1 | 0
col2 | 1
col3 | 2
The result would be:
COLUMN | COLTYPE
-------------+--------------
col1 | CHAR
col2 | SMALLINT
col3 | 2
However, if coltype maps to an id or similar in another table, it would make more sense to join to that table, like:
SELECT MyTable.COLUMN, SecondTable.COLUMNNAME
FROM MyTable
JOIN SecondTable ON MyTable.COLTYPE = SecondTable.COLTYPE

SELECT with calculated column that is dependent upon a correlation

I don't do a lot of SQL,and most of the time, I'm doing CRUD operations. Occasionally I'll get something a bit more complicated. So, this question may be a newbie question, but I'm ready. I've just been trying to figure this out for hours, and it's been no use.
So, Imagine the following table structure:
> | ID | Col1 | Col2 | Col3 | .. | Col8 |
I want to select ID and a calculated column. The calculated column has a range of 0 - 8 and it contains the number of matches to the query. I also want to restrict the result set to only include rows that have a certain number of matches.
So, from this sample data:
> | 1 | 'a' | 'b' | 1 | 2 |
> | 2 | 'b' | 'c' | 1 | 2 |
> | 3 | 'b' | 'c' | 4 | 5 |
> | 4 | 'x' | 'x' | 9 | 9 |
I want to query on Col1 = 'a' OR Col2 = 'c' OR Col3 = 1 OR Col4 = 5 where the calculated result > 1 and have the result set look like:
> | ID | Cal |
> | 1 | 2 |
> | 2 | 2 |
> | 3 | 2 |
I'm using T-SQL and SQL Server 2005, if it matters, and I can't change the DB Schema.
I'd also prefer to keep it as one self-contained query and not have to create a stored procedure or temporary table.
This answer will work with SQL 2005, using a CTE to clean up the derived table a little.
WITH Matches AS
(
SELECT ID, CASE WHEN Col1 = 'a' THEN 1 ELSE 0 END +
CASE WHEN Col2 = 'c' THEN 1 ELSE 0 END +
CASE WHEN Col3 = 1 THEN 1 ELSE 0 END +
CASE WHEN Col4 = 5 THEN 1 ELSE 0 END AS Result
FROM Table1
WHERE Col1 = 'a' OR Col2 = 'c' OR Col3 = 1 OR Col4 = 5
)
SELECT ID, Result
FROM Matches
WHERE Result > 1
Here's a solution that leverages the fact that a boolean comparison returns the integers 1 or 0:
SELECT * FROM (
SELECT ID, (Col1='a') + (Col2='c') + (Col3=1) + (Col4=5) AS calculated
FROM MyTable
) q
WHERE calculated > 1;
Note that you have to parenthesize the boolean comparisons because + has higher precedence than =. Also, you have to put it all in a subquery because you normally can't use a column alias in a WHERE clause of the same query.
It might seem like you should also use a WHERE clause in the subquery to restrict its rows, but in all likelihood you're going to end up with a full table scan anyway so it's probably not a big win. On the other hand, if you expect that such a restriction would greatly reduce the number of rows in the subquery result, then it'd be worthwhile.
Re Quassnoi's comment, if you can't treat boolean expressions as integer values, there should be a way to map boolean conditions to integers, even if it's a bit verbose. For example:
SELECT * FROM (
SELECT ID,
CASE WHEN Col1='a' THEN 1 ELSE 0 END
+ CASE WHEN Col2='c' THEN 1 ELSE 0 END
+ CASE WHEN Col3=1 THEN 1 ELSE 0 END
+ CASE WHEN Col4=5 THEN 1 ELSE 0 END AS calculated
FROM MyTable
) q
WHERE calculated > 1;
This query is more index friendly:
SELECT id, SUM(match)
FROM (
SELECT id, 1 AS match
FROM mytable
WHERE col1 = 'a'
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col2 = 'c'
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col3 = 1
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col4 = 5
) q
GROUP BY
id
HAVING SUM(match) > 1
This will only be efficient if all the columns you are searching for are, first, indexed and, second, have high cardinality (many distinct values).
See this article in my blog for performance details:
Matching 3 of 4