How to categorize several columns in one statement in SQL/PLSQL - sql

I've got a table with 20 columns which I like to categorize like;
0-25 --> 1
25-50 --> 2
50-75 --> 3
75-100 --> 4
I prefer not to use 20 case ... when statements. Anyone who knows how to do this more dynamically & efficiently? Can be SQL or PL/SQL.
I tried some PL/SQL, but I didn't see a simple method to use the column names as variables.
Many thanks.
Frans

Your example is a bit confusing, but assuming you want to put a certain value into those categories, the function width_bucket might be what you are after:
Something like this:
with sample_data as (
select trunc(dbms_random.value(1,100)) as val
from dual
connect by level < 10
)
select val, width_bucket(val, 0, 100, 4) as category
from sample_data;
This will assign the numbers 1-4 to the (random) values from sample_data. the 0, 100 defines the range from which to build the buckets, and the final parameter 4 says in how many (equally wide) buckets this should be distributed. The result of the function is the bucket into which the value val would fall.
SQLFiddle example: http://sqlfiddle.com/#!4/d41d8/10721

The case statement is probably the most efficient way of doing it. A more dynamic way would be to create a table using the with statement. Here is an example of the code:
with ref as (
select 0 as lower, 25 as higher 1 as val from dual union all
select 25, 59, 2 from dual union all
select 50, 75, 3 from dual union all
select 75, 100, 4 from dual
)
select ref.val
from t left outer join ref
on t.col >= ref.lower and t.col < ref.higher
That said, this particular lookup could be done with arithmetic:
select trunc((t.col - 1) / 25) + 1 as val
from t
And, if your problem is managing the different columns, you might consider unpivot. However, I think it is probably easier just to write the code and modify the column names in a text editor or Excel.

Related

WHILE Window Operation with Different Starting Point Values From Column - SQL Server [duplicate]

In SQL there are aggregation operators, like AVG, SUM, COUNT. Why doesn't it have an operator for multiplication? "MUL" or something.
I was wondering, does it exist for Oracle, MSSQL, MySQL ? If not is there a workaround that would give this behaviour?
By MUL do you mean progressive multiplication of values?
Even with 100 rows of some small size (say 10s), your MUL(column) is going to overflow any data type! With such a high probability of mis/ab-use, and very limited scope for use, it does not need to be a SQL Standard. As others have shown there are mathematical ways of working it out, just as there are many many ways to do tricky calculations in SQL just using standard (and common-use) methods.
Sample data:
Column
1
2
4
8
COUNT : 4 items (1 for each non-null)
SUM : 1 + 2 + 4 + 8 = 15
AVG : 3.75 (SUM/COUNT)
MUL : 1 x 2 x 4 x 8 ? ( =64 )
For completeness, the Oracle, MSSQL, MySQL core implementations *
Oracle : EXP(SUM(LN(column))) or POWER(N,SUM(LOG(column, N)))
MSSQL : EXP(SUM(LOG(column))) or POWER(N,SUM(LOG(column)/LOG(N)))
MySQL : EXP(SUM(LOG(column))) or POW(N,SUM(LOG(N,column)))
Care when using EXP/LOG in SQL Server, watch the return type http://msdn.microsoft.com/en-us/library/ms187592.aspx
The POWER form allows for larger numbers (using bases larger than Euler's number), and in cases where the result grows too large to turn it back using POWER, you can return just the logarithmic value and calculate the actual number outside of the SQL query
* LOG(0) and LOG(-ve) are undefined. The below shows only how to handle this in SQL Server. Equivalents can be found for the other SQL flavours, using the same concept
create table MUL(data int)
insert MUL select 1 yourColumn union all
select 2 union all
select 4 union all
select 8 union all
select -2 union all
select 0
select CASE WHEN MIN(abs(data)) = 0 then 0 ELSE
EXP(SUM(Log(abs(nullif(data,0))))) -- the base mathematics
* round(0.5-count(nullif(sign(sign(data)+0.5),1))%2,0) -- pairs up negatives
END
from MUL
Ingredients:
taking the abs() of data, if the min is 0, multiplying by whatever else is futile, the result is 0
When data is 0, NULLIF converts it to null. The abs(), log() both return null, causing it to be precluded from sum()
If data is not 0, abs allows us to multiple a negative number using the LOG method - we will keep track of the negativity elsewhere
Working out the final sign
sign(data) returns 1 for >0, 0 for 0 and -1 for <0.
We add another 0.5 and take the sign() again, so we have now classified 0 and 1 both as 1, and only -1 as -1.
again use NULLIF to remove from COUNT() the 1's, since we only need to count up the negatives.
% 2 against the count() of negative numbers returns either
--> 1 if there is an odd number of negative numbers
--> 0 if there is an even number of negative numbers
more mathematical tricks: we take 1 or 0 off 0.5, so that the above becomes
--> (0.5-1=-0.5=>round to -1) if there is an odd number of negative numbers
--> (0.5-0= 0.5=>round to 1) if there is an even number of negative numbers
we multiple this final 1/-1 against the SUM-PRODUCT value for the real result
No, but you can use Mathematics :)
if yourColumn is always bigger than zero:
select EXP(SUM(LOG(yourColumn))) As ColumnProduct from yourTable
I see an Oracle answer is still missing, so here it is:
SQL> with yourTable as
2 ( select 1 yourColumn from dual union all
3 select 2 from dual union all
4 select 4 from dual union all
5 select 8 from dual
6 )
7 select EXP(SUM(LN(yourColumn))) As ColumnProduct from yourTable
8 /
COLUMNPRODUCT
-------------
64
1 row selected.
Regards,
Rob.
With PostgreSQL, you can create your own aggregate functions, see http://www.postgresql.org/docs/8.2/interactive/sql-createaggregate.html
To create an aggregate function on MySQL, you'll need to build an .so (linux) or .dll (windows) file. An example is shown here: http://www.codeproject.com/KB/database/mygroupconcat.aspx
I'm not sure about mssql and oracle, but i bet they have options to create custom aggregates as well.
You'll break any datatype fairly quickly as numbers mount up.
Using LOG/EXP is tricky because of numbers <= 0 that will fail when using LOG. I wrote a solution in this question that deals with this
Using CTE in MS SQL:
CREATE TABLE Foo(Id int, Val int)
INSERT INTO Foo VALUES(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)
;WITH cte AS
(
SELECT Id, Val AS Multiply, row_number() over (order by Id) as rn
FROM Foo
WHERE Id=1
UNION ALL
SELECT ff.Id, cte.multiply*ff.Val as multiply, ff.rn FROM
(SELECT f.Id, f.Val, (row_number() over (order by f.Id)) as rn
FROM Foo f) ff
INNER JOIN cte
ON ff.rn -1= cte.rn
)
SELECT * FROM cte
Not sure about Oracle or sql-server, but in MySQL you can just use * like you normally would.
mysql> select count(id), count(id)*10 from tablename;
+-----------+--------------+
| count(id) | count(id)*10 |
+-----------+--------------+
| 961 | 9610 |
+-----------+--------------+
1 row in set (0.00 sec)

SQL With clause, same column name in different tables

I have two different tables. mode_table and station_table. They both have a column called day_column. I want to have the records from each table which day_column is greater than 20.
I am not sure if I am allowed to use the same alias (day_Val) in both tables. Is this correct?
WITH
active_mode AS (
SELECT (IF(day_column > '20', 0, 1)) AS day_Val
FROM mode_table
),
active_station AS (
SELECT (IF(day_column > '20', 0, 1)) AS day_Val
FROM station_table
)
SELECT day_column, day_val
FROM active_mode, active_station
WHERE day_Val != 1
The reason for this question is I have around 14 tables and I do not want to create 14 alias and use just one.
The goal is to report any row in all those 14 tables that has day_column > 20. All the tables have a column named day_column.
Do you just want union all:
select . . . -- whatever columns you want
from mode_table m
where day_column > 20
union all
select . . .
from station_table s
where day_column > 20
The immediate problem I can find from your script is there is no column called “day_column” from both “active_mode” and active_station.
Regarding to your question, the answer is yes, but not the way you are doing, you may consider to use a union all statement
For example
Select
Date_val
From active_mode
Union all
Select
Date_val
From active_station
Otherwise, it will cause ambiguity to the server as it don’t know which table expression are you referring to.

Oracle SQL - How to "pivot" one row to many

In Oracle 12c, I have a view, which takes a little time to run. When I add the where clause, it will return exactly one row of interest. The row has columns/value like this...
I need this flipped so that I can see one row per EACH "set". I need the SQL to return something like
I know I can do a UNION ALL for each of the entry sets, but as the view takes a little while to run, plus there are about 30 different sets (I only showed 3 - Car, Boat, and truck)
Is there a better way of doing this? I have looked at PIVOT/UNPIVOT, but I didn't see how to make this work.
I think you are looking for UNPIVOT
WITH TEMP_DATA (ID1, CarPrice, CarTax, BoatPrice, BoatTax, TruckPrice, TruckTax)
AS (
select 'AAA', 1, 2, 3, 4, 5, 6 from dual )
select TYPE, PRICE, TAX
from temp_data
unpivot
(
(PRICE, TAX)
for TYPE IN
(
(CarPrice, CarTax) as 'CAR',
(BoatPrice, BoatTax) as 'BOAT',
(TruckPrice, TruckTax) as 'TRUCK'
)
)
;
OUTPUT:
TYPE PRICE TAX
----- ---------- ----------
CAR 1 2
BOAT 3 4
TRUCK 5 6

BigQuery SQL same column multiple expressions

I'm using Google's Big Query service to do some data processing...my database looks like:
value
-----
'a'
'b'
'a'
'a'
'a'
'b'
I want to write a query to count the occurrences of the various values.
Example:
Count('a') Count('b')
---------- ----------
4 3
I'd normally use Case to solve this; but BQ doesn't support Case.
Anyone have any ideas?
Thanks!
The first thing I would suggest is a group by:
select value, count(*)
from t
group by value
But you seem to want the values in one row. According to this documentation, it does support case. If you prefer, you can use if:
select sum(if(value = 'A', 1, 0)) as A, sum(if(value = 'B', 1, 0)) as B
from t

Comparing list of values against table

I tried to find solution for this problem for some time but without success so any help would be much appreciated. List of IDs needs to be compared against a table and find out which records exist (and one of their values) and which are non existent. There is a list of IDs, in text format:
100,
200,
300
a DB table:
ID(PK) value01 value02 value03 .....
--------------------------------------
100 Ann
102 Bob
300 John
304 Marry
400 Jane
and output I need is:
100 Ann
200 missing or empty or whatever indication
300 John
Obvious solution is to create table and join but I have only read access (DB is closed vendor product, I'm just a user). Writing a PL/SQL function also seems complicated because table has 200+ columns and 100k+ records and I had no luck with creating dynamic array of records. Also, list of IDs to be checked contains hundreds of IDs and I need to do this periodically so any solution where each ID has to be changed in separate line of code wouldn't be very useful.
Database is Oracle 10g.
there are many built in public collection types. you can leverage one of them like this:
with ids as (select /*+ cardinality(a, 1) */ column_value id
from table(UTL_NLA_ARRAY_INT(100, 200, 300)) a
)
select ids.id, case when m.id is null then '**NO MATCH**' else m.value end value
from ids
left outer join my_table m
on m.id = ids.id;
to see a list of public types on your DB, run :
select owner, type_name, coll_type, elem_type_name, upper_bound, precision, scale from all_coll_types
where elem_type_name in ('FLOAT', 'INTEGER', 'NUMBER', 'DOUBLE PRECISION')
the hint
/*+ cardinality(a, 1) */
is just used to tell oracle how many elements are in our array (if not specified, the default will be an assumption of 8k elements). just set to a reasonably accurate number.
You can transform a variable into a query using CONNECT BY (tested on 11g, should work on 10g+):
SQL> WITH DATA AS (SELECT '100,200,300' txt FROM dual)
2 SELECT regexp_substr(txt, '[^,]+', 1, LEVEL) item FROM DATA
3 CONNECT BY LEVEL <= length(txt) - length(REPLACE(txt, ',', '')) + 1;
ITEM
--------------------------------------------
100
200
300
You can then join this result to the table as if it were a standard view:
SQL> WITH DATA AS (SELECT '100,200,300' txt FROM dual)
2 SELECT v.id, dbt.value01
3 FROM dbt
4 RIGHT JOIN
5 (SELECT to_number(regexp_substr(txt, '[^,]+', 1, LEVEL)) ID
6 FROM DATA
7 CONNECT BY LEVEL <= length(txt) - length(REPLACE(txt, ',', '')) + 1) v
8 ON dbt.id = v.id;
ID VALUE01
---------- ----------
100 Ann
300 John
200
One way of tackling this is to dynamically create a common table expression that can then be included in the query. The final synatx you'd be aiming for is:
with list_of_values as (
select 100 val from dual union all
select 200 val from dual union all
select 300 val from dual union all
...)
select
lov.val,
...
from
list_of_values lov left outer join
other_data t on (lov.val = t.val)
It's not very elegant, particularly for large sets of values, but compatibility with a database on which you might have few privileges is very good.