Is there a way to round an Oracle crosstab PIVOT? - sql

I want to pivot data using the AVG() function, but I want to round the results to prevent repeating decimals from displaying.
When I try something like this: PIVOT( ROUND( AVG(column_name), 2) FOR ...)
I get an error: ORA-56902: expect aggregate function inside pivot operation
Here is a very simple example of "number of students registered in a course":
CREATE TABLE TBL_EXAMPLE
(
enrolled NUMBER,
course VARCHAR2(50 CHAR)
);
INSERT INTO TBL_EXAMPLE (enrolled, course) VALUES (1, 'math');
INSERT INTO TBL_EXAMPLE (enrolled, course) VALUES (2, 'math');
INSERT INTO TBL_EXAMPLE (enrolled, course) VALUES (2, 'math');
INSERT INTO TBL_EXAMPLE (enrolled, course) VALUES (1, 'english');
INSERT INTO TBL_EXAMPLE (enrolled, course) VALUES (4, 'english');
SELECT *
FROM TBL_EXAMPLE
PIVOT ( AVG(enrolled) FOR course IN ('math', 'english') );
'math' 'english'
---------------|-------------
1.6666666666...| 2.5
What I want is:
SELECT *
FROM TBL_EXAMPLE
PIVOT ( ROUND(AVG(enrolled), 2) FOR course IN ('math', 'english') );
'math' 'english'
---------------|-------------
1.67 | 2.50
In the real world application, the SQL is being dynamically generated based on user input on a report, and due to the complexities of the real world scenario I can't just re-write the query like this:
SELECT ROUND("'math'", 2) as "'math'", ROUND("'english'", 2) as "'english'"
FROM TBL_EXAMPLE
PIVOT ( AVG(enrolled) FOR course IN ('math', 'english') );
So, my question is, is there any workaround I can use to bypass ORA-56902 in this scenario, or any other way to 'trick' Oracle into NOT returning up to 38 digits of decimal precision when numbers don't divide evenly via the AVG() calculation in a PIVOT clause?

Maybe I'm missing something, but why not perform the AVG() in a subquery with a ROUND and then apply your PIVOT:
select *
from
(
select round(avg(enrolled), 2) enrolled, course
from tbl_example
group by course
) d
PIVOT
(
max(enrolled)
FOR course IN ('math', 'english')
);
See SQL Fiddle with Demo

Related

SQL Query for insert many values in a table and take only a value from another table

I'm looking for insert many values in a table and take the ID refernce from another table. I have tried diffent ways, and finaly I have found this that works.
INSERT INTO tblUserFreeProperty (id, identname, val, pos)
VALUES ((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.G', N'??_??#False', 1),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.Qta_C', N'??_??#0', 2),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.Qta_M', N'??_??#0', 3),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.UbicM', N'??_??#No', 4),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.UbicS', N'??_??#', 5),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.UbicP', N'??_??#', 6),
((SELECT id FROM tblpart where tblPart.ordernr=N'3CFSU05'),N'DSR_Mag.UbicC', N'??_??#', 7);
This works, but I'm looking for a "easy query" because I need to write the command from Visual Studio
The link I noted earlier should have sufficed to explain the correct syntax.
Insert into ... values ( SELECT ... FROM ... )
But seeing as there has been much misinformation on this post, I will show how you should do it.
INSERT INTO tblUserFreeProperty (id, identname, val, pos)
SELECT p.id, v.identname, v.val, v.pos
FROM (VALUES
(N'DSR_Mag.G', N'??_??#False', 1),
(N'DSR_Mag.Qta_C', N'??_??#0', 2),
(N'DSR_Mag.Qta_M', N'??_??#0', 3),
(N'DSR_Mag.UbicM', N'??_??#No', 4),
(N'DSR_Mag.UbicS', N'??_??#', 5),
(N'DSR_Mag.UbicP', N'??_??#', 6),
(N'DSR_Mag.UbicC', N'??_??#', 7)
) AS v(identname, val, pos)
JOIN tblpart p ON p.ordernr = N'3CFSU05';
Note the use of a standard JOIN clause, there are no subqueries. Note also good use of short, meaningful table aliases.
As far as the VALUES table constructor goes, it can also be replaced with a temp table, or table variable, or Table Valued parameter. Or indeed another table.
Side note: I don't know what you are storing in those columns, but it appears you have multiple pieces of info in each. Do not do this. Store each atomic value in its own column.
INSERT tblUserFreeProperty (id, identname, val, pos)
SELECT tblpart.id, X.A, X.B, X.C)
FROM (
VALUES (
(N'DSR_Mag.G0', N'??_??#True', 1),
(N'DSR_Mag.G1', N'??_??#True', 2),
(N'DSR_Mag.G2', N'??_??#False', 3);
)
) X(A,B,C)
CROSS JOIN tblPart
WHERE tblPart.ordernr=N'555'

Want to use multiple aggregate function with snowflake pivot columns function

CREATE TABLE person (id INT, name STRING, date date, class INT, address STRING);
INSERT INTO person VALUES
(100, 'John', 30-1-2021, 1, 'Street 1'),
(200, 'Mary', 20-1-2021, 1, 'Street 2'),
(300, 'Mike', 21-1-2021, 3, 'Street 3'),
(100, 'John', 15-5-2021, 4, 'Street 4');
SELECT * FROM person
PIVOT (
**SUM(age) AS a, MAX(date) AS c**
FOR name IN ('John' AS john, 'Mike' AS mike)
);
This is databricks sql code above, how do I implement the same logic in snowflake
Below is the syntax for PIVOT in Snowflake:
SELECT ...
FROM ...
PIVOT ( <aggregate_function> ( <pivot_column> )
FOR <value_column> IN ( <pivot_value_1> [ , <pivot_value_2> ... ] ) )
[ ... ]
In case of Snowflake, your AS keyword will be outside the PIVOT function.
Check this example for your reference:
select *
from monthly_sales
pivot(sum(amount) for month in ('JAN', 'FEB', 'MAR', 'APR'))
as p
order by empid;
Visit this official document and check the given examples for better understanding.
Firstly, there is no "AGE" column as I can see from your table DDL.
Secondly, I do not think you can pivot on multiple aggregation functions, as the value will be put under the mentioned columns "JOHN" and "MIKE" for their corresponding aggregated values, it can't fit into two separate values. I don't know how your DataBricks example would work.
Your example will look something like below in Snowflake, after removing one aggregation function:
SELECT *
FROM
person
PIVOT (
MAX(date) FOR name IN ('John', 'Mike')
)
as p (id, class, address, john, mike)
;
Snowflake does not support multiple aggregate expressions in the PIVOT
And as noted by others, your AGE is missing, and you also do not have a ORDER BY clause, which makes rolling your own SQL harder.
SELECT
SUM(IFF(name='John',age,null)) AS john_sum_age,
MAX(IFF(name='John',date,null)) AS john_max_date,
SUM(IFF(name='Mike',age,null)) AS mike_age,
MAX(IFF(name='Mike',date,null)) AS mike_max_date
FROM person
if you had the ORDER BY in your example it would become the GROUP BY clause in this form
SELECT
<gouping_columns>,
SUM(IFF(name='John',age,null)) AS john_sum_age,
MAX(IFF(name='John',date,null)) AS john_max_date,
SUM(IFF(name='Mike',age,null)) AS mike_age,
MAX(IFF(name='Mike',date,null)) AS mike_max_date
FROM person
GROUP BY <gouping_columns>

Run mode() function of each value in INT ARRAY

I have a table that holds an INT ARRAY data type representing some features (this is done instead of having a separate boolean column for each feature). The column is called feature_ids. If a record has a specific feature, the ID of the feature will be present in the feature_ids column. The mapping of the feature_ids are for context understanding as follows:
1: Fast
2: Expensive
3: Colorfull
4: Deadly
So in other words, I would also have had 4 columns called is_fast, is_expensive, is_colorfull and is_deadly - but I don't since my real application have +100 features, and they change quite a bit.
Now back to the question: I wanna do an aggregate mode() on the records in the table returning what are the most "frequent" features to have (e.g. if it's more common to be "fast" than not etc.). I want it returned in the same format as the original feature_ids column, but where the ID of a feature is ONLY in represented, if it's more common to be there than not, within each group:
CREATE TABLE test (
id INT,
feature_ids integer[] DEFAULT '{}'::integer[],
age INT,
type character varying(255)
);
INSERT INTO test (id, age, feature_ids, type) VALUES (1, 10, '{1,2}', 'movie');
INSERT INTO test (id, age, feature_ids, type) VALUES (2, 2, '{1}', 'movie');
INSERT INTO test (id, age, feature_ids, type) VALUES (3, 9, '{1,2,4}', 'movie');
INSERT INTO test (id, age, feature_ids, type) VALUES (4, 11, '{1,2,3}', 'wine');
INSERT INTO test (id, age, feature_ids, type) VALUES (5, 12, '{1,2,4}', 'hat');
INSERT INTO test (id, age, feature_ids, type) VALUES (6, 12, '{1,2,3}', 'hat');
INSERT INTO test (id, age, feature_ids, type) VALUES (7, 8, '{1,4}', 'hat');
I wanna do a query something like this:
SELECT
type, avg(age) as avg_age, mode() within group (order by feature_ids) as most_frequent_features
from test group by "type"
The result I expect is:
type avg_age most_frequent_features
hat 10.6 [1,2,4]
movie 7.0 [1,2]
wine 11.0 [1,2,3]
I have made an example here: https://www.db-fiddle.com/f/rTP4w7264vDC5rqjef6Nai/1
I find this quite tricky. The following is a rather brute-force approach -- calculating the "mode" explicitly and then bringing in the other aggregates:
select tf.type, t.avg_age,
array_agg(feature_id) as features
from (select t.type, feature_id, count(*) as cnt,
dense_rank() over (partition by t.type order by count(*) desc) as seqnum
from test t cross join
unnest(feature_ids) feature_id
group by t.type, feature_id
) tf join
(select t.type, avg(age) as avg_age
from test t
group by t.type
) t
on tf.type = t.type
where seqnum <= 2
group by tf.type, t.avg_age;
Here is a db<>fiddle.

Getting Count Only of Distinct Value Combinations of multiple fields.

Please consider the following:
IF OBJECT_ID ('tempdb..#Customer') IS NOT NULL
DROP TABLE #Customer;
CREATE TABLE #Customer
(
CustomerKey INT IDENTITY (1, 1) NOT NULL
,CustomerNum INT NOT NULL
,CustomerName VARCHAR (25) NOT NULL
,Planet VARCHAR (25) NOT NULL
)
GO
INSERT INTO #Customer (CustomerNum, CustomerName, Planet)
VALUES (1, 'Anakin Skywalker', 'Tatooine')
, (2, 'Yoda', 'Coruscant')
, (3, 'Obi-Wan Kenobi', 'Coruscant')
, (4, 'Luke Skywalker', 'Tatooine')
, (4, 'Luke Skywalker', 'Tatooine')
, (4, 'Luke Skywalker', 'Bespin')
, (4, 'Luke Skywalker', 'Bespin')
, (4, 'Luke Skywalker', 'Endor')
, (4, 'Luke Skywalker', 'Tatooine')
, (4, 'Luke Skywalker', 'Kashyyyk');
Notice that there are a total of 10 records. I know that I can get the list of distinct combinations of CustomerName and PLanet eith either of the following two queries.
SELECT DISTINCT CustomerName, Planet FROM #Customer;
SELECT CustomerName, Planet FROM #Customer
GROUP BY CustomerName, Planet;
However, what I'd like is a simple way to get just the count of those values, not the values themselves. I'd like a way that's quick to type, but also performant. I know I could load the values into a CTE, Temp Table, Table Variable, or Sub Query, and then count the records. Is there a better way to accomplish this?
This will work in 2005:
SELECT COUNT(*) AS cnt
FROM
( SELECT 1 AS d
FROM Customer
GROUP BY Customername, Planet
) AS t ;
Tested in SQL-Fiddle. An index on (CustomerName, Planet) would be used, see the query plan (for 2012 version):
The simplest to think, "get all distinct values in a subquery, then count" , yiields the same identical plan:
SELECT COUNT(*) AS cnt
FROM
 ( SELECT DISTINCT Customername, Planet
   FROM  Customer
 ) AS t ;
And also the one (thanx to #Aaron Bertrand) using ranking function ROW_NUMBER() (not sure if it will be efficient in 2005 version, too, but you can test):
SELECT COUNT(*) AS cnt
FROM
(SELECT rn = ROW_NUMBER()
OVER (PARTITION BY CustomerName, Planet
ORDER BY CustomerName)
FROM Customer) AS x
WHERE rn = 1 ;
There are also other ways to write this (one even without subquery, thanx to #Mikael Erksson!) but not as efficient.
The subquery/CTE method is the "right" way to do it.
A quick (in terms of typing but not necessarily performance) and dirty way is:
select count(distinct customername+'###'+Planet)
from #Customer;
The '###' is to separate the values so you don't get accidental collisions.

MySQL INSERT with multiple nested SELECTs

Is a query like this possible? MySQL gives me an Syntax error. Multiple insert-values with nested selects...
INSERT INTO pv_indices_fields (index_id, veld_id)
VALUES
('1', SELECT id FROM pv_fields WHERE col1='76' AND col2='val1'),
('1', SELECT id FROM pv_fields WHERE col1='76' AND col2='val2')
I've just tested the following (which works):
insert into test (id1, id2) values (1, (select max(id) from test2)), (2, (select max(id) from test2));
I imagine the problem is that you haven't got ()s around your selects as this query would not work without it.
When you have a subquery like that, it has to return one column and one row only. If your subqueries do return one row only, then you need parenthesis around them, as #Thor84no noticed.
If they return (or could return) more than row, try this instead:
INSERT INTO pv_indices_fields (index_id, veld_id)
SELECT '1', id
FROM pv_fields
WHERE col1='76'
AND col2 IN ('val1', 'val2')
or if your conditions are very different:
INSERT INTO pv_indices_fields (index_id, veld_id)
( SELECT '1', id FROM pv_fields WHERE col1='76' AND col2='val1' )
UNION ALL
( SELECT '1', id FROM pv_fields WHERE col1='76' AND col2='val2' )