Oracle - Group-level summaries - sql

I am trying to create a report that has a summary for each group. For example:
ID NAME COUNT TOTAL TYPE
-------------------------------------------------------------
1 Test 1 10 A
2 Test 2 8 A
18
7 Mr. Test 9 B
12 XYZ 4 B
13
25 ABC 3 C
26 DEF 5 C
19 GHIJK 1 C
9
I have a query that can do everything except the TOTAL columns:
select sd.id DATA_REF_NUM ID, count(sd.DATA_DEF_ID) COUNT, defs.data_name NAME, sd.type
from some_data sd, data_defs defs
where sd.data_def_id = defs.data_def_id
group by some_data.type, some_data.id, defs.data_nam
order by some_data.id asc, count(amv.MSG_ID) desc ;
I'm just not sure how to get a summary on a group. In this case, I'm trying to get a sum of COUNT for each group of ID.
UPDATE:
Groups are by type. Forgot that in the original post.
TOTAL is SUM(COUNT) for each group.

How about using ROLLUP like...
select sd.id DATA_REF_NUM ID, count(sd.DATA_DEF_ID) COUNT, defs.data_name NAME, sd.type
from some_data sd, data_defs defs
where sd.data_def_id = defs.data_def_id
group by ROLLUP(some_data.type, (some_data.id, defs.data_nam))
order by some_data.id asc, count(amv.MSG_ID) desc ;
This works for a similar example in my database, but I only did it over two columns, not sure how it will function over more...
Hope this is helpful,
Craig...
EDIT: In a ROLLUP, columns you want to sum over but not subtotal over like id and data_nam should be lumped together inside the ROLLUP in parantheses)

Assuming SQL*Plus, you could do something like this:
col d1 noprint
col d2 noprint
WITH q AS
(SELECT sd.id, count(sd.DATA_DEF_ID) COUNT, defs.data_name NAME, sd.type
FROM some_data sd JOIN data_defs defs ON (sd.data_def_id = defs.data_def_id)
GROUP BY some_data.type, some_data.id, defs.data_nam)
SELECT 1 d1, type d2, id, count, name FROM q
UNION ALL
SELECT 2, type, null, null, null, SUM(count) FROM q GROUP BY 2, type
ORDER BY 2,1,3;
I can't make this work in PL/SQL Developer 8, only SQL*Plus. Not even the command window will work...

Try a subquery that returns the count of all the items of the type. This would
select sd.id DATA_REF_NUM ID, count(sd.DATA_DEF_ID) COUNT, tot.TOTAL_FOR_TYPE, defs.data_name NAME, sd.type
from some_data sd, data_defs defs,
(select count(sd2.DATA_DEF_ID) TOTAL_FOR_TYPE
from some_data sd2
where sd2.type = sd.type) tot
where sd.data_def_id = defs.data_def_id
group by some_data.type, some_data.id, defs.data_nam
order by some_data.id asc, count(amv.MSG_ID) desc ;

Related

Oracle Create a view replacing ids with names in column (not 1nf)

We have for example this table:
pl_num camp_type products
1 T 1,2,3
2 B 1,3,4
Yeah, I know it's not in 1NF but we need to work with it
because of application loads data in such way.
And we have table DICT_PRODUCT, for example (in reality, there are more than 500 product):
id product_name
1 a
2 b
3 c
4 d
So, what we need is to create view where product_id was replaced by its name in dictionary
---V_TAB1 ---
pl_num camp_type products
1 T 1,b,c
2 B a,c,d
Try this. It will work if products column in TAB1 contain numbers and not any other characters.
WITH prod
AS (SELECT pl_num, camp_type, TO_NUMBER (TRIM (COLUMN_VALUE)) product
FROM Tab1 t, XMLTABLE (t.products))
SELECT prod.pl_num,
prod.camp_type,
LISTAGG (d.product_name, ',') WITHIN GROUP (ORDER BY id) products
FROM prod JOIN dict_product d ON prod.product = d.id
GROUP BY prod.pl_num, prod.camp_type;
DEMO
Try this one:
select distinct *
from (
select t.u_name, u_id, regexp_substr(t.prod,'[^,]+', 1, level) id
from (select prod,u_id, u_name from cmdm.t_prod) t
connect by regexp_substr(prod,'[^,]+',1,level) is not null) ut
inner join cmdm.t_dct dt
on ut.id=dt.id

Merge rows by one column

I have a column name and value and I need to write a select to merge all rows with the same name into one row (something like distinct) except when I use distinct then I can't merge/sum value column.
Example:
name value
A 10
B 5
C 20
A 5
C 1
B 5
And the result would be:
A 15
B 10
C 21
This is my select so far, but it is not "merged", it does exactly what my example shows.
select
projects.name,
sum(current_date - (projects_programmers.joined_at))
from projects, projects_programmers
where projects.id = projects_programmers.project_id
group by projects.name, projects_programmers.joined_at
select name, sum(value) value from yourTableName group by name
select distinct
"name", sum("value") over (partition by "name")
from table_name
you can use window function for this

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

Simple SQL query with select and group by

I have some kind of problem to understand something.
I have the next table:
ID PROD PRICE
1 A 10
2 B 20
3 C 30
4 A 1
5 B 12
6 C 2
7 A 7
8 B 8
9 C 9
10 A 5
11 B 2
I want to get all the minimum prices of all the prod, meaning I want to get 3 records, the minimum price for every prod.
From the example above, this is what I want to get:
ID PROD MIN(PRICE)
4 A 1
11 B 2
6 C 2
This is the query I wrote:
select id, prod, min(price)
from A1
group by(prod);
But this is the records I got:
ID PROD MIN(PRICE)
1 A 1
2 B 2
3 C 2
As you can see the ID value is wrong, it is only give me some kind of line counter and not the actual ID value.
You can check it at the next link
What I'm doing wrong?
SELECT a.*
FROM A1 a
INNER JOIN
(
SELECT Prod, MIN(Price) minPrice
FROM A1
GROUP BY Prod
) b ON a.Prod = b.Prod AND
a.Price = b.minPrice
SQLFiddle Demo
For MSSQL
SELECT ID, Prod, Price
FROM
(
SELECT ID, Prod, Price,
ROW_NUMBER() OVER(Partition BY Prod ORDER BY Price ASC) s
FROM A1
) a
WHERE s = 1
SQLFiddle Demo
You must be using MySQL or perhaps PostgreSQL.
In standard SQL, all non-aggregate columns in the select-list must be cited in the GROUP BY clause.
I'm not clear whether you need the ID column. If not, then use:
SELECT prod, MIN(price) AS min_price
FROM A1
GROUP BY prod;
If you need the matching ID number, then that becomes a sub-query:
SELECT id, prod, price
FROM A1
JOIN (SELECT prod, MIN(price) AS min_price
FROM A1
GROUP BY prod
) AS A2 ON A1.prod = A2.prod AND A1.price = A2.min_price;
Can you please explain what is the problem with what I wrote, and yes I need the ID column.
select id, prod, min(price)
from A1
group by(prod);
In standard SQL, you would get an error message (or, if not standard, in most SQL DBMS).
Where you are allowed to omit the ID column from the GROUP BY clause, then you get a quasi-random value for ID for the correct prod and MIN(price) values. Basically, the optimizer will choose any convenient ID that it knows about, based on its whims. Specifically, it does not do the sub-query and join that the full answer does. For example, it might do a sequential scan, and the ID it returns might be the first, or last, that it encounters for the given prod value, or it might be some other value — I'm not even sure whether the ID returned for prod = 'A' has to be an ID that was associated with prod = 'A'; you'd have to read the manual carefully. Basically, your query is indeterminate, so many return values are permissible and 'correct' (but not what you wanted).
Note that if you grouped by ID and not prod, then the result in prod would be determinate. That's because the ID column is a candidate key (unique identifier) for the table. (I believe PostgreSQL distinguishes between the two cases — but I'm not certain of that; MySQL does not.)

Selecting and sorting data from a single table

Correction to my question....
I'm trying to select and sort in a query from a single table. The primary key for the table is a combination of a serialized number and a time/date stamp.
The table's name in the database is "A12", the columns are defined as:
Serial2D (PK, char(25), not null)
Completed (PK, datetime, not null)
Result (smallint, null)
MachineID (FK, smallint, null)
PT_1 (float, null)
PT_2 (float, null)
PT_3 (float, null)
PT_4 (float, null)
Since the primary key for the table is a combination of the "Serial2D" and "Completed", there can be multiple "Serial2D" entries with different values in the "Completed" and "Result" columns. (I did not make this database... I have to work with what I got)
I want to write a query that will utilize the value of the "Result" column ( always a "0" or "1") and retrive only unique rows for each "Serial2D" value. If the "Result" column has a "1" for that row, I want to choose it over any entries with that Serial that has a "0" in the Result column. There should be only one entry in the table that has a Result column entry of "1" for any Serial2D value.
Ex. table
Serial2d Completed Result PT_1 PT_2 PT_3 PT_4
------- ------- ------ ---- ---- ---- ----
A1 1:00AM 0 32.5 20 26 29
A1 1:02AM 0 32.5 10 29 40
A1 1:03AM 1 10 5 4 3
B1 1:04AM 0 29 4 1 9
B1 1:05AM 0 40 3 4 9
C1 1:06AM 1 9 7 6 4
I would like to be able to retrieve would be:
Serial2d Completed Result PT_1 PT_2 PT_3 PT_4
------- ------- ------ ---- ---- ---- ----
A1 1:03AM 1 10 5 4 3
B1 1:05AM 0 40 3 4 9
C1 1:06AM 1 9 7 6 4
I'm new to SQL and I'm still learning ALL the syntax. I'm finding it difficult to search for the correct operators to use since I'm not sure what I need, so please forgive my ignorance. A post with my answer could be staring me right in the face and i wouldn't know it, please just point me to it.
I appreciate the answers to my previous post, but the answers weren't sufficient for me due to MY lack of information and ineptness with SQL. I know this is probably insanely easy for some, but try to remember when you first started SQL... that's where I'm at.
Since you are using SQL Server, you can use Windowing Functions to get this data.
Using a sub-query:
select *
from
(
select *,
row_number() over(partition by serial2d
order by result desc, completed desc) rn
from a12
) x
where rn = 1
See SQL Fiddle with Demo
Or you can use CTE for this query:
;with cte as
(
select *,
row_number() over(partition by serial2d
order by result desc, completed desc) rn
from a12
)
select *
from cte c
where rn = 1;
See SQL Fiddle With Demo
You can group by Serial to get the MAX of each Time.
SELECT Serial, MAX([Time]) AS [Time]
FROM myTable
GROUP BY Serial
HAVING MAX(Result) => 0
SELECT
t.Serial,
max_Result,
MAX([time]) AS max_time
FROM
myTable t inner join
(SELECT
Serial,
MAX([Result]) AS max_Result
FROM
myTable
GROUP BY
Serial) m on
t.serial = m.serial and
t.result = m.max_result
group by
t.serial,
max_Result
This can be solved using a correlated sub-query:
SELECT
T.serial,
T.[time],
0 AS result
FROM tablename T
WHERE
T.result = 1
OR
NOT EXISTS(
SELECT 1
FROM tablename
WHERE
serial = T.serial
AND (
[time] > T.[time]
OR
result = 1
)
)