Runnig the same query over and over again - sql

I'm using Oracle SQL and I need help with hard query.
I have the following table (MyTable):
id int,
name1 int,
name2 int,
..
..
..
name80 int,
These column names are fake.
Here is my query:
select id ,cnt / (select count(*) from MyTable)
from(
select id, name1, name2, count(distinct name1) over(partition by name2) cnt
from my MyTable);
I need to run this query each time for next pair of columns. For example, the next pair will be:
select id ,cnt / (select count(*) from MyTable)
from(
select id, name2, name3, count(distinct name2) over(partition by name3) cnt
from my MyTable);
And so on.
The final output table need to include id and each pair calculation.
id int,
"calc of name1+name2" float,
"calc of name2+name3" float,
"calc of name3+name4" float,
"calc of name4+name5" float,
"calc of name5+name6" float,
...
...
...
"calc of name79+name80" float,
Can someone show me how to do that? I'll really appreciate any help. I'm feel lost.

Am I missing something? You want a query like this:
select id,
count(distinct name2) over (partition by name3) / count(*) over (),
count(distinct name3) over (partition by name4) / count(*) over (),
. . .
from mytable;
My guess is that your problem is typing all these rows.
You can run a query like this to generate the code:
select replace(replace('count(distinct <thiscol>) over (partition by <nextcol>) / count(*) over () as <thiscol>_<nextcol>,',
'<thiscol>', column_name
), '<nextcol>', lead(column_name) over (order by column_id)
)
from all_tab_columns atc
where table_name = 'mytable'

You can use dynamic sql (EXECUTE IMMEDIATE ) and USER_TAB_COLUMNS table for table column metadata.

Related

Find duplicate ID and add new sequence ID

I have a table where ID must be unique. There are some IDs that are not unique. How do I generate a new column which adds a sequence to this ID? I want to generate ID_new_generated in the table below
ID Company Name ID_new_generated
1 A 1
1 B 1_2
2 C 2
You can use a windowing function (e.g. Rank) to to generate an secondary ID, over each window defined by rows that have the same ID number, then just concatenate it to create the new one.
something like:
select
ID
, companyName
, rank() over(partition by ID ORDER BY companyName)
, concat(ID, '_', rank() over(partition by ID ORDER BY companyName)) as new_id
from test;
See this demo: https://www.db-fiddle.com/f/bd6aQKnZ7gcZCQjFpZicrp/0
Syntax will be different depending on which sql you are using.
Assumed you are looking for a solution in SQL Server:
First you will need to add a nullable column ID_Generated like below:
ALTER TABLE tablename
ADD COLUMN ID_Generated varchar(25) null
GO
Then, use row_number like below in a cte structure (you can use temp table if you are using mysql):
;with cte as (
SELECT DISTINCT t.ID,
(ROW_NUMBER() over (partition by t.ID order by t.ID)) as RowNumber
FROM tablename t
INNER JOIN (select ID, Count(*) RecCount
From tablename
group by ID
having Count(*) > 1) tt on t.ID = t.ID
ORDER BY id ASC
)
Update t
set t.ID_Generated = cte.RowNumber
from tablename t
inner join cte on t.ID = cte.ID
I think you want:
select ID, companyName,
(case when row_number() over (partition by id order by companyname) = 1
then cast(id as varchar(255))
else id || '_' || row_number() over (partition by id order by companyname)
end) as new_id
from test;
|| is the ANSI/ISO standard concatenation operator in SQL. Not all databases support it, so you might need to replace the operator with the one appropriate for your database.

aggregation according to different conditions on same column

I have a table #tbl like below, i need to write a query like if there are more than 3 records availble
for particular cid then avg(val of particular cid ) for particular cid should be dispalyed against each id and if there are less than
3 records availble for particular cid then avg(val of all records availble).
Please suggest.
declare #tbl table(id int, cid int, val float )
insert into #tbl
values(1,100,20),(2,100,30),(3,100,25),(4,100,31),(5,100,50),
(6,200,30),(7,200,30),(8,300,90)
Your description is not clear, but I believe you need windowed functions:
WITH cte AS (
SELECT *, COUNT(*) OVER(PARTITION BY cid) AS cnt
FROM #tbl
)
SELECT id, (SELECT AVG(val) FROM cte) AS Av
FROM cte
WHERE cnt <=3
UNION ALL
SELECT id, AVG(val) OVER(PARTITION BY cid) AS Av
FROM cte
WHERE cnt > 3
ORDER BY id;
DBFiddle Demo
EDIT:
SELECT id,
CASE WHEN COUNT(*) OVER(PARTITION BY cid) <= 3 THEN AVG(val) OVER()
ELSE AVG(val) OVER(PARTITION BY cid)
END
FROM #tbl
ORDER BY id;
DBFiddle Demo2
You can try with the following. First calculate the average for each Cid depending in it's number of occurences, then join each Cid with the Id to display all table.
;WITH CidAverages AS
(
SELECT
T.cid,
Average = CASE
WHEN COUNT(1) >= 3 THEN AVG(T.val)
ELSE (SELECT AVG(Val) FROM #tbl) END
FROM
#tbl AS T
GROUP BY
T.cid
)
SELECT
T.*,
C.Average
FROM
#tbl AS T
INNER JOIN CidAverages AS C ON T.cid = C.cid
Given the clarifications in comments, I am thinking this is the intention
declare #tbl table(id int, cid int, val float )
insert into #tbl
values(1,100,20),(2,100,30),(3,100,25),(4,100,31),(5,100,50),
(6,200,30),(7,200,30),(8,300,90);
select distinct
cid
, case
when count(*) over (partition by cid) > 3 then avg(val) over (partition by cid)
else avg (val) over (partition by 1)
end as avg
from #tbl;
http://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=fdf4c4457220ec64132de7452a034976
cid avg
100 31.2
200 38.25
300 38.25
There are a number of aspects of a query like this that when run at scale though are going to be pretty bad on the query plan, I'd want to test this at a larger scale and tune before using.
The description was not clear on what happened if it was exactly 3, it mentions 'more than 3' and 'less than 3' - within this code the 'more than' was used to determine which category it was in, and less than interpreted to mean 'less than or equal to 3'

Select max from calculated column alongside other columns

I have a table that has columns ID, FIELD1, FIELD2, all of type NUMBER.
I want to find the MAX of a function on FIELD1 and FIELD2, and display that alongside the ID.
I try
SELECT ID, MAX(SQRT(FIELD1 + FIELD2)) AS CALC
FROM TABLE;
But it returns ORA-00937: not a single-group group function.
I tried the solutions in this thread, but they have their own errors.
SELECT * FROM (
SELECT ID, SQRT(FIELD1 + FIELD2) AS CALC,
RANK() OVER (ORDER BY CALC DESC) AS RANKING
FROM TABLE
)
WHERE RANKING = 1;
gives the error
ORA-06553: PLS-306: wrong number or types of arguments in call to
'OGC_CALC'
and so does
SELECT ID, SQRT(FIELD1 + FIELD2) AS CALC
FROM TABLE
WHERE CALC = (
SELECT MAX(CALC)
FROM TABLE
);
Using Oracle Database 11g Express Edition Release 11.2.0.2.0.
How can I get this to work? Thanks.
Can you try the below two queries.
SELECT ID,FIELD1,FIELD2,SQRT(FIELD1 + FIELD2) AS CALC
FROM TABLE WHERE SQRT(FIELD1 + FIELD2)= (SELECT MAX(SQRT(FIELD1 + FIELD2)) FROM TABLE);
Or,
Suggested by Aleksej without using aggregate or group by function.
SELECT *
FROM (
SELECT ID,
SQRT(FIELD1 + FIELD2)
FROM TABLE
ORDER BY 2 DESC
)
WHERE ROWNUM=1;
Initial query,
SELECT *
FROM (
SELECT ID,
MAX(SQRT(FIELD1 + FIELD2)) AS CALC
FROM TABLE
GROUP BY ID
ORDER BY 2 DESC
)
WHERE ROWNUM=1;
If you need a single value, with the maximum sqrt(field1+field2) among all the values of ID, these are two possible ways:
select *
from (
select *
from yourTable
order by sqrt(field1 + field2) desc
)
where rownum = 1
select id, field1, field2
from (
select t.*,
row_number() over ( order by sqrt(field1 + field2) desc) as rn
from yourTable t
)
where rn = 1
Notice that if you have more than one ID with the same, maximum value, this will pick one of them randomly.

How to group rows, select one by a number - ext with other columns - in SQL Oracle?

I had an issue with writing a query that would gather groups in a column, and then select one of them by a number.
A good person (#sstan) gave me this:
select your_col
from (select your_col,
row_number() over (order by your_col) as rn
from your_table
group by your_col)
where rn = 2
And it works. However, it appears that my query needs to consider other columns. For now, it looks like this:
select MAINCOL, sum(some_col+other_col) as together_col, count(another_col)
from my_table
where date_col >= next_day(trunc(sysdate), 'MONDAY') - 14
and date_col < next_day(trunc(sysdate), 'MONDAY') - 7
group by MAINCOL, other_col, together_col
order by MAINCOL
So the challenge is to extend the upper query with what is below. Although I couldn't make it work, it seems simple..
You may try with Inner table alias
SELECT your_col,rn.your_col,rn.your_col2,rn.your_col3
FROM(select your_col,your_col2,your_col3,row_number() over (order by your_col)
from your_table group by your_col)as rn where rn = 2
Got it!
With help of Stack, of course.
select t.*
from (select MAINCOL, col1, col2, col3, col4, DENSE_RANK()OVER(ORDER BY MAINCOL) GROUPID
from tab_1
group by MAINCOL, col1, col2
) t
where GROUPID = 1;

Union select statements horizontally

let's say result of my select statements as follows (I have 5 of those):
Id Animal AnimalId
1 Dog Dog1
1 Cat Cat57
Id Transport TransportId
2 Car Car100
2 Plane Plane500
I'd like to get a result as follows:
Id Animal AnimalId Transport TransportId
1 Dog Dog1
1 Cat Cat57
2 Car Car100
2 Plane Plane500
What I can do is I can crate a tablevariable and specify all possible columns and insert records from each select statement into it. But maybe better solution like PIVOT?
Edit
queries: 1st: Select CategoryId as Id, Animal, AnimalId from Animal
2nd: Select CategoryId as Id, Transport, TransportId from Transport
How about this, if you need them in the same rows, this gets the row_number() for each row and joins on those:
select a.id,
a.aname,
a.aid,
t.tname,
t.tid
from
(
select id, aname, aid, row_number() over(order by aid) rn
from animal
) a
left join
(
select id, tname, tid, row_number() over(order by tid) rn
from transport
) t
on a.rn = t.rn
see SQL Fiddle with Demo
If you don't need them in the same row, then use UNION ALL:
select id, aname, aid, 'Animal' tbl
from animal
union all
select id, tname, tid, 'Transport'
from transport
see SQL Fiddle with Demo
Edit #1, here is a version with an UNPIVOT and PIVOT:
select an_id, [aname], [aid], [tname], [tid]
from
(
select *, row_number() over(partition by col order by col) rn
from animal
unpivot
(
value
for col in (aname, aid)
) u
union all
select *, row_number() over(partition by col order by col) rn
from transport
unpivot
(
value
for col in (tname, tid)
) u
) x1
pivot
(
min(value)
for col in([aname], [aid], [tname], [tid])
) p
order by an_id
see SQL Fiddle with Demo
This would do it for you:
SELECT
ID, field1, field2, '' as field3, '' as field4
FROM sometable
UNION ALL
SELECT
ID, '', '', field3, field4
FROM someothertable
create table Animal (
Animal varchar(50)
,AnimalID varchar(50)
)
create table Transport (
Transport varchar(50)
,TransportID varchar(50)
)
insert into Animal values ('Dog', 'Dog1')
insert into Animal values ('Cat', 'Cat57')
insert into Transport values ('Car', 'Car100')
insert into Transport values ('Plane', 'Plane500')
select ID = 1
,A.Animal
,A.AnimalID
,Transport = ''
,TransportID = ''
from Animal A
union
select ID = 2
,Animal = ''
,AnimalID = ''
,T.Transport
,T.TransportID
from Transport T
To get it in the format you want, select the values you want, and then null (or an empty string) for the other columns.
SELECT
CategoryId as Id,
Animal as 'Animal',
AnimalId as 'AnimalId',
null as 'Transport',
null as 'TransportId'
FROM Animal
UNION
SELECT
CategoryId as Id,
null as 'Animal',
null as 'AnimalId',
Transport as 'Transport',
TransportId as 'TransportId'
FROM Transport
I'm still not sure of the purpose of this, but this should give the output you want.
You shouldn't need to pivot, your results are already fine.
If you want, you can just UNION all 5 statements together in the same format as the first select: ID/Category/CategoryID. Then you'll get one long result set with all 5 sets appended 3 columns wide.
Is that what you want? Or do you need to distinguish between 'categories'?
given your example, try:
Select CategoryId as Id, Animal, AnimalId from Animal
union all
Select CategoryId as Id, Transport, TransportId from Transport
if you want, you can alias the columns like:
Select CategoryId as Id, Animal as category, AnimalId as categoryID from Animal
union all
Select CategoryId as Id, Transport, TransportId from Transport
you really don't need to pivot, just space out your columns like you were thinking initially. You don't pivot to move columns, you pivot to perform an aggregate function over grouped data.