Row Aggregation in SQL [duplicate] - sql

This question already has answers here:
Grouping by pairs of values that are interchangeably used
(2 answers)
Closed 2 months ago.
I have below data:
Source Dest Fare
jal del 5000
del jal 6000
mum jal 7000
jal mum 8000
I want the below output:
City Total_Fare
jal/del 11000
mum/jal 15000
Can anyone suggest query?

I'd use least and greatest to create source-dest pairs, and then group on them and sum the fare:
SELECT LEAST(source, dest) || '/' || GREATEST(source, dest) AS city,
SUM(fare) AS total_fare
FROM mytable
GROUP BY LEAST(source, dest) || '/' || GREATEST(source, dest)

You want to aggregate by the two involved cities, no matter whether they be source or destination. In order to do that, order them. A simple way to do this is with GREATEST and LEAST:
select
greatest(source, dest) || '/' || least(source, dest) as cities,
sum(fare) as total_fare
from mytable
group by greatest(source, dest), least(source, dest)
order by greatest(source, dest), least(source, dest);

Related

Oracle SQL - variable number of columns in model clause

I am looking into the Oracle SQL Model clause. I am trying to write dynamic Oracle SQL which can be adapted to run for a varying number of columns each time, using this model clause. However I am struggling to see how I could adapt this (even using PL/SQL) to a dynamic/generic query or procedure
here is a rough view of the table I am working on
OWNER||ACCOUNT_YEAR||ACCOUNT_NAME||PERIOD_1||PERIOD_2||PERIOD_3||PERIOD_4||PERIOD_5||PERIOD_6||....
---------------------------------------------------------------------------------------------------
9640|| 2018 ||something 1|| 34 || 444 || 982 || 55 || 42 || 65 ||
9640|| 2018 ||something 2|| 333 || 65 || 666 || 78 || 44 || 55 ||
9640|| 2018 ||something 3|| 6565 || 783 || 32 || 12 || 46 || 667 ||
Here is what I have so far:
select OWNER, PERIOD_1, PERIOD_2, PERIOD_3, PERIOD_4, PERIOD_5, PERIOD_6, PERIOD_7, PERIOD_8, PERIOD_9, PERIOD_10, PERIOD_11, PERIOD_12, ACCOUNT_YEAR, ACCOUNT_NAME
from DATA-TABLE
where OWNER IN ('9640') and PERIOD_1 is not null
MODEL ignore nav
Return UPDATED ROWS
PARTITION BY (OWNER, ACCOUNT_NAME)
DIMENSION BY (ACCOUNT_YEAR)
MEASURES (PERIOD_1,PERIOD_2, PERIOD_3, PERIOD_4, PERIOD_5, PERIOD_6, PERIOD_7, PERIOD_8, PERIOD_9, PERIOD_10, PERIOD_11, PERIOD_12)
RULES
(
PERIOD_1[2021] = PERIOD_1[2018] * 1.05,
PERIOD_2[2021] = PERIOD_2[2018] * 1.05,
PERIOD_3[2021] = PERIOD_3[2018] * 1.05,
PERIOD_4[2021] = PERIOD_4[2018] * 1.05,
PERIOD_5[2021] = PERIOD_6[2018] * 1.05,
PERIOD_7[2021] = PERIOD_7[2018] * 1.05,
PERIOD_8[2021] = PERIOD_8[2018] * 1.05,
PERIOD_9[2021] = PERIOD_9[2018] * 1.05,
PERIOD_10[2021] = PERIOD_10[2018] * 1.05,
PERIOD_11[2021] = PERIOD_11[2018] * 1.05,
PERIOD_12[2021] = PERIOD_12[2018] * 1.05
)
ORDER BY ACCOUNT_YEAR asc;
As you can see in the measures and rules section, I am currently hardcoding each period column into this query
I want to be able to use this model clause (well specifically the rule part in a flexible way, so I can have a query which could be run for say, just period 1 -3, or period 5-12...
I have tried looking into this but all examples show the left hand side of the rule (e.g. PERIOD_12[2021] =...) to explicitly refer to a column in a table, rather than a parameter or variable I can swap in for something else simply
Any help on how I might accomplish this through SQL or PLSQL would be greatly appreciated
First, you should try to avoid dynamic columns by changing the table structure to a simpler format. SQL is much simpler if you store the data vertically instead of horizontally - use multiple rows instead of multiple columns.
If you can't change the data structure, you still want to keep the MODEL query as simple as possible, because the MODEL clause is a real pain to work with. Transform the table from columns to rows using UNPIVOT, run a simplified MODEL query, and then transform the results back if necessary.
If you really, really need dynamic columns in a pure SQL statement, you'll either need to use an advanced data type like Gary Myers suggested, or use the Method4 solution below.
Sample Schema
To make the examples fully reproducible, here's the sample data I used, along with the MODEL query (which I had to slightly modify to only reference 6 variables and the new table name).
create table data_table
(
owner number,
account_year number,
account_name varchar2(100),
period_1 number,
period_2 number,
period_3 number,
period_4 number,
period_5 number,
period_6 number
);
insert into data_table
select 9640, 2018 ,'something 1', 34 , 444 , 982 , 55 , 42 , 65 from dual union all
select 9640, 2018 ,'something 2', 333 , 65 , 666 , 78 , 44 , 55 from dual union all
select 9640, 2018 ,'something 3', 6565 , 783 , 32 , 12 , 46 , 667 from dual;
commit;
MODEL query:
select OWNER, PERIOD_1, PERIOD_2, PERIOD_3, PERIOD_4, PERIOD_5, PERIOD_6, ACCOUNT_YEAR, ACCOUNT_NAME
from DATA_TABLE
where OWNER IN ('9640') and PERIOD_1 is not null
MODEL ignore nav
Return UPDATED ROWS
PARTITION BY (OWNER, ACCOUNT_NAME)
DIMENSION BY (ACCOUNT_YEAR)
MEASURES (PERIOD_1,PERIOD_2, PERIOD_3, PERIOD_4, PERIOD_5, PERIOD_6)
RULES
(
PERIOD_1[2021] = PERIOD_1[2018] * 1.05,
PERIOD_2[2021] = PERIOD_2[2018] * 1.05,
PERIOD_3[2021] = PERIOD_3[2018] * 1.05,
PERIOD_4[2021] = PERIOD_4[2018] * 1.05,
PERIOD_5[2021] = PERIOD_5[2018] * 1.05,
PERIOD_6[2021] = PERIOD_6[2018] * 1.05
)
ORDER BY ACCOUNT_YEAR, ACCOUNT_NAME asc;
Results:
OWNER PERIOD_1 PERIOD_2 PERIOD_3 PERIOD_4 PERIOD_5 PERIOD_6 ACCOUNT_YEAR ACCOUNT_NAME
----- -------- -------- -------- -------- -------- -------- ------------ ------------
9640 35.7 466.2 1031.1 57.75 44.1 68.25 2021 something 1
9640 349.65 68.25 699.3 81.9 46.2 57.75 2021 something 2
9640 6893.25 822.15 33.6 12.6 48.3 700.35 2021 something 3
UNPIVOT approach
This example uses static code to demonstrate the syntax, but this can also be made more dynamic if necessary, perhaps through PL/SQL that creates temporary tables.
create table unpivoted_data as
select *
from data_table
unpivot (quantity for period_code in (period_1 as 'P1', period_2 as 'P2', period_3 as 'P3', period_4 as 'P4', period_5 as 'P5', period_6 as 'P6'));
With unpivoted data, the MODEL clause because simpler. Instead of listing a rule for each period, simply partition by the PERIOD_CODE:
select *
from unpivoted_data
where OWNER IN ('9640')
and (OWNER, ACCOUNT_YEAR, ACCOUNT_NAME) in
(
select owner, account_year, account_name
from unpivoted_data
where period_code = 'P1'
and quantity is not null
)
MODEL ignore nav
Return UPDATED ROWS
PARTITION BY (OWNER, ACCOUNT_NAME, PERIOD_CODE)
DIMENSION BY (ACCOUNT_YEAR)
MEASURES (QUANTITY)
RULES
(
QUANTITY[2021] = QUANTITY[2018] * 1.05
)
ORDER BY ACCOUNT_YEAR, ACCOUNT_NAME, PERIOD_CODE;
Results:
OWNER ACCOUNT_YEAR ACCOUNT_NAME PERIOD_CODE QUANTITY
----- ------------ ------------ ----------- --------
9640 2018 something 1 P1 34
9640 2018 something 1 P2 444
9640 2018 something 1 P3 982
...
Dynamic SQL in SQL
If you really need to do this all in one query, my open source package Method4 can help. Once the package is
installed, you call it by passing in a query that will generate the query you want to run.
This query returns the same results as the previous MODEL query, but will automatically adjust based on the columns in the table.
select * from table(method4.dynamic_query(
q'[
--Generate the MODEL query.
select
replace(replace(q'<
select OWNER, #PERIOD_COLUMN_LIST#, ACCOUNT_YEAR, ACCOUNT_NAME
from DATA_TABLE
where OWNER IN ('9640') and PERIOD_1 is not null
MODEL ignore nav
Return UPDATED ROWS
PARTITION BY (OWNER, ACCOUNT_NAME)
DIMENSION BY (ACCOUNT_YEAR)
MEASURES (#PERIOD_COLUMN_LIST#)
RULES
(
#RULES#
)
ORDER BY ACCOUNT_YEAR, ACCOUNT_NAME asc
>', '#PERIOD_COLUMN_LIST#', period_column_list)
, '#RULES#', rules) sql_statement
from
(
--List of columns.
select
listagg(column_name, ', ') within group (order by column_id) period_column_list,
listagg(column_name||'[2021] = '||column_name||'[2018] * 1.05', ','||chr(10)) within group (order by column_id) rules
from user_tab_columns
where table_name = 'DATA_TABLE'
and column_name like 'PERIOD%'
)
]'
));
Don't.
You can get an idea of the underlying obstruction if you understand the PARSE, BIND, EXECUTE flow of SQL as demonstrated by the DBMS_SQL package
https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_SQL.html#GUID-BF7B8D70-6A09-4E04-A216-F8952C347BAF
A cursor is opened and an SQL statement is parsed once. After being parsed, a DESCRIBE_COLUMNS can be called which tells you definitively what the columns will be returned by the execution of that SQL statement. From that point you can do multiple BIND and EXECUTE executions, putting different values for variables into the same statement and re-running. Each EXECUTE may be followed up by one of more FETCHes. None of the bind, execute or fetch can affect what columns are returned (either in number of columns, name, order or datatype).
The only way to change the columns returned is to parse a different SQL statement.
Depending on what you want at the end, you might be able to use a complex datatype (such as XML or JSON) to return data with different internal structures from the same statement (or even in different rows returned by the same statement),

Loop Through a Table to concatenate Rows

I have a table of similar structure:
Name Movies_Watched
A Terminator
B Alien
A Batman
B Rambo
B Die Hard
....
I am trying to get this:
Name Movies_Watched
A Terminator;Batman
B Alien, Die Hard, Rambo
My initial guess was:
SELECT Name, Movies_Watched || Movies_Watched from TABLE
But obviously that's wrong. Can someone tell me how can I loop through the 2nd column and concatenate them? What's the logic like?
Got to know that group_concat is the right approach. But haven't been able to figure it out yet. When I've tried:
select name, group_concat(movies_watched) from table group by 1
But it throws an error saying User-defined transform function group_concat must have an over clause
You are looking for string_agg():
select name, string_agg(movie_watched, ';') as movies_watched
from t
group by name;
That said, you are using Postgres, so you should learn how to use arrays instead of strings for such things. For instance, there is no confusion with arrays when the movie name has a semicolon. That would be:
select name, array_agg(movie_watched) as movies_watched
from t
group by name;
use array_agg
SELECT Name, array_agg(Movies_Watched)
FROM data_table
GROUP BY Name
i think you need listagg or group_concat as you are using vertica upper is postgrey solution
SELECT Name, listagg(Movies_Watched)
FROM data_table
GROUP BY Name
or
select Name,
group_concat(Movies_Watched) over (partition by Name order by name) ag
from mytable
As already mentioned, in Vertica it's LISTAGG():
WITH
input(nm,movies_watched) AS (
SELECT 'A','Terminator'
UNION ALL SELECT 'B','Alien'
UNION ALL SELECT 'A','Batman'
UNION ALL SELECT 'B','Rambo'
UNION ALL SELECT 'B','Die Hard'
)
SELECT
nm AS "Name"
, LISTAGG(movies_watched) AS movies_watched
FROM input
GROUP BY nm;
-- out Name | movies_watched
-- out ------+----------------------
-- out A | Terminator,Batman
-- out B | Alien,Rambo,Die Hard
-- out (2 rows)
-- out
-- out Time: First fetch (2 rows): 12.735 ms. All rows formatted: 12.776 ms

SQL A Union B result different from A+B?

I was using Union to merge two SQL queries, but the results are different from a combination of two different queries.
Below is my SQL code:
(SELECT CONCAT(Name, '(', LEFT(Occupation, 1), ')')
FROM OCCUPATIONS
ORDER BY Name ASC)
UNION
(SELECT CONCAT('There are a total of ', COUNT(Occupation), ' ',
LOWER(Occupation), 's.')
FROM OCCUPATIONS
GROUP BY Occupation
ORDER BY Occupation ASC)
If I only run the first half of the query, I get the following result:
Aamina(D)
Ashley(P)
Belvet(P)
Britney(P)
Christeen(S)
Eve(A)
Jane(S)
Jennifer(A)
Jenny(S)
Julia(D)
Ketty(A)
Kristeen(S)
Maria(P)
Meera(P)
Naomi(P)
Priya(D)
Priyanka(P)
Samantha(A)
If I only run the second half of the query, I get the following result:
There are a total of 4 actors.
There are a total of 3 doctors.
There are a total of 7 professors.
There are a total of 4 singers.
Both results above are in expected order. However, if I run all the query, I get the following result:
Ashley(P)
Samantha(A)
Julia(D)
Britney(P)
Maria(P)
Meera(P)
Priya(D)
Priyanka(P)
Jennifer(A)
Ketty(A)
Belvet(P)
Naomi(P)
Jane(S)
Jenny(S)
Kristeen(S)
Christeen(S)
Eve(A)
Aamina(D)
There are a total of 4 actors.
There are a total of 3 doctors.
There are a total of 7 professors.
There are a total of 4 singers.
As you may notice, the order of the first half is screwed. Does anyone know why?
How does Union different from writting two separate SQL query? Thanks!
The order is not "screwed". You have no order by for the overall query, just for the subqueries. The ordering is not preserved. You are using UNION, which removes duplicates.
The safe way to execute this query is:
select str
from ((select concat(Name, '(', LEFT(Occupation, 1), ')') as str, 1 as which
from OCCUPATIONS
) union all
(select concat('There are a total of ', COUNT(Occupation), ' ',
lower(Occupation), 's.') as str, 2 as which
from OCCUPATIONS
group by occupation
)
) o
order by which, str

Oracle/SQL - Multiple Records Into One [string aggregation]

I realize this is a ridiculous request, but what I'm trying to do is pull multiple records back into a single column along with some literal text.
So given a table like this
REGION CITY SID
-------------------
1 Chicago 1234
1 Palatine 567
1 Algonquin 234
1 Wauconda 987
I would like to see back a single record with a column, other columns like region are fine, but a single column like this
<option value="1234">Chicago</option><option value="567">Palatine</option><option value="234">Algonquin</option><option value="987">Wauconda</option>
Any thoughts on how to do this? I'm running Oracle 9i and cannot do this in PL/SQL
Okay the table format has changed a bit, but the idea is the same
COUNTRY STORECODE STORE_NAME
------------------------------
USA 1234 Chicago
USA 567 Palatine
CAN 987 Toronto
So I found this code going through the links listed
SELECT COUNTRY,
LTRIM(MAX(SYS_CONNECT_BY_PATH(STORECODE,','))
KEEP (DENSE_RANK LAST ORDER BY curr),',') AS COUNTRY_HTML
FROM (SELECT COUNTRY,
STORECODE,
ROW_NUMBER() OVER (PARTITION BY COUNTRY ORDER BY STORECODE) AS curr,
ROW_NUMBER() OVER (PARTITION BY COUNTRY ORDER BY STORECODE) -1 AS prev
FROM tablename)
GROUP BY COUNTRY
CONNECT BY prev = PRIOR curr AND COUNTRY = PRIOR COUNTRY
START WITH curr = 1;
And when I run it I see this output
COUNTRY COUNTRY_HTML
--------------------
USA 1234,567
CAN 987
My thought was simply to have the inner select pull from another select where I do my concat of the STORECODE and STORE_NAME along with the html required like this...
SELECT COUNTRY,
LTRIM(MAX(SYS_CONNECT_BY_PATH(RECORD_HTML,','))
KEEP (DENSE_RANK LAST ORDER BY curr),',') AS COUNTRY_HTML
FROM (SELECT COUNTRY,
RECORD_HTML,
ROW_NUMBER() OVER (PARTITION BY COUNTRY ORDER BY RECORD_HTML) AS curr,
ROW_NUMBER() OVER (PARTITION BY COUNTRY ORDER BY RECORD_HTML) -1 AS prev
FROM (SELECT COUNTRY, '<option value="' || STORECODE || '">' || STORE_NAME || '</option>' AS RECORD_HTML FROM tablename))
GROUP BY COUNTRY
CONNECT BY prev = PRIOR curr AND COUNTRY = PRIOR COUNTRY
START WITH curr = 1;
While our front end environment does accept the query when I try to review the results I get a error: the resource is invalid. You may need to re-create of fix the query before viewing.
I know that error probably isn't helpful, but any ideas why my version isn't working?
Thanks!
It's disgusting but you could do something like this:
select replace(blah2,',','')
from ( select wm_concat(blah) as blah2
from ( select '<option value="' || sid || '">' || city || '</option>' as blah
from my_table
)
)
Have you played around with DBMS_XMLGEN?
You can create an aggregate function in Oracle, see documentations.

How to code Microsoft Excel "Shift Cells Up" feature in SQL

Take a simple table like below:
Column Headings: || Agent's Name || Time Logged In || Center ||
Row 1: Andrew || 12:30 PM || Home Base
Row 2: Jeff || 7:00 AM || Virtual Base
Row 3: Ryan || 6:30 PM || Test Base
Now lets say that a single cell is deleted so the table now looks like this:
Column Headings: || Agent's Name || Time Logged In || Center ||
Row 1: Andrew || 12:30 PM ||
Row 2: Jeff || 7:00 AM || Virtual Base
Row 3: Ryan || 6:30 PM || Test Base
Notice that "Home Base" is missing. Now in excel you can delete the cell and shift the rest so the finished product looks like below:
Column Headings: || Agent's Name || Time Logged In || Center ||
Row 1: Andrew || 12:30 PM || Virtual Base
Row 2: Jeff || 7:00 AM || Test Base
Row 3: Ryan || 6:30 PM ||
And you can see we are left with a blank cell last row.
How do I code this procedure of shifting the cells up in SQL?
I've been struggling on this problem for weeks! Thank you!
You can't - SQL tables aren't Excel sheets - they just don't have that kind of a structure. No matter how hard you try - you won't be able to do something like that. It's just fundamentally different.
SQL Server tables have rows and columns - sure - but they have no implied order or anything. You cannot "shift" a row up - there's no "up" per se - it all depends on your ordering.
It's worse than comparing apples to oranges - it's like comparing apples to granite blocks - it's just not the same - don't waste your time trying to make it the same.
One of many options is to use an outer apply to fetch the Center from the next row:
declare #t table (name varchar(50), login time, center varchar(50))
insert into #t (name, login, center)
select 'Andrew', '12:30 PM', 'Home Base'
union all select 'Jeff', '7:00 AM', 'Virtual Base'
union all select 'Ryan', '6:30 PM', 'Test Base'
update t1
set t1.center = t3.center
from #t t1
outer apply (
select top 1 t2.center
from #t t2
where t2.name > t1.name
order by t2.name
) t3
select * from #t
You do have to specify an ordering (the example orders on name.)
If you have two sets and you simply want to assign one of each in the second set to items from the first set, "using them up", the simplest thing is to use ROW_NUMBER() on both sets and do a LEFT JOIN on the ROW_NUMBER() column from the first to the second set, assigning each row the next available item:
WITH set1 AS (
SELECT *, ROW_NUMBER() (OVER ORDER BY set1.sortorder /* choice of order by is obviously important */) AS ROWNUM
FROM set1
)
,set2 AS (
SELECT *, ROW_NUMBER() (OVER ORDER BY set2.sortorder /* choice of order by is obviously important */) AS ROWNUM
FROM set2
)
SELECT *
FROM set1
LEFT JOIN set2
ON set1.keys = set2.keys -- if there are any keys
AND set1.ROW_NUM = set2.ROW_NUM