SQL Merge two rows with same ID but different column values (Oracle) - sql

I am trying to merge different rows into one when they have the same id but different column values.
For example :
(table1)
id colour
1 red
1 blue
2 green
2 red
I would like this to be combine so that the result is :
id colour1 colour2
1 red blue
2 green red
Or
id colour
1 red, blue
2 green, red
Or any other variation of the above so that the rows are joined together some way.
Any help would be appreciated! Thanks in advance.

Please read my Comment first - you shouldn't even think about doing this unless it is ONLY for reporting purposes, and you want to see how this can be done in plain SQL (as opposed to the correct solution, which is to use your reporting tool for this job).
The second format is easiest, especially if you don't care about the order in which the colors appear:
select id, listagg(colour, ', ') within group (order by null)
from table1
group by id
order by null means order randomly. If you want to order by something else, use that in order by with listagg(). For example, to order the colors alphabetically, you could say within group (order by colour).
For the first format, you need to have an a priori limit on the number of columns, and how you do it depends on the version of Oracle you are using (which you should always include in every question you post here and on other discussion boards). The concept is called "pivoting"; since version 11, Oracle has an explicit PIVOT operator that you can use.

The following would solve your problem in the first of the two ways that you proposed. Listagg is what you would use to solve it the second of the two ways (as pointed out in the other answer):
select id,
min(decode(rn,1,colour,null)) as colour1,
min(decode(rn,2,colour,null)) as colour2,
min(decode(rn,3,colour,null)) as colour3
from (
select id,
colour,
row_number() over(partition by id order by colour) as rn
from table1
)
group by id;
In this approach, you need to add additional case statements up to the maximum number of possible colors for a given ID (this solution is not dynamic).
Additionally, this is putting the colors into color1, color2, etc. based on the alphabetical order of the color names. If you prefer a random order, or some other order, you need to change the order by.

Try this, it works for me:
Here student is the name of the table and studentId is a column. We can merge all subjects to the particular student using GROUP_CONCAT.
SELECT studentId, GROUP_CONCAT(subjects) FROM student

Related

Is it possible to group by rows over a column that is not part of the GROUP BY, that will become a FK to the Group?

I'm attempting to figure out how to group by similar rows, leaving one column out of the grouping, that will become a FK to the grouping. I know I can do it using a cursor and temp tables, but I would like to figure out how to do it in a set based way.
For example, let's assume I have the following table:
Let's assume I want each unique group of Letters and Numbers by Color to be one group, and then I want to build a FK to the color.
So, for example, in the above example, both Blue and Black have the same row values for A (A 1 and A 2). So this would be one group. Red has a different group, as it has an extra number (A 1, A 2, A 3) so it would be a separate group.
The end result would look like this:
Is this possible? Or do I have to use a looping mechanism?
The simplest way is to use string_agg() to bring the letter/numbers together.
select c.*,
dense_rank() over (order by ln) as groupid
from (select color,
string_agg(concat(letter, ':', number), ',') within group (order by letter) as ln
from t
group by color
) c;
You can then join back to the original table to assign groupid.

ORDER BY an aggregated column in Report Builder 3.0

On a report builder 3.0, i retreived some items and counted them using a Count aggregate. Now i want to order them from highest to lowest. How do i use the ORDER BY function on the aggregated column? The picture below show the a column that i want to ORDER BY it, it is ticked.
Pic
The code is vers simple as shown bellow:
SELECT DISTINCT act_id,NameOfAct,
FROM Acts
Your picture indicates you also want a Total row at the bottom:
SELECT
COALESCE(NameOfAct,'Total') NameOfAct,
COUNT(DISTINCT act_id) c
FROM Acts
GROUP BY ROLLUP(NameOfAct)
ORDER BY
CASE WHEN NameOfAct is null THEN 1 ELSE 0 END,
c DESC;
Result of example data:
NameOfAct count
-------------- -------
Act_B 3
Act_A 2
Act_Z 1
Total 6
Try it with example rows at: http://sqlfiddle.com/#!18/dbd6c/2
I looked at the Pic. So you might have duplicate acts with the same name. And you want to know the number of acts that have the same unique name.
You might want to group the results by name:
GROUP BY NameOfAct
And include the act names and their counts in the query results:
SELECT NameOfAct, COUNT(*) AS ActCount
(Since the act_id column is not included in the groups, you need to omit it in the SELECT. The DISTINCT is also not necessary anymore, since all groups are unique already.)
Finally, you can sort the data (probably descending to get the acts with the largest count on top):
ORDER BY ActCount DESC
Your complete query would become something like this:
SELECT NameOfAct, COUNT(*) AS ActCount
FROM Acts
GROUP BY NameOfAct
ORDER BY ActCount DESC
Edit:
By the way, you use field "act_id" in your SELECT clause. That's somewhat confusing. If you want to know counts, you want to look at either the complete table data or group the table data into smaller groups (with the GROUP BY clause). Then you can use aggregate functions to get more information about those groups (or the whole table), like counts, average values, minima, maxima...
Single record information, like an act's ID in your case, is typically not important if you want to use statistic/aggregate methods on grouped data. Suppose your query returns an act name which is used 10 times. Then you have 10 records in your table, each with a unique act_id, but with the same name.
If you need just one act_id that represents each group / act name (and assuming act_id is an autonumbering field), you might include the latest / largest act_id value in the query using the MAX aggregate function:
SELECT NameOfAct, COUNT(*) AS ActCount, MAX(act_id) AS LatestActId
(The rest of the query remains the same.)

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.

Controlling result of oracle query

I have a schema like this
create table sample(id number ,name varchar2(30),mark number);
Now i has to return names of the top three marks. How can i write sql query for this?
If i use max(mark) it will return only maximum and
select name from sample
returns all the names!! I tried in many ways but i was unable to control the result to 3 rows..
Please suggest the way to get rid of my problem..
How do you want to handle ties? If Mary gets a mark of 100, Tom gets a mark of 95, and John and Dave both get a mark of 90, what results do you want, for example? Do you want both John and Dave to be returned since they both tied for third? Or do you want to pick one of the two so that the result always has exactly three rows? What happens if Beth also tied for second with a score of 95? Do you still consider John and Dave tied for third place or do you consider them tied for fourth place?
You can use analytic functions to get the top N results though which analytic function you pick depends on how you want to resolve ties.
SELECT id,
name,
mark
FROM (SELECT id,
name,
mark,
rank() over (order by mark desc) rnk
FROM sample)
WHERE rnk <= 3
will return the top three rows using the RANK analytic function to rank them by MARK. RANK returns the same rank for people that are tied and uses the standard sports approach to determining your rank so that if two people tie for second, the next competitor is in fourth place, not third. DENSE_RANK ensures that numeric ranks are not skipped so that if two people tie for second, the next row is third. ROW_NUMBER assigns each row a different rank by arbitrarily breaking ties.
If you really want to use ROWNUM rather than analytic functions, you can also do
SELECT id,
name,
mark
FROM (SELECT id,
name,
mark
FROM sample
ORDER BY mark DESC)
WHERE rownum <= 3
You cannot, however, have the ROWNUM predicate at the same level as the ORDER BY clause since the predicate is applied before the ordering.
SELECT t2.name FROM
(
SELECT t.*, t.rownum rn
FROM sample t
ORDER BY mark DESC
) t2
WHERE t2.rn <=3

How to randomize order of data in 3 columns

I have 3 columns of data in SQL Server 2005 :
LASTNAME
FIRSTNAME
CITY
I want to randomly re-order these 3 columns (and munge the data) so that the data is no longer meaningful. Is there an easy way to do this? I don't want to change any data, I just want to re-order the index randomly.
When you say "re-order" these columns, do you mean that you want some of the last names to end up in the first name column? Or do you mean that you want some of the last names to get associated with a different first name and city?
I suspect you mean the latter, in which case you might find a programmatic solution easier (as opposed to a straight SQL solution). Sticking with SQL, you can do something like:
UPDATE the_table
SET lastname = (SELECT lastname FROM the_table ORDER BY RAND())
Depending on what DBMS you're using, this may work for only one line, may make all the last names the same, or may require some variation of syntax to work at all, but the basic approach is about right. Certainly some trials on a copy of the table are warranted before trying it on the real thing.
Of course, to get the first names and cities to also be randomly reordered, you could apply a similar query to either of those columns. (Applying it to all three doesn't make much sense, but wouldn't hurt either.)
Since you don't want to change your original data, you could do this in a temporary table populated with all rows.
Finally, if you just need a single random value from each column, you could do it in place without making a copy of the data, with three separate queries: one to pick a random first name, one a random last name, and the last a random phone number.
I suggest using newid with checksum for doing randomization
SELECT LASTNAME, FIRSTNAME, CITY FROM table ORDER BY CHECKSUM(NEWID())
In SQL Server 2005+ you could prepare a ranked rowset containing the three target columns and three additional computed columns filled with random rankings (one for each of the three target columns). Then the ranked rowset would be joined with itself three times using the ranking columns, and finally each of the three target columns would be pulled from their own instance of the ranked rowset. Here's an illustration:
WITH sampledata (FirstName, LastName, CityName) AS (
SELECT 'John', 'Doe', 'Chicago' UNION ALL
SELECT 'James', 'Foe', 'Austin' UNION ALL
SELECT 'Django', 'Fan', 'Portland'
),
ranked AS (
SELECT
*,
FirstNameRank = ROW_NUMBER() OVER (ORDER BY NEWID()),
LastNameRank = ROW_NUMBER() OVER (ORDER BY NEWID()),
CityNameRank = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM sampledata
)
SELECT
fnr.FirstName,
lnr.LastName,
cnr.CityName
FROM ranked fnr
INNER JOIN ranked lnr ON fnr.FirstNameRank = lnr.LastNameRank
INNER JOIN ranked cnr ON fnr.FirstNameRank = cnr.CityNameRank
This is the result:
FirstName LastName CityName
--------- -------- --------
James Fan Chicago
John Doe Portland
Django Foe Austin
select *, rand() from table order by rand();
I understand some versions of SQL have a rand() that doesn't change for each line. Check for yours. Works on MySQL.