SQL - Ordering second column based on the first column - sql

I am trying to retrieve data from a table, but I need it to be ordered in a very specific way and I'm not sure if it's possible using Oracle SQL alone.
What I need to do is retrieve all of the rows, but order it in a way that where column 3 is null (indicated by a blank space in the graphs below) those are ordered first. Then, all the rows that aren't null in column 3 would be shown AFTER the row that has their column value in column 1.
What I have:
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 8 | text | 10 |
| 9 | text | 10 |
| 10 | text | |
+------+-------+------+
What I would like as a result:
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 10 | text | |
| 8 | text | 10 |
| 9 | text | 10 |
+------+-------+------+
What I have tried:
First thing I tried was using:
ORDER BY coalesce(Col3, Col1)
and it got me close to the result, but the Col1 value 10 needs to be shown before the Col3 value 10.
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 8 | text | 10 |
| 9 | text | 10 |
| 10 | text | |
+------+-------+------+
I've also tried creating a new column where if Col3 is null then Col4 is true and false other wise, but this was essentially the same thing as coalesce up above.
I also tried just running some basic order by's but had no success in achieving this.

In Oracle, you would just use nulls first:
order by coalesce(col3, col1), col3 nulls first, col1

Your table looks very much like hierarchical data, where in some sense col1 is a unique row identifier, and col3 points to a row's parent row.
If so, it may be better to use a hierarchical query (connect by) for this. The ordering is hierarchical, and siblings (descendants from the same parent) are ordered according to the order siblings by clause.
Like this:
with
sample_table(col1, col2, col3) as (
select 1, 'text', null from dual union all
select 2, 'text', 1 from dual union all
select 3, 'text', 1 from dual union all
select 8, 'text', 10 from dual union all
select 9, 'text', 10 from dual union all
select 10, 'text', null from dual
)
select *
from sample_table
start with col3 is null
connect by col3 = prior col1
order siblings by col1
;
COL1 COL2 COL3
---------- ---- ----------
1 text
2 text 1
3 text 1
10 text
8 text 10
9 text 10
The with clause is not part of the solution - I added it there so I can test the query. (Remember this "with clause" way to create sample tables for testing - you can include them yourself, instead of the formatted table in your original question, so that people can easily test their answers on your sample data.)

Related

Translate table values to text following a fixed pattern

We use software to store combinations of financial elements. Those elements are allowed in certain combinations. Exceptions of these combinations are SQL-like statements in the front-end, and are saved as numerical values in a database table like the following example:
+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 |
+------+------+------+------+------+
| 1 | 2 | 4 | 5 | 1 |
+------+------+------+------+------+
| -1 | 2 | 6 | 4 | 5 |
+------+------+------+------+------+
| 1 | 2 | 5 | 7 | 1 |
+------+------+------+------+------+
I would like to translate those numerical values back to a SQL-statement like the following example:
+------+-----------+------+-----------+------+-----------+------+-----------+------+-----------+
| Col1 | Col1Trans | Col2 | Col2Trans | Col3 | Col3Trans | Col4 | Col4Trans | Col5 | Col5Trans |
+------+-----------+------+-----------+------+-----------+------+-----------+------+-----------+
| 1 | ( | 2 | SELECT | 4 | CODE | 5 | LIKE | 1 | * |
+------+-----------+------+-----------+------+-----------+------+-----------+------+-----------+
| -1 | | 2 | SELECT | 6 | NUMBER | 4 | = | 5 | AND |
+------+-----------+------+-----------+------+-----------+------+-----------+------+-----------+
| 1 | ( | 2 | SELECT | 5 | TOOL | 7 | <> | 1 | * |
+------+-----------+------+-----------+------+-----------+------+-----------+------+-----------+
The numerical values differ in each column so I can only imagine the use of a lot of case...when statements which I doubt will be efficiƫnt. I don't want to create tables to hold the translation values. Are there ways to do this with arrays?
Are there any code samples to easily loop through table/columns and translate the contents of it?
You can use below code and add more case statement as per the requirement.
SELECT Col1
,CASE
WHEN Col1 = 1 THEN '('
ELSE '' END AS Col1Trans
,Col2
,CASE
WHEN Col2 = 2 THEN 'SELECT'
END AS Col2Trans
,Col3
,CASE
WHEN Col3 = 4 THEN 'CODE'
WHEN Col3 = 6 THEN 'NUMBER'
WHEN Col3 = 5 THEN 'TOOL'
END AS Col3Trans
,Col4
,CASE
WHEN Col4 = 5 THEN 'LIKE'
WHEN Col4 = 4 THEN '='
WHEN Col4 = 7 THEN '<>'
END AS Col4Trans
,Col5
,CASE
WHEN Col5 = 1 THEN '*'
WHEN Col5 = 5 THEN 'AND'
END AS Col5Trans
The best way to avoid so many case when and decode and etc is to use with as clause as following:
With col1trans (value, translation) as
(Select 1, '(' from dual union all
Select -1, null from dual),
Col2trans (value, translation) as
(Select 2, 'SELECT' from dual)
..
... till col5trans
Select m.col1, t1.translation as col1trans,
.... till m.col5, t5.translation
From your_table m join col1trans t1 m.col1=t1.value
join col2trans t2 m.col2=t2.value
... till col5trans
Cheers!!

Over Partition to find duplicates and remove them based on criteria SQL

I hope everyone is doing well. I have a dilemma that i can not quite figure out.
I am trying to find a unique value for a field that is not a duplicate.
For example:
Table 1
|Col1 | Col2| Col3 |
| 123 | A | 1 |
| 123 | A | 2 |
| 12 | B | 1 |
| 12 | B | 2 |
| 12 | C | 3 |
| 12 | D | 4 |
| 1 | A | 1 |
| 2 | D | 1 |
| 3 | D | 1 |
Col 1 is the field that would have the duplicate values. Col2 would be the owner of the value in Col 1. Col 3 uses the row number() Over Partition syntax to get the numbers in ascending order.
The goal i am trying to accomplish is to remove the value in col 1 if it is not truly unique when looking at col2.
Example:
Col1 has the value 123, Col2 has the value A. Although there are two instances of 123 being owned by A, i can determine that it is indeed unique.
Now look at Col1 that has the value 12 with values in Col2 of B,C,D.
Value 12 is associated with three different owners thus eliminating 12 from our result list.
So in the end i would like to see a result table such as this :
|Col1 | Col2|
| 123 | A |
| 1 | A |
| 2 | D |
| 3 | D |
To summarize, i would like to first use the partition numbers to identify if the value in col1 is repeated. From there i want to verify that the values in col 2 are the same. If so the value in col 1 and col 2 remains as one single entry. However if the values in col 2 do not match, all records for the col1 value are removed.
I will provide the syntax code for my query if needed.
Update**
I failed to mention that table 1 is the result of inner joining two tables.
So Col1 comes from table a and Col2 comes from table b.
The values in table a for col2 are hard to interpret so i had to make sense of them and assigned it proper name values.
The join query i used to combine the two are:
Select a.Col1, B.Col2 FROM Table a INNER JOIN Table b on a.Colx = b.Colx
Update**
Table a:
|Col1 | Colx| Col3 |
| 123 | SMS | 1 |
| 123 | S9W | 2 |
| 12 | NAV | 1 |
| 12 | NFR | 2 |
| 12 | ABC | 3 |
| 12 | DEF | 4 |
| 1 | SMS | 1 |
| 2 | DEF | 1 |
| 3 | DES | 1 |
Table b:
|Colx | Col2|
| SMS | A |
| S9W | A |
| DEF | D |
| DES | D |
| NAV | B |
| NFR | B |
| ABC | C |
Above are sample data for both tables that get joined in order to create the first table displayed in this body.
Thank you all so much!
NOT EXISTS operator can be used to do this task:
SELECT distinct Col1 , Col2
FROM table t
WHERE NOT EXISTS(
SELECT 1 FROM table t1
WHERE t.col1=t1.col1 AND t.col2 <> t1.col2
)
If I understand correctly, you want:
select col1, min(col2)
from t
group by col1
where min(col2) <> max(col2);
I think the third column is confusing you. It doesn't seem to play any role in the logic you want.

Oracle group by only ONE column

I have a table in Oracle database, which have 40 columns.
I know that if I want to do a group by query, all the columns in select must be in group by.
I simply just want to do:
select col1, col2, col3, col4, col5 from table group by col3
If I try:
select col1, col2, col3, col4, col5 from table group by col1, col2, col3, col4, col5
It does not give the required output.
I have searched this, but did not find any solution. All the queries that I found using some kind of Add() or count(*) function.
In Oracle is it not possible to simply group by one column ?
UPDATE:
My apologies, for not being clear enough.
My Table:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 1 | 1 | some text 1 | 100 |
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 4 | 3 | some text 1 | 78 |
| 5 | 4 | some text 1 | 65 |
| 6 | 5 | some text 1 | 101 |
| 7 | 5 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 |
| 9 | 6 | some text 1 | 202 |
+--------+----------+-------------+-------+
and by running following query:
select col1, col2, col3 from table where col3='200' group by col1;
I will get the following desired Output:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
+--------+----------+-------------+-------+
Long comment here;
Yeah, you can't do that. Think about it... If you have a table like so:
Col1 Col2 Col3
A A 1
B A 2
C A 3
And you're grouping by only Col2, which will group down to a single row... what happens to Col1 and Col3? Both of those have 3 distinct row values.
How is your DBMS supposed to display those?
Col1 Col2 Col3
A? A 1?
B? 2?
C? 3?
This is why you have to group by all columns, or otherwise aggregate or concatenate them. (SUM(),MAX(), MIN(), etc..)
Show us how you want the results to look and I'm sure we can help you.
Edit - Answer:
First off, thanks for updating your question. Your query doesn't have id but your expected results do, so I will answer for each separately.
Without id
You will still need to group by all columns to achieve what you're going for. Let's walk through it.
If you run your query without any group by:
select col1, col2, col3 from table where col3='200'
You will get this back:
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 |
| 5 | some text 1 | 200 |
| 1 | some text 1 | 200 |
+----------+-------------+-------+
So now you want to only see the col1 = 1 row once. But to do so, you need to roll all of the columns up, so your DBMS knows what do to with each of them. If you try to group by only col1, you DBMS will through an error because you didn't tell it what to do with the extra data in col2 and col3:
select col1, col2, col3 from table where col3='200' group by col1 --Errors
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 |
| 5 | some text 1 | 200 |
| ? | some text 1?| 200? |
+----------+-------------+-------+
If you group by all 3, your DBMS knows to group together the entire rows (which is what you want), and will only display duplicate rows once:
select col1, col2, col3 from table where col3='200' group by col1, col2, col3
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 | --Desired results
| 5 | some text 1 | 200 |
+----------+-------------+-------+
With id
If you want to see id, you will have to tell your DBMS which id to display. Even if we group by all columns, you won't get your desired results, because the id column will make each row distinct (They will no longer group together):
select id, col1, col2, col3 from table where col3='200' group by id, col1, col2, col3
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 | --id = 2
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 | --id = 8
+--------+----------+-------------+-------+
So in order to group these rows, we need to explicitly say what to do with the ids. Based on your desired results, you want to choose id = 2, which is the minimum id, so let's use MIN():
select MIN(id), col1, col2, col3 from table where col3='200' group by col1, col2, col3
--Note, MIN() is an aggregate function, so id need not be in the group by
Which returns your desired results (with id):
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
+--------+----------+-------------+-------+
Final thought
Here were your two trouble rows:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 |
+--------+----------+-------------+-------+
Any time you hit these, just think about what you want each column to do, one at a time. You will need to handle all columns any time you do grouping or aggregates.
id, you only want to see id = 2, which is the MIN()
co1, you only want to see distinct values, so GROUP BY
col2, you only want to see distinct values, so GROUP BY
col3, you only want to see distinct values, so GROUP BY
maybe analytic functions is what you need
try smth like this:
select col1, col2, col3, col4, col5
, sum(*) over (partition by col1) as col1_summary
, count(*) over () as total_count
from t1
if you google the article - you find thousands on examples
for example this
Introduction to Analytic Functions (Part 1)
Why do you want to GROUP BY , wouldn't you want to ORDER BY instead?
If you state an English language version of the problem you are trying to solve (i.e. the requirements) it would be easier to be more specific.
I guess,maybe you need upivot function
or post your specific final result you want
select col3, col_group
from table
UNPIVOT ( col_group for value in ( col1,col2,col4,col5))
SELECT * FROM table
WHERE id IN (SELECT MIN(id) FROM table WHERE col3='200' GROUP BY col1)

Fetch the column which has the Max value for a row in Hive

I have a scenario where i need to pick the greatest value in the row from three columns, there is a function called Greatest but it doesn't work in my version of Hive 0.13.
Please suggest better way to accomplish it.
Example table:
+---------+------+------+------+
| Col1 | Col2 | Col3 | Col4 |
+---------+------+------+------+
| Group A | 1 | 2 | 3 |
+---------+------+------+------+
| Group B | 4 | 5 | 1 |
+---------+------+------+------+
| Group C | 4 | 2 | 1 |
+---------+------+------+------+
expected Result:
+---------+------------+------------+
| Col1 | output_max | max_column |
+---------+------------+------------+
| Group A | 3 | Col4 |
+---------+------------+------------+
| Group B | 5 | col3 |
+---------+------------+------------+
| Group C | 4 | col2 |
+---------+------------+------------+
select col1
,tuple.col1 as output_max
,concat('Col',tuple.col2) as max_column
from (select Col1
,sort_array(array(struct(Col2,2),struct(Col3,3),struct(Col4,4)))[2] as tuple
from t
) t
;
sort_array(Array)
Sorts the input array in ascending order according to the natural ordering of the array elements and returns it
(as of version 0.9.0).
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
hive> select col1
> ,tuple.col1 as output_max
> ,concat('Col',tuple.col2) as max_column
>
> from (select Col1
> ,sort_array(array(struct(Col2,2),struct(Col3,3),struct(Col4,4)))[2] as tuple
> from t
> ) t
> ;
OK
Group A 3 Col4
Group B 5 Col3
Group C 4 Col2

SQL Count across columns

I know that this table structure is horrible and that I should look into database normalization, but this is what I have to work with at the moment.
I need to find the most common number across the columns where one of them has a specific id (in my example 3). Both columns will never have the same value.
Query
SELECT Col1, Col2 FROM scores WHERE Col1 = 3 OR Col2 = 3
Result
+------+------+
| Col1 | Col2 |
+------+------+
| 1 | 3 |
| 3 | 1 |
| 2 | 3 |
| 6 | 3 |
| 3 | 7 |
| 3 | 9 |
| 2 | 3 |
| 5 | 3 |
+------+------+
I'm hoping to get a result like this (I don't need count for 3 since it's the ID, but it can be included)
+-------+-------+
| Value | Count |
+-------+-------+
| 1 | 2 |
| 2 | 2 |
| 5 | 1 |
| 6 | 1 |
| 7 | 1 |
| 9 | 1 |
+-------+-------+
I've tried a few things such as UNION and nested SELECT but that doesn't seem to solve this thing.
Any suggestions?
If you want a count of the values where the OTHER column is 3, then a UNION would work like this:
SELECT value, theCount = COUNT(*)
FROM (
SELECT value = col1
FROM scores
WHERE col2 = 3
UNION ALL
SELECT col2
FROM scores
WHERE col1 = 3) T
GROUP BY value
ORDER BY value;
One way is using case:
SELECT
case Col1 when 3 then Col2 else Col1 end,
count(*)
FROM scores
WHERE Col1 = 3 OR Col2 = 3
Group by
case Col1 when 3 then Col2 else Col1 end;