How can I separate same column values to a variable based on value in another column? - sql

suppose I Have below table
A
B
1
one
2
two
1
three
2
four
1
last
for value in A=1
then I need the output as one;three;last
how can I query this in Oracle's SQL?

If you care whether you get the string "one;three;last" or "three;one;last" or some other combination of the three values, you'd need some additional column to order the results by (a database table is inherently unordered). If there is an id column that you're not showing, for example, that could do that, you'd order by id in the listagg.
If you don't care what order the values appear in the result, you could do something like this
select listagg( b, ';' ) within group (order by a)
from your_table
where a = 1

Related

Valid SQL causes Access error requiring expression in SELECT and GROUP? [duplicate]

I have this:
SELECT name, value,
MIN(value) as find_min
FROM history
WHERE date_num >= 1609459200
AND date_num <= 1640995200
AND name IN('A')
GROUP BY name
Trying to get the minimum value between dates for each subject separately :
name value
A. 3
B 4
C 9
A 0
C 2
I keep getting this popular error:
column "history.value" must appear in the GROUP BY clause or be used in an aggregate function
I read this must appear in the GROUP BY clause or be used in an aggregate function
and I still do not understand:
Why I have to include in GROUP BY everything? what is the logic?
Why is this not working?
is Min() over (partition by name) better, and if so, how can I get only a single result per name?
EDIT:
If I try:GROUP BY name, find_min it will fail as well, even though in this case he can produce a unique result (the all the same)
That is actually easy to understand.
When you say GROUP BY name, all rows where name is the same are grouped together to form a single result row. Now the original table could contain two rows with the same name, but different value. If you add value to the SELECT list, which of those should be output? On the other hand, determining min(value) for each group is no problem.
Even if there is only a single value for the whole group (like with your find_min), you have to add the column to GROUP BY.
There is actually one exception: if the primary key of a table is in the GROUP BY clause, other columns from that table need not be in GROUP BY, because this proves automatically that there can be no different values.
try like below
SELECT name,
MIN(value) as find_min
FROM history
WHERE date_num >= 1609459200 AND date_num <= 1640995200
GROUP BY name
I removed name in ('A') because your are searching for all name min value so it will restrict just A
To answer your question, GROUP BY groups similar data in a table.
For example this table:
A B C
a d 1
a k 2
b d 3
And you have the query:
SELECT A, B, MIN(C)
FROM t
GROUP BY A
and this would not work you can't give a decisive answer what to do with the entry a k 2 because you don't group by Column B, but you group by column A, is there now two entries but they are different. Therefore you have to group by all non min,max,sum,etc. columns.

Why column must appear in the GROUP BY?

I have this:
SELECT name, value,
MIN(value) as find_min
FROM history
WHERE date_num >= 1609459200
AND date_num <= 1640995200
AND name IN('A')
GROUP BY name
Trying to get the minimum value between dates for each subject separately :
name value
A. 3
B 4
C 9
A 0
C 2
I keep getting this popular error:
column "history.value" must appear in the GROUP BY clause or be used in an aggregate function
I read this must appear in the GROUP BY clause or be used in an aggregate function
and I still do not understand:
Why I have to include in GROUP BY everything? what is the logic?
Why is this not working?
is Min() over (partition by name) better, and if so, how can I get only a single result per name?
EDIT:
If I try:GROUP BY name, find_min it will fail as well, even though in this case he can produce a unique result (the all the same)
That is actually easy to understand.
When you say GROUP BY name, all rows where name is the same are grouped together to form a single result row. Now the original table could contain two rows with the same name, but different value. If you add value to the SELECT list, which of those should be output? On the other hand, determining min(value) for each group is no problem.
Even if there is only a single value for the whole group (like with your find_min), you have to add the column to GROUP BY.
There is actually one exception: if the primary key of a table is in the GROUP BY clause, other columns from that table need not be in GROUP BY, because this proves automatically that there can be no different values.
try like below
SELECT name,
MIN(value) as find_min
FROM history
WHERE date_num >= 1609459200 AND date_num <= 1640995200
GROUP BY name
I removed name in ('A') because your are searching for all name min value so it will restrict just A
To answer your question, GROUP BY groups similar data in a table.
For example this table:
A B C
a d 1
a k 2
b d 3
And you have the query:
SELECT A, B, MIN(C)
FROM t
GROUP BY A
and this would not work you can't give a decisive answer what to do with the entry a k 2 because you don't group by Column B, but you group by column A, is there now two entries but they are different. Therefore you have to group by all non min,max,sum,etc. columns.

Is the ordering of a GROUP BY with a MAX aggregate well defined?

Let's assume I run the following in SQLite:
CREATE TABLE my_table
(
id INTEGER PRIMARY KEY,
NAME VARCHAR(20),
date DATE,
num INTEGER,
important VARCHAR(20)
);
INSERT INTO my_table (NAME, date, num, important)
VALUES ('A', '2000-01-01', 10, 'Important 1');
INSERT INTO my_table (NAME, date, num, important)
VALUES ('A', '2000-02-01', 20, 'Important 2');
INSERT INTO my_table (NAME, date, num, important)
VALUES ('A', '1999-12-01', 30, 'Important 3');
The table looks like this:
id
NAME
date
num
important
1
A
2000-01-01
10
Important 1
2
A
2000-02-01
20
Important 2
3
A
1999-12-01
30
Important 3
If I execute:
SELECT id
FROM my_table
GROUP BY NAME;
the results are:
+----+
| id |
+----+
| 1 |
+----+
If I execute:
SELECT id, MAX(date)
FROM my_table
GROUP BY NAME;
The results are:
+----+------------+
| id | max(date) |
+----+------------+
| 2 | 2000-02-01 |
+----+------------+
And if I execute:
SELECT id,
MAX(date),
MAX(num)
FROM my_table
GROUP BY NAME;
The results are:
+----+------------+----------+
| id | max(date) | max(num) |
+----+------------+----------+
| 3 | 2000-02-01 | 30 |
+----+------------+----------+
My question is, is this well defined? Specifically, am I guaranteed to always get id = 2 when doing the second query (with the single Max(date) aggregate), or is this just a side effect of how SQLite is likely ordering the table to grab the Max before grouping?
I ask this because I specifically do want id = 2. I will then execute another query that selects the important field for that row (for my actual problem the first query would return multiple ids and I'd select all important fields for all those rows at once.
Additionally, this is all happening in an iOS Core Data query, so I'm not able to do more complicated subqueries. If I knew that the ordering of a GROUP BY is defined by an aggregate then I'd feel pretty confident my queries wouldn't break (until Apple moves away from SQLite for Core Data).
Thanks!
From the Sqlite manual
2.5. Bare columns in an aggregate query
The usual case is that all column names in an aggregate query are either arguments to aggregate functions or else appear in the GROUP BY clause. A result column which contains a column name that is not within an aggregate function and that does not appear in the GROUP BY clause (if one exists) is called a "bare" column. Example:
SELECT a, b, sum(c) FROM tab1 GROUP BY a;
In the query above, the "a" column is part of the GROUP BY clause and so each row of the output contains one of the distinct values for "a". The "c" column is contained within the sum() aggregate function and so that output column is the sum of all "c" values in rows that have the same value for "a". But what is the result of the bare column "b"? The answer is that the "b" result will be the value for "b" in one of the input rows that form the aggregate. The problem is that you usually do not know which input row is used to compute "b", and so in many cases the value for "b" is undefined.
Special processing occurs when the aggregate function is either min() or max(). Example:
SELECT a, b, max(c) FROM tab1 GROUP BY a;
When the min() or max() aggregate functions are used in an aggregate query, all bare columns in the result set take values from the input row which also contains the minimum or maximum. So in the query above, the value of the "b" column in the output will be the value of the "b" column in the input row that has the largest "c" value. There is still an ambiguity if two or more of the input rows have the same minimum or maximum value or if the query contains more than one min() and/or max() aggregate function. Only the built-in min() and max() functions work this way.
If bare columns appear in an aggregate query that lacks a GROUP BY clause, and the number of input rows is zero, then the values of the bare columns are arbitrary. For example, in this query:
SELECT count(*), b FROM tab1;
If the tab1 table contains no rows (of count(*) evaluates to 0) then the bare column "b" will have an arbitrary and meaningless value.
Most other SQL database engines disallow bare columns. If you include a bare column in a query, other database engines will usually raise an error. The ability to include bare columns in a query is an SQLite-specific extension.
https://www.sqlite.org/lang_select.html
am I guaranteed to always get id = 2 when doing the second query (with
the single Max(date) aggregate), or is this just a side effect of how
SQLite is likely ordering the table to grab the Max before grouping?
Yes, the result that you get is guaranteed because it is documented in Bare columns in an aggregate query.
The value for the column id that you get is from the row that contains the max date.

Order By clause in sql server

Suppose, there is a table and I need to sort one of its column (name) alphabetically and at the same time I want to sort it by using ID column in asc order based on the condition ( rows that have same name). So, I failed to understand how this will work. Once the records will be sorted by column (name) then will it sort all rows by using id column?
Can someone explain how actually order by clause works in this case
select name,
id
from hack h
order by name,
id
use order by name, id
select name,
id
from hack
order by name,
id
I just tried to understand what you want to know, you want to realize how it happens when the order by clause have two or more columns ,am I right? Let's go to an example,
the first column is id and the second is name,
2 A
5 B
6 A
3 A
1 B
the result of SQL "select name,id from hack order by name,id" will get the result as below
A 2
A 3
A 6
B 1
B 5
see, it will sort first by name column, and then sort id in the same name value group.
That's it ,did I make it clear?
This answers the original question.
In the code you posted:
substring(name, len(name) - 2, len(name))
returns the last 3 characters of the name.
So you are sorting by these last 3 characters and not by name.
When there are 2 names with the same last 3 characters these will be sorted by id.
If there are more than one column names after "order by" keyword, the system orders the records according to the first column just after order by.

Query to find duplicate values for two fields

Sorry for the Title, But didn't know how to explain.
I have a table that have 2 fields A and B.
I want find all rows in the table that have duplicate A (more than one record) but at the same time A will consider as a duplicate only if B is different in both rows.
Example:
FIELD A Field B
10 10
10 10 // This is not duplicate
10 10
10 5 // this is a duplicate
How to to this in a single query
Let's break this down into how you would go about constructing such a query. You don't make it clear whether you're looking for all values of A or all rows but let's assume all values of A initially.
The first step therefore is to create a list of all values of A. This can be done two ways, DISTINCT or GROUP BY. I'm going to use GROUP BY because of what else you want to do:
select a
from your_table
group by a
This returns a single column that is unique on A. Now, how can you change this to give you the unique values? The most obvious thing to use is the HAVING clause, which allows you to restrict on aggregated values. For instance the following will give you all values of A which only appear once in the table
select a
from your_table
group by a
having count(*) = 1
That is the count of all values of A inside the group is 1. You don't want this of course, you want to do this with the column B. You need there to exist more than one value of B in order for the situation you want to identify to be possible (if there's only one value of B then it's impossible). This gets us to
select a
from your_table
group by a
having count(b) > 1
This still isn't enough as you want two different values of B. The above just counts the number of records with the column B. Inside an aggregate function you use the DISTINCT keyword to determine unique values; bringing us to:
select a
from your_table
group by a
having count(distinct b) > 1
To transcribe this into English this means select all unique values of A from YOUR_TABLE that have more than one values of B in the group.
You can use this method, or something similar, to build up your own queries as you create them. Determine what you want to achieve and slowly build up to it.
select FIELD from your_table group by FIELD having count(b) > 1
take in consideration that this will return count of all duplicate
example
if you have values
1
1
2
1
it will return 3 for value 1 not 2