SQL - querying without duplicate base on another column, /improving condition.?

SQL - querying without duplicate base on another column, /improving condition.? - sql

I have written a query which involves joins and finally returns the below result,
Name ID
AAA 1
BBB 1
BBB 6
CCC 1
CCC 6
DDD 6
EEE 1
But I want my result to be still filtered in such a way that, the duplicate values in the first column should be ignored which has lesser value. ie, CCC and BBB which are duplicates with value 1 should be removed. The result should be
AAA 1
BBB 6
CCC 6
DDD 6
EEE 1
Note: I have a condition called Where (ID = '6' or ID = '1'), is there any way to improve this condition saying Where ID = 6 or ID = 1 (if no 6 is available in that table)"

You will likely want to add:
GROUP BY name
to the bottom of your query and change ID to MAX(ID) in your SELECT statement
It is hard to give a more specific answer without seeing the query you've already written.

Related

Mariadb Building the best INDEX for a given SELECT - GROUP BY

I do not have much knowledge in the database.
For study, I am reading MariaDB's index documents.
But there are parts that I do not understand.
Document
Algorithm, step 2b (GROUP BY)¶
WHERE aaa = 123 AND bbb = 1 GROUP BY ccc ⇒ INDEX(bbb, aaa, ccc) or INDEX(aaa, bbb, ccc) (='s first, in any order; then the GROUP BY)
aaa or bbb knows that ordering of the indexes is important, regardless of the order of the where clauses. Therefore, the indexes of aaa and bbb in the where clause are used, and sort ccc based on the matched aaa and bbb.
GROUP BY x,y ⇒ INDEX(x,y) (no WHERE)
(no WHERE) means don't use WHERE clause?
What if I use it like this?
WHERE x > 1 GROUP BY x, y
my think:
(1) from table
(2) where x > 1 -> using index
(3) group by x, y -> using index..? because (2) already sorted..? or sort again?
(4) having -> if i did not enter this keyword, is it not used?
(5) select -> print data(?)
(6) order by -> group by already order by(?)

Algorithm, step 2b (GROUP BY)¶
WHERE aaa = 123 AND bbb = 1 GROUP BY ccc ⇒ INDEX(bbb, aaa, ccc) or INDEX(aaa, bbb, ccc) (='s first, in any order; then the GROUP BY)
there is table like below:
aaa | bbb | ccc
------------------
123 | 1 | 30
------------------
123 | 1 | 48
------------------
123 | 2 | 27
------------------
125 | 1 | 11
------------------
125 | 3 | 29
------------------
125 | 3 | 40
------------------
WHERE aaa = 123 AND bbb = 1 clause result is this:
aaa | bbb | ccc
------------------
123 | 1 | 30
------------------
123 | 1 | 48
check ccc column.
ccc column is sorted by bbb column.
so GROUP BY clause can be grouped quickly because the ccc columns are sorted.
**CAUTION**
think about WHERE aaa >= 123 AND bbb = 1 GROUP BY ccc clause.
aaa | bbb | ccc
------------------
123 | 1 | 30
------------------
123 | 1 | 48
------------------
125 | 1 | 11
------------------
ccc column doesn't be sorted by bbb column.
The ccc column is meaningful only if the aaa and bbb columns have the same value.
GROUP BY x,y ⇒ INDEX(x,y) (no WHERE)
this is same thing.

GROUP BY x,y ⇒ INDEX(x,y) (no WHERE)
should probably say "(if there is no WHERE)". If there is a WHERE, then that index may or may not be useful. You should (usually) build the INDEX based on the WHERE, an only if you get past it, consider the GROUP BY.
WHERE x > 1 GROUP BY x, y
OK, that can use INDEX(x,y), in that order. First, it will filter, and that leaves the rest of the index still in a good order for the grouping. Similarly:
WHERE x > 1 ORDER BY x, y
WHERE x > 1 GROUP BY x, y ORDER BY x, y
No sorting should be necessary.
So, here are the steps I might take:
1. WHERE x > 1 ... --> INDEX(x) (or any index _starting_ with `x`)
2. ... GROUP BY x, y --> INDEX(x,y)
3. recheck that I did not mess up the WHERE.
This has no really good index:
WHERE x > 1 AND y = 4 GROUP BY x,y
1. WHERE x > 1 AND y = 4 ... --> INDEX(y,x) in this order!
2. ... GROUP BY x,y --> can use that index
However, flipping to GROUP BY y,x has the same effect (ignoring the order of display).
(4) having -> if i did not enter this keyword, is it not used?
HAVING, if present, is applied after things for which INDEXes are useful. Having no HAVING does mean there is no HAVING.
(6) order by -> group by already order by(?)
That has become a tricky question. Until very recently (MySQL 8.0; don't know when or if MariaDB changed), GROUP BY implied the equivalent ORDER BY. That was non-standard and potentially interfered with optimization. With 8.0, GROUP BY does not imply any order; you must explicitly request the order (if you care).
(I updated the source document in response to this discussion.)

Remove group rows

How do I remove group rows based on other columns for a particular ID such that:
ID Att Comp Att. Inc. Att
aaa 2 0 2
aaa 2 0 2
bbb 3 1 2
bbb 3 1 2
bbb 3 0 2
becomes:
ID Att Comp Att. Inc. Att
aaa 2 0 2
bbb 3 1 2
I need to discard cases which are not just duplicate, but also infer the same data based on the columns.

Use drop_duplicates -- check out the documentation at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html
I can't tell for sure from your description what you want to pay attention for for duplicates, but you can tell drop_duplicates which column(s) to look at.

I want to get my output like this using sql.

Input:
id type value
1 a aa
1 a aaa
1 b bb
1 b bbb
1 c cc
1 c ccc
Output:
id type_a type_b type_c
1 aa;aaa bb;bbb cc;ccc
using db2 i need to do the work

Please give some info of Database Version. As different source have different techniques.
Refer below articles for all possible ways of concatenating strings.
String Aggregation Techniques

How to Combine field of different row in MS-Access? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
is there a group_concat function in ms-access?
Ms Access Query: Concatenating Rows through a query
I have following table:
ID TAG
----------
1 AAA
1 BBB
2 CCC
2 DDD
2 EEE
I want to get following as Output:
1 AAA, BBB
2 CCC, DDD, EEE
How can get result in the combination field in MS-Access.

View to replace values with max value corresponding to a match

I am sure my question is very simple for some, but I cannot figure it out and it is one of those things difficult to search an answer for. I hope you can help.
In a table in SQL I have the following (simplified data):
UserID UserIDX Number Date
aaa bbb 1 21.01.2000
aaa bbb 5 21.01.2010
ppp ggg 9 21.01.2009
ppp ggg 3 15.02.2020
xxx bbb 99 15.02.2020
And I need a view which will give me the same amount of records, but for every combination of UserID and UserIDX, there should be only 1 value under the Number field, i.e. the highest value found in the combination data set. The Date field needs to remain unchanged. So the above would be transformed to:
UserID UserIDX Number Date
aaa bbb 5 21.01.2000
aaa bbb 5 21.01.2010
ppp ggg 9 21.01.2009
ppp ggg 9 15.02.2020
xxx bbb 99 15.02.2020
So, for all instances of aaa+bbb combination the unique value in Number should be 5 and for ppp+ggg the unique number is 9.
Thank you very much.
Leo

select userid,useridx,maxnum,date
from table a
inner join (
select userid,useridx,max(number) maxnum
from table
group by userid,useridx) b
on a.userid = b.userid and a.useridx = b.useridx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - querying without duplicate base on another column, /improving condition.? - sql

You will likely want to add: GROUP BY name to the bottom of your query and change ID to MAX(ID) in your SELECT statement It is hard to give a more specific answer without seeing the query you've already written.

Related

Mariadb Building the best INDEX for a given SELECT - GROUP BY

Remove group rows

I want to get my output like this using sql.

How to Combine field of different row in MS-Access? [duplicate]

View to replace values with max value corresponding to a match

Categories

Resources