Postgres Group by like fuzzy logic - sql

I want to group by data present table. but the problem in all names is similar not the same.
id name subject_id
---------------------------------
1 Ganeash 1
2 Ganesha P 2
3 Shree Ganesh Pai 1
4 Gaanesh shree G 1
5 Ramesh shri 2
In this data everywhere Ganesh is common so the output should contain.
name count
-------------
Ganesh 4
Ramesh 1
If I use soundex function.
postgres=# SELECT soundex('hello world!');
ERROR: function soundex(unknown) does not exist
LINE 1: SELECT soundex('hello world!');
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

You can also use CASE
select name,count(*) from (
select case when name like '%Ganesh%' THEN 'Ganesh'
when name like '%Ramesh%' THEN 'Ramesh' end as name
from test) a
group by name
Check Demo Here
Output

It is pretty simple, you just need to split the text field into an array of words and then unnest the arrays into rows. Then you manipulate the rows using standard SQL (count(), GROUP BY, etc.):
SELECT count(*),
unnest(regexp_split_to_array(name, E'\\s+')) AS name2
FROM names
GROUP BY name2
ORDER BY 1 DESC

Related

merging multiple rows into one based on id

i have the data in this format in an amazon redshift database:
id
answer
1
house
1
apple
1
moon
1
money
2
123
2
xyz
2
abc
and what i am looking for would be:
id
answer
1
house, apple, moon, money
2
123, xyz, abc
any idea? the thing is that i cannot hard code the answers as they will be variable, so preferably a solution that would simply scoop the answers for each id's row and put them together separated by a delimiter.
you can use aggregate function listagg:
select id , listagg(answer,',')
from table
group by id
You can use string_agg(concat(answer,''),',') with group by so it will be like that:
select id , string_agg(concat(answer,''),',') as answer
from table
group by id
tested here
Edit:
you don't need concatenate, you can just use string_agg(answer,',')

How to sort the string 'MH/122020/[xx]x' in an Access query?

I am trying to sort the numbers,
MH/122020/101
MH/122020/2
MH/122020/145
MH/122020/12
How can I sort these in an Access query?
I tried format(mid(first(P.PFAccNo),11),"0") but it didn't work.
You need to use expressions in your ORDER BY clause. For test data
ID PFAccNo
-- -------------
1 MH/122020/101
2 MH/122020/2
3 MH/122020/145
4 MH/122020/12
5 MH/122021/1
the query
SELECT PFAccNo, ID
FROM P
ORDER BY
Left(PFAccNo,9),
Val(Mid(PFAccNo,11))
returns
PFAccNo ID
------------- --
MH/122020/2 2
MH/122020/12 4
MH/122020/101 1
MH/122020/145 3
MH/122021/1 5
you have to convert your substring beginning with pos 11 to a number, and the number can be sorted.
How about this ?
SELECT
tmpTbl.yourFieldName
FROM
tmpTbl
ORDER BY
CLng(Mid([tmpTbl].[yourFieldname],InStrRev([tmpTbl].[yourFieldname],"/")+1));
Given the following data in my test_table, column DATETIMESTAMP:
XXX123
YYY000
XXX-1234
my Statement:
SELECT CInt(Mid(datetimestamp,4)) AS Ausdr1
FROM test_data
ORDER BY 1;
sorts my data. please hange 4 to 11 and it will work for you

Counting the number of rows based on like values

I'm a little bit lost on this. I would like to list the number of names beginning with the same letter, and find the total amount of names containing that first same letter.
For instance:
name | total
-------|--------
A | 12
B | 10
C | 8
D | 7
E | 3
F | 2
...
Z | 1
12 names beginning with letter 'A', 10 with 'B' and so on.
This is what I have so far
SELECT
LEFT(customers.name,1) AS 'name'
FROM customers
WHERE
customers.name LIKE '[a-z]%'
GROUP BY name
However, I'm unsure how I would add up columns based on like values.
This should work for you:
SELECT
LEFT(customers.name,1) AS 'name',
COUNT(*) AS NumberOfCustomers
FROM customers
WHERE
customers.name LIKE '[a-z]%'
GROUP BY LEFT(customers.name,1)
EDIT: Forgot the explanation; as many have mentioned already, you need to group on the calculation itself and not the alias you give it, as the GROUP BY operation actually happens prior to the SELECT and therefore has no idea of the alias yet. The COUNT part you would have figured out easily. Hope that helps.
You don't want to count the names, but only the first letters. So you must not group by name, but group by the first letter
SELECT LEFT(name, 1) AS name, count(*)
FROM customers
GROUP BY LEFT(name, 1)
SQLFiddle

Select required data in sql WHERE conditon

I have a table like
EID Name Desc
1 DMK Den (Obsolete)
2 KMPL K descforce
3 SFFSS system force (Obsolete)
4 QEMPL Yes
5 BGRNK BoardGMP
6 JIGG J G (obsolete)
How do i retrive EID,Name WHERE Desc is not (Obsolete). Result table looks like
EID Name Desc
2 KMPL K descforce
4 QEMPL Yes
5 BGRNK BoardGMP
How to specify that in WHERE clause of sql query?
You are looking for the not like clause. In MS Access, this uses * as a wildcard, so the query is:
select t.*
from table as t
where [desc] not like "*obsolete*";
I forget if like in Access is case sensitive, so you might need:
select t.*
from table as t
where lcase([desc]) not like "*obsolete*";

how to select one tuple in rows based on variable field value

I'm quite new into SQL and I'd like to make a SELECT statement to retrieve only the first row of a set base on a column value. I'll try to make it clearer with a table example.
Here is my table data :
chip_id | sample_id
-------------------
1 | 45
1 | 55
1 | 5986
2 | 453
2 | 12
3 | 4567
3 | 9
I'd like to have a SELECT statement that fetch the first line with chip_id=1,2,3
Like this :
chip_id | sample_id
-------------------
1 | 45 or 55 or whatever
2 | 12 or 453 ...
3 | 9 or ...
How can I do this?
Thanks
i'd probably:
set a variable =0
order your table by chip_id
read the table in row by row
if table[row]>variable, store the table[row] in a result array,increment variable
loop till done
return your result array
though depending on your DB,query and versions you'll probably get unpredictable/unreliable returns.
You can get one value using row_number():
select chip_id, sample_id
from (select chip_id, sample_id,
row_number() over (partition by chip_id order by rand()) as seqnum
) t
where seqnum = 1
This returns a random value. In SQL, tables are inherently unordered, so there is no concept of "first". You need an auto incrementing id or creation date or some way of defining "first" to get the "first".
If you have such a column, then replace rand() with the column.
Provided I understood your output, if you are using PostGreSQL 9, you can use this:
SELECT chip_id ,
string_agg(sample_id, ' or ')
FROM your_table
GROUP BY chip_id
You need to group your data with a GROUP BY query.
When you group, generally you want the max, the min, or some other values to represent your group. You can do sums, count, all kind of group operations.
For your example, you don't seem to want a specific group operation, so the query could be as simple as this one :
SELECT chip_id, MAX(sample_id)
FROM table
GROUP BY chip_id
This way you are retrieving the maximum sample_id for each of the chip_id.