sql distinct on one column of a group

sql distinct on one column of a group - sql

SELECT DISTINCT inta, name, PHN#, FROM nydta.adres
WHERE inta <> ' '
I want the distinct for inta because alot of the time phone is blank so those are coming thru i do want all columns, but distinct for inta.
and secondly, inta is an internet address column.
i would like to exclude one domain like say
#excludethisdomain.com
Data looks like this
ACCOUNT#ALLSTARS.COM GATES LOU 212-555-1212 ALLSTARREADING
PHERWESTBARN#MSN.COM BARN HEAT 212-555-1212
PHERWESTBARN#MSN.COM BARN RALP EARLS
So in the second and third, it's distinct bec of the email address.

With regard to the comments under question, if you want to select distinct emails with only one name which does not matter which of the names are selected for the specific email then you can use subqueries to select the values:
select distinct
t.inta ,
(select top 1 a.name from nydta.adres a where a.inta=t.inta) name,
(select top 1 a.PHH from nydta.adres a where a.inta=t.inta) PHH
from
nydta.adres t
where
inta <> ' '

Related

Get all records where each words in string exists on any of the columns in a table

I am building a search functionality and need help with a postgres query. My use case is - When a string is an input, what is the best (optimized) way in postgres to get all records where each words in string exists on any of the columns in a table ?
Sample Table: (The table I am working with has 40 columns)
FName
Occupation
John
Engineer
Carlos
Doctor
Case 1: Given a string 'John Doctor', In this case it would return both the records.
Output:
FName
Occupation
John
Engineer
Carlos
Doctor
Case 2: Given a string 'John Engineer', it would only return 1 row
Output:
FName
Occupation
John
Engineer
Case 3: Given a string 'Carlos', it would return 1 row
Output:
FName
Occupation
Carlos
Doctor

Basically, you want to do following:
SELECT FName, Occupation
FROM yourtable
WHERE
'John' IN (FName, Occupation) OR
'Doctor' IN (FName, Occupation);
I don't know if this is already a sufficient answer for you because it's unclear if the logic to fetch the different names from your "search string" must be written as SQL query, too. I think that's a much better task for your application.
If this must also be done in pure SQL, you could use UNNEST to split your string.
Something like this:
WITH sub AS
(SELECT UNNEST(STRING_TO_ARRAY('John Doctor', ' ')) AS searchNames)
SELECT
DISTINCT y.FName, y.Occupation
FROM yourtable y, sub
WHERE
sub.searchNames IN (y.FName, y.Occupation);
This will split your string by spaces into the different names, i.e. you need to provide a search string in the form you have mentioned, with a space between the names.
This will produce the correct results according to your description.
We can verify this here: db<>fiddle1
This can be extended for as many column as needed. Let's for example add a further column col and search Test3 in this column, then the query will be like this:
SELECT FName, Occupation,col
FROM yourtable
WHERE 'John' IN (FName, Occupation, col)
OR 'Doctor' IN (FName, Occupation, col)
OR 'Test3' IN (FName, Occupation, col);
Or again with UNNEST like this:
WITH sub AS
(SELECT UNNEST(STRING_TO_ARRAY('John Doctor Test3', ' ')) AS searchNames)
SELECT
DISTINCT y.FName, y.Occupation, y.col
FROM yourtable y, sub
WHERE
sub.searchNames IN (y.FName, y.Occupation, y.col);
Try this here: db<>fiddle2

Use regexp match operator (case insensitive) and any to find the records that contain at least one of the words in the list.
select *
from the_table t
where t::text ~* any(string_to_array(the_words_list, ' '));
DB Fiddle demo

How do I display multiple fields when using distinct count?

I am trying to get a count of total different first and last names with the same email address, and I'm not sure where to go from here. Field1 and Field2 are in the same table.
My output should have the concatenated field, field 1, field2
SELECT COUNT(DISTINCT(CONCAT(first_name,last_name)))
FROM `datalake.core.profile_snapshot`
WHERE classic_country = 'US' and
email.personal = 'example#provider.net'
LIMIT 1000
Appreciate any help!

SELECT
first_name
,last_name
,email_address
,count(1) as number
FROM datalake.core.profile_snapshot
GROUP BY
first_name
,last_name
,email_address
If you want to reduce the result set to a particular email address then just add a where clause to do so.
I've used email_address instead of email.personal.

LIMIT for SQL is generally limiting the number of rows returned, not for filtering. Need to use HAVING to filter on your aggregate
Email with 1000+ Distinct Names
SELECT email
/*Put random pipe character "|" in between first and last name so don't get names that concatenate to same value
Such as Jane Doe and Jan Edoe. Not a realistic example but concatenation could result in same "value" without a separator*/
,DistinctNames = COUNT(DISTINCT CONCAT(first_name,'|',last_name))
FROM datalake.core.profile_snapshot
WHERE classic_country = 'US'
AND email.personal = 'example#provider.net' /*Can comment this out if you want to see all email with 1000+ distinct names*/
GROUP BY email
/*HAVING clause = WHERE clause for aggregates*/
HAVING COUNT(DISTINCT CONCAT(first_name,'|',last_name)) > 1000 /*1000 distinct names for each email*/

PostgreSQL Return Row if Value Exists in One of Several Columns

Ok, I am stuck on this one.
I have a PostgreSQL table customers that looks like this:
id firm1 firm2 firm3 firm4 firm5 lastname firstname
1 13 8 2 0 0 Smith John
2 3 2 0 0 0 Doe Jane
Each row corresponds to a client/customer. Each client/customer can be associated with one or multiple firms; the numeric value under each firm# columns corresponds to the firm id in a different table.
So I am looking for a way of returning all rows of customers that are associated with a specific firm.
For example, SELECT id, lastname, firstname where 8 exists in firm1, firm2, firm3, firm4, firm5 would just return the John Smith row as he is associated with firm 8 under the firm2 column.
Any ideas on how to accomplish that?

You can use the IN operator for that:
SELECT *
FROM customer
where 8 IN (firm1, firm2, firm3, firm4, firm5);
But it would be much better in the long run if your normalized your data model.

You should consider to normalize your tables, with the current schema you should join firms tables as many times as the number of firm fields in your customer table.
select *
from customers c
left join firms f1
on f1.firm_id = c.firm1
left join firms f2
on f2.firm_id = c.firm2
left join firms f3
on f3.firm_id = c.firm3
left join firms f4
on f4.firm_id = c.firm4

You can "unpivot" using a combination of array and unnest, as specified in this answer: unpivot and PostgreSQL.
In your case, I think this should work:
select lastname,
firstname,
unnest(array[firm1, firm2, firm3, firm4, firm5]) as firm_id
from customer
Now you can select from this table (using either a with statement or an inner query) where firm_id is the value you care about

How to count a number of substring in a row via SQL?

I am working on a database represeting a simple address book through MS Studio 2015 (C#) and MS SQL Server 2008. I successfully added 'insert row' and 'remove row' methods in my code. So I want to compose a query (a stored procedure) which counts a number of substring in every row.
For example, I have the database which includes a table called Contacts:
PersonID Name Surname City Phone
1 Alice Karlsson Gotheburg 69-58-12
2 Mark Morrow Stockholm 48-48-48
3 Katherine Karlsson Gotheburg 69-58-16
If I try to find and count 'th' in the table, I want to get the following the result:
PersonID Name Surname City Phone Count
3 Katherine Karlsson Gotheburg 69-58-16 2
1 Alice Karlsson Gotheburg 69-58-12 1
So I don't know how to do that. I've been googling for all the day but I didn't find the satisfying result. Here on the stackoverflow.com I find a solution returning the next result:
ColumnName ColumnValue
Contacts.City Gotheburg
Contacts.Name Katherine
Contacts.City Gotheburg
Please, give me any idea to compose a query returning the expected result.
Full-text search; is the expected result
UPD: 'th' is a substring I'm looking for in a row. So it should count "Agathe', 'th' and 'youth' the same way.

You should try following,
Select
PersonId,
Name,
Surname,
City,
Phone,
sum(count) as count
From
(
select
*,
(Len(name) - LEN(REPLACE(name, 'th', ' ')) +
Len(surname) - LEN(REPLACE(surname, 'th', ' ')) +
Len(city) - LEN(REPLACE(city, 'th', ' '))) as count
from Contacts
where name like '%th%' or surname like '%th%' or city like '%th%'
)T
Group by PersonId, Name, Surname, City, Phone
Order by 6 desc

Here what you are trying to achieve is fulltext search...
Please follow this link..
http://blog.sqlauthority.com/2008/09/05/sql-server-creating-full-text-catalog-and-index/
create a full text index
and use this script
select * from yourtable
where freetext (*,'your_search_item')

Try this way
select * from Contacts where Contacts.City like '%th%' or
Contacts.Name like '%th%'

You need to create a table valued function that loops on all rows column by column to seek the sub-string with a counter,
inside the loop you can use built in functions that help in seeking texts such as CHARINDEX('th',Name+Surname+City,0)
which gives the exact location of the sub-string inside the text ...

sql query order by parts of

lets say you have a table with 10 000 records of different email adresses, but within this tables there are a few hundred (this can vary and should not matter) addresses that contains a specific domain name ie #horses.com.
I would like in one single query retrieve all 10 000 record, but the ones that contains #horses.com will always be on top of the list.
Something like this " SELECT TOP 10000 * FROM dbo.Emails ORDER BY -- the records that contains #horses.com comes first"
OR
Give me 10000 records from the table dbo.Emails but make shure everyone that contains "#horses.com" comes first, no matter how many there is.
BTW This is on an sql 2012 server.
Anyone??

Try this:
SELECT TOP 10000 *
FROM dbo.Emails
ORDER BY IIF(Email LIKE '%#horses.com', 0, 1)
This assumes the email ends in '#horses.com', which isn't unreasonable. If you really want a contains-like function, add another % after the .com.
Edit: The IIF function is only available in sql server 2012 and later, for a more portable solution use CASE WHEN Email LIKE '%#horses.com' THEN 0 ELSE 1 END.

SELECT TOP 10000 *
FROM dbo.Emails
ORDER BY case when charindex('#horses.com', email) > 0
then 1
else 2
end,
email

SELECT 1,* FROM dbo.Emails where namn like '%#horses.com%'
union
SELECT 2,* FROM dbo.Emails where namn not like '%#horses.com%'
order by 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql distinct on one column of a group - sql

Related

Get all records where each words in string exists on any of the columns in a table

How do I display multiple fields when using distinct count?

PostgreSQL Return Row if Value Exists in One of Several Columns

How to count a number of substring in a row via SQL?

sql query order by parts of

Categories

Resources