Finding multiple occurrences in SQLite3 database? - sql

I am having trouble with my SQL Query to return all entries in a column which have occurred more than twice.
I have looked at other StackOverflow answers for trying to do this, and whenever I apply the same query to mine it just returns entries where the first character in the entry is the same.
For example, there are around 6 entries in the database which start off with a '-' and it returns all of them even though they are not identical matches.
Would someone be able to tell me where my query is going wrong?
I would have thought that looking for duplicates would be a standard procedure.
Here is the query I am using:
SELECT name FROM subs GROUP BY name HAVING (COUNT(name) > 1);
Here is a sample of the output:
-Johnny-
-Lady_Gaga-
-Randy_Marsh-
AJWesty
All_CAPS
NB: I am looking to find all the usernames in the database that occur more than once.
Thanks in advance!

You could try stripping the dashes off the names column. Here is one way to do that:
SELECT REPLACE(name, '-', '')
FROM subs
GROUP BY REPLACE(name, '-', '')
HAVING (COUNT(name) > 1);
But this approach runs the risk of removing dashes occurring inside a name. To avoid this, we need to do more work:
SELECT
CASE WHEN SUBSTR(name, 1, 1) = '-' AND SUBSTR(name, LENGTH(name), 1) = '-'
THEN SUBSTR(name, 2, LENGTH(name)-2)
ELSE name END AS name
FROM subs
GROUP BY
CASE WHEN SUBSTR(name, 1, 1) = '-' AND SUBSTR(name, LENGTH(name), 1) = '-'
THEN SUBSTR(name, 2, LENGTH(name)-2)
ELSE name END
HAVING COUNT(*) > 1;

Related

Oracle : Sort a list on second word in SQL Query?

In Oracle, how do i sort a value for eg: Name in SQL using its second word. Looks a bit tricky to me. For eg: If in a table i have the names as below:
I have a list of objects which i want to sort on basis of name. I have done coding where it does get sorted on basis of name but i have a slight different requirement.
The names are for Example:
Bull AMP
Cat DEF
Dog AMP
Frog STR
Zebra DEF
I want the list to be :
Bull AMP
Dog AMP
Cat DEF
Zebra DEF
Frog STR
To be sorted by second word.
I tried the below query but it didnt seem to work.
SELECT NAME,
SUBSTR(NAME, INSTR(' ',NAME),
LENGTH(NAME) - INSTR(' ',NAME) +2) AS word2
FROM animal_master
ORDER BY SUBSTR(NAME,INSTR(' ',NAME), LENGTH(NAME) - INSTR(' ',NAME) +2) asc;
Can anyone please guide whats wrong.
Your INSTR function had back to front arguments, so that is why you were gettting the incorrect results. I would recommend using the following:
SELECT NAME
FROM animal_master
ORDER BY SUBSTR(NAME,INSTR(NAME, ' ')) asc;
The SUBSTR(NAME,INSTR(NAME, ' ')) returns only the second word, and you order by this second word. If you also want to order by the first, then the second word you can do something like this:
ORDER BY SUBSTR(NAME,INSTR(NAME, ' ')), NAME
Clean up all that nested instr/substr stuff with regexp_replace:
order by regexp_replace(NAME, '.* (\w)', '\1'), NAME;
The regex matches and remembers the last "word" after the space, then orders by that first.
Try the following.
SELECT NAME
FROM animal_master
ORDER BY TRIM(SUBSTR(NAME, (INSTR(NAME, ' ', 1, 1) + 1), (INSTR(NAME, ' ', 1, 2) - INSTR(NAME, ' ', 1, 1) - 1))) asc;
Below query explains better..
SELECT NAME
FROM (
SELECT
NAME,
(INSTR(NAME, ' ', 1, 1) + 1) first_occurrence_of_space,
(INSTR(NAME, ' ', 1, 2) - INSTR(NAME, ' ', 1, 1) - 1) second_occurrence_of_space
FROM animal_master)
ORDER BY TRIM(SUBSTR(NAME, first_occurrence_of_space, second_occurrence_of_space)) asc;
select display_name from employee order by substr(upper(display_name),instr(upper(display_name),' ',1));

How to select specific text from the string in efficient way using SQL

Below is my query. It is giving me correct output but I need to run it efficiently as it is used for 500k records.
DECLARE #DESC_MESSAGE VARCHAR(5000)
SET #DESC_MESSAGE = '12345 VENKAT was entered ODC ABCD-3'
SELECT REPLACE(#DESC_MESSAGE,SUBSTRING(#DESC_MESSAGE,1,CHARINDEX('was',#DESC_MESSAGE,3)-1),'')
I just want to retrieve text after 'was' which can change depending on condition.
for ex. text can be like
'112233 XYZ was entered ODC PQRS-3' or
'223344 HARRY was gone out of ODC AMD-3'
Please suggest efficient way to retrieve such text.
I would be inclined to use stuff():
select stuff(col, 1, chardindex('was ', col + 'was ') + 4, '')
The + 'was + in the charindex() function just guarantees no error if 'was ' is not in the text.
half milion rows is not so huge..
what i can see in your question is that there is an architecture issue,
why do you need to split a column to make a query?
why don't you keep the colums splitted in origin ?
eventually you could have another column that contains only the text after the "was"
this could be better even if the rows grow a lot.
select LTRIM(stuff(#DESC_MESSAGE, 1, CHARINDEX(' was', #DESC_MESSAGE + 'was') + 3, ''))

splitting a column into individual words then filter results

I have a column where I would like to break it into individual words for an "auto suggest" on text box. I was able to accomplish this thanks to this article:
http://discourse.bigdinosaur.org/t/sql-and-database-splitting-a-column-into-individual-words/709/45
However, the results have characters that I don't want like "/' , etc.
I have a database function that I use when I want to filter out characters, but I cannot figure out how to merge the 2 and get it to work.
Here's the splitting code:
;WITH for_tally (spc_loc) AS
(SELECT 1 AS spc_loc
UNION ALL
SELECT spc_loc + 1 from for_tally WHERE spc_loc < 65
)
SELECT spc_loc into #tally from for_tally OPTION(MAXRECURSION 65)
select substring(name, spc_loc, charindex(' ', name + ' ', spc_loc) - spc_loc) as word
into #temptable
from #tally, products
where spc_loc <= convert(int, len(name))
and substring(' ' + name, spc_loc, 1) = ' '
Then, here's how I view the table:
select distinct word from #temptable order by word
Then here's how I call the database function in other queries:
SELECT * INTO #Keywords FROM dbo.SplitStringPattern(#Keywords, '[abcdefghijklmnopqrstuvwxyz0123456789-]')
Where #Keywords is the string to filter.
I've tried all I can think of, to filter the first query by the dbo.SplitStringPattern function
ie:
select substring(dbo.SplitStringPattern(name, '[abcdefghijklmnopqrstuvwxyz0123456789-]'), spc_loc, charindex(' ', name + ' ', spc_loc) - spc_loc) as word
into #temptable
from #tally, products
where spc_loc <= convert(int, len(name))
and substring(' ' + name, spc_loc, 1) = ' '
But I get "Cannot find either column "dbo" or the user-defined function or aggregate "dbo.SplitStringPattern", or the name is ambiguous."
I need this to be optimized as this query needs to return results very quick.
Or, if there's a better way of doing this, I'm open to suggestions.
Create a T-SQL Function to filter out the unwanted characters before your "database function" is called. The following web page links should help.
MSDN Create Function (Transact-SQL)
How to remove special characters from a string in MS SQL Server (T-SQL)

Sort Everything After A Specific Character in SQL

I am needing to sort a field on everything after a space usig SQL. In the example below, I would like it to sort (ascending) beginning with the last name.
USA-J. Doe
USA-M. Mouse
USA-A. Mouse
USA-D. Duck
USA-P. Panther
USA-T. Bird
I need it to sort the entire string, but on the last name. If there are two last names that are identical, I would like for it to take the initial of the first name into account. The result would be:
USA-T. Bird
USA-J. Doe
USA-D. Duck
USA-A. Mouse
USA-M. Mouse
USA-P. Panther
I will need to use this code in both SQL Server and MS Access.
I hope that someone can fully answer this question. For whatever reason, someone has scored me a -1 on this question. I cannot figure out why. I have been as specific as I know to be and I wasn't able to find an answer to the final piece--sorting by first letter if the last name is the same.
Thank you guys for responding. The information helped. I had to add brackets around "name" because the name of the field was similar to the name of the actual table.
This depends on the database. The following is how you might do this in SQL Server:
order by substring(name, charindex(' ', name) + 1, len(name)))
Similar logic works in other databases but the functions are different.
For instance, in Oracle:
order by substr(name, instr(name, ' ') + 1)
And, in MySQL, you could use similar logic, but this is simpler:
order by substring_index(name, ' ', -1)
And in MS Access:
order by mid(name, instr(name, ' ') + 1)
For sorting by lasname, firstname you need to use like this in SQL server:
order by substring(name, charindex(' ', name) + 1, len(name))), substring(name, charindex('-', name) + 1, 1))
In Access
order by mid(name, instr(name, ' ') + 1), order by mid(name, instr(name, '-') + 1, 1)

Running SQL query to remove all trailing and beginning double quotes only affects first record in result set

I'm having a problem when running the below query - it seems to ONLY affect the very first record. The query removes all trailing and beginning double quotes. The first query is the one that does this; the second query is just to demonstrate that there are multiple records that have beginning double quotes that I need removed.
QUESTION: As you can see the first record resulting from the top query is fine - it has its double quotes removed from the beginning. But all subsequent queries appear to be untouched. Why?
If quotes are always assumed to exist at both the beginning and the end, adjust your CASE statement to look for instances where both cases exist:
CASE
WHEN ([Message] LIKE '"%' AND [Message] LIKE '%"') THEN LEFT(RIGHT([Message], LEN([Message])-1),LEN([Message]-2)
ELSE [Message]
EDIT
If assumption is not valid, combine above syntax with your existing CASE logic:
CASE
WHEN ([Message] LIKE '"%' AND [Message] LIKE '%"') THEN LEFT(RIGHT([Message],LEN([Message])-1),LEN([Message]-2)
WHEN ([Message] LIKE '"%') THEN RIGHT([Message],LEN([Message]-1)
WHEN ([Message] LIKE '%"') THEN LEFT([Message],LEN([Message]-1)
ELSE [Message]
Because your CASE statement is only evaluating the first condition met, it will only ever remove one of the statements.
Try something like the following:
SELECT REPLACE(SUBSTRING(Message, 1, 1), '"', '') + SUBSTRING(Message, 2, LEN(Message) - 2) + REPLACE(SUBSTRING(Message, LEN(Message), 1), '"', '')
EDIT: As Martin Smith pointed out, my original code wouldn't work if a string was under two characters, so ...
CREATE TABLE #Message (Message VARCHAR(20))
INSERT INTO #Message (Message)
SELECT '"SomeText"'
UNION
SELECT '"SomeText'
UNION
SELECT 'SomeText"'
UNION
SELECT 'S'
SELECT
CASE
WHEN LEN(Message) >=2
THEN REPLACE(SUBSTRING(Message, 1, 1), '"', '') + SUBSTRING(Message, 2, LEN(Message) - 2) + REPLACE(SUBSTRING(Message, LEN(Message), 1), '"', '')
ELSE Message
END AS Message
FROM #Message
DROP TABLE #Message
Try this:
SELECT REPLACE([Message], '"', '') AS [Message] FROM SomeTable