Related
I have a table that includes names and allows for a "nickname" for each name in parenthesis.
PersonName
John (Johnny) Hendricks
Zekeraya (Zeke) Smith
Ajamain Sterling (Aljo)
Beth ()) Jackson
I need to extract the Nickname, and return a column of nicknames and a column of full names (Full string without the nickname portion in parenthesis). I also need a condition for the nickname to be null if no nickname exists, and so that the nickname only returns letters. So far I have been able to figure out how to get the nickname out using Substring, but I can't figure out how to create a separate column for just the name.
Select SUBSTRING(PersonName, CHARINDEX('(', PersonName) +1,(((LEN(PersonName))-CHARINDEX(')',REVERSE(PersonName)))-CHARINDEX('(',PersonName)))
as NickName
from dbo.Person
Any help would be appreciated. I'm using MS SQL Server 2019. I'm pretty new at this, as you can tell.
Using your existing substring, one simple way is to use apply.
Assuming your last row is an example of a nickname that should be NULL, you can use an inline if to check its length - presumably a nickname must be longer than 1 character? Adjust this logic as required.
select PersonName, Iif(Len(nn)<2,null,nn) NickName, Trim(Replace(Replace(personName, Concat('(',nn,')') ,''),' ','')) FullName
from Person
cross apply (values(SUBSTRING(PersonName, CHARINDEX('(', PersonName) +1,(((LEN(PersonName))-CHARINDEX(')',REVERSE(PersonName)))-CHARINDEX('(',PersonName))) ))c(nn)
The following code will deal correctly with missing parenthesis or empty strings.
Note how the first CROSS APPLY feeds into the next
SELECT
PersonName,
NULLIF(NickName, ''),
FullName = ISNULL(REPLACE(personName, ' (' + NickName + ')', ''), PersonName)
FROM t
CROSS APPLY (VALUES(
NULLIF(CHARINDEX('(', PersonName), 0))
) v1(opening)
CROSS APPLY (VALUES(
SUBSTRING(
PersonName,
v1.opening + 1,
NULLIF(CHARINDEX(')', PersonName, v1.opening), 0) - v1.opening - 1
)
)) v2(NickName);
db<>fiddle
Assume there's a table projects containing project name, location, team id, start and end years. How can I concatenate rows so that the same names would combine the other information into one string?
name location team_id start end
Library Atlanta 2389 2015 2017
Library Georgetown 9920 2003 2007
Museum Auckland 3092 2005 2007
Expected output would look like this:
name Records
Library Atlanta, 2389, 2015-2017
Georgetown, 9920, 2003-2007
Museum Auckland, 3092, 2005-2007
Each line should contain end-of-line / new line character.
I have a function for this, but I don't think it would work with just using CONCAT. What are other ways this can be done? What I tried:
CREATE OR REPLACE TYPE projects (name TEXT, records TEXT);
CREATE OR REPLACE FUNCTION records (INT)
RETURNS SETOF projects AS
$$
RETURN QUERY
SELECT p.name
CONCAT(p.location, ', ', p.team_id, ', ', p.start, '-', p.end, CHAR(10))
FROM projects($1) p;
$$
LANGUAGE PLpgSQL;
I tried using CHAR(10) for new line, but its giving a syntax error (not sure why?).
The above sample concatenate the string but expectedly leaving out duplicated names.
You do not need PL/pgSQL for that.
First eliminate duplicate names using DISTINCT and then in a subquery you can concat the columns into a single string. After that use array_agg to create an array out of it. It will then "merge" multiple arrays, in case the subquery returns more than one row. Finally, get rid of the commas and curly braces using array_to_string. Instead of using the char value of a newline, you can simply use E'\n' (E stands for escape):
WITH j (name,location,team_id,start,end_) AS (
VALUES ('Library','Atlanta',2389,2015,2017),
('Library','Georgetown',9920,2003,2007),
('Museum','Auckland',3092,2005,2007)
)
SELECT
DISTINCT q1.name,
array_to_string(
(SELECT array_agg(concat(location,', ',team_id,', ',start,'-', end_, E'\n'))
FROM j WHERE name = q1.name),'') AS records
FROM j q1;
name | records
---------+----------------------------
Library | Atlanta, 2389, 2015-2017
| Georgetown, 9920, 2003-2007
|
Museum | Auckland, 3092, 2005-2007
Note: try to not use reserved strings (e.g. end,name,start, etc.) to name your columns. Although PostgreSQL allows you to use them, it is considered a bad practice.
Demo: db<>fiddle
A bit simple query:
select
name,
string_agg( concat(location, ', ', team_id, ', ', start, '-', "end"), E'\n') AS records
FROM t
group by name;
PostgreSQL fiddle
I am looking for some help in separating scientific names in my data. I want to take only the genus names and group them, but they are both connected in the same column. I saw the SQL Sever had a CHARINDEX command, but PostgreSQL does not. Does there need to be a function created for this? If so, how would it look?
I want to change 'Mallotus philippensis' to just 'Mallotus' or to just 'philippensis'
I am currently using Postgres 11, 12.
Use SPLIT_PART:
WITH yourTable AS (
SELECT 'Mallotus philippensis'::text AS genus
)
SELECT
SPLIT_PART(genus, ' ', 1) AS genus,
SPLIT_PART(genus, ' ', 2) AS species
FROM yourTable;
Demo
Probably string_to_array will be slightly more efficient than split_part here because string splitting will be done only once for each row.
SELECT
val_arr[1] AS genus,
val_arr[2] AS species
FROM (
SELECT string_to_array(val, ' ') as val_arr
FROM (
VALUES
('aaa bbb'),
('cc dddd'),
('e fffff')
) t (val)
) tt;
I am writing a small query to extract all last names from a bunch of Author name database. Names on file will contain first and middle name, or just first name.
Ex: John Smith
John T. Smith
So I cannot search purely by just after the first space... But I know for sure that the lastname should be from the END to the first space from the right side of the string. I don't really care about first name.
This is what I currently have...
select [name], LEFT([name], CHARINDEX(' ', [name] + ' ')-1) as firstName,
SUBSTRING([name], charindex(' ', [name]+' ') + 1, LEN([name])) as lastName
from Author
;
I am quite new to sql, any help is highly appreciated!
Thanks in advance!
EDIT: for those who ever come across this need for help, this line helps:
select substr(t.string,instr(t.string,' ',-1)+1) last_word
For Oracle DB try this :
WITH t AS
(SELECT name AS l FROM <your query>
)
SELECT SUBSTR(l,instr(l,' ',-1)+1,LENGTH(l)) FROM t;
SUBSTRING_INDEX() should work in this case.
select name, SUBSTRING_INDEX(name,' ',-1) as lastName from Author;
I'm trying to work out if this is possible, let me give an example. Would be awesome if you could guide me in the right direction please.
Table = names
--------------------
Marks & Spencer
Marks & Spencer
marks & spencer
What I am trying to do is to return distinct values where I have converted all & signs and changed to upper case.
So my query is:
SELECT regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS name FROM names GROUP BY names;
However, what I would like to do is also return one of the original values, it does not matter which one, but I only want 1 row to be returned, like
Result
----------------
name original
------------------------
MARKS&SPENCER Marks & Spencer
Is this possible? Because at the moment, what I get returned is this:
Result
----------------
name original
------------------------
MARKS&SPENCER Marks & Spencer
MARKS&SPENCER Marks & Spencer
MARKS&SPENCER marks & spencer
Thank you for reading, would really appreciate the help.
==========
EDIT
The query I am using to get the above result is:
SELECT names.name, T.result FROM names
INNER JOIN
(
SELECT DISTINCT regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS result FROM names
) AS T
ON regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi')=T.result
GROUP BY T.result, names.name
ORDER BY T.result ASC
I am using PostgreSQL btw, which can do more than MySQL incase that changes things?
You need to group by the new name to get only one row and, as you don't care which original name appears, aggregate it with something like min:
SELECT min(name),regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS name
FROM names
GROUP BY regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi')
There is still room for improvement:
SELECT regexp_replace(upper(name), E'&(?:AMP;)+|\\+', '&', 'g') AS name
, min(name) AS min_org_name
-- , string_agg(name) AS org_names -- if you want a list of originals
-- , array_to_string(array_agg(name), ', ') AS org_names -- for pg < 9.0+
, count(*) AS ct
FROM (
SELECT *
FROM (VALUES
('Marks & Spencer')
, ('Marks & Spencer')
, ('marks & spencer')
, ('marks & speNceR + sons')
, ('marks &AMP; speNceR & sons')
) AS names(name)
) name
GROUP BY 1;
Major points
Improve regexp:
replace &(amp;)* with identical &(amp;)+
after use of upper() on the original, the 'i' flag only slows execution. Rather upper case pattern, too: &(AMP;)+
Use non-capturing parenthesis: (?:)
As you use a escape sequence \\+, use proper syntax E''
Simplify GROUP BY with positional parameter, no need to spell it out twice
At present you're grouping by the original field (you can't group by a field in your select).
Do you want one of these?
SELECT DISTINCT
name AS original,
regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS name
FROM
names
Or...
SELECT
name AS original,
regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS name
FROM
names
GROUP BY
name,
regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi')
Or...
SELECT
original,
name
FROM
(
SELECT
name AS original,
regexp_replace(UPPER(name), '&(amp;)*|\\+', '&', 'gi') AS name
FROM
names
)
AS clean_data
GROUP BY
original,
name