I have a number of tables 'App_build', 'Server_build' with a column called 'buildid' and it contains a large number of records. I.e.:
buildid
-----------
Application1_BLD_01
Application1_BLD_02
Application1_BLD_03
Application2_BLD_01
Application3_BLD_01
Application3_BLD_02
Application4_1_0_0_1 - old format to be disregarded
Application4_1_0_0_2
Application4_BLD_03
I want to write a function called getmax(tablename) i.e. getmax('App_build')
which will return a recordset which lists the highest values only. I.e:
buildid
--------
Application1_BLD_03
Application2_BLD_01
Application3_BLD_02
Application4_BLD_03
I am new to SQL so am not sure how to start - I guess I can use a split command and then the MAX function but I have no idea where to start.
Any help will be great.
Assuming current version PostgreSQL 9.2 for lack of information.
Plain SQL
The simple query could look like this:
SELECT max(buildid)
FROM app_build
WHERE buildid !~ '\d+_\d+_\d+_\d+$' -- to exclude old format
GROUP BY substring(buildid, '^[^_]+')
ORDER BY substring(buildid, '^[^_]+');
The WHERE condition used a regular expression:
buildid !~ '\d+_\d+_\d+_\d+$'
Excludes buildid that end in 4 integer numbers divided by _.
\d .. character class shorthand for digits. Only one backslash \ in modern PostgreSQL with standard_conforming_strings = ON.
+ .. 1 or more of preceding atom.
$ .. As last character: anchored to the end of the string.
There may be a cheaper / more accurate way, you did not properly specify the format.
GROUP BY and ORDER BY extract the the string before the first occurrence of _ with substring() as app name to group and order by. The regexp explained:
^ .. As first character: anchor search expression to start of string.
[^_] .. Character class: any chracter that is not _.
Does the same as split_part(buildid, '_', 1). But split_part() may be faster ..
Function
If you want to write a function where the table name is variable, you need dynamic SQL. That is a plpgsql function with EXECUTE:
CREATE OR REPLACE FUNCTION getmax(_tbl regclass)
RETURNS SETOF text AS
$func$
BEGIN
RETURN QUERY
EXECUTE format($$
SELECT max(buildid)
FROM %s
WHERE buildid !~ '\d+_\d+_\d+_\d+$'
GROUP BY substring(buildid, '^[^_]+')
ORDER BY substring(buildid, '^[^_]+')$$, _tbl);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM getmax('app_build');
Or if you are, in fact, using mixed case identifiers:
SELECT * FROM getmax('"App_build"');
->SQLfiddle demo.
More info on the object identifier class regclass in this related questions:
Table name as a PostgreSQL function parameter
What you want is a groupwise_max. It can be done with MAX() but the usual way is left join:
SELECT b1.buildid
FROM builds AS b1
LEFT JOIN builds AS b2 ON
split_part(b1.buildid, '_', 1)=split_part(b2.buildid, '_', 1)
AND
split_part(b1.buildid, '_', 3)::int<split_part(b2.buildid, '_', 3)::int
WHERE b2.buildid IS NULL;
But since you're using PG it can be done with DISTINCT ON ()
SELECT DISTINCT ON (split_part(buildid, '_', 1)) buildid
FROM builds
ORDER BY split_part(buildid, '_', 1),split_part(buildid, '_', 3)::int DESC
http://sqlfiddle.com/#!12/308bf/9
Related
I have SQL table where username have different cases for example "ACCOUNTS\Ninja.Developer" or "ACCOUNTS\ninja.developer"
I want to find the how many records where username where first in first and last name capitalize ? how can use Regex to find the total ?
x table
User
"ACCOUNTS\James.McAvoy"
"ACCOUNTS\michael.fassbender"
"ACCOUNTS\nicholas.hoult"
"ACCOUNTS\Oscar.Isaac"
Do you want something like this?
select count(*)
from t
where name rlike 'ACCOUNTS\[A-Z][a-z0-9]*[.][A-Z][a-z0-9]*'
Of course, different databases implement regular expressions differently, so the actual comparator may not be rlike.
In SQL Server, you can do:
select count(*)
from t
where name like 'ACCOUNTS\[A-Z][^.][.][A-Z]%';
You might need to be sure that you have a case-sensitive collation.
In most cases in MS SQL string collation is case insensitive so we need some trick. Here is an example:
declare #accts table(acct varchar(100))
--sample data
insert #accts values
('ACCOUNTS\James.McAvoy'),
('ACCOUNTS\michael.fassbender'),
('ACCOUNTS\nicholas.hoult'),
('ACCOUNTS\Oscar.Isaac')
;with accts as (
select
--cleanup and split values
left(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)-1) frst,
right(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)) last
from #accts
)
,groups as (--add comparison columns
select frst, last,
case when CAST(frst as varbinary(max)) = CAST(lower(frst) as varbinary(max)) then 'lower' else 'Upper' end frstCase, --circumvert case insensitive
case when CAST(last as varbinary(max)) = CAST(lower(last) as varbinary(max)) then 'lower' else 'Upper' end lastCase
from accts
)
--and gather fruit
select frstCase, lastCase, count(frst) cnt
from groups
group by frstCase,lastCase
Your question is a little vague but;
You might be looking for the DISTINCT command.
REF
I don't think you need regex.
Maybe do something like:
Get distinct names from Table X as Table A
Use inputs table A as where clause on Table X
count
union
I hope this helps,
Rhys
Given your example set you can use a combination of techniques. First if the user name always begins with "ACCOUNTS\" then you can use substr to select the characters that start after the "\" character.
For the first name:
Then you can use a regex function to see if it matches against [A-Z] or [a-z] assuming your username must start with an alpha character.
For the last name:
Use the instr function on the substr and search for the character '.' and again apply the regex function to match against [A-Z] or [a-z] to see if the last name starts with an upper or a lower character.
To total:
Select all matches where both first and last match against upper and do a count. Repeat for the lower matches and you'll have both totals.
I'm having : delimited column like 1:2:3:. I want to get this into 1,2,3. My query looks like,
select name
from status where id IN (SELECT REPLACE(NEXT_LIST,':',',')
FROM status);
but I got an error
ORA-01722: invalid number
(1, 2, 3, 4) is different from ('1, 2, 3, 4'). IN requires the former, a list of values; you give it the latter, a string.
You have two options mainly:
Build the query dynamically, i.e. get the list first, then use this to build a query string.
Tokenize the string. This can be done with a custom pipelined function or a recursive query, maybe also via some XML functions. Google "Oracle tokenize string" to find a method that suits you.
UPDATE Option #3: Use LIKE as in ':1:2:3:4:' like '%:3:%'
(This requires your next_list to contain only simple numbers separated with colons. No leading zeros, no blanks, no other characters.)
select name
from status
where (select ':' || next_list || ':' from status) like '%:' || id || ':%'
i agreed with Thorsten but i wonder if we just replace one more time would it works? i mean like this:
select name
from status where id IN (SELECT replace(REPLACE(NEXT_LIST,':',','),'''','')
FROM status);
The REPLACE function returns a string, so the nested query returns a list of string values (where colons replaced with commas), but not a list of number values. When Oracle engine interprets id IN (str_value) it tries to cast the str_value to number and raises exception ORA-01722: invalid number because there are cases like '1:2:3' which are definetely unparseable.
The "pure sql" approach leads us to using custom function detecting if a number is in a colon-separated list:
-- you need Oracle 12c to use function in the WITH clause
-- on earlier versions just unwrap CASE statement and put it into query
WITH
FUNCTION in_list(p_id NUMBER, p_list VARCHAR2) RETURN NUMBER DETERMINISTIC IS
BEGIN
RETURN CASE WHEN
instr(':' || p_list || ':', ':' || p_id || ':') > 0
THEN 1 ELSE 0 END;
END;
SELECT *
FROM status
WHERE in_list(id, next_list) = 1;
Here I assume that values in the next_list column are strings containing numbers separated with colon without spaces. In common case you shall modify the function to match specific list formats.
I run the following query:
select * from my_temp_table
And get this output:
PNRP1-109/RT
PNRP1-200-16
PNRP1-209/PG
013555366-IT
How can I alter my query to strip the last two characters from each value?
Use the SUBSTR() function.
SELECT SUBSTR(my_column, 1, LENGTH(my_column) - 2) FROM my_table;
Another way using a regular expression:
select regexp_replace('PNRP1-109/RT', '^(.*).{2}$', '\1') from dual;
This replaces your string with group 1 from the regular expression, where group 1 (inside of the parens) includes the set of characters after the beginning of the line, not including the 2 characters just before the end of the line.
While not as simple for your example, arguably more powerful.
I use Postgres 8.1. In my sub function I am returning a string which some ID s are concatenated together. And I need to split those ID s and use them in the WHERE clause from main select query.
for example sub function:
subFunction( 'item_id' character varying )RETURNS character varying AS
-- implementation of sub function---
return concatenatedString;
this concatenatedString like this: 23|32|25|234.
And in my main query
SELECT * FROM tableName WHERE id IN (--need to get ids returning from sub function splitting the string--) .
Is there any way that I can split the string returned by sub function and put the result into the IN clause.
Another approach to solve this : Split PostgreSQL Query filtering
As #JustBob says in his comments, your best bet would be to upgrade to a more recent postgresql version, after which you could use regexp_split_to_table, or regexp_split_to_array and then use array operators instead, eg.
SELECT * FROM tablename WHERE id = ANY (regexp_split_to_array(subFunction(...), '\|'));
However, luckily for you, your string resembles an alternation regex, so you might just be able to get away with this:
SELECT * FROM tablename WHERE id::text ~ '^(' || subFunction(...) || ')$';
This will do a regular expression match against a regex looking like this
^(23|32|25|234)$
which will return true if your id value is in the list.
That should work even in 8.1.
I have a column that has values stored in the following format:
name#URL
All data is stored with this and a second # is never present.
I've got the following statement that strips the URL from this column:
SELECT SUBSTRING ( wf_name ,PATINDEX ( '%#%' , wf_name )+1 , LEN(wf_name)-(PATINDEX ( '%#%' , wf_name )) )
However I want to take the name also (everything left of the #). Unfortuantely I don't understand the functions I'm using above (having read the documentation I'm still confused). Could somebody please help me to understand the flow and how I can adjust this to get everything left of #?
Have a look at the following example
; WITH Table1 AS (
SELECT 'TADA#TEST' AS NameURL
)
SELECT *,
LEFT(NameURL,PATINDEX('%#%',NameURL) - 1) LeftText,
RIGHT(NameURL,PATINDEX('%#%',NameURL)- 1) RightText
FROM Table1
SQL Fiddle DEMO
Using functions
PATINDEX (Transact-SQL)
Returns the starting position of the first occurrence of a pattern in
a specified expression, or zeros if the pattern is not found, on all
valid text and character data types.
LEFT (Transact-SQL)
Returns the left part of a character string with the specified number
of characters.
RIGHT (Transact-SQL)
Returns the right part of a character string with the specified number
of characters.