Searching each element of array using like - sql

I would like some professional advice regarding my problem.
You see our database is using an ETL to retrieve massive data. Some or rather most of these data are aggregated.
Problem is I need to retrieve a data, base on the selected Stations
For example. I selected MNL.
Now, in aggregated column we have arrays like:
{MNL-CEB,CEB-MNL,DVO-MNL,MNL-DVO,DVO-CEB,CEB-DVO}
Now, given with my selected code (MNL), I should only be able to pick the following elements from the array.
MNL-CEB,
CEB-MNL,
DVO-MNL,
MNL-DVO
I've been trying various where conditions with no success. Hope you guys can help me out. Thanks!
Here's a piece of code I've been using below -
select distinct
unnest(fpd.segment_agg) segments
from daylightreport.f_dailysales_agg fpd
The data is too big to unnest, causing the script to load more.
Edit - I'm also using more than 1 station code. For instance
Where fpd.segment_agg IN ('MNL','CEB') or something similar to this.
Thanks in advance!

One way that avoids unnesting is to convert the array to a string then do a regex match on that string:
select *
from daylightreport.f_dailysales_agg fpd
where array_to_string(fpd.segment_agg, ',') ~ '(-){0,1}(MNL)(-){0,1}';
The regex makes sure the search string doesn't appear as part of another string. If that can't happen anyway, a simple like would probably do as well. This also assumes that the data never contains a ,
This is not going to be fast though, but avoids the unnesting in multiple rows.
Another option (although probably not really that much faster) is to write a function that loops through the array elements and uses the LIKE operator:
create or replace function any_element_like(p_elements text[], p_pattern text)
returns boolean
as
$$
declare
l_word text;
begin
foreach l_word in array p_elements
loop
if l_word like p_pattern then
return true;
end if;
end loop;
return false;
end;
$$
language plpgsql;
That could be used like this:
select *
from daylightreport.f_dailysales_agg fpd
where any_element_like(fpd.segment_agg, '%MNL%');

Related

how to filter text according to a large number of possible values of a LIKE operator in POSTGRESQL?

EDIT: I changed my example a bit because it was incorrect and misleading. Here is a more correct one (I hope so).
This is a complex problem to explain, so I'll try to be as clear as I can.
I have a CASE that returns a value according to a text filter by means of the LIKE operator.
I need to generate 1 column (class_of_event) with N possible values that classify one given string in N possible categories.
This set of values searched by the LIKE operator will be used again and again in the script, and will be update occasionally.
The script is more or less like this:
SELECT
event,
CASE
WHEN
event LIKE '%MURDER%' or
event LIKE '%KILL%' or
... --and so on with many other possible values...
event LIKE '%WAR%'
THEN 'VIOLENCE'
WHEN
event LIKE '%MARRIAGE%' or
event LIKE '%MARRIED%' or
... --and so on with many other possible values...
event LIKE '%WIFE%'
THEN 'RELATIONSHIP'
ELSE NULL
END class_of_event
FROM TABLE history_facts
I know I can use the pipe | instead of the OR operator, thus writing
CASE WHEN event LIKE '%MARRIAGE%|%MARRIED%|%WIFE%' THEN 'RELATIONSHIP' ELSE null END class_of_event
instead of the long list of OR operators.
Anyway this could turn out in a VERY LONG string, because I could be willing to enlarge the set of values to be looked for.
ALSO, this set of values will be used again in the (long) script, and it will be a problem if one day I'll have to rewrite them all coherently.
So I tried putting these values in the return value of a function:
CREATE OR REPLACE FUNCTION relationship_event()
RETURNS text AS
$$SELECT text '%MARRIAGE%|%MARRIED%|%WIFE%'$$ LANGUAGE sql IMMUTABLE PARALLEL SAFE;
and then using the following:
CASE WHEN event LIKE relationship_event() THEN 'RELATIONSHIP' ELSE null END class_of_event
This seemed a good solution because I could just define or update the function once at the beginning of the script and then use it everywhere I needed it.
The problem is that this method performs quite well in some cases and horribly in other cases.
So, is there a way to:
1) write a synthetic version of event LIKE 'a' OR event LIKE 'b' OR event LIKE 'c' OR...
2) and store the strings I am looking for in some "global variable" that I can rewrite only once and re-use everywhere in the script?
Thanks everybody, this is driving me crazy.
I think I can do this easily with SAS or Python, but can't achieve it on POSTGRESQL
I know I can use the pipe | instead of the OR operator, thus writing
No, you can not. LIKE does not support a pipe as an "or" operator.
You can simplify the expressions using an array:
SELECT event,
CASE
WHEN event ilike any (array['%MURDER%','%KILL%','%WAR%'])
then 'VIOLENCE'
WHEN event ilike any (array['%MARRIAGE%','%MARRIED%','%WIFE%'])
then 'RELATIONSHIP'
END as class_of_event,
class_of_event
FROM history_facts;
You can put this into a function:
create or replace function map_event(p_input text)
returns text
as
$$
select CASE
WHEN event ilike any (array['%MURDER%','%KILL%','%WAR%'])
then 'VIOLENCE'
WHEN ilike any (array['%MARRIAGE%','%MARRIED%','%WIFE%'])
then 'RELATIONSHIP'
END;
$$
language sql
immutable;
Then you just need to call the function, rather having the CASE expression:
select event,
map_event(event) as class_of_event
from history_facts;

Go application making SQL Query using GROUP_CONCAT on FLOATS returns []uint8 instead of actual []float64

Have a problem using group_concat in a query made by my go application.
Any idea why a group_concat of FLOATS would look like a []uint8 on the Go side?
Cant seem to properly convert the suckers either.
It's definitely floats, I can see it in the raw query results, but when I do the same query in go and try to .Scan the result, Go complains that it's a []uint8 not a []float64 (which it actually is) Attempts to convert to floats gives me the wrong values (and way too many of them).
For example, at the database, I query and get 2 floats for the column in question, looks like this:
"5650.50, 5455.00"
On the go side however, go sees a []uint8 instead of []float64. Why does this happen? How does one workaround this to get the actual results?
My problem is that I have to use this SQL with the group_concat, due to the nature of the database I am working with, this is the best way to get the information, and more importantly the query itself works great, returns the data the function needs, but now I cant read it out because of type issues. No stranger to those, but Go isn't cooperating with me today.
I'd be more than pleased to learn WHY go is doing it this way, and delighted to learn of a way to deal with it.
Example:
SELECT ID, getDistance(33.1543,-110.4353, Loc.Lat, Loc.Lng) as distance,
GROUP_CONCAT(values) FROM stuff INNER JOIN device on device.ID = stuff.ID WHERE (someConditionsETC) GROUP BY ID ORDER BY ID
The actual result, when interfacing with the actual database (not within my application), is
"5650.00, 5850.50"
It's clearly 2 floats.
The same result produces a slice of uint8 when queried from Go and trying to .Scan the result in. If I range through and print those values, I get way more than 2, and they are uint8 (bytes) that look like this:
53,55,56,48,46,48,48
Not sure how Go expects me to handle this.
Solution.... stupid simple and not terribly obvious:
The solution: 
crazyBytes := []uint8("5760.00,5750.50")
aString := string(crazyBytes)
strSlice := strings.Split(aString,",") // string representation of our array (of floats)
var floatz []float64
for _, x := range strSlice {
fmt.Printf("At last, Float: %s \r\n",x)
f,err := strconv.ParseFloat(x,64)
if err != nil { fmt.Printf("Error: %s",err) }
floatz = append(floatz, f)
fmt.Printf("as float: %s \r\n", strconv.FormatFloat(f,'f',-1,64))
}
Yea sure, it's obvious NOW.
GROUP_CONCAT returns a string. So in Go you get a byte array of characters, not a float. The result you posted 53,55,56,48,46,48,48 translates into a string "5780.00" which does look like one of your values. So you need to either fix your SQL to return floats or use strings and strconv modules in Go to parse and convert your string into floats. I think the former approach is better, but it is up to you.

Create an array from input array in PostgreSQL

I am working on creating a PostgreSQL function. I have a situation where I receive an array as input and I want to use that array and get some other column for each element from that array and convert that into another array maintaining the same order. I have tried below, but I have some issues while executing it.
Below is the Example of what I need: (Let us say input_array is the input array to the function)
Example:
FOREACH item IN ARRAY $1
LOOP
tempVar = (select some_column from some_table where some_other_column = cast(item as varchar));
some_other_array := array_append(some_other_array, tempVar);
END LOOP;
But using the above approach I am not able to get the expected array as output. Somehow the values are not as expected with the above approach. And i am not able to debug whats going wrong in here as well as i cant see the Raise notices in console :(
Any other suggestions on this are highly appreciated.

Constructing a recursive compare with SQL

This is an ugly one. I wish I wasn't having to ask this question, but the project is already built such that we are handling heavy loads of validations in the database. Essentially, I'm trying to build a function that will take two stacks of data, weave them together with an unknown batch of operations or comparators, and produce a long string.
Yes, that was phrased very poorly, so I'm going to give an example. I have a form that can have multiple iterations of itself. For some reason, the system wants to know if the entered start date on any of these forms is equal to the entered end date on any of these forms. Unfortunately, due to the way the system is designed, everything is stored as a string, so I have to format it as a date first, before I can compare. Below is pseudo code, so please don't correct me on my syntax
Input data:
'logFormValidation("to_date(#) == to_date(^)"
, formname.control1name, formname.control2name)'
Now, as I mentioned, there are multiple iterations of this form, and I need to loop build a fully recursive comparison (note: it may not always be typical boolean comparisons, it could be internally called functions as well, so .In or anything like that won't work.) In the end, I need to get it into a format like below so the validation parser can read it.
OR(to_date(formname.control1name.1) == to_date(formname.control2name.1)
,to_date(formname.control1name.2) == to_date(formname.control2name.1)
,to_date(formname.control1name.3) == to_date(formname.control2name.1)
,to_date(formname.control1name.1) == to_date(formname.control2name.2)
:
:
,to_date(formname.control1name.n) == to_date(formname.control2name.n))
Yeah, it's ugly...but given the way our validation parser works, I don't have much of a choice. Any input on how this might be accomplished? I'm hoping for something more efficient than a double recursive loop, but don't have any ideas beyond that
Okay, seeing as my question is apparently terribly unclear, I'm going to add some more info. I don't know what comparison I will be performing on the items, I'm just trying to reformat the data into something useable for ANY given function. If I were to do this outside the database, it'd look something like this. Note: Pseudocode. '#' is the place marker in a function for vals1, '^' is a place marker for vals2.
function dynamicRecursiveValidation(string functionStr, strArray vals1, strArray vals2){
string finalFunction = "OR("
foreach(i in vals1){
foreach(j in vals2){
finalFunction += functionStr.replace('#', i).replace('^', j) + ",";
}
}
finalFunction.substring(0, finalFunction.length - 1); //to remove last comma
finalFunction += ")";
return finalFunction;
}
That is all I'm trying to accomplish. Take any given comparator and two arrays, and create a string that contains every possible combination. Given the substitution characters I listed above, below is a list of possible added operations
# > ^
to_date(#) == to_date(^)
someFunction(#, ^)
# * 2 - 3 <= ^ / 4
All I'm trying to do is produce the string that I will later execute, and I'm trying to do it without having to kill the server in a recursive loop
I don't have a solution code for this but you can algorithmically do the following
Create a temp table (start_date, end_date, formid) and populate it with every date from any existing form
Get the start_date from the form and simply:
SELECT end_date, form_id FROM temp_table WHERE end_date = <start date to check>
For the reverse
SELECT start_date, form_id FROM temp_table WHERE start_date = <end date to check>
If the database is available why not let it do all the heavy lifting.
I ended up performing a cross product of the data, and looping through the results. It wasn't the sort of solution I really wanted, but it worked.

Search for string with conditions

I have table with a name filed which is string. I need to create a sql statement that searches for children to a node without finding the children to the children. Is it possible to use LIKE and some wildcards to accomplish this? You can see some examples below of the results I need to get based on my search string.
Search string is /home
Then the follwing entries should be returned
/home/something
/home/somethingElse
but not
/home/something/foo
/home/something/bar
/home/somethingElse/foo
but if the search string is /home/something
These should be returned
/home/something/foo
/home/something/bar
SELECT name FROM table
WHERE name LIKE '/home/%' AND name NOT LIKE '/home/%/%'
should filter out anything with second level node under it.
I would probably search on the number of slashes in addition to the actual keywords. So the first one would be searching for /home with 1-2 /'s
The second one would be /home/something wtih 2-3 slashes.
I don't have sql up infront of me, but I'll work on some sample code for you.
Edit:
CREATE FUNCTION [dbo].[ufn_CountChar] ( #pInput VARCHAR(1000), #pSearchChar CHAR(1) )
RETURNS INT
BEGIN
RETURN (LEN(#pInput) - LEN(REPLACE(#pInput, #pSearchChar, '')))
END
GO
This little function will act nicely to count the number of slashes in your strings.
Enjoy
Hope this helps,
Cheers,