Individual fuzzy membership values of a crisp number - fuzzy-logic

Consider a trapezoidal membership function having values a, b, c, d. How can I define fuzzy membership function for four linguistic variables like {low, medium-low, medium-large, large} with a defined equation with all these four variables? In the case case of a triangular membership function with (a, b, c) I can define membership function for μ(low) as Max{0,min((b-x)/(b-a) , 1)} and μ(medium) as Max{0,min((x-a)/(b-a) , (c-x)/(c-b))} and μ(high) as Max{0,min((x-b)/(c-b) , 1)}.

Related

SQL, how to group rows based on field values

i have question about query result group. The image is a example. it is cable list. Each cable come with two attributes, 'From' location and 'To' location. if we'd like to group the cable list by location, it becomes tricky.
when we group the data by two locations, the results will land into two groups.
'A->B',
'B->A'
But in reality, it makes more sense to combine these two groups into one, 'From<->', or saying cable list between two locations.
Think of add one more filed to mark cables between two locations. But didn't come up with any idea.
thank you for sharing your thoughts.
regards,
Roland
SQL, group
enter image description here
You can group by the least and the greatest of the 2 columns.
Use your database's functions to do this.
The SQL standard is a CASE expression like:
GROUP BY CASE WHEN A < B THEN A ELSE B END,
CASE WHEN A < B THEN B ELSE A END
Or maybe you can use a function like IF() or IIF():
GROUP BY IF(A < B, A, B),
IF(A < B, B, A)
or functions like LEAST() and GREATEST():
GROUP BY LEAST(A, B),
GREATEST(A, B)

Oracle SQL how to get rowmax?

I likely lack the correct vocabulary, which is why my google searches were unsuccessful to achieve the following rowmax type of operation: I want to create a new column that is for each row the maximum of two existing columns and is bounded by 0.
SELECT a,b, rowmax(a,b,0) as c
FROM ...
Use greatest():
greatest(a, b, 0) as c

how to segment groups based on different criteria

I'm trying to assign test and control groups based on A to F columns values to the table below.
Eventually, I want a table look like below. If different zips have the same values for all columns, then assign half zips to test and half to control. If the total number of zips cannot be equally assigned, then give the extra zip to control.
You could use row_number() and mod():
select
t.*,
case when mod(
row_number() over(partition by A, B, C, D, E, F order by zip),
2
) = 0 then 'T' else 'C' end tc_group
from mytable t
row_number() assigns increasing numbers to records that share the same (A, B, C, D, E, F) values, ordered by increasing zip. We would assign even row numbers to testing group T, and uneven numbers to group C.
I think a stratified sample will do what you want:
select t.*,
(case when mod(row_number() over (order by a, b, c, d, e, f), 2) = 1
then 'C' else 'T'
end) as test_group
from t;
This is not exactly how you phrased the question, but it should have the same effect of splitting rows with the same values in the columns evenly in the two groups. When there are odd numbers, sometimes the extra will go to test and sometimes to control.
It is unclear from the question whether you want balanced control and test groups -- which is what I would expect. If you actually want all groups with odd numbers to go to control (as you suggest), then all the onesies will be in the control and that seems biased to me.

OrientDB group by query using graph

I need to perform an grouped aggregate on a property of vertices of a certain class, the group by field is however a vertex two steps away from my current node and I can't make it work.
My case:
The vertex A contains the property I want to aggregate on and have n number of vertices with label references. The vertex I want to group on is any of those vertices (B, C or D) if that vertex has a defined by edge to vertex F.
A ----references--> B --defined by--> E
\---references--> C --defined by--> F
\--references--> D --defined by--> G
...
The query I thought would work is:
select sum(property), groupOn from (
select property, out('references')[out('definedBy').#rid = F] as groupOn from AClass
) group by groupOn
But it doesn't work, the inner statement gives me a strange response which isn't correct (returns no vertices) and I suspect that out() isn't supported for bracket conditions (the reason for the .#rid is that the docs I found stated that only "=" are supported.
out('references')[out('definedBy') contains F] doesn't work either, that returns the out('definedBy') for the $current vertex).
Anyone with an idea how to achieve this? In my example, the result I would like is the property in one column and the #rid of the C vertex in another. Then I can happily perform my group by aggregates.
Solved it! In OrientDB 2.1 (I'm trying rc4) there's a new clause called UNWIND (see SELECT in docs).
Using UNWIND I can do:
SELECT sum(property), groupOn from (
SELECT property, out('references') as groupOn
FROM AClass
UNWIND groupOn
) WHERE groupOn.out('definedBy')=F
GROUP BY groupOn
It could potentially be a slow function depending on the number of vertices of AClass and its references, I'll report back if I find any performance issues.

What is the difference between `::` and `.` in pig?

What is the difference between :: and . in pig?
When do I use one vs the other?
E.g., I know that :: is need in join when a field exists in both aliases:
A = foreach (join B by (x), C by (y)) generate B::y as b_y, C::y as c_y;
and I need . when accessing group fields:
A = foreach (group B by (x,y)) generate group.x as x, group.y as y, SUM(B?z) as z;
However, do I pass B::z or B.z to SUM above instead of B?z?
In Pig, :: is used as a disambiguation tool after operations which could possibly create naming collisions. Notably, this happens with JOIN, CROSS, and FLATTEN. Consider two relations, A:{(id:int, name:chararray)} and B:{(id:int, location:chararray)}. If you want to associate names with locations, naturally you would do:
C = JOIN A BY id, B BY id;
Without the disambiguation operator, your schema would be
C:{(id:int, name:chararray, id:int, location:chararray)}
Now you can't tell which field id refers to. To avoid this, Pig will instead do
C:{(A::id:int, A::name:chararray, B::id:int, B::location:chararray)}
Likewise, you could FLATTEN two bags whose tuples have fields with the same name, and they would also collide. So the same operator is used in this case as well. When there is no such conflict, you do not need to use the full name: name is unambiguous here. To simplify C, then, you can do this:
D = FOREACH C GENERATE A::id, name, location;
The . operator, by contrast, projects fields from bags and tuples. If you have a bag b with schema {(x:int, y:int, z:int)}, the projection b.y yields a bag with just the specified field: {(y:int)}. You can project multiple fields at once with parentheses: b.(y,z) yields {(y:int, z:int)}.
When used with tuples, the result is a tuple with just the specified fields. If the tuple t has schema (x:int, y:int, z:int), then t.x is the tuple (x:int) and t.(y,z) is the tuple (y:int, z:int).
To your specific question about SUM, note that SUM along with the other summary statistic UDFs, takes a bag as its argument. Therefore, you need to create a bag with just the one field per tuple that you want to sum. Using the projection operator, .: B.z.
IIRC you get :: as a side effect after some statements. You cannot bother about it, unless (as you mentioned) a name exists inside two different prefixes.
The . is different in that you are going inside the structure.
group.x as x, group.y as y is equivalent to FLATTEN(group)
SUM(B?z) - here you should do SUM(B.z), to specify that you need a particular field to SUM.