Combining two tuples into one in oracle db - sql

Say I have a bunch of letters grouped together and I want to find out which pair bonds with another the most, as an example I have
a b d b
b c e a
and it should return ab or ba because a-b are the most occurred here.
so far I have made a query that just pulls every two letters that are together, but when I run the query I get something like the above example but all are in separate tuples, like this
a
b
b
c
d
e
b
a
I need to compare the occurence of all the PAIRS, my logic so far is that I can use nvl() to concat them(but nvl() of a-b and b-a returns two separate instances), then run a count, but I'm not sure how to run a count on these as I called the letters
select a.value, b.value
from Letter a, Letter b, Word aw, Word bw, Sentence senA, sentence senb
where a.id = aw.aid and aw.pid = sena.id and b.id = bw.aid and
bw.pid = senb.id and aw.pid = bw.pid and a.value != b.value
;
TL;DR: I want to do a count(a.ltr-b.ltr pair) but not sure how to code that.
Thanks!
EDIT: table structure:
letter(id, value)
\
word(aid, pid)
\
sentence(id, name,sid)
basically, if two letters end up in the same sentence.id, they are a pair(bond).

Related

PIG REPLACE with NULL

I have three values A, B and C.
I want to be able to replace the value of C with a NULL value if A AND B have values in their cells.
Unsure where to go. I've tried something like
FOR EACH X GENERATE REPLACE(C, ((A IS NOT NULL AND B IS NOT NULL) ? NULL:C) ;
But unsure if this will work, it doesn't seem right. I don't want to add any more values, just update the value of C?
Maybe something like
FOR EACH X GENERATE (A IS NOT NULL AND B IS NOT NULL) ? NULL:C AS NEW_C;
Then drop C, whilst retaining A, B and NEW_C?
You can simply do:
Y = FOREACH X GENERATE A, B, (A IS NOT NULL AND B IS NOT NULL ? NULL : C) AS C;
There is no need to create NEW_C and then drop C since no fields are carried into the new relation unless you explicitly name them (unless you use GENERATE * so that all fields are carried through).

PostGIS Intersection of multiple tables

In my postGIS DB, I need to create a new layer from the intersection of multiple polygons layers, while maintaining all fields (or Columns) from all layers (or tables) ==> in the output table, I need all columns for the 3 input tables
I believe it has to include ST_Intersects, but I am unable to find the correct syntax. Can you help me designing the SQL command line, knowing the following generic table names:
- TableA (with the columns: GeomA (geometry), ColumnA1, ColumnA2)
- TableB (with the columns: GeomB (geometry), ColumnB1)
- TableC (with the columns: GeomC (geometry), ColumnC1)
All geometry fields from TableA,TableB and TableC are in the same projection.
Thanks
For clarity, since "interaction with multiple polygons layers" is a bit vague, it could mean:
you want to find all polygons from A that intersect with a polygon from B and a polygon from C
or A with B, B with C
or A with B, A with C and B with C
For simplicity let us assume the first scenario, and I presume the others will be pretty easy to deduce:
select A.*, B.*, C.*
from A, B, C
where st_intersects(A.geomA, B.geomB) = true
and st_intersects(A.geomA, C.geomC) = true
[EDIT] Not just finding the rows, but if the intersection itself is important we can do the following (in the simple case of two geometries intersecting)
select A.*, B.*, st_intersection(A.geomA, B.geomB) as geomAB
from A, B
where st_intersects(A.geomA, B.geomB) = true
I simplified the case because if A intersects with B, and A intersects with C, it is not even sure those two intersections intersect again and have a result.
If we suppost A intersects with B, B with C and C with A then we should have a intersection imho. So that would look like:
select A.*, B.*, C.*, st_intersection(A.geomA, st_intersection(B.geomB, C.geomC)) as geomABC
from A, B, C
where st_intersects(A.geomA, B.geomB) = true
and st_intersects(A.geomA, C.geomC) = true
and st_intersects(B.geomB, C.geomC) = true

Oracle: outer join(+) with or clause replacement

I have an enormous select that schematically looks like this:
SELECT c_1, c_2, ..., c_j FROM t_1, t_2, ..., t_k
WHERE e_11 = e_12(+)
AND e_21 = e_22(+)
AND ...
AND e_l1 = e_l2(+)
ORDER BY o
where j, k and l are in hundreds and e_mn is a column from some table. I need to add new columns A_1 and A_2 to the select from a new table T. The new columns are connected to the former select via a column call it B from a table R. I want those rows where A_1 = B or A_2 = B or those rows where there is no correspondeing A_i to the value B.
Suppose I only had to deal with tables T and R then I want this:
SELECT * FROM R
LEFT OUTER JOIN T
ON (A_1 = B OR A_2 = B)
To mimic this behaviour I'd want something like this in the big select:
SELECT c_1, c_2, ..., c_j, A_1, A_2 FROM t_1, t_2, ..., t_k, T
WHERE e_11 = e_12(+)
AND e_21 = e_22(+)
AND ...
AND e_l1 = e_l2(+)
AND (B = A_1(+) OR B = A_2(+))
ORDER BY o
this is, however, syntactically incorrect since the (+) operator cannot be used with the OR caluse. And if I leave out the (+)'s I lose those rows where there is no corresponding A_i to B.
What are my options here? Can I somehow find a way to do this without changing the whole body of the select? I doubt there is a reasonable way to do this, nevertheless I'd appreciate any help.
Thanks.

when 2 output values are returned it should display the hardcorded one and if 1 output value is returned it should display the 1output itself

When I execute a query for input parameter ABC it returns two values (Partner, Smith); whenever two values are returned of those two values Smith will be a compulsory value which will be returned.
But whenever the same query is executed with input parameter as 'xyz' it returns only one value.
Now my requirement is whenever I execute a query if it returns two values of those two values only SMITH must be returned in output and if the same query returns one output value then it should display the loutput value itself.
The below query satisfies 1st part of my requirement but it doesn’t satisfy my 2nd part of the requirement. Instead of displaying the 1output value it’s returning ‘Null’ value whenever the output value quantity is 1.
SELECT R.REGION_GID
FROM GTM_TRANSACTION T,
GTM_TRANSACTION_INVOLVED_PARTY P,
CONTACT C,
LOCATION L,
REGION_DETAIL R
WHERE T.GTM_TRANSACTION_GID=P.GTM_TRANSACTION_GID
AND R.COUNTRY_CODE3_GID = L.COUNTRY_CODE3_GID
AND R.REGION_GID LIKE 'SSN/BP.GTM_COMPL%'
AND L.LOCATION_GID = C.LOCATION_GID
AND P.INVOLVED_PARTY_CONTACT_GID=C.CONTACT_GID
AND P.INVOLVED_PARTY_QUAL_GID='SHIP_FROM'
AND T.GTM_TRANSACTION_GID=$SHIP_FORM
INTERSECT
SELECT R.REGION_GID
FROM GTM_TRANSACTION T,
GTM_TRANSACTION_INVOLVED_PARTY P,
CONTACT C,
LOCATION L,
REGION_DETAIL R
WHERE T.GTM_TRANSACTION_GID=P.GTM_TRANSACTION_GID
AND R.COUNTRY_CODE3_GID = L.COUNTRY_CODE3_GID
AND R.REGION_GID ='SSN/BP.GTM_COMPL_NO_CODING'
AND L.LOCATION_GID = C.LOCATION_GID
AND P.INVOLVED_PARTY_CONTACT_GID=C.CONTACT_GID
AND P.INVOLVED_PARTY_QUAL_GID='SHIP_FROM'
AND T.GTM_TRANSACTION_GID=$SHIP_FROM
As far as I can tell, the only difference between the two halves of your INTERSECT are in the filters for P.REGION_GID. The first half has:
R.REGION_GID LIKE 'SSN/BP.GTM_COMPL%'
while the second has
R.REGION_GID = 'SSN/BP.GTM_COMPL_NO_CODING'
Given how INTERSECT works, I think this means the first half is redundant. The only question then is whether the second half is returning one row or two rows. You want it to always return one row, with 'SMITH' taking precedence. The following logic may be what you want (as a bonus, I've tidied up your JOINs too):
SELECT TOP 1
R.REGION_GID
FROM
GTM_TRANSACTION T
JOIN GTM_TRANSACTION_INVOLVED_PARTY P ON
T.GTM_TRANSACTION_GID=P.GTM_TRANSACTION_GID
JOIN CONTACT C ON
P.INVOLVED_PARTY_CONTACT_GID=C.CONTACT_GID
JOIN LOCATION L ON
L.LOCATION_GID = C.LOCATION_GID
JOIN REGION_DETAIL R ON
R.COUNTRY_CODE3_GID = L.COUNTRY_CODE3_GID
WHERE
R.REGION_GID ='SSN/BP.GTM_COMPL_NO_CODING'
AND P.INVOLVED_PARTY_QUAL_GID='SHIP_FROM'
AND T.GTM_TRANSACTION_GID=$SHIP_FROM
ORDER BY
CASE WHEN R.REGION_GID = 'SMITH' then 1 else 2 end
That last line will want to be something like: CASE WHEN R.REGION_GID = 'SMITH' then 1 else 2 end but I you haven't told us much about your data, so I really don't know.

How to find the occurrences of a column mapped to a corresponding column in a query SQL

I have a query as below
select custref, tetranumber
from
(select *
from cdsheader h, custandaddr c
where h.custref=c.cwdocid and c.addresstype = 'C' )
where tetranumber = '034096'
The objective is the 2nd column should have only one corresponding 1st column
Ex : 034096 should have always have 2600135 as the first column
I would like to check if there is any value apart from 2600135 for 034096.
(I am a java developer and suggested a solution to avoid 1 to n or n to n mappings of data but there is a bad data already in the DB(Oracle), so I would like to check whether there is a bad data so that I could delete the data)
re: The objective is the 2nd column should have only one corresponding 1st column
You'll need to perform an aggregate function, like MAX or MIN, to determine which of the row is returned.
Thanks for the response guys,
I have figured out the way and here it goes...
select custref, count(distinct(tetranumber)) from(
select custref, tetranumber from cdsheader h, custandaddr c where h.custref=c.cwdocid and c.addresstype = 'C')
group by custref having count(distinct(tetranumber))>1