How to use intersect operator in MongoDB? - sql

(select A from 'TableB' where C = c and G = g)
intersect
(select A from 'TableB' where C = d and G = h)
First of all, because Mysql does not provide an intersect operator, I changed the query statement written above as follows.
select A
from 'TableB'
where C = c and G = g and A in(
select A
from 'TableB'
where C = d and G = h)
I want to use MongoDB to get the same result as above.
Is there any other way??

let mongoQuery = {
$and:[
{C: c},
{D: d},
{G: g},
{G: h}
]
};
const result = await TableB.find(mongoQuery, {A: 1});
This query will return only elements from 'A' that matches C=c, D=d, G=h, G=g
Hope it helps

Related

Why is DMux4way operate like this?

When the sel is 01 the b should be chosen, but why the operation is a = nsel1 b = sel[0]? After that the sel will become 00 from 01
CHIP DMux4Way {
IN in, sel[2];
OUT a, b, c, d;
PARTS:
// Put your code here:
Not(in = sel[0], out = nsel0);
Not(in = sel[1], out = nsel1);
And(a = nsel1, b = nsel0, out = s00);
And(a = in, b = s00, out = a);
And(a = nsel1, b = sel[0], out = s01);
And(a = in, b = s01, out = b);
And(a = sel[1], b = nsel0, out = s10);
And(a = in, b = s10, out = c);
And(a = sel[1], b = sel[0], out = s11);
And(a = in, b = s11, out = d);
}
I just figured it out because the hack uses little endian.
In each pair of ANDs, the first one determines whether the output is activated, and the second one gates the input to that output.
So:
And(a = nsel1, b = nsel0, out = s00);
s00 is True only if both its inputs (nsel0 and nsel1) are True. Since they are the inverse of sel[0] and sel[1], s00 will only be true if both sel[0] and sel[1] are False.
And(a = in, b = s00, out = a);
The second And will only be true if both in and s00 are True. So it will only be True if in is True, and sel[0] and sel[1] are False.
Similar logic applies for the other cases. But the key point is that for the 01 and 10 cases, it is not enough to check that the right value is True, you also have to check that the other value is 0.
Edit: In addition, as the OP points out in a subsequent comment, the machine is little-endian, and you must keep this in mind when operating on a multi-bit bus (like sel).

Compare Tuples value present inside a bag with a hardcoded String value

I have a data set with these columns:-
FMID,County,WIC,WICcash
Here is a sample of data:-
1002267,Douglas,Y,N
21005876,Douglas,Y,N
1001666,Douglas,N,Y
I have grouped the data based on County and have filtered the data based on County = 'Douglas'. Here is the output:
(Douglas,{(1002267,Douglas,Y,N),(21005876,Douglas,Y,N),(1001666,Douglas,N,Y)})
Now if the WIC and WICcash columns have value as Y then I want to take the combine count of the values from both the columns.
Here, combining WIC and WICcash columns I have 3 Y values, so my output will be
Douglas 3
How can I achieve this?
Below is the code that I have written till now
load_data = LOAD 'PigPrograms/Markets/DATA_GOV_US_Farmers_Market_DataSet.csv' USING PigStorage(',') as (FMID:long,County:chararray, WIC:chararray, WICcash:chararray);
group_markets_by_county = GROUP load_data BY County;
filter_county = FILTER group_markets_by_county BY group == 'Douglas';
DUMP filter_county;
For looking inside a bag, you can use a nested-foreach.
A = LOAD 'input3.txt' AS (FMID:long,County:chararray, WIC:chararray, WICcash:chararray);
B = GROUP A by County;
describe B; /* B: {group: chararray,A: {(FMID: long,County: chararray,WIC: chararray,WICcash: chararray)}} */
C = FOREACH B {
FILTER_WIC_Y = FILTER A by WIC == 'Y';
COUNT_WIC_Y = COUNT(FILTER_WIC_Y);
FILTER_WICcash_Y = FILTER A by WICcash == 'Y';
COUNT_WICcash_Y = COUNT(FILTER_WICcash_Y);
GENERATE group, COUNT_WIC_Y + COUNT_WICcash_Y as count;
}
dump C;
Or, you can replace 'Y'&'N' into 1&0 and add them up.
A = LOAD 'input3.txt' AS (FMID:long,County:chararray, WIC:chararray, WICcash:chararray);
B = FOREACH A GENERATE FMID, County, (WIC == 'Y' ? 1 : 0 ) as wic, (WICcash == 'Y' ? 1 : 0 ) as wiccash;
C = GROUP B by County;
D = FOREACH C GENERATE group, SUM(B.wic) + SUM(B.wiccash) as count;
dump D;

Scala Slick 3.1.0 - counting columns from 2 tables in left join query

I am trying to construct a double count left join query in Slick 3.1 similar to this:
SELECT
COUNT(p.id) AS replies,
COUNT(f.id) AS images
FROM posts AS p
LEFT JOIN files AS f
ON p.id = f.post_id
WHERE thread = :thread_id
Join part of the query is quite simple and looks like this:
val joinQ = (threadId: Rep[Long]) =>
(postDAO.posts joinLeft fileRecordDAO.fileRecords on (_.id === _.postId))
.filter(_._1.thread === threadId)
Using joinQ(1L).map(_._1.id).length generates COUNT(1) which counts all rows - not the result I want to obtain. Using joinQ(1L).map(_._1.id).countDistinct generates COUNT(DISTINCT p.id) which is somewhat what I'm looking for, but trying to do two of these generates this monstrosity:
select x2.x3, x4.x5
from (select count(distinct x6.`id`) as x3
from `posts` x6
left outer join `files` x7 on x6.`id` = x7.`post_id`
where x6.`thread` = 1) x2,
(select
count(distinct (case when (x8.`id` is null) then null else 1 end)) as x5
from `posts` x9
left outer join `files` x8 on x9.`id` = x8.`post_id`
where x9.`thread` = 1) x4
Here's the double countDistinct code:
val q = (threadId: Rep[Long]) => {
val join = joinQ(threadId)
val q1 = join.map(_._1.id).countDistinct // map arg type is
val q2 = join.map(_._2).countDistinct // (Post, Rep[Option[FileRecord]])
(q1, q2)
}
Any ideas? :)
Count is aggregation function, so you also need grouping by some field (e.x. id). Try next query:
def joinQ(threadId: Long) = (postDAO.posts joinLeft fileRecordDAO.fileRecords on (_.id === _.postId))
.filter(_._1.thread === threadId)
val q = (threadId: Long) => {
joinQ(threadId).groupBy(_._1.id).map {
case (id, qry) => (id, qry.map(_._1.id).countDistinct, qry.map(_._2.map(_.id)).countDistinct)
}
}
It generates next sql:
SELECT x2.`id`,
Count(DISTINCT x2.`id`),
Count(DISTINCT x3.`id`)
FROM `posts` x2
LEFT OUTER JOIN `files` x3
ON x2.`id` = x3.`post_id`
WHERE x2.`thread` = 10
GROUP BY x2.`id`

Limit but not Order in PIG

I meet one problem while I using Limit in PIG.
The result of Limit is sorted, but I don't want the result be sorted.
From the example on the website:
A = LOAD 'data' AS (a1:int,a2:int,a3:int);
DUMP A;
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
Using Limit
X = LIMIT A 3;
DUMP X;
(1,2,3)
(4,3,3)
(7,2,5)
Is it possible that show the top 3 lines without sorted in the reuslt?
(1,2,3)
(4,2,1)
(8,3,4)
My code is below:
A = LOAD '$input';
B = foreach A generate $s_field;
C = FILTER B BY $pattern;
D = FOREACH C {
topnresult = LIMIT B $lines;
GENERATE FLATTEN(topnresult);
}
dump D;
Thank you very much.
By default LIMIT will execute ORDER command followed by LIMIT command internally, so obviously you will get the sorted list. There are many way to solve this problem, one option could be
input.txt
1 2 3
4 2 1
8 3 4
4 3 3
7 2 5
8 4 3
PigScript:
A = LOAD 'input.txt' AS (a1:int,a2:int,a3:int);
B = RANK A;
C = FILTER B BY rank_A<=3;
D = FOREACH C GENERATE a1,a2,a3;
DUMP D;
Output:
(1,2,3)
(4,2,1)
(8,3,4)
Option2:
A = LOAD 'input.txt' AS (a1:int,a2:int,a3:int);
B = GROUP A ALL;
C = FOREACH B {
top3list = LIMIT A 3;
GENERATE FLATTEN(top3list);
}
DUMP C;
Output:
(1,2,3)
(4,2,1)
(8,3,4)
UPDATE: Solution1
A = LOAD '$input';
B = foreach A generate $s_field;
C = FILTER B BY $pattern;
D = GROUP C ALL;
E = FOREACH D {
topnresult = LIMIT C $lines;
GENERATE FLATTEN(topnresult);
}
DUMP E;
Solution2:
A = LOAD '$input';
B = foreach A generate $s_field;
C = FILTER B BY $pattern;
D = RANK C;
E = FILTER D BY rank_C<=$lines;
F = FOREACH E GENERATE $1..;
DUMP F;
I have tested the solution using the below command line and its working fine
>pig -x local -param input='input.txt' -param s_field='$0,$1,$2' -param pattern='$0<10' -param lines=3 myscript.pig

Hardcore SQL(ite) : fetch segment intersection

So this one is a very difficult one I think, so I'll try to make it as clear as possible.
So basically, I have geographic datas :
Nodes (#ID,lat,lng)
WayNodes (#ID,#node_id,#way_id, sequence)
Ways(#id,name)
And what I want is to get the intersection of two GROUPS of ways.
For example I need to find intersection(s) between the ways called "name1" : {way1, way2, way3} and the ways called "name2" : {way4, way5, way6}
So what I need is to do an equivalent to this :
float x;
float y;
float A1 = Y2-Y1;
float B1 = X1-X2;
float C1 = A1*X1+B1*Y1;
float A2 = Y4-Y3;
float B2 = X3-X4;
float C2 = A2*X3+B2*Y3;
float det = A1*B2 - A2*B1;
if(det == 0){
//Lines are parallel
x = 0.0;
y = 0.0;
}else{
x = (B2*C1 - B1*C2)/det;
y = (A1*C2 - A2*C1)/det;
}
BOOL intersection = (x<MAX(X1,X2) && x<MAX(X3,X4) && x>MIN(X1,X2) && x>MIN(X3,X4));
But in SQL !
I kind of think it is possible, my request looks like that :
(F1, F2 and F3 replace two very long functions, which compute X, Y and det, they should be correct.)
SELECT F1(n1.lat,n1.lng,n2.lat,n2.lng,n3.lat,n3.lng,n4.lat,n4.lng) AS x,
F2(n1.lat,n1.lng,n2.lat,n2.lng,n3.lat,n3.lng,n4.lat,n4.lng) AS y,
F3(n1.lat,n1.lng,n2.lat,n2.lng,n3.lat,n3.lng,n4.lat,n4.lng) AS det,
FROM Nodes n1, Nodes n2, Nodes n3, Nodes n4
JOIN WayNodes wn1 ON n1.id = wn1.node_id
JOIN WayNodes wn2 ON n2.id = wn1.node_id
JOIN WayNodes wn3 ON n3.id = wn1.node_id
JOIN WayNodes wn4 ON n4.id = wn1.node_id
JOIN Way w1 ON wn1.way_id = w1.id AND wn2..way_id = w1.id
JOIN Way w2 ON wn3..way_id = w2.id AND wn4..way_id = w2.id
WHERE det != 0 AND
x < MAX(n1.lng, n2.lng)
AND x > MIN(n1.lng, n2.lng)
AND x < MAX(n3.lng, n4.lng)
AND x > MIN(n3.lng, n4.lng)
AND wn1.sequence=wn2.sequence - 1
AND wn3.sequence=wn4.sequence - 1
AND w1.name = "name1"
AND w2.name = "name2"
Apparently something doesn't work in the junction... any idea ?
Is it that you need...
JOIN WayNodes wn1 ON n1.id = wn1.node_id
JOIN WayNodes wn2 ON n2.id = wn2.node_id
JOIN WayNodes wn3 ON n3.id = wn3.node_id
JOIN WayNodes wn4 ON n4.id = wn4.node_id