Joining two subqueries with Cypher in Neo4J - sql

I have a following SQL query:
SELECT q1.customerId, q1.invoiceId, q2.workId, sum(q2.price)
FROM (select customer.id as customerId, invoice.id as invoiceId, work.id as workId from customer, invoice, workinvoice, work where customer.id=invoice.customerid and invoice.id=workinvoice.invoiceId and workinvoice.workId=work.id
) as q1, (select work.id as workId, sum((price * hours * workhours.discount) + (purchaseprice * amount * useditem.discount)) as price from worktype,workhours,work,warehouseitem,useditem where worktype.id=workhours.worktypeid and workhours.workid=work.id and work.id=useditem.workid and useditem.warehouseitemid=warehouseitem.id group by work.id
) as q2
WHERE q1.workId = q2.workId group by q1.invoiceId;
This query should return me a sum of work prices for each invoice per customer.
I would be interested to know how to do this kind of query in Neo4J. I know that there is UNION https://neo4j.com/docs/cypher-manual/current/clauses/union/. However that does seem to do what I want. I need to make two subqueries and join them from same node as in that SQL example. What would be the correct way to do this with Cypher?

There's a quite complex example of how to do a join in cypher which you can find here: https://github.com/moxious/halin/blob/master/src/api/data/queries/dbms/3.5/tasks.js#L22
Basically, the technique is that you run the first query, collect the results. Then you run the second, collect the results. Then you unwind the second, match using a filter, and return the result.
In really simplified form, it looks something like this:
CALL something() YIELD a, b
WITH collect({ a: a, b: b }) as resultSet1
CALL somethingElse YIELD a, c
WITH resultSet1, collect({ a: a, c: c }) as resultSet2
UNWIND resultSet2 as rec
WITH [item in resultSet1 WHERE item.a = rec.a][0] as match, rec
RETURN match.a, match.b, rec.c
The list comprehension bit is basically doing the join. Here we're joining on the "a" field.

I figured out the solution I wanted:
MATCH (inv:invoice)-[:WORK_INVOICE]->(w:work)<-[h:WORKHOURS]-(wt:worktype) WITH inv, w, SUM(wt.price * h.hours * h.discount) as workTimePrice OPTIONAL MATCH (w)-[u:USED_ITEM]->(i:item) WITH inv, workTimePrice + SUM(u.amount * u.discount * i.purchaseprice) as workItemPrice RETURN inv, sum(workItemPrice) as invoicePrice

Related

Google bigquery is extremely slow on simple query

I have a simple query that count records from 4 tables (NO JOINING):
SELECT count(tx._sequence_num) as txc,
count(o._sequence_num) as oc,
count(t._sequence_num) as tc,
count(ol._sequence_num) as olc
FROM `xxx.TAX_TRANSACTIONS` tx,
xxx.ORDER o,
xxx.TRANSACTION t,
xxx.ORDER_LINES ol
It never returns result to me
If I separate it to 4 queries like that:
SELECT count(tx._sequence_num) as txc FROM `xxx.TAX_TRANSACTIONS` tx; --202685
SELECT count(o._sequence_num) as oc FROM xxx.ORDER o; --175642
SELECT count(t._sequence_num) as tc FROM xxx.TRANSACTION t; --199392
SELECT count(ol._sequence_num) as olc FROM xxx.ORDER_LINES ol; --174947
It return just after 1-2 seconds (--xxxxxx in the right is the records count)
Same for this simple join, I never get the result:
SELECT ol.DEVICE_ID AS VIN,
tx.TAX_LINES AS SKU,
o.USER_ID AS ACCOUNT_DN,
o.ORDER_NUMBER,
cast(t.AMOUNT as FLOAT64)/100 AS TOTAL_AMOUNT ,
t.TRANSACTION_STATUS,
t.TRANSACTION_TYPE,
t.TRANSACTION_TAG,
t.CREATED_ON ,
tx.TAX_CALCULATED,
tx.TRANSACTION_STATUS AS TAX_TXN_STATUS,
tx.ERROR_MESSAGE REMARKS,
tx.TRANSACTION_ID AS TAX_TXN_ID,
tx.TAXATION_TYPE AS TAX_TXN_TYPE,
tx.TRANSACTION_DATE TAX_TXN_DATE
FROM xxx.TAX_TRANSACTIONS tx join
`xxx.ORDER` o on o.ORDER_NUMBER = tx.ORDER_NUMBER join
xxx.TRANSACTION t on o.ORDER_NUMBER = t.ORDER_NUMBER join
xxx.ORDER_LINES ol on o.ID = ol.ORDER_ID
WHERE (t.TRANSACTION_TYPE IN ("purchase") AND t.TRANSACTION_STATUS ="approved" AND tx.TAXATION_TYPE = "SalesInvoice") or
(t.TRANSACTION_TYPE IN ("refund") AND tx.TAXATION_TYPE = "ReturnInvoice") or
(tx.TRANSACTION_STATUS IN ("Error"))
ORDER BY CREATED_ON DESC
Is there something wrong with my query? Please let me know how to resolve the problem (joining). Thank you
You say you're not doing any JOINs, but actually you are. Worse, you are doing CROSS JOINs. By putting 4 tables as you have done in your FROM clause you are implicitly joining all 4 of them together.
In other words, the number of rows produced by the join will be 202685 * 175642 * 199392 * 174947 = 1241835900000000000000 which is a humungous number. That's why your query doesn't complete.
Maybe take a look at the execution graph which is currently in preview (I can see it on your screenshot above) - it might give an indication into what operation is being performed here.
If you want COUNTs of the number of rows in each of those tables then you have to write 4 separate queries, as you have done.
UPDATE, as a demonstration I have a table that has 288 rows in it
select count(*)
from `project.dataset.t` a
returns 288
select count(*)
from `project.dataset.t` a,
`project.dataset.t` b
returns 82944
select count(*)
from `project.dataset.t` a,
`project.dataset.t` b,
`project.dataset.t` c
returns 23887872
select count(*)
from `project.dataset.t` a,
`project.dataset.t` b,
`project.dataset.t` c,
`project.dataset.t` d
returns 6879707136 (6.8billion). That's an enormous number, and that's for a table with only 288 rows in it. Your query will (as I said above) produce 1241835900000000000000 rows.
Here is the execution graph for my query that returns 6879707136:

Get count and details of each count details

I have a table called 'A' and another table called 'B'.
Here in table A i keep all the master details and where B it keeps the status of field A like approved, rejected.
My need is i need a single query with output as
{
submitted_count: 5,
{[details of first app], [details of 2 app], [], [],[]},
rejected_count : 2,
{[details of first app],[details of second app]}
}
How would i achieve this ?
If you want to convert the result of the query to JSON you need to use the json_agg function.
select json_agg(t)
from (
Select
count(1) as total,
string_agg(tb.detail,',') as details
FROM A tb
inner join B tbb
on tb.id = tbB.id_A
where tbb.status = true
union
Select
count(1) as total,
string_agg(tb.detail,',') as details
FROM A tb
inner join B tbb
on tb.id = tbB.id_A
where tbb.status = false
) t;
The output is a little bit different:
[{"total":2,"details":"Bob,Logan"},{"total":3,"details":"Scott,Jean,Gambit"}]
There is an example here how to use it

Why do we need 'WHERE EXISTS' operator in SQL?

I found this tutorial about SQL EXISTS Operator. And I am trying to understand why do we need it.
It is always possible to replace 'EXISTS' by another expression (showed below)
As an example, this SQL:
SELECT count(SupplierName)
FROM Suppliers
WHERE EXISTS (SELECT * FROM Products WHERE SupplierId = Suppliers.supplierId AND Price < 20);
Could be replaced to this SQL:
SELECT count(SupplierName)
FROM Suppliers
WHERE (SELECT count(*) FROM Products WHERE SupplierId = Suppliers.supplierId AND Price < 20) > 0;
I tested it, and got the same result.
Is there any situations when we have to use 'EXISTS' ?
Well, in your example, EXISTS is more efficient. The subquery needs to read all matching rows in order to do the count.
EXISTS, by contract, can (and does!) stop at the first matching row.
You could argue that the SQL engine could identify this situation. However, that would be very, very difficult for more complex queries.
In contrast to your viewpoint, I find EXISTS to be more useful than IN or correlated subqueries with aggregations. It is the preferred method for expressing this logic.
EXISTS is better when it comes to performance, since it needs to retrieve only a single row rather than enumerating all of them.
http://sqlblog.com/blogs/andrew_kelly/archive/2007/12/15/exists-vs-count-the-battle-never-ends.aspx
Exist could have better performance in some cases.
For example to execute a query below, engine has to count all the rows in sub-query:
SELECT count(SupplierName)
FROM Suppliers
WHERE (SELECT count(*) FROM Products WHERE SupplierId = Suppliers.supplierId AND Price < 20) > 0;
In this case it will yield, when first Products is found;
SELECT count(SupplierName)
FROM Suppliers
WHERE EXISTS (SELECT * FROM Products WHERE SupplierId = Suppliers.supplierId AND Price < 20);
In most cases,EXISTS() is also more expressive. Try, for instance, to rewite the below fragment into a NOT COUNT(*) > 0 version. Which one is more readable?
--
-- All the possible moves for the current board.
-- ,which is := all the (empty spots * all the numbers 1-9)
-- for which the same number does not occur yet in the same {row/col/box}.
--
CREATE VIEW valid_moves AS (
SELECT DISTINCT su.iii AS iii
, su.yyy AS yyy
, su.xxx AS xxx
, su.box AS box
, su.y3 AS y3
, su.x3 AS x3
, su.z3 AS z3
, nn.val AS val
FROM v_sudoku su -- current board, including filled-in numbers
CROSS JOIN all_numbers nn -- Cartesian product here ...
WHERE su.val IS NULL -- empty spots
AND NOT EXISTS (SELECT * FROM v_sudoku ny
WHERE ny.yyy = su.yyy AND ny.val = nn.val
)
AND NOT EXISTS (SELECT * FROM v_sudoku nx
WHERE nx.xxx = su.xxx AND nx.val = nn.val
)
AND NOT EXISTS (SELECT * FROM v_sudoku nz
WHERE nz.box = su.box AND nz.val = nn.val
)
);

Concat sql result in one row Oracle 10g

I'm usung Oracle 10.
I've to concat the results from two sql queries in one row.
The first query is :
SELECT DISTINCT F.comments from flight F, task WHERE F.id = task.flight_id and task.name like 'BO%' AND F.comments IS NOT NULL
Which returns :
Initial comment.
And the second query (it concats the result one the query in one row) :
SELECT (RTRIM(XMLAGG(xmlelement(X, T.comments||',')order by F.id).extract('//text()'),',')) list from flight F, task T where F.id = T.flight_id and T.name like 'BOS%' AND T.comments IS NOT NULL
Which returns :
First comment.,Second comment.,Third comment.
I have to concat the results into one row so I did :
SELECT DISTINCT F.comments from flight F, task WHERE F.id = task.flight_id and task.name like 'BO%' AND F.comments IS NOT NULL
UNION ALL
SELECT (RTRIM(XMLAGG(xmlelement(X, T.comments||',')order by F.id).extract('//text()'),',')) list from flight F, task T where F.id = T.flight_id and T.name like 'BOS%' AND T.comments IS NOT NULL
Which returns me two rows. The first one about the result of the first query and the second one about the second query.
I would like to retrieve them in one row like :
Initial comment.First comment.,Second comment.,Third comment.
Thank you !
Push the two queries into a single subquery in the correct order then apply your xmlagg over top. Something like:
SELECT (RTRIM(XMLAGG(xmlelement(X, comments||',')order by sortorder, F.id).extract('//text()'),',')) list
from (
SELECT DISTINCT 1 sortorder,
f.id,
F.comments
from flight F, task
WHERE F.id = task.flight_id
and task.name like 'BO%'
AND F.comments IS NOT NULL
union all
SELECT 2, f.id, T.comments
from flight F, task T
where F.id = T.flight_id
and T.name like 'BOS%'
AND T.comments IS NOT NULL )
(pls forgive any minor syntax glitches - I'm away from my database at the moment)

activerecord sql difficult query

I need to make a query with two options: first - select DISTINCT ON, secondly - order by (and order by other fields). BTW, having by don't work
At one sql forum I find a solution
WITH d AS (
SELECT DISTINCT ON ({Dlist}) {slist}
FROM {flist}
....
)
SELECT * FROM d ORDER BY {order fields}
So, how I can make this via ActiveRecord method and get back ActiveRecord::Relation
My full query seems something like that:
WITH d AS (
SELECT DISTINCT ON(item_info_id, volume) items.item_info_id, items.volume, items.*
FROM "items" INNER JOIN "item_info" ON "item_info"."id" = "items"."item_info_id" WHERE "items"."type" IN ('Product')
AND "items"."published" = 't'
AND ("items"."item_info_id" IS NOT NULL)
AND ("items"."price" BETWEEN 2 AND 823489)\
)
SELECT * FROM d ORDER_BY 'price'
Below might work for you or give you some hints
class Item < ActiveRecord::Base
def self.what_you_want_to_achieve
item_ids = where("item_info_id IS NOT NULL")
.select(" DISTINCT on(item_info_id, volume) items.item_info_id, items.volume, items.id ")
.map(&:id)
where(:id => item_ids).published.products.price_between(2,823489).order(:price)
end
I assume you know how to define scope e.g. published