I have a tasks in BigQuery with created date and last modified date. I would like to be able to report the number of task open and task close events by date in the same table if possible.
view: tasks {
derived_table: {
sql:
SELECT *
FROM UNNEST(ARRAY<STRUCT<CREATED_DATE DATE, LAST_MODIFIED DATE, ID INT64, STATE STRING>>[
('2020-12-01', '2020-12-01', 1, "OPEN"),
('2020-12-01', '2020-12-03', 2, "CLOSED"),
('2020-12-02', '2020-12-03', 3, "CLOSED"),
('2020-12-03', '2020-12-05', 4, "OPEN"),
('2020-12-05', '2020-12-05', 5, "CLOSED")])
;;
}
dimension_group: created {
type: time
datatype: date
sql: ${TABLE}.created_date ;;
}
dimension_group: last_modified {
type: time
datatype: date
sql: ${TABLE}.last_modified ;;
}
dimension: id {
type: number
}
dimension: state {
type: string
}
measure: number_of_tasks {
type: count_distinct
sql: ${id} ;;
}
measure: number_of_open_tasks {
type: count_distinct
sql: ${id} ;;
filters: {
field: "state"
value: "OPEN"
}
}
measure: number_of_closed_tasks {
type: count_distinct
sql: ${id} ;;
filters: {
field: "state"
value: "CLOSED"
}
}
}
explore: tasks {}
I can get the number of opened tasks using the created date.
I can get the number of tasks closed by counting tasks, where the last modified date is in the aggregating period and status is closed, with a filtered measure.
However, if I try to combine these in a single table I get a row for each combination of dates.
How can I count task state changes by date?
Date
Number of Opened Tasks
Number of Closed Tasks
2020-12-01
2
0
2020-12-02
1
0
2020-12-03
1
2
2020-12-04
0
0
2020-12-05
1
1
A colleague has suggested a solution. Stacking the tasks table on itself creates (up to) two rows per task.
view: tasks {
derived_table: {
sql:
WITH tab AS (
SELECT *
FROM UNNEST(ARRAY<STRUCT<CREATED_DATE DATE, LAST_MODIFIED DATE, ID INT64, STATE STRING>>[
('2020-12-01', '2020-12-01', 1, "OPEN"),
('2020-12-01', '2020-12-03', 2, "CLOSED"),
('2020-12-02', '2020-12-03', 3, "CLOSED"),
('2020-12-03', '2020-12-05', 4, "OPEN"),
('2020-12-05', '2020-12-05', 5, "CLOSED")])
)
SELECT *, 1 open_count, 0 closed_count, created_date AS action_date
FROM tab
UNION DISTINCT
SELECT *, 0 open_count, 1 closed_count, last_modified AS action_date
FROM tab
WHERE state = "CLOSED"
;;
}
dimension_group: created {
type: time
datatype: date
sql: ${TABLE}.created_date ;;
}
dimension_group: last_modified {
type: time
datatype: date
sql: ${TABLE}.last_modified ;;
}
dimension_group: action {
type: time
datatype: date
sql: ${TABLE}.action_date ;;
}
dimension: id {
type: number
}
dimension: state {
type: string
}
dimension: open_count {
type: number
hidden: yes
}
dimension: closed_count {
type: number
hidden: yes
}
measure: number_opened{
type: sum
sql: ${open_count} ;;
}
measure: number_closed {
type: sum
sql: ${closed_count} ;;
}
}
explore: tasks {}
The opened and closed tags can then be counted.
Related
Given a table Location, where each row has many records in table Events, for example:
Location has columns: id, city, state
Event has columns: id, name, date, location_id
I want to perform a query that results in a structure like this (json array shown here):
[
{
location: {
id: 1,
city: 'San Francisco'
state: 'California'
},
events: [
{
id: 1,
name, 'Fest 1',
date, 'March 1, 2022'
},
{
id: 2,
name, 'Fest 2',
date, 'March 2, 2022'
}
]
},
{
location: {
id: 2,
city: 'Seattle'
state: 'Washington'
},
events: [
{
id: 3,
name, 'Fest 3',
date, 'March 3, 2022'
},
{
id: 4,
name, 'Fest 4',
date, 'March 4, 2022'
}
]
}
]
I'm struggling to achieve this. I've tried various subqueries, and GROUP BY approaches but not getting what I need.
Use json_agg() to create the array and to_jsonb() to convert the rows to json:
select
to_jsonb(location) as location,
json_agg(to_jsonb(event)) as events
from location
left join event on event.location_id = location.id
group by 1
I want to convert the following SQL query to MongoDB query:
SELECT count(invoiceNo), year, month, manager
FROM battle
WHERE year=2021 AND month='Dec' OR year=2022 AND month='Jan' AND manager = 'name#test.com'
GROUP BY year,month;
I've tried to do so, but it seems to be incorrect:
const getNoOfOrders = await BattlefieldInfo.aggregate([
{
$match: {
$and: [
{
year: periodDate[0]['year']
},
{ month: periodDate[0]['month'] }
],
$or: [
{
$and: [
{
year: prevYear
},
{ month: prevMonth }
]
}
],
$and: [{ manager: email }]
}
},
{
$group: {
_id: '$month'
}
},
{
$project: {
// noOfOrders: { $count: '$invoiceNo' },
month: 1,
year: 1,
manager: 1
}
}
]);
Because I am getting an empty array. But it should be something like this:
| count(invoiceNo) | manager | year | month |
+------------------+---------------+------+-------+
2 name#test.com 2021 Dec
3 name#test.com 2022 Jan
From my point of view, I think parenthesis (bracket) is important to group the conditions together such as month and year.
SELECT count(invoiceNo), `year`, month, manager
FROM battle
WHERE (`year` = 2021 AND month = 'Dec')
OR (`year` = 2022 AND month = 'Jan')
AND manager = 'abc#email.com'
GROUP BY month, `year`
Sample DBFiddle
Same goes for your MongoDB query. While to search with month and year, you can do without $and as below:
{
year: 2021,
month: "Dec"
}
Instead of:
$and: [
{
year: 2021
},
{
month: "Dec"
}
]
And make sure that $group stage need an accumulator operator:
noOfOrders: {
$count: {}
}
Or
noOfOrders: {
$sum: 1
}
Complete MongoDB query
db.collection.aggregate([
{
$match: {
$or: [
{
year: 2021,
month: "Dec"
},
{
year: 2022,
month: "Jan"
}
],
manager: "abc#email.com"
}
},
{
$group: {
_id: {
month: "$month",
year: "$year"
},
noOfOrders: {
$count: {}
},
manager: {
$first: "$manager"
}
}
},
{
$project: {
_id: 0,
noOfOrders: 1,
month: "$_id.month",
year: "$_id.year",
manager: "$manager"
}
}
])
Sample Mongo Playground
Note:
Would be great for both queries to add manager as one of the group keys. Since you are filtering for the specific (only one) manager's record(s), it's fine. But without filtering for specific manager, your query will result in the wrong output.
I'm struggling to see how I would represent the following type of postgres SQL query in a cube.js schema:
SELECT
CASE
WHEN COUNT(tpp.net_total_amount) > 0 THEN
SUM(tpp.net_total_amount) / COUNT(tpp.net_total_amount)
ELSE
NULL
END AS average_spend_per_customer
FROM
(
SELECT
SUM(ts.total_amount) AS net_total_amount
FROM
postgres.transactions AS ts
WHERE
ts.transaction_date >= '2020-11-01' AND
ts.transaction_date < '2020-12-01'
GROUP BY
ts.customer_id,
ts.event_id
) AS tpp
;
I had the feeling that pre-aggregations might be what I'm after, but that doesn't seem to be the case after looking into them. I can get a list of total amount spent per customer per event with the following schema:
cube(`TransactionTotalAmountByCustomerAndEvent`, {
sql: `SELECT * FROM postgres.transactions`,
joins: {
},
measures: {
sum: {
sql: `SUM(total_amount)`,
type: `number`
}
},
dimensions: {
eventId: {
sql: `event_id`,
type: `string`
},
customerId: {
sql: `customer_id`,
type: `string`
},
transactionDate: {
sql: `transaction_date`,
type: `time`
}
},
preAggregations: {
customerAndEvent: {
type: `rollup`,
measureReferences: [sum],
dimensionReferences: [customerId, eventId]
}
}
});
But that is really just giving me the output of the inner SELECT statement grouped by customer and event. How do I query the cube to get the average customer spend per event figure I'm after?
You might find it easier to model the dataset as two different cubes, Customers and Transactions. You'll then need to set up a join between the cubes and then create a special dimension with the subQuery property set to true. I've included an example below to help you understand:
cube('Transactions', {
sql: `SELECT * FROM postgres.transactions`,
measures: {
spend: {
sql: `total_amount`,
type: `number`,
},
},
dimensions: {
eventId: {
sql: `event_id`,
type: `string`
},
customerId: {
sql: `customer_id`,
type: `string`
},
transactionDate: {
sql: `transaction_date`,
type: `time`
},
},
})
cube('Customers', {
sql: `SELECT customer_id FROM postgres.transactions`,
joins: {
Transactions: {
relationship: `hasMany`,
sql: `${Customers}.id = ${Transactions}.customerId`
}
},
measures: {
averageSpend: {
sql: `${spendAmount}`,
type: `avg`,
},
},
dimensions: {
id: {
sql: `customer_id`,
type: `string`
},
spendAmount: {
sql: `${Transactions.spend}`,
type: `number`,
subQuery: true
},
}
})
You can find more information on the Subquery page on the documentation
I have quite an issue here. I have three tables:
#Entity()
Vessel {
#OneToMany()
workOrders: WorkOrder[];
}
#Entity()
WorkOrder {
#ManyToOne()
vessel: Vessel;
#OneToOne()
order: Order;
}
#Entity()
Order {
#OneToOne()
workOrder: WorkOrder
#Column()
paid: Boolean
}
I need to make a query that fetches all vessels with their workOrders with its order. A workOrder should ONLY be selected if its order has paid: false.
So if a vessel has no workOrders that has order.paid = false, it should still be selected, but have an empty workOrders array.
The data would look something like this:
vessels: [
{
id: 123,
workOrders: [] // or null, or not selected, doesn't really matter
},
{
id: 1234,
workOrders: [
{
id: 99,
order: {
id: 12345,
paid: false
}
},
{
id: 88,
order: {
id: 321,
paid: false
}
},
]
}
]
I tried this:
const vessels = await getConnection()
.createQueryBuilder()
.select('vessel')
.from('vessel', 'vessel')
.leftJoinAndSelect(
'vessel.workOrders',
subQuery =>
subQuery
.select('workOrder.*')
.from(WorkOrder, 'workOrder')
.leftJoinAndSelect('workOrder.order', 'order')
.where('order.isInvoiced = :isInvoiced', { isInvoiced: false }),
'workOrders',
'workOrders.vesselId = vessel.id'
)
.getMany();
Which generates the following SQL:
SELECT
"vessel"."id" AS "vessel_id",
"vessel"."vesselName" AS "vessel_vesselName",
"vessel"."isDeactivated" AS "vessel_isDeactivated",
"vessel"."externalID" AS "vessel_externalID",
"vessel"."externalDepartment" AS "vessel_externalDepartment",
"vessel"."operationalPlan" AS "vessel_operationalPlan",
"vessel"."planLastUpdated" AS "vessel_planLastUpdated",
"vessel"."phoneNumber" AS "vessel_phoneNumber",
"vessel"."email" AS "vessel_email",
"vessel"."info" AS "vessel_info",
"vessel"."vesselPictureId" AS "vessel_vesselPictureId",
"vessel"."categoryId" AS "vessel_categoryId",
"vessel"."departmentId" AS "vessel_departmentId",
"vessel"."externalClientId" AS "vessel_externalClientId",
"workOrders".*
FROM
"vessel" "vessel"
LEFT JOIN (
SELECT
"order"."id" AS "order_id",
"order"."orderDate" AS "order_orderDate",
"order"."isInvoiced" AS "order_isInvoiced",
"order"."isApproved" AS "order_isApproved",
"order"."isEmergency" AS "order_isEmergency",
"order"."isDisease" AS "order_isDisease",
"order"."description" AS "order_description",
"order"."when" AS "order_when",
"order"."precautions" AS "order_precautions",
"order"."info" AS "order_info",
"order"."orderNumber" AS "order_orderNumber",
"order"."customerId" AS "order_customerId",
"order"."locationId" AS "order_locationId",
"order"."customerUserId" AS "order_customerUserId",
"order"."categoryId" AS "order_categoryId",
"order"."orderStatusId" AS "order_orderStatusId",
"order"."departmentId" AS "order_departmentId",
workOrder.*
FROM
"work_order" "workOrder"
LEFT JOIN "order" "order" ON "order"."id" = "workOrder"."orderId"
WHERE
"order"."isInvoiced" = 0
) "workOrders" ON workOrders.vesselId = "vessel"."id"
Which didn't work. I did, however, get the correct data using getRawMany, but with the following result:
[
{
id: null, // workOrder id
order_id: null,
paid: null,
vessel_id: 123
},
{
id: 99, // workOrder id
order_id: 12345,
order_paid: false,
vessel_id: 1234
},
{
id: 88, // workOrder id
order_id: 321,
order_paid: false,
vessel_id: 1234
},
]
How can I get the data I want, formatted as I want?
I wonder how we can do the below translation from sql to mongoDB:
Assume the table has below structure:
table
=====
-----
##id contribution time
1 300 Jan 2, 1990
2 1000 March 3, 1991
And I want to find a ranking list of ids in the descending orders of their number of contributions.
'$' This is what I do using sql:
select id, count(*) c from table group by id order by c desc;
How can I translate this complex sql into mongoDB using count(), order() and group()?
Thank you very much!
Setting up test data with:
db.donors.insert({donorID:1,contribution:300,date:ISODate('1990-01-02')})
db.donors.insert({donorID:2,contribution:1000,date:ISODate('1991-03-03')})
db.donors.insert({donorID:1,contribution:900,date:ISODate('1992-01-02')})
You can use the new Aggregation Framework in MongoDB 2.2:
db.donors.aggregate(
{ $group: {
_id: "$donorID",
total: { $sum: "$contribution" },
donations: { $sum: 1 }
}},
{ $sort: {
donations: -1
}}
)
To produce the desired result:
{
"result" : [
{
"_id" : 1,
"total" : 1200,
"donations" : 2
},
{
"_id" : 2,
"total" : 1000,
"donations" : 1
}
],
"ok" : 1
}
Check the mongodb Aggregation
db.colection.group(
{key: { id:true},
reduce: function(obj,prev) { prev.sum += 1; },
initial: { sum: 0 }
});
After you get the result, sort it by sum.