Complex count based on the latest date of a month - sql

I have a model:
class HistoricalRecord(models.Model):
history_id = models.CharField(max_length=32)
type = models.CharField(max_length=8)
history_date = models.DateField()
How can I get the count of each type of HistoricalRecord by getting only the latest object (based on the history_id) for a given month. For example:
With these example objects:
HistoricalRecord.objects.create(history_id="ABC1", type="A", history_date=date(2000, 10, 5))
HistoricalRecord.objects.create(history_id="ABC1", type="A", history_date=date(2000, 10, 27))
HistoricalRecord.objects.create(history_id="DEF1", type="A", history_date=date(2000, 10, 16))
HistoricalRecord.objects.create(history_id="ABC1", type="B", history_date=date(2000, 10, 8))
The result should be:
[
{
"type": "A",
"type_count": 2
},
{
"type": "B",
"type_count": 0
}
]
"A" is 2 because the latest HistoryRecord object with history_id "ABC1" is on the 27th and the type is A; the other one is the record with history_id "DEF1".
I've tried:
HistoricalRecord.objects.filter(history_date__range(month_start, month_end)).order_by("type").values("type").annotate(type_count=Count("type"))
but obviously this is incorrect since it gets all the values for the month. The structure of the result doesn't have to be exactly like above, as long as it clearly conveys the count of each type.

This can likely be done with .extra(), add this to the query:
.extra(
where=["""history_date = (SELECT MAX(history_date) FROM historical_record hr
WHERE hr.history_id = historical_record.history_id
AND hr.history_date < %s)"""],
params=[month_end]
)

Related

How can I modify all values that match a condition inside a json array?

I have a table which has a JSON column called people like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 3 }]
2
[{ "id": 2 }, { "id": 3 }, { "id": 1 }]
...and I need to update the people column and put a 0 in the path $[*].id where id = 3, so after executing the query, the table should end like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 0 }]
2
[{ "id": 2 }, { "id": 0 }, { "id": 1 }]
There may be more than one match per row.
Honestly, I didnĀ“t tried any query since I cannot figure out how can I loop inside a field, but my idea was something like this:
UPDATE mytable
SET people = JSON_SET(people, '$[*].id', 0)
WHERE /* ...something should go here */
This is my version
SELECT VERSION()
+-----------------+
| version() |
+-----------------+
| 10.4.22-MariaDB |
+-----------------+
If the id values in people are unique, you can use a combination of JSON_SEARCH and JSON_REPLACE to change the values:
UPDATE mytable
SET people = JSON_REPLACE(people, JSON_UNQUOTE(JSON_SEARCH(people, 'one', 3)), 0)
WHERE JSON_SEARCH(people, 'one', 3) IS NOT NULL
Note that the WHERE clause is necessary to prevent the query replacing values with NULL when the value is not found due to JSON_SEARCH returning NULL (which then causes JSON_REPLACE to return NULL as well).
If the id values are not unique, you will have to rely on string replacement, preferably using REGEXP_REPLACE to deal with possible differences in spacing in the values (and also avoiding replacing 3 in (for example) 23 or 34:
UPDATE mytable
SET people = REGEXP_REPLACE(people, '("id"\\s*:\\s*)2\\b', '\\14')
Demo on dbfiddle
As stated in the official documentation, MySQL stores JSON-format strings in a string column, for this reason you can either use the JSON_SET function or any string function.
For your specific task, applying the REPLACE string function may suit your case:
UPDATE
mytable
SET
people = REPLACE(people, CONCAT('"id": ', 3, ' '), CONCAT('"id": ',0, ' '))
WHERE
....;

Bulk update in PostgreSQL by unnesting a JSON array

I want to do a batch update in PostgreSQL in one go by passing in an array of JSON objects, but I am not sure how I should approach the problem.
An example:
[
{ "oldId": 25, "newId": 30 },
{ "oldId": 41, "newId": 53 }
]
should resolve as:
UPDATE table SET id = 30 WHERE id = 25 and UPDATE table SET id = 41 WHERE id = 53, in a single command, of course.
Use the function jsonb_array_elements() in the from clause:
update my_table
set id = (elem->'newId')::int
from jsonb_array_elements(
'[
{ "oldId": 25, "newId": 30 },
{ "oldId": 41, "newId": 50 }
]') as elem
where id = (elem->'oldId')::int
Note that if the column id is unique (primary key) the update may result in a duplication error depending on the data provided.
Db<>fiddle.
You need to unnest the array the cast the elements to the proper data type:
update the_table
set id = (x.item ->> 'newId')::int
from jsonb_array_elements('[...]') x(item)
where the_table.id = (x.item ->> 'oldId')::int

How to write a CASE clause with another column as a condition using knex.js

So my code is like one below:
.select('id','units',knex.raw('case when units > 0 then cost else 0 end'))
but it gives me error like this one
hint: "No operator matches the given name and argument type(s). You might need to add explicit type casts."
Any idea how I should right my code so I can use another column as an condition for different to column ?
I don't get the same error you do:
CASE types integer and character varying cannot be matched
but regardless, the issue is that you're trying to compare apples and oranges. Postgres is quite strict on column types, so attempting to put an integer 0 and a string (value of cost) in the same column does not result in an implicit cast.
Turning your output into a string does the trick:
.select(
"id",
"units",
db.raw("CASE WHEN units > 0 THEN cost ELSE '0' END AS cost")
)
Sample output:
[
{ id: 1, units: null, cost: '0' },
{ id: 2, units: 1.2, cost: '2.99' },
{ id: 3, units: 0.9, cost: '4.50' },
{ id: 4, units: 5, cost: '1.23' },
{ id: 5, units: 0, cost: '0' }
]

Django rest with raw

sorry I have a problem with raw, I'm trying to add up the total sales for each month, but I have an error.
this is my sight.
class TotalSale(ListAPIView):
serializer_class = TotalSaleSerealizer
def get_queryset(self):
queryset = Sale.objects.raw("SELECT 1 id, SUM(totalprice),
to_char(datesale,'yyyy-MM') FROM sales_sale group by
to_char(datesale,'yyyy-MM')")
return queryset
the to_char I am using to change the format of my date and so I can calculate the sales of each month, this query works well when I do it directly in Postgresql, but when I do it in django I do not vote for the correct data.
1,'1197','2018-10'
1,'612','2018-09'
1,'1956','2018-08'
and it's fine I calculate the sum of the sales of each month
But when I do that in Django, this comes to me.
{
"id": 1,
"totalprice": 144,
"datesale": "2018-08-06"
},
{
"id": 1,
"totalprice": 144,
"datesale": "2018-08-06"
},
{
"id": 1,
"totalprice": 144,
"datesale": "2018-08-06"
}
I think the error is for the 1 id, just filter the data of the data that has id 1, my question is why that happens, how can I solve it try to remove the 1 id but another error comes out, how can I fix that problem.
I think that the rest is irrelevant, because the view is the most important, but here is the remaining code.
class TotalSaleSerealizer(ModelSerializer):
class Meta:
model = Sale
fields = ('id','totalprice', 'datesale')
class Sale(models.Model):
id_customer = models.ForeignKey(Customer, null =False, blank=False,
on_delete=models.CASCADE)
id_user = models.ForeignKey(User, null =False, blank=False,
on_delete=models.CASCADE)
datesale = models.DateField(auto_now_add=True)
totalprice = models.IntegerField()
these are the serealizer and the model

BigQuery: Select entire repeated field with group

I'm using LegacySQL, but am not strictly limited to it. (though it does have some methods I find useful, "HASH" for example).
Anyhow, the simple task is that I want to group by one top level column, while still keeping the first instance of a nested+repeated set of data alongside.
So, the following "works", and produces nested output:
SELECT
cd,
subarray.*
FROM [magicalfairy.land]
And now I attempt to just grab the entire first subarray (honestly, I don't expect this to work of course)
The following is what doesn't work:
SELECT
cd,
FIRST(subarray.*)
FROM [magicalfairy.land]
GROUP BY cd
Any alternate approaches would be appreciated.
Edit, for data behaviour example.
If Input data was roughly:
[
{
"cd": "something",
"subarray": [
{
"hello": 1,
"world": 1
},
{
"hello": 2,
"world": 2
}
]
},
{
"cd": "something",
"subarray": [
{
"hello": 1,
"world": 1
},
{
"hello": 2,
"world": 2
}
]
}
]
Would expect to get out:
[
{
"cd": "something",
"subarray": [
{
"hello": 1,
"world": 1
},
{
"hello": 2,
"world": 2
}
]
}
]
You'll have a much better time preserving the structure using standard SQL, e.g.:
WITH T AS (
SELECT
cd,
ARRAY<STRUCT<x INT64, y BOOL>>[
STRUCT(off, MOD(off, 2) = 0),
STRUCT(off - 1, false)] AS subarray
FROM UNNEST([1, 2, 1, 2]) AS cd WITH OFFSET off)
SELECT
cd,
ANY_VALUE(subarray) AS subarray
FROM T
GROUP BY cd;
ANY_VALUE will return some value of subarray for each group. If you wanted to concatenate the arrays instead, you could use ARRAY_CONCAT_AGG.
to run this against your table - try below
SELECT
cd,
ANY_VALUE(subarray) AS subarray
FROM `magicalfairy.land`
GROUP BY cd
Try below (BigQuery Standard SQL)
SELECT cd, subarray
FROM (
SELECT cd, subarray,
ROW_NUMBER() OVER(PARTITION BY cd) AS num
FROM `magicalfairy.land`
) WHERE num = 1
This gives you expected result - equivalent of "ANY ARRAY"
This solution can be extended to "FIRST ARRAY" by adding ORDER BY sort_col into OVER() clause - assuming that sort_col defines the logical order