Django field widget doesn't show appropriate attribute - sql

I'm using Django and this is a question on how to organize your models, or equivalentely, organize tables in SQL.
At the moment I have a table where each row contains a primary key, a "value" (a float multiple of 0.01) and a "amount" (integer). This is how I need this data.
However, I need to serve it differentely. I need to sum the "amount"s over rows with the same "value".
Example, my table is
| id | value | amount |
| 1 | 1.2 | 10 |
| 2 | 1.2 | 27 |
| 3 | 1.2 | 4 |
| 4 | 1.3 | 21 |
| 5 | 1.3 | 1 |
| 6 | 1.4 | 5 |
| 7 | 1.4 | 9 |
For my app I need to serve this as
| value | amount |
| 1.2 | 41 |
| 1.3 | 22 |
| 1.4 | 14 |
Now my question is: What is the best way to do this? Should I generate the second table from the first every time I need to serve it? Or should I add a new model to my app that gets updated everytime my current model gets updated, and so containing redundant information but getting the job done faster?
EDIT:
qb = Order.objects.filter(
models.Q(status='B')|models.Q(status='K')
).filter(
side='L', market__pk=self.pk
).order_by(
'-value'
).values('value').annotate(amount_sum=Sum('amount'))
The output is
[{'amount_sum': 22, 'value': Decimal('1.3')}, {'amount_sum': 41, 'value': Decimal('1.2')}]

from django.db.models import Sum
MyTable.objects.values('value').annotate(amount_sum=Sum('amount'))
This will return a list of dictionaries that contain value and amount_sum. You can name amount_sum whatever.
Django doc for Sum

Related

How to manage relationships between a main table and a variable number of secondary tables in Postgresql

I am trying to create a postgresql database to store the performance specifications of wind turbines and their characteristics.
The way I have structures this in my head is the following:
A main table with a unique id for each turbine model as well as basic information about them (rotor size, max power, height, manufacturer, model id, design date, etc.)
example structure of the "main" table holding all of the main turbine characteristics
turbine_model
rotor_size
height
max_power
etc.
model_x1
200
120
15
etc.
model_b7
250
145
18
etc.
A lookup table for each turbine model storing how much each produces for a given wind speed, with one column for wind speeds and another row for power output. There will be as many of these tables as there are rows in the main table.
example table "model_x1":
wind_speed
power_output
1
0.5
2
1.5
3
2.0
4
2.7
5
3.2
6
3.9
7
4.9
8
7.0
9
10.0
However, I am struggling to find a way to implement this as I cannot find a way to build relationships between each row of the "main" table and the lookup tables. I am starting to think this approach is not suited for a relational database.
How would you design a database to solve this problem?
A relational database is perfect for this, but you will want to learn a little bit about normalization to design the layout of the tables.
Basically, you'll want to add a 3rd column to your poweroutput reference table so that each model is just more rows (grow long, not wide).
Here is an example of what I mean, but I even took this to a further extreme where you might want to have a reference for other metrics in addition to windspeed (rpm in this case) so you can see what I mean.
PowerOutput Reference Table
+----------+--------+------------+-------------+
| model_id | metric | metric_val | poweroutput |
+----------+--------+------------+-------------+
| model_x1 | wind | 1 | 0.5 |
| model_x1 | wind | 2 | 1.5 |
| model_x1 | wind | 3 | 3 |
| ... | ... | ... | ... |
| model_x1 | rpm | 1250 | 1.5 |
| model_x1 | rpm | 1350 | 2.5 |
| model_x1 | rpm | 1450 | 3.5 |
| ... | ... | ... | ... |
| model_bg | wind | 1 | 0.7 |
| model_bg | wind | 2 | 0.9 |
| model_bg | wind | 3 | 1.2 |
| ... | ... | ... | ... |
| model_bg | rpm | 1250 | 1 |
| model_bg | rpm | 1350 | 1.5 |
| model_bg | rpm | 1450 | 2 |
+----------+--------+------------+-------------+

Creating a view that joins multiple tables on an ID and a timestamp that needs to be rounded

I have a web application that sends data to my sqlite database into different tables depending on the information. I would like to make a view that merges multiple tables together based on cownumber and TS[timestamp] (There are no updates to my table, so a change to the same cownumber send the full record as a new entry with a new timestamp). The ajax calls are made table by table so the TS do not exactly sync up generally they can be 5-20 seconds off depending on the connection
Here is a sample of the three tables
+----master_animal-----+
+----------------------------------------------------+
| cownumber | height | weight | ts |
+-----------+----------+--------+--------------------+
| 1 | 150 | ... | 2017-12-01 12:28:00|
| 2 | 170 | ... | 2017-12-03 17:16:00|
| 3 | 60 | ... | 2017-12-03 08:09:00|
| 4 | 109 | ... | 2017-12-04 23:23:00|
+----animal_inventory-----+
+-------------------------------------------------------------+
| cownumber | brandlocation| dateacquired| ts |
+-----------+--------------+-------------+--------------------+
| 1 | ... | ... | 2017-12-01 12:28:50|
| 2 | ... | ... | 2017-12-03 17:16:30|
| 3 | ... | ... | 2017-12-03 08:09:12|
| 4 | ... | ... | 2017-12-04 23:23:23|
+----experiment-----+
+-------------------------------------------------------------+
| cownumber | ageatwean | birthweight | ts |
+-----------+--------------+-------------+--------------------+
| 1 | ... | ... | 2017-12-01 12:28:20|
| 2 | ... | ... | 2017-12-03 17:16:41|
| 3 | ... | ... | 2017-12-03 08:09:24|
| 4 | ... | ... | 2017-12-04 23:23:11|
The View I wrote
CREATE VIEW testing
AS SELECT a.height,a.weight,a.cownumber,
b.brandlocation,b.dateacquired,
c.ageatwean,c.birthweight
FROM master_animal a, animal_inventory b, experiment c
WHERE a.cownumber=b.cownumber
AND ROUND(a.ts/10000) = ROUND(b.ts/10000)
AND a.cownumber=c.cownumber
AND ROUND(a.ts/10000) = ROUND(c.ts/10000);
The query I wrote
Select * from testing where cownumber = 1;
What I was hoping to get back was
+----testing-----+
+----------------------------------------------------+
| cownumber | height | weight | brandlocation| dateacquired | ageatwean |birthweight |
+-----------+--------+--------+--------------+--------------+-----------+------------+
| 941 | 0 | ... | ... | ... | ... | .. |
Where there will be one row for cownumber 941 as long as all the correlated records were within a few seconds of each other. I am not exactly sure if I need to divide by 10000 or smaller. The same record should be no more than 50 seconds apart from each other. Anything more than 50 seconds apart should be considered a new record.
When I test this where there is only one record for that cownumber it works fine. But lets say I change some information from each table. I provide a new height, a new brandlocation.
Instead of getting two rows. The first row being the initial data entry and the second row showing the same cownumber with the changed values, I get back 8 rows with partial changes.
height|weight|cownumber|brandlocation|dateacquired|ageatwean|birthweight|
0.0|0.0|941|0|0|0.0|0
0.0|0.0|941|0|0|0.0|0
0.0|0.0|941|Left Hip|0|0.0|0
0.0|0.0|941|Left Hip|0|0.0|0
50.0|0.0|941|0|0|0.0|0
50.0|0.0|941|0|0|0.0|0
50.0|0.0|941|Left Hip|0|0.0|0
50.0|0.0|941|Left Hip|0|0.0|0
I assume the issue is in my where clause but I am not sure exactly how to fix it
The timestamps are stored as strings. When you try to divide it, the database tries to convert it to a number, which results in 2017. So all timestamps end up being the same.
Dividing cannot determine the distance; the values 9999 and 10000 will end up different although they are right near each other. (And an integer division results in an integer result, so the ROUND() has no effect.)
To compute the distance, convert the timestamp into a number of seconds first, and then use abs():
SELECT ...
FROM master_animal m
JOIN animal_inventory i ON m.cownumber = i.cownumber
AND abs(strftime('%s', m.ts) - strftime('%s', i.ts)) <= 50
JOIN experiment e ON m.cownumber = e.cownumber
AND abs(strftime('%s', m.ts) - strftime('%s', e.ts)) <= 50;

influxdb/SQL get field count

I have an influxdb table lets call it my_table
my_table is structured like this (simplified):
+-----+-----+-----
| Time| m1 | m2 |
+=====+=====+=====
| 1 | 8 | 4 |
+-----+-----+-----
| 2 | 1 | 12 |
+-----+-----+-----
| 3 | 6 | 18 |
+-----+-----+-----
| 4 | 4 | 1 |
+-----+-----+-----
However I was wondering if it is possible to find out how many of the metrics are larger than a certain (dynamic) threshold for each time.
So lets say I want to know how many of the metrics (columns) are higher than 5,
I would want to do something like this:
select fieldcount(/m*/) from my_table where /m*/ > 5
Returning:
1
1
2
0
I am relatively restricted in structuring the database as I'm using diamond collector (python) which takes care of all datacollection for me and flushes it to my influxdb without me telling what the tables should look like.
EDIT
I am aware of a possible solution if I hardcode the threshold and add a third metric named mGreaterThan5:
+-----+-----+------------------+
| Time| m1 | m2 |mGreaterThan5|
+=====+=====+====+=============+
| 1 | 8 | 4 | 1 |
+-----+-----+----+-------------+
| 2 | 1 | 12 | 1 |
+-----+-----+----+-------------+
| 3 | 6 | 18 | 2 |
+-----+-----+----+-------------+
| 4 | 4 | 1 | 0 |
+-----+-----+----+-------------+
However this means that I cant easily change this threshold to 6 or any other number so thats why I would prefer a better solution if there is one.
EDIT2
Another similar problem occurs with trying to retrieve the highest x amount of metrics. Eg:
On Jan 1st what were the highest 3 values of m? Given table:
+-----+-----+----+-----+----+-----+----+
| Time| m1 | m2 | m3 | m4 | m5 | m6 |
+=====+=====+====+=====+====+=====+====+
| 1/1 | 8 | 4 | 1 | 7 | 2 | 0 |
+-----+-----+----+-----+----+-----+----+
Am I screwed if I keep the table structured this way?

Primary key auto-increment manipulation

Is there any way to have a primary key with a feature that increments it but fills in gaps? Assuming I have the following table:
____________________
| ID | Value |
| 1 | A |
| 2 | B |
| 3 | C |
^^^^^^^^^^^^^^^^^^^^^
Notice that the value is only an example, the order has nothing to do with the question.
Once I remove the row with the ID of 2 (the table will look like this):
____________________
| ID | Value |
| 1 | A |
| 3 | C |
^^^^^^^^^^^^^^^^^^^^^
And I add another row, with regular auto-increment feature it will look like this:
____________________
| ID | Value |
| 1 | A |
| 3 | C |
| 4 | D |
^^^^^^^^^^^^^^^^^^^^^
As expected.
The output I'd want would be:
____________________
| ID | Value |
| 1 | A |
| 2 | D |
| 3 | C |
^^^^^^^^^^^^^^^^^^^^^
Where the gap is filled with the new row. Also note that maybe, in memory, it would look different. But the point is that the primary key would fill the gaps.
When having the primary keys (for instance) 1, 2, 3, 6, 7, 10, 11, 4 should be first filled in, then 5, 8 and so on... When the table is empty (even if it had a million of rows before) it should start over from 1.
How do I accomplish that? Is there any built-in feature similar to that? Can I implement it?
EDIT: If it's not possible, why not?
No, you don't want to do that, as juergen-d said. It's unlikely to do what you think it is doing, and it will do it even less in a multi-user environment.
In a multiuser environment you are likely to get voids even when there are no deletes, just from aborted inserts.

Relative incremental ID by reference field

I have a table to store reservations for certain events; relevant part of it is:
class Reservation(models.Model):
# django creates an auto-increment field "id" by default
event = models.ForeignKey(Event)
# Some other reservation-specific fields..
first_name = models.CharField(max_length=255)
Now, I wish to retrieve the sequential ID of a given reservation relative to reservations for the same event.
Disclaimer: Of course, we assume reservations are never deleted, or their relative position might change.
Example:
+----+-------+------------+--------+
| ID | Event | First name | Rel.ID |
+----+-------+------------+--------+
| 1 | 1 | AAA | 1 |
| 2 | 1 | BBB | 2 |
| 3 | 2 | CCC | 1 |
| 4 | 2 | DDD | 2 |
| 5 | 1 | EEE | 3 |
| 6 | 3 | FFF | 1 |
| 7 | 1 | GGG | 4 |
| 8 | 1 | HHH | 5 |
+----+-------+------------+--------+
The last column is the "Relative ID", that is, a sequential number, with no gaps, for all reservations of the same event.
Now, what's the best way to accomplish this, without having to manually calculate relative id for each import (I don't like that)? I'm using postgresql as underlying database, but I'd prefer to stick with django abstraction layer in order to keep this portable (i.e. no database-specific solutions, such as triggers etc.).
Filtering using Reservation.objects.filter(event_id = some_event_id) should suffice. This will give you a QuerySet that should have the same ordering each time. Or am I missing something in your question?
I hate always being the one that responds its own questions, but I solved using this:
class Reservation(models.Model):
# ...
def relative_id(self):
return self.id - Reservation.objects.filter(id__lt=self.id).filter(~Q(event=self.event)).all().count()
Assuming records from reservations are never deleted, we can safely assume the "relative id" is the incremental id - (count of reservations before this one not belonging to same event).
I'm thinking of any drawbacks, but I didn't find any.