Reference an arbitrary row and field in another table - sql

Is there any form (data type, inherence..) of implement in postgresql something like this:
CREATE TABLE log (
datareferenced table_row_column_reference,
logged boolean
);
The referenced data may be any row field from the database. My objective is implement something like this without use Procedural Language or implement it in a higher layer, using only a relational approach and without modify the rest of the tables. Another feature may be referencial integrity, example:
-- Table foo (id, field1, field2, fieldn)
-- ('bar', '2014-01-01', 4.33, Null)
-- Table log (datareferenced, logged)
-- ({table foo -> id:'bar' -> field2 } <=> 4.33, True)
DELETE FROM foo where id='bar';
-- as result, on cascade, deleted both rows.
I have an application build onto a MVC pattern. The logic is written in Python. The application is a management tool, very data intensive. My goal is implement a module that could store additional information per every data present in the DDBB. Per example, a client have a serie of attributes (name, address, phone, email ...) across multiple tables, and I want that the app could store metadata-like for every registry from all the DDBB. A metadata could be last modification, or a user flag, etc.
I have implemented the metadata model (in postgres), its mapping to objects and a parcial API. But the part left is the most important, the glue. My plan B is create that glue in the data mapping layer as a module. Something like this:
address= person.addresses[0]
address.saveMetadata('foo', 'bar')
-- in the superclass of Address
def saveMetadata(self, code, value):
self.mapper.metadata_adapter.save(self, code, value)
-- in the metadata adapter class:
def save(self, entity, code, value):
sql = """update value=%s from metadata_values
where code=%s and idmetadata=
(select id from metadata_rels mr
where mr.schema=%s and mr.table=%s and
mr.field=%s and mr.recordpk=%s)"""%
(value, code,
self.class2data[entity.__class__]["schema"],
self.class2data[entity.__class__]["table"],
self.class2data[entity.__class__]["field"],
entity.id)
self.mapper.execute(sql)
def read(self, entity , code):
sql = """select mv.value
from metadata_values mv
join metadata_rels mr on mv.idmetadata=mr.id
where mv.code=%s and mr.schema=%s and mr.table=%s and
mr.field=%s and mr.recordpk=%s"""%
(code,
self.class2data[entity.__class__]["schema"],
self.class2data[entity.__class__]["table"],
self.class2data[entity.__class__]["field"],
entity.id )
return self.mapper.execute(sql)
But it would add overhead between python and postgresql, complicate Python logic, and using PL and triggers may be very laborious and bug-prone. That is why i'm looking at doing the same at the DDBB level.

No, there's nothing like that in PostgreSQL.
You could build triggers yourself to do it, probably using a composite type. But you've said (for some reason) you don't want to use PL/PgSQL, so you've ruled that out. Getting RI triggers right is quite hard, though, and you must apply a trigger to the referencing and referenced ends.
Frankly, this seems like a square peg, round hole kind of problem. Are you sure PostgreSQL is the right choice for this application?
Describe your needs and goal in context. Why do you want this? What problem are you trying to solve? Maybe there's a better way to approach the same problem one step back...

Related

How to associate multiple tables on Exposed

Now I'm creating an API using Kotlin with Exposed, Ktor and Postgres.
When I need to do some select on a table which is associated with another table I have to "parse" the ResultRow to an entity object. If this table have more associations I do have to repeat the association every time, what is weird.
Is there an easy way to do it?
Cuz I think to write a lot of code to create the objects it too much!
Now I'm doing like this.
fun ResultRow.toInterval() = Interval(
this[Intervals.idInterval],
Setting(
this[OfficeSettings.idOffice],
Office(
this[Offices.idOffice],
this[Offices.code],
this[Offices.name]
),
this[OfficeSettings.scheduleDaysRule],
this[OfficeSettings.serviceTimeRule],
this[OfficeSettings.countInterval],
this[OfficeSettings.restTimeRule]
),
this[Intervals.date],
this[Intervals.startTime],
this[Intervals.endTime]
)
Just to parse the result into an Interval, which is associated with a Setting, and Setting is associated with an Office.

SQL string lookup. This worked, but seems like it isn't best practice

Using SSMS 2014
I have data in the db in a particular column. The average line of data looks something like this:
(5:30) 3-J.WINSTON PASS INCOMPLETE DEEP LEFT TO 13-M.EVANS (23-R.ALFORD).
My task is to retrieve the first player featured in the line, in this case I am retrieving J.Winston.
I used this to retrieve and update a column with the name. It worked exactly as needed. But something about it tells me this is poorly constructed. Any tips on improving?
UPDATE nfl.dbo.Temp_NFL2015
Set Player_Name = SUBSTRING(description,LEN(Left(description,Patindex('%-%',description)))+1,CHARINDEX(' ',description,LEN(Left(description,Patindex('%-%',description))))-1-LEN(Left(description,PATINDEX('%-%', description))))
from NFL.dbo.Temp_NFL2015
Normalization of the data would help simplify things when you're extracting data or doing research.
Considering your example:
(5:30) 3-J.WINSTON PASS INCOMPLETE DEEP LEFT TO 13-M.EVANS (23-R.ALFORD).
The syntax (roughly) works out to:
(Game Time) (Source Player) (Action) (Target Player) ((Other Involved Players))
This suggests a base table Actions with the following structure:
ActionID (identity to mark discrete actions)
GameTime
SourcePlayer
Action
TargetPlayer
To include other involved players, we split that off into a separate table, Action_InvolvedPlayers:
InvolvedActionID (referent to Actions.ActionID)
InvolvedPlayer
Points for consideration:
Your dataset likely contains multiple games, so consider an explicit way to say 'this game had this action take place'
It might make sense to store player data in another table and just referent values in the Actions table.

Django - SQL bulk get_or_create possible?

I am using get_or_create to insert objects to database but the problem is that doing 1000 at once takes too long time.
I tried bulk_create but it doesn't provide functionality I need (creates duplicates, ignores unique value, doesn't trigger post_save signals I need).
Is it even possible to do get_or_create in bulk via customized sql query?
Here is my example code:
related_data = json.loads(urllib2.urlopen(final_url).read())
for item in related_data:
kw = item['keyword']
e, c = KW.objects.get_or_create(KWuser=kw, author=author)
e.project.add(id)
#Add m2m to parent project
related_data cotains 1000 rows looking like this:
[{"cmp":0,"ams":3350000,"cpc":0.71,"keyword":"apple."},
{"cmp":0.01,"ams":3350000,"cpc":1.54,"keyword":"apple -10810"}......]
KW model also sends signal I use to create another parent model:
#receiver(post_save, sender=KW)
def grepw(sender, **kwargs):
if kwargs.get('created', False):
id = kwargs['instance'].id
kww = kwargs['instance'].KWuser
# KeyO
a, b = KeyO.objects.get_or_create(defaults={'keyword': kww}, keyword__iexact=kww)
KW.objects.filter(id=id).update(KWF=a.id)
This works but as you can imagine doing thousands of rows at once takes long time and even crashes my tiny server, what bulk options do I have?
As of Django 2.2, bulk_create has an ignore_conflicts flag. Per the docs:
On databases that support it (all but Oracle), setting the ignore_conflicts parameter to True tells the database to ignore failure to insert any rows that fail constraints such as duplicate unique values
This post may be of use to you:
stackoverflow.com/questions/3395236/aggregating-saves-in-django
Note that the answer recommends using the commit_on_success decorator which is deprecated. It is replaced by the transaction.atomic decorator. Documentation is here:
transactions
from django.db import transaction
#transaction.atomic
def lot_of_saves(queryset):
for item in queryset:
modify_item(item)
item.save()
If I understand correctly, "get_or_create" means SELECT or INSERT on the Postgres side.
You have a table with a UNIQUE constraint or index and a large number of rows to either INSERT (if not yet there) and get the newly create ID or otherwise SELECT the ID of the existing row. Not as simple as it may seem on the outside. With concurrent write load, the matter is even more complicated.
And there are various parameters that need to be defined (how to handle conflicts exactly):
How to use RETURNING with ON CONFLICT in PostgreSQL?

What's the reasoning behind result columns being excluded from auto-select statements in PetaPoco

If I have a POCO class with ResultColumn attribute set and then when I do a Single<Entity>() call, my result column isn't mapped. I've set my column to be a result column because its value should always be generated by SQL column's default constraint. I don't want this column to be injected or updated from business layer. What I'm trying to say is that my column's type is a simple SQL data type and not a related entity type (as I've seen ResultColumn being used mostly on those).
Looking at code I can see this line in PetaPoco:
// Build column list for automatic select
QueryColumns = ( from c in Columns
where !c.Value.ResultColumn
select c.Key
).ToArray();
Why are result columns excluded from automatic select statement because as I understand it their nature is to be read only. So used in selects only. I can see this scenario when a column is actually a related entity type (complex). Ok. but then we should have a separate attribute like ComputedColumnAttribute that would always be returned in selects but never used in inserts or updates...
Why did PetaPoco team decide to omit result columns from selects then?
How am I supposed to read result columns then?
I can't answer why the creator did not add them to auto-selects, though I would assume it's because your particular use-case is not the main one that they were considering. If you look at the examples and explanation for that feature on their site, it's more geared towards extra columns you bring back in a join or calculation (like maybe a description from a lookup table for a code value). In these situations, you could not have them automatically added to the select because they are not part of the underlying table.
So if you want to use that attribute, and get a value for the property, you'll have to use your own manual select statement rather than relying on the auto-select.
Of course, the beauty of using PetaPoco is that you can easily modify it to suit your needs, by either creating a new attribute, like you suggest above, or modifying the code you showed to not exclude those fields from the select (assuming you are not using ResultColumn in other join-type situations).

nhibernate and DDD suggestion

I am fairly new to nHibernate and DDD, so please bear with me.
I have a requirement to create a new report from my SQL table. The report is read-only and will be bound to a GridView control in an ASP.NET application.
The report contains the following fields Style, Color, Size, LAQty, MTLQty, Status.
I have the entities for Style, Color and Size, which I use in other asp.net pages. I use them via repositories. I am not sure If should use the same entities for my report or not. If I use them, where I am supposed to map the Qty and Status fields?
If I should not use the same entities, should I create a new class for the report?
As said I am new to this and just trying to learn and code properly.
Thank you
For reports its usually easier to use plain values or special DTO's. Of course you can query for the entity that references all the information, but to put it into the list (eg. using databinding) it's handier to have a single class that contains all the values plain.
To get more specific solutions as the few bellow you need to tell us a little about your domain model. How does the class model look like?
generally, you have at least three options to get "plain" values form the database using NHibernate.
Write HQL that returns an array of values
For instance:
select e1.Style, e1.Color, e1.Size, e2.LAQty, e2.MTLQty
from entity1 inner join entity2
where (some condition)
the result will be a list of object[]. Every item in the list is a row, every item in the object[] is a column. This is quite like sql, but on a higher level (you describe the query on entity level) and is database independent.
Or you create a DTO (data transfer object) only to hold one row of the result:
select new ReportDto(e1.Style, e1.Color, e1.Size, e2.LAQty, e2.MTLQty)
from entity1 inner join entity2
where (some condition)
ReportDto need to implement a constructor that has all this arguments. The result is a list of ReportDto.
Or you use CriteriaAPI (recommended)
session.CreateCriteria(typeof(Entity1), "e1")
.CreateCriteria(typeof(Entity2), "e2")
.Add( /* some condition */ )
.Add(Projections.Property("e1.Style", "Style"))
.Add(Projections.Property("e1.Color", "Color"))
.Add(Projections.Property("e1.Size", "Size"))
.Add(Projections.Property("e2.LAQty", "LAQty"))
.Add(Projections.Property("e2.MTLQty", "MTLQty"))
.SetResultTransformer(AliasToBean(typeof(ReportDto)))
.List<ReportDto>();
The ReportDto needs to have a proeprty with the name of each alias "Style", "Color" etc. The output is a list of ReportDto.
I'm not schooled in DDD exactly, but I've always modeled my nouns as types and I'm surprised the report itself is an entity. DDD or not, I wouldn't do that, rather I'd have my reports reflect the results of a query, in which quantity is presumably count(*) or sum(lineItem.quantity) and status is also calculated (perhaps, in the page).
You haven't described your domain, but there is a clue on those column headings that you may be doing a pivot over the data to create LAQty, MTLQty which you'll find hard to do in nHibernate as its designed for OLTP and does not even do UNION last I checked. That said, there is nothing wrong with abusing HQL (Hibernate Query Language) for doing lightweight reporting, as long as you understand you are abusing it.
I see Stefan has done a grand job of describing the syntax for that, so I'll stop there :-)