Is there a more efficient way to return all records in a "traversed" link'ed list? - sql

I have a generic (non-V) class with a LINKMAP field that links a chain of time-series data. Each record links to the next record based on a key of the next month/day/hour/etc. and a value of the #rid of the next record. I could use edges (non-lightweight, with a single property the same as the LINKMAP key), but I don't want to because I only need a one-direction link and want to use the least amount of storage space (plus my assumption is that while using edges might offer slightly easier querying, it wouldn't offer any benefit in terms of performance or storage space).
I currently have the following batch query:
begin;
let r = select from data where key='AAA';
let y = select expand(links['2017']) from $r;
let m = select expand(links['07']) from $y;
let d = select expand(links['15']) from $m;
let h = select expand(links['10']) from $d;
return (select $r[0], $y[0], $m[0], $d[0], $h[0])
This works, in the sense that I get a result set with the #rids of each record in the link'ed list.
I have two questions:
Is there a more efficient way to do this query? I've tried various TRAVERSE options with $path and such, but nothing else I attempted worked at all. I did not try MATCH because I'm not using edges.
The current result set is flat, with each value prop ($r, $y, etc.) containing the corresponding #rid. That's OK for my purposes, but I'd like to know if there's a way to return the actual records instead of just the #rids. Nothing I've tried works.
PS - I'm using orientjs, but the batch runs identically from the console as well.

Related

Django ORM Cross Product

I have three models:
class Customer(models.Model):
pass
class IssueType(models.Model):
pass
class IssueTypeConfigPerCustomer(models.Model):
customer=models.ForeignKey(Customer)
issue_type=models.ForeignKey(IssueType)
class Meta:
unique_together=[('customer', 'issue_type')]
How can I find all tuples of (custmer, issue_type) where there is no IssueTypeConfigPerCustomer object?
I want to avoid a loop in Python. A solution which solves this in the DB would be preferred.
Background: for every customer and for every issue-type, there should be a config in the DB.
If you can afford to make one database trip for each issue type, try something like this untested snippet:
def lacking_configs():
for issue_type in IssueType.objects.all():
for customer in Customer.objects.filter(
issuetypeconfigpercustomer__issue_type=None
):
yield customer, issue_type
missing = list(lacking_configs())
This is probably OK unless you have a lot of issue types or if you are doing this several times per second, but you may also consider having a sensible default instead of making a config object mandatory for each combination of issue type and customer (IMHO it is a bit of a design-smell).
[update]
I updated the question: I want to avoid a loop in Python. A solution which solves this in the DB would be preferred.
In Django, every Queryset is either a list of Model instances or a dict (values querysets), so it is impossible to return the format you want (a list of tuples of Model) without some Python (and possibly multiple trips to the database).
The closest thing to a cross product would be using the "extra" method without a where parameter, but it involves raw SQL and knowing the underlying table name for the other model:
missing = Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype']
)
As a result, each Customer object will have an extra attribute, "issue_type_id", containing the id of one IssueType. You can use the where parameter to filter based on NOT EXISTS (SELECT 1 FROM appname_issuetypeconfigpercustomer WHERE issuetype_id=appname_issuetype.id AND customer_id=appname_customer.id). Using the values method you can have something close to what you want - this is probably enough information to verify the rule and create the missing records. If you need other fields from IssueType just include them in the select argument.
In order to assemble a list of (Customer, IssueType) you need something like:
cross_product = [
(customer, IssueType.objects.get(pk=customer.issue_type_id))
for customer in
Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype'],
where=["""
NOT EXISTS (
SELECT 1
FROM appname_issuetypeconfigpercustomer
WHERE issuetype_id=appname_issuetype.id
AND customer_id=appname_customer.id
)
"""]
)
]
Not only this requires the same number of trips to the database as the "generator" based version but IMHO it is also less portable, less readable and violates DRY. I guess you can lower the number of database queries to a couple using something like this:
missing = Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype'],
where=["""
NOT EXISTS (
SELECT 1
FROM appname_issuetypeconfigpercustomer
WHERE issuetype_id=appname_issuetype.id
AND customer_id=appname_customer.id
)
"""]
)
issue_list = dict(
(issue.id, issue)
for issue in
IssueType.objects.filter(
pk__in=set(m.issue_type_id for m in missing)
)
)
cross_product = [(c, issue_list[c.issue_type_id]) for c in missing]
Bottom line: in the best case you make two queries at the cost of legibility and portability. Having sensible defaults is probably a better design compared to mandatory config for each combination of Customer and IssueType.
This is all untested, sorry if some homework was left for you.

Neo4J create temp variable within Cypher

So my Top-Level problem is I am trying to return whether a MERGE resulted in the creation of a new Node or not.
In order to do this I was thinking I could just create a simple temp boolean setting it to TRUE using ON CREATE
How I imagine it working:
MERGE(: Person {id:'Tom Jones'})
WITH false as temp_bool
ON CREATE set temp_bool = true
RETURN temp_bool
Obviously this does not work.
I am looking for a way to create arbitrary temp values within a Cypher query, and have the ability to return those variables in the end.
Thanks
You can do what you want, here's how (combination of my first answer, with #cybersam's addition). You just do it with a node property you create and then remove, instead of an unbound variable as you've been trying.
MERGE(tom:Person {id:'Tom Jones'})
ON CREATE set tom.temp_bool = true
ON MATCH set tom.temp_bool = false
WITH tom, tom.temp_bool AS result
REMOVE tom.temp_bool
RETURN result;
In simple merging cases like this where maximum one node could be created, a cleaner way to achieve what you are looking for could be checking the result stats. I case of using Bolt API you should check:
results.consume().counters.nodes_created = 1

How to use LINQ to Entities to make a left join using a static value

I've got a few tables, Deployment, Deployment_Report and Workflow. In the event that the deployment is being reviewed they join together so you can see all details in the report. If a revision is going out, the new workflow doesn't exist yet new workflow is going into place so I'd like the values to return null as the revision doesn't exist yet.
Complications aside, this is a sample of the SQL that I'd like to have run:
DECLARE #WorkflowID int
SET #WorkflowID = 399 -- Set to -1 if new
SELECT *
FROM Deployment d
LEFT JOIN Deployment_Report r
ON d.FSJ_Deployment_ID = r.FSJ_Deployment_ID
AND r.Workflow_ID = #WorkflowID
WHERE d.FSJ_Deployment_ID = 339
The above in SQL works great and returns the full record if viewing an active workflow, or the left side of the record with empty fields for revision details which haven't been supplied in the event that a new report is being generated.
Using various samples around S.O. I've produced some Entity to SQL based on a few multiple on statements but I feel like I'm missing something fundamental to make this work:
int Workflow_ID = 399 // or -1 if new, just like the above example
from d in context.Deployments
join r in context.Deployment_Reports.DefaultIfEmpty()
on
new { d.Deployment_ID, Workflow_ID }
equals
new { r.Deployment_ID, r.Workflow_ID }
where d.FSJ_Deployment_ID == fsj_deployment_id
select new
{
...
}
Is the SQL query above possible to create using LINQ to Entities without employing Entity SQL? This is the first time I've needed to create such a join since it's very confusing to look at but in the report it's the only way to do it right since it should only return one record at all times.
The workflow ID is a value passed in to the call to retrieve the data source so in the outgoing query it would be considered a static value (for lack of better terminology on my part)
First of all don't kill yourself on learning the intricacies of EF as there are a LOT of things to learn about it. Unfortunately our deadlines don't like the learning curve!
Here's examples to learn over time:
http://msdn.microsoft.com/en-us/library/bb397895.aspx
In the mean time I've found this very nice workaround using EF for this kind of thing:
var query = "SELECT * Deployment d JOIN Deployment_Report r d.FSJ_Deployment_ID = r.Workflow_ID = #WorkflowID d.FSJ_Deployment_ID = 339"
var parm = new SqlParameter(parameterName="WorkFlowID" value = myvalue);
using (var db = new MyEntities()){
db.Database.SqlQuery<MyReturnType>(query, parm.ToArray());
}
All you have to do is create a model for what you want SQL to return and it will fill in all the values you want. The values you are after are all the fields that are returned by the "Select *"...
There's even a really cool way to get EF to help you. First find the table with the most fields, and get EF to generated the model for you. Then you can write another class that inherits from that class adding in the other fields you want. SQL is able to find all fields added regardless of class hierarchy. It makes your job simple.
Warning, make sure your filed names in the class are exactly the same (case sensitive) as those in the database. The goal is to make a super class model that contains all the fields of all the join activity. SQL just knows how to put them into that resultant class giving you strong typing ability and even more important use-ability with LINQ
You can even use dataannotations in the Super Class Model for displaying other names you prefer to the User, this is a super nice way to keep the table field names but show the user something more user friendly.

Orientdb sql auto increment: id is always null (sql batch, update increment, variables)

I was looking at another question in stackoverflow regarding auto increment fields in orientdb, where one of the answers was to create our own vertex with counter field.
However, when I'm trying to execute the following code (both java api and console script batch), It is not working.
Do note however that the id is returned good (did some debug attempts, returning the id variable only), and the vertex is created.
However, the vertex id is always null (unless I set it explicit, that is).
The script:
script sql
LET id = UPDATE CCounter INCREMENT value=1 RETURN AFTER $current WHERE name='session'
LET csession = CREATE VERTEX CDate SET id=$id.result, meet_date='2015-01-01 15:23:00'
end
I tried playing around with $id and $current , but nothing seems to work.
Currently I am doing it in a 2-transaction mode; one to get the id, and another to create the vertex. I really hope there is a better way though.
P.S.
I am using version 2.0-M2
You should execute
LET csession = CREATE VERTEX CDate SET id=$id.value, meet_date='2015-01-01 15:23:00'
Note the $id.value in place of $id.result.
As stated in a comment to Lvca, the answer was:
(1) Change the field name. Id seems to be reserved (ish). It's probably possible to still bypass it and use a field named 'id', but I didn't want to mess around with it.
(2) From some reason, the result was a collection (shown as '[id]'). It took me some time to figure it out, but I just had to choose the first value from it.
(3) Also, there are 2 'values' here. One of the field ($current.value), and the second one(not sure where it's coming from).
Final solution that works:
script sql
LET id = UPDATE CCounter INCREMENT value = 1 RETURN AFTER $current.value WHERE name='session'
LET csession = CREATE VERTEX CDate SET data_id=$id[0].value, meet_date='2015-01-01 15:23:00'
end

Magento Bulk update attributes

I am missing the SQL out of this to Bulk update attributes by SKU/UPC.
Running EE1.10 FYI
I have all the rest of the code working but I"m not sure the who/what/why of
actually updating our attributes, and haven't been able to find them, my logic
is
Open a CSV and grab all skus and associated attrib into a 2d array
Parse the SKU into an entity_id
Take the entity_id and the attribute and run updates until finished
Take the rest of the day of since its Friday
Here's my (almost finished) code, I would GREATLY appreciate some help.
/**
* FUNCTION: updateAttrib
*
* REQS: $db_magento
* Session resource
*
* REQS: entity_id
* Product entity value
*
* REQS: $attrib
* Attribute to alter
*
*/
See my response for working production code. Hope this helps someone in the Magento community.
While this may technically work, the code you have written is just about the last way you should do this.
In Magento, you really should be using the models provided by the code and not write database queries on your own.
In your case, if you need to update attributes for 1 or many products, there is a way for you to do that very quickly (and pretty safely).
If you look in: /app/code/core/Mage/Adminhtml/controllers/Catalog/Product/Action/AttributeController.php you will find that this controller is dedicated to updating multiple products quickly.
If you look in the saveAction() function you will find the following line of code:
Mage::getSingleton('catalog/product_action')
->updateAttributes($this->_getHelper()->getProductIds(), $attributesData, $storeId);
This code is responsible for updating all the product IDs you want, only the changed attributes for any single store at a time.
The first parameter is basically an array of Product IDs. If you only want to update a single product, just put it in an array.
The second parameter is an array that contains the attributes you want to update for the given products. For example if you wanted to update price to $10 and weight to 5, you would pass the following array:
array('price' => 10.00, 'weight' => 5)
Then finally, the third and final attribute is the store ID you want these updates to happen to. Most likely this number will either be 1 or 0.
I would play around with this function call and use this instead of writing and maintaining your own database queries.
General Update Query will be like:
UPDATE
catalog_product_entity_[backend_type] cpex
SET
cpex.value = ?
WHERE cpex.attribute_id = ?
AND cpex.entity_id = ?
In order to find the [backend_type] associated with the attribute:
SELECT
  backend_type
FROM
  eav_attribute
WHERE entity_type_id =
  (SELECT
    entity_type_id
  FROM
    eav_entity_type
  WHERE entity_type_code = 'catalog_product')
AND attribute_id = ?
You can get more info from the following blog article:
http://www.blog.magepsycho.com/magento-eav-structure-role-of-eav_attributes-backend_type-field/
Hope this helps you.