Issue about huge query constructing huge object?

Issue about huge query constructing huge object? - sql

I am working on a company's legacy project. There's a huge object which is constructed by a super long query. The query somehow looks like this.
SELECT *
FROM item i
JOIN item_product prod on prod.item = i.id
LEFT JOIN product_shippingaddress ps on pa.product = prod.id
LEFT JOIN product_packageinfo pp on pp.product = prod.id
.
.
.
(80 lines of query)
WHERE item.id = #itemId
Which is a very long query involves many information about this product.
It constructs a huge object 'Item' which provides all kind of information of the item.
Taking the below example of how things works currently
int itemId = createItem(); //creates a record in item table
associateAddressToItem(item); // Add a shipping address to item
for(int productId in productsToAdd){
addProduct(itemId); // insert info into item_product table
}
Item item = getItem(item); // This function invokes the huge query to collect information of a single item
for(Product product in item.products){
UpdateItemPrice(product.Id,itemId);
if (product.shippable())
{
addItemTax(itemId, product.Id); // add tax based on address and product attribute
}
}
item = getItem(itemId); // calls the query to update information of the object
charge(item); //charge based on item's price and tax
This case invokes the function getItem twice. I don't think this is efficient at all, as long as it runs a huge query twice. However, I think it is needed currently because it needs to fetch information stored in database to update the object.
Is there a better way to handle this kind of situation? I feel it is not optimized but I can come up with a way to improve it.

As long as I got your question right, here's what you do:
Creating an empty record in the table.
Filling in a table (item_product) with some data that you got.
Requesting that data again.
Updating the data.
Requesting that data again. x2
Updating the data. x2
And all of that is done querying the database.
What I suggest you to do:
Update the Item object in memory first, do as much calculation as you can without hitting the database.
Request an ID for the new record.
Update data in the database. Once.
Thus, most of the calculation is performed in memory without running long queries and requesting information you already have.

Related

SSIS Inserting incrementing ID with starting range into multiple tables at a time

Is there are one or some reliable variants to solve easy task?
I've got a number of XML files which will be converting into 6 SQL tables (via SSIS).
Before the end of this process i need to add a new (in fact - common for all tables) column (or field) into each of them.
This column represents ID with assigning range and +1 incrementing step. Like (350000, 1)
Yes, i know how to solve it on SSMS SQL stage. But i need a solution at SSIS's pre-SQL converting lvl.
I'm sure there should be well-known pattern-solutions to deal with it.

I am going to take a stab at this. Just to be clear, I don't have a lot of information in your question to go on.
Most XML files that I have dealt with have a common element (let's call it a customer) with one to many attributes (this can be invoices, addresses, email, contacts, etc).
So your table structure will be somewhat star shaped around the customer.
So your XML will have a core customer information on a 1 to 1 basis that can be loaded into a single main table, and will have array information of invoices and an array of addresses etc. Those arrays would be their own tables referencing the customer as a key.
I think you are asking how to create that key.
Load the customer data first and return the identity column to be used as a foreign key when loading the other tables.
I find it easiest to do so in script component. I'm only going to explain how to get the key back. I personally would handle the whole process in C# (deserializing and all).
Add this to Using Block:
Using System.Data.OleDB;
Add this into your main or row processing depending on where the script task / component is:
string SQL = #"INSERT INTO Customer(CustName,field1, field2,...)
values(?,?,?,...); Select cast(scope_identity() as int);";
OleDBCommanad cmd = new OleDBCommand();
cmd.CommandType = System.Data.CommandType.Text;
cmd.CommandText = SQL;
cmd.Parameters.AddWithValue("#p1",[CustName]);
...
cmd.Connection.Open();
int CustomerKey = (int)cmd.ExecuteScalar(); //ExecuteScalar returns the value in first row / first column which in our case is scope_identity
cmd.Connection.Close();
Now you can use CustomerKey for all of the other tables.

SQL Selecting Matching Data from Another Table where matching columns have the same data

So I'm making a database of orders and I want to create a list of options for a dropdown box for an order form, but the list needs to be derived from a list of Services in another table, but it needs to match a preceeding Catagory Dropdown.
I'm not sure where I'm going wrong. The query is below.
SELECT ServiceTypes.ServiceType
FROM ServiceTypes
WHERE Orders.ServiceCatagory = ServiceTypes.ServiceType;
So orders is the db of Orders I want to pull the list into
ServiceTypes has the lists of of ServiceTypes I want to pull. It's composed of ServicType and ServiceCatagory. I need to limit the list of Service Types based on the Service Types in Orders.
So if someone selects the ServiceCatagory in orders to be "InstalL", the only results from ServiceTypes I want are those that have a ServiceCatagory that equals the Orders Service Catagory.
I suspect I need to join but I'm not sure how or what kind.
-update-
I should point out, I'm doing this in Access and just trying to populate a listbox.
My new query looks like this
SELECT ServiceTypes.ServiceType
FROM ServiceTypes
INNER JOIN Orders ON Orders.ServiceCatagory = ServiceTypes.ServiceCatagory;
Still not sure if that's right

So I've tested some of the SQL i've been using to get the information and I think I've actually got the correct code
SELECT ServiceTypes.ServiceType
FROM ServiceTypes
WHERE ServiceTypes.ServiceCatagory = Orders.ServiceCatagory
The real problem was my implementation in Access and where the query was being called.
I didn't need to use a JOIN, I just need to change where the call is being made. I.e. after the Catagory has been selected and not at the Database design level.

Simplest way to persist data in Azure - recommended options?

I'm giving Azure a go with MVC4, and have the simplest data storage requirement I can think of. Each time my controller is hit, I want to record the datetime and a couple of other details to some location. There will most likely be only a few thousand hits at most per month. Then I'd like to view a page telling me how many hits (rows appended) there are.
Writing to a text file in the folder with Server.MapPath... gives permission errors and seems not to be possible due to the distributed nature of it. Getting a whole SQL instance is $10 a month or so. Using the table or blob storage sounds hopeful, but setting up the service and learning to use those seems nowhere near as simple as just a basic file or DB.
Any thoughts would be appreciated.

Use TableStorage. For all intents and purposes it's free (it'll be pennies per month for that amount of volume a fraction of your web roles anyway).
As for how complicated you think it is, it's really not. Have a look at this article to get going. http://www.windowsazure.com/en-us/develop/net/how-to-guides/table-services/#create-table
//Create a class to hold your data
public class MyLogEntity : TableEntity
{
public CustomerEntity(int id, DateTime when)
{
this.PartitionKey = when;
this.RowKey = id;
}
public MyLogEntity () { }
public string OtherProperty { get; set; }
}
//Connect to TableStorage
var connstr = CloudConfigurationManager.GetSetting("StorageConnectionString") //Config File
var storageAccount = CloudStorageAccount.Parse(connstr);
// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
// Create the table if it doesn't exist.
var table = tableClient.GetTableReference("MyLog");
table.CreateIfNotExists();
var e= new MyLogEntity (%SOMEID%, %SOMEDATETIME%);
e.OtherValue = "Some Other Value";
// Create the TableOperation that inserts the customer entity.
var insertOperation = TableOperation.Insert(customer1);
// Execute the insert operation.
table.Execute(insertOperation);

Augmenting #Eoin's answer a bit: When using table storage, tables are segmented into partitions, based on the partition key you specify. Within a partition, you can either search for a specific row (via row key) or you can scan the partition for a group of rows. Exact-match searching is very, very fast. Partition-scanning (or table-scanning) can take a while, especially with large quantities of data.
In your case, you want a count of rows (entities). Storing your rows seems pretty straightforward, but how will you tally up a count? By day? By month? By year? It may be worth aligning your partitions to a day or month to make counting quicker (there's no function that returns number of rows in a table or partition - you'd end up querying them).
One trick is to keep an accumulated value in another table, each time you write a specific entity. This would be very fast:
Write entity (similar to what Eoin illustratecd)
Read row from Counts table corresponding to the type of row you wrote
Increment and write value back
Now you have a very fast way to retrieve counts at any given time. You could have counts for individual days, specific months, whatever you choose. And for this, you could have the specific date as your partition key, giving you very fast access to the correct entity holding the accumulated count.

CRM 2011 sdk - quickest/most efficient way to get contact info + associated list IDs

Our CRM 2011 database contains approx. 20000 contacts, and I need to loop through all of them using the SDK. Currently I'm finding the following linq query takes a very long time to execute:
Dim contactList = From c In orgService.ContactSet
Select New With {
Key .ContactId = c.ContactId,
Key .EMailAddress1 = c.EMailAddress1,
Key .ListIds = From l In c.listcontact_association Select l.ListId
}
As you can see, I just need a couple of fields from each Contact, and a list of associated Marketing List IDs. Perhaps it's taking a long time because it's doing an additional query (to get the list IDs) within each contact result?
I'm fairly new to Linq, so not sure how the above translates to actual FetchXML communication. Is there a more efficient way of getting that info which would result in shorter query run time?
More Info: I'm writing code to sync a CRM database with a CreateSend database. So I do need to go through all contact records, not only adding to the CS database, but also reflecting changes in list membership and updating activity or other info for each contact where needed. The sync process will eventually run nightly on the CRM server itself, so it's expected to take time to run, but of course I want to make it as efficient as possible.

In the absence of more information, it looks like your query is actually two - one query to get the list of contacts, and then another to get the list of list ids. This means SQL Server is returning the full list of 20,000 contacts, then for each contact, your code asks SQL Server for the list of associated list ids.
This means you are making 20,000 + 1 separate calls to SQL Server (though this is actually an abstracted total - the CRM SQL translator actually makes more than that, but it doesn't add significant overhead).
So what you really want to do is make just 1 query that gets you all your data, and then begin to work with it. I'm not proficient enough in VB.NET to make the translation, but the below C# code should get you there most of the way.
// Gets all contacts and list ids in a flat result set
var contacts = from c in orgService.ContactSet
join lms in orgService.ListMemberSet on c.ContactId equals lms.EntityId.Id into leftJoinedContact
from ljc in leftJoinedContact.DefaultIfEmpty()
where ljc.EntityType == Xrm.Contact.EntityLogicalName
select new
{
c.ContactId,
c.EMailAddress1,
ljc.ListId
};
// Calls .ToList() to put the result set in memory
var contactList = contacts.ToList();
// Manipulates the result set in memory, grouping by contact info
var newContactList = contactList.GroupBy(x => new {x.ContactId, x.EMailAddress1})
.Select(x => new {x.Key.ContactId, x.Key.EMailAddress1, Ids = x.Select(y => y.ListId).ToList()})
.ToList();
// var contactsWithListIds = newContactList.Where(x => x.Ids.Where(y => y != null).Any()).ToList();
// var contactsWithoutListIds = newContactList.Where(x => x.Ids.Where(y => y != null).Any()).ToList();
foreach (var contact in newContactList)
{
throw new NotImplementedException();
}

Yes!
But we need more information. The fastest way is to use straight SQL (i.e. filtered views).
The second fastest way is to use straight FetchXML (although some may debate that linq is on par with this in terms of performance).
If you can filter your query to reduce the 20,000 records to only the records you need, you'll save the most time however. So the first question I have is why are you iterating through the 20,000 records? Do you need to process them all or do you need to check them all for certain criteria and then do something based on if they match the criteria?

MVC creating SQL views vs. putting calculations in the controller

I applied for a job, they required me to create a small MVC app before interview which I did. They rejected it saying that I had used bad practices. Please help me figure out what I did wrong!
The task involved a simple database with a Product and a load of Sales (a one to many).
I had to:
display a list of products with total sales
display and allow edition and deletion of the sales
My solution:
create a left joined SQL view which used a group and join to itself to get the sale totals. This table had one row per product
Create an inner joined SQL view with all of the product and sales data. This had one row per sale.
For #1, I just rendered out the view
For #2, I had to render out product details and sales details (a one to many on a single page) so I did the following in the controller:
public ActionResult Details(int id)
{
// get details for the selected product
var product = db.ProductsWithTotals.Where(q => q.ProductId == id).Single();
ViewData["CatalogueNumber"] = product.CatalogueNumber;
ViewData["Title"] = product.Title;
ViewData["Artist"] = product.Artist;
ViewData["TotalSold"] = product.TotalSold;
ViewData["ProductId"] = product.ProductId;
// then pass its sales lines to the view
var salesLines = db.SalesLineDetails.Where(q => q.ProductId == id);
return View(salesLines);
}
If anyone could explain how I could have done this more gracefully, it'd be greatly appreciated.

The only thing I personally would have done differently is:
Use EF (but yeah, L2S is fine for a quickie like this)
Create a ViewModel class for the details View. I use ViewData as sparingly as possible.
But yeah, nothing bad here and I agree that it was probably not a good place to work at. People also tend to make up vague reasons if they didn't want to say they didn't like your face.
/shrug

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Issue about huge query constructing huge object? - sql

Related

SSIS Inserting incrementing ID with starting range into multiple tables at a time

SQL Selecting Matching Data from Another Table where matching columns have the same data

Simplest way to persist data in Azure - recommended options?

CRM 2011 sdk - quickest/most efficient way to get contact info + associated list IDs

MVC creating SQL views vs. putting calculations in the controller

Categories

Resources