Why am I getting different results via LINQ to Entities that via run of SQL generated by the same query? - sql

I'm working on a school project that was started by another group last semester. This semester I'm on a team that is tasked with completing this project. There are ZERO common people between the groups .... my team is a completely new team attempting to finish another teams project with little to no documentation.
Anyway, with that background out of the way, I am having an issue with the project. My entity framework seems to not like the views I have created. It may also be worth mentioning that when creating this view, it is a complex view and was created by joining about 6-7 tables
As an arbitrary test (i dont really need answers that have "what" in them), I have executed this query in SQL Management Studio
SELECT *
FROM [dbo].[Course_Answers_Report] -- Course_Answers_Report is a View
WHERE question like '%what%'
Which produces the following output:
survey_setup_id | course_number | crn_number | term_offered | course_title | Instructor_Name | question_type_id | question | answer
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I understood what the teacher was saying. | A
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I can apply what I learned in this class. | A
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I understood what was expected of me in this course. | A
Now in Visual Studio i have this small bit of code (as a small side note this is in MVC, however the issue doesn't lie in MVC, but rather somewhere in the LINQ, Entity, or Controller.....this has been decided by doing some debugging).
public ActionResult modelTest()
{
using (SurveyEntities context = new SurveyEntities())
{
context.Database.Log = s => System.Diagnostics.Debug.WriteLine(s);
var questions = context
.Course_Answers_Report
.Where(r => r.question.Contains("what"))
.ToList();
ViewBag.Questions = questions;
}
}
This outputs the following table on the View (again, the problem is decidedly not in the View because when debugging, the var that holds the List has all incorrect data)
survey_setup_id | course_number | crn_number | term_offered | course_title | Instructor_Name | question_type_id | question | answer
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I understood what the teacher was saying. | A
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I understood what the teacher was saying | A
2617 | 107013 | 5001 | 201505 | Advanced Microsoft Access | -output ommited- | 2 | I understood what the teacher was saying. | A
As you can see, this output is incorrect as the question (or rather the record) never changes when it should be
The SQL generated by this linq statement is
SELECT
[Extent1].[survey_setup_id] AS [survey_setup_id],
[Extent1].[course_number] AS [course_number],
[Extent1].[crn_number] AS [crn_number],
[Extent1].[term_offered] AS [term_offered],
[Extent1].[course_title] AS [course_title],
[Extent1].[Instructor_Name] AS [Instructor_Name],
[Extent1].[question_type_id] AS [question_type_id],
[Extent1].[question] AS [question],
[Extent1].[answer] AS [answer]
FROM (SELECT
[Course_Answers_Report].[survey_setup_id] AS [survey_setup_id],
[Course_Answers_Report].[course_number] AS [course_number],
[Course_Answers_Report].[crn_number] AS [crn_number],
[Course_Answers_Report].[term_offered] AS [term_offered],
[Course_Answers_Report].[course_title] AS [course_title],
[Course_Answers_Report].[Instructor_Name] AS [Instructor_Name],
[Course_Answers_Report].[question_type_id] AS [question_type_id],
[Course_Answers_Report].[question] AS [question],
[Course_Answers_Report].[answer] AS [answer]
FROM [dbo].[Course_Answers_Report] AS [Course_Answers_Report]) AS [Extent1]
WHERE [Extent1].[question] LIKE N'%what%'
When this SQL is run inside SQL management studio, it produces proper results. I am at a loss as to why EF is behaving this way, can anyone offer insight
EDIT: Per request of Danny Varod, the EDMX can be found here http://pastebin.com/dUf6J4fV and the View can be found here http://pastebin.com/sCsqNYWc (the view is kind of ugly/sloppy as it was just supposed to be a test and experiment)

Your problem is visible in the edmx file;
warning 6002: The table/view 'wctcsurvey.dbo.Course_Answers_Report' does not have a primary key defined. The key has been inferred and the definition was created as a read-only table/view.
<EntityType Name="Course_Answers_Report">
<Key>
<PropertyRef Name="survey_setup_id" />
</Key>
You have not defined a primary key in the table, so one has been "guessed". Since the guessed column survey_setup_id is not unique in the table (all 3 rows in the correct result have the same value), EF will get confused and fetch the same object 3 times (it has the same guessed primary key after all).
If you add a correct primary key annotation to your model (ie a unique field), the problem will disappear.

Related

PowerBI Report or SQL Query Grouping Data Spanning Columns

I'm wracking my brain trying to figure this out. I have a dataset / table that looks like this:
ID | Person1 | Person2 | Person3 | EffortPerPerson
01 | Bob | Ann | Frank | 2
02 | Frank | Bob | Joe | 3
03 | Ann | Joe | Beth | 1
I'm trying add up "Effort" for each person. For example, Bob is 2+3, Joe is 3+1, etc. My goal is to produce a PowerBI scatter plot showing total Effort for each person.
In a perfect world, the query shouldn't care how many "Person" fields there are. It should just count up the Effort value for every row that the individual's name appears.
I thought GROUP BY would work, but obviously that's only for one column, and I can't wrap my head around how to make nested queries work here.
Any one have any ideas? Thanks in advance!
As Nick suggested, you should go with the Unpivot transformation. Go to Edit Queries and select Transform tab:
Select columns you want to transform in rows, open dropdown menu under Unpivot Columns and select "Unpivot Only Selected Columns":
And that's it! Power BI will aggregate values for you:

Conditionally Relate New Tables

I run a store that sells cigarettes, which have certain promotions given to us by the manufacturer that effect their pricing. These promotions are organized into groups of products (like all menthol or all reds) and are subject to frequent change, making them a bear to manage. My end goal here is to create a table(s) that will help me track these promotions and run an UPDATE query that will adjust their prices.
I have table inventory like
itemnum|dept_id|cost |price
-----------------------------
123 | cig | 2.6 | 3.4
234 | 401 | 2.22| 23.4
345 | cig | 3.33| 3.45
456 | cig | 4.00| 4.56
567 | 901 | 4.5 | 5.67
678 | cig | 4.1 | 6.25
789 | cig | 5.2 | 6.25
My initial thought was creating a set of new tables like
CigGroup
Brand | Group_id | Itemnum
-------------------------------
Altria| a_men | 123
Altria| a_men | 345
Altria| a_black | 456
RJR | r_crush | 678
RJR | r_crush | 789
And
CigGroup_Promo
Group_id |promo_1|promo_2|promo_n...|net_promo|
--------------------------------------------
a_men | .5 | 1 | .1 | 1.6 (promo_1 + ...promo_n...)
a_red | .25 | 1 | NULL | 1.25
a_black | .25 | .5 | .1 | .85
r_crush | .25 | .1 | NULL | .35
r_filter | .35 | .5 | NULL | .85
I thought that maybe I could do something conditionally with foreign keys and set Cig_Group.Itemnum to reference inventory.itemnum only when inventory.itemnum = 'cig', though from SQL Server Conditional Foreign Key
I gathered that this might not be possible. (I've also looked into composite keys, but not sure how to apply this to my data)
So, here are my questions:
First, is it possible to populate my new table(s) (however that ends up being structured) with inventory.itemnum only when inventory.dept_id = 'cig' ?
Second, can i set CigGroup_Promo.Net_Promo as a function of promo_1, promo_2, promo_n..., or is that yet another table that I would be creating?
Any suggestions on how to structure tables for these data and how to relate them would be greatly appreciated.
Side note: I could, instead of creating CigGroup, create new values for inventory.dept_id, which I would honestly prefer not to do, but might make things simpler.
Once all the tables are created and related, I'm hoping to be able to run something like:
UPDATE inventory i SET price =
CASE WHEN 1.07 * (i.cost - g.net_promo) >= .5 + (i.cost - g.net_promo)
THEN 1.07 * (i.cost - g.net_promo)
ELSE .5 + (i.cost - g.net_promo)
END
FROM inventory i JOIN GigGroup g ON i.itemnum = g.itemnum
JOIN CigGroup_Promo p ON g.group_id = p.group_id
Looks to me like there are multiple solutions for design available that would depend on how the source data is loaded and whether you require to track all periodic changes (in which case your model will need datetime-support).
There may be a variety of options, but I would explore a Star Schema design which would entail building your wide and descriptive dimension tables to link with a PKey - FKey relationship to a central Fact table that records all your transactions (in your case that would be the various "promotion" prices that need to be tracked).
In your example based on my comprehension i would opt for a star schema design with dimensions for item, brandGroup and any other required dimensions along with a fact table for tracking inventory and another fact table for tracking price updates. By designing the tables to a conformed dimensional model we can do all types of analysis across this new warehouse.
With regards to your "CigGroup"table specifically, I would create a table for "Items" with the most granular SKU / item on sale, which can then be structured into a hierarchy using attributes, or new columns in the table.

How to join between table DurationDetails and Table cost per program

How to design database for tourism company to calculate cost of flight and hotel per every program tour based on date ?
what i do is
Table - program
+-----------+-------------+
| ProgramID | ProgramName |
+-----------+-------------+
| 1 | Alexia |
| 2 | Amon |
| 3 | Sfinx |
+-----------+-------------+
every program have more duration may be 8 days or 15 days only
it have two periods only 8 days or 15 days .
so that i do duration program table have one to many with program .
Table - ProgramDuration
+------------+-----------+---------------+
| DurationNo | programID | Duration |
+------------+-----------+---------------+
| 1 | 1 | 8 for Alexia |
| 2 | 1 | 15 for Alexia |
+------------+-----------+---------------+
And same thing to program amon program and sfinx program 8 and 15 .
every program 8 or 15 have fixed details for every day as following :
Table Duration Details
+------+--------+--------------------+-------------------+
| Days | Hotel | Flight | transfers |
+------+--------+--------------------+-------------------+
| Day1 | Hilton | amsterdam to luxor | airport to hotel |
| Day2 | Hilton | | AbuSimple musuem |
| Day3 | Hilton | | |
| Day4 | Hilton | | |
| Day5 | Hilton | Luxor to amsterdam | |
+------+--------+--------------------+-------------------+
every program determine starting by flight date so that
if flight date is 25/06/2017 for program alexia 8 days it will be as following
+------------+-------+--------+----------+
| Date | Hotel | Flight | Transfer |
+------------+-------+--------+----------+
| 25/06/2017 | 25 | 500 | 20 |
| 26/06/2017 | 25 | | 55 |
| 27/06/2017 | 25 | | |
| 28/06/2017 | 25 | | |
| 29/06/2017 | 25 | 500 | |
+------------+-------+--------+----------+
And this is actually what i need how to make relations ship to join costs with program .
for flight and hotel costs as above ?
for 5 days cost will be 1200
25 is cost per day for hotel Hilton
500 is cost for flight
20 and 55 is cost per transfers
image display what i need
relation between duration and cost
Truthfully, I don't fully understand exactly what you're trying to accomplish. Your description is not clear, your tables seem to be missing information / contain information that should not be in your tables, and the way that I'm understanding your description doesn't really make sense based on the UI screenshot that you shared.
It looks like you're working on an application for a travel agency which will allow agents to create an itinerary for a trip. They can give this trip a name (so if a particular package is a hit with customers, they can just offer the "Alexa" package), and the utility will calculate the total estimated cost of the trip. If I understand correctly, the trips will be either 8, or 15 days long.
Personally, I would delete the "ProgramDuration" table altogether. If there are two versions of the Alexa trip at index 1, then you're going to run into all manners of issues. I can get into the details of why this is a bad idea, but unless you're really hung up on having this ProgramDuration table, it's not worth the time. You should add a "duration" field to your "program" table, and assign a new ProgramID for each different duration version of the "Alexa" program.
Your table "Duration details" also misses the mark. Your fields in this table will make it harder to add new features to your application down the line. You should have a field "ProgramID," which we will use to join this table against the program table later. You should have a field "Day" which obviously indicates the day in the itinerary. You should have only one more field "ItemID." We're going to use the "ItemID" field to join your itinerary against a new items table we're going to create.
Your items table is where you define all of the items that can possibly appear in an itinerary. Your current itinerary table has three possible "types" of expenses, flights, hotels, and transfers. What if your travel agents want to start adding meal expenditures into their itineraries / budgets? What about activities that cost money? What about currency exchange fees? What about items that your clientele will need before their trip (wall adapters, luggage, etc.)? In your items table, you will have fields for an ItemID, ItemName, ItemUnitPrice, and ItemType. A possible item is as follows:
ItemID: 1, ItemName: Night At The Hilton, ItemUnitPrice: 300, ItemType: Lodging
Using the "SELECT [Column] AS [Alias]" syntax with some CTEs or subqueries and the JOIN operator, we can easily reconstitute a table that looks like your "Program Duration Details" table, but we will be afforded considerably more flexibility to add or remove things later down the line.
In the interests of security and programmability, I would also add a table called "ItemTypeTable" with a single field "TypeName." You can use this table to prevent unauthorized users from defining new item types, and you can use this table to create drop down menus, navigation, and all manners of other useful features. There might be cleaner implementations, but this shouldn't represent a serious performance or size hit.
All in all, at the risk of being somewhat rude, it seems like you're trying to take on a rather large, sophisticated task with a very rudimentary understanding of basic relational database design and implementation. If you are doing this in a professional context, I would strongly encourage you to consider consulting with another professional that may be more experienced in this area.

Doing BULK Insert SUSPENDED With Wait Type LCK_M_RIn_LN

I'm having awful problems with doing BULK insert. I'm actually using SqlBulkCopy to insert a number of rows into a table. At first, I would get a Timeout exception. So, I set the SqlBulkCopy's BulkCopyTimeout to a ridiculous[?] 1800 seconds. The exception wouldn't be thrown (yet). So, I checked the Activity Monitor (as suggested here: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. The statement has been terminated) from the MS Server management studio and saw that my BULK INSERT's Task Status is SUSPENDED with a Wait Type of LCK_M_RIn_LN.My code goes like this:
Using sqlCon As SqlConnection = connection.Connect()
Dim sqlBulkCopy As New SqlBulkCopy(sqlCon, SqlBulkCopyOptions.CheckConstraints And
SqlBulkCopyOptions.FireTriggers And
SqlBulkCopyOptions.KeepNulls And
SqlBulkCopyOptions.KeepIdentity, sqlTran)
sqlBulkCopy.BulkCopyTimeout = 1800 ' is this ridiculous?
sqlBulkCopy.BatchSize = 1000
sqlBulkCopy.DestinationTableName = destinationTable
sqlBulkCopy.WriteToServer(dataTableObject)
sqlTran.Commit()
End Using
I have been searching for solutions in the web, but to no avail. Although I have found this defintion of LCK_M_RIn_LN:
Occurs when a task is waiting to acquire a NULL lock on the current key value, and an Insert Range lock between the current and previous key. A NULL lock on the key is an instant release lock. For a lock compatibility matrix, see sys.dm_tran_locks (Transact-SQL).
from http://msdn.microsoft.com/en-us/library/ms179984.aspx
But it's not helping. May someone help me out. My deepest gratitude.
Edit
I think it's because of the KeepIdentity attribute because the primary key is auto incremented. This is according to SqlBulkCopy Insert with Identity Column. I'll see if it fixes my issue.
Edit 2
I don't know what's happening but BULK insert worked fine when I tested it on the management studio (using direct transact-sql). I don't know. Maybe it's with the SqlBulkCopy. When I checked on the Activity Monitor, the query it generated was this:
insert bulk TableName ([ColumnName] Int)
Edit 3
I forgot to write that I'm actually using Entity Framework so I copied a code (translated from c# to vb, actually) that would create a DataTable from an entity object since EntityDataReader is only available for C# (which distressed me). But, anyway. I trashed the SqlBulkCopy thing and just stored the values in XML because when I look at it, I realized I did not need the values inside a database.
I hit something similar trying to bulk insert from Java but with wait type ASYNC_NETWORK_IO e.g.
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
| Status | BlkBy | Command | CPUTime | DiskIO | LastBatch | ProgramName |
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
| SUSPENDED | . | BULK INSERT | 15 | 4 | 09/16 02:42:04 | Microsoft JDBC Driver for SQL Server |
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
It's hard to say what the exact issue was, there are a few things I observed:
Either is the driver swallows errors or you only get them when the copy completes, e.g. when I tried to insert a single row I had exceptions thrown with the errors I needed to fix.
Tuning can be important, specifically the batch size (see https://dba.stackexchange.com/questions/165966/how-does-one-investigate-the-performance-of-a-bulk-insert-statement)
Once I'd addressed these then the full load worked as expected.
Some stats for batch size/rows I generated (note the data is going across the Atlantic) but the point is that the performance is very variable.
+------------+------+----------+----------+----------+
| batch size | rows | start | end | duration |
+------------+------+----------+----------+----------+
| 100 | 2500 | 09:15:45 | 09:18:17 | 00:02:32 |
| 1000 | 2500 | 09:23:34 | 09:25:35 | 00:02:00 |
| 2500 | 2500 | 09:32:53 | 09:34:55 | 00:02:01 |
| 2500 | 7500 | 10:27:18 | 10:30:49 | 00:03:31 |
| 7500 | 7500 | 10:38:10 | 10:45:57 | 00:07:47 |
+------------+------+----------+----------+----------+

How To Traverse a Tree/Work With Hierarchical data in SQL Code

Say I have an employee table, with a record for each employee in my company, and a column for supervisor (as seen below). I would like to prepare a report, which lists the names and title for each step in a supervision line. eg for dick robbins, 1d #15, i'd like a list of each supervisor in his "chain of command," all the way to the president, big cheese. I'd like to avoid using cursors, but if that's the only way to do this then that's ok.
id fname lname title supervisorid
1 big cheese president 1
2 jim william vice president 1
3 sally carr vice president 1
4 ryan allan senior manager 2
5 mike miller manager 4
6 bill bryan manager 4
7 cathy maddy foreman 5
8 sean johnson senior mechanic 7
9 andrew koll senior mechanic 7
10 sarah ryans mechanic 8
11 dana bond mechanic 9
12 chris mcall technician 10
13 hannah ryans technician 10
14 matthew miller technician 11
15 dick robbins technician 11
The real data probably won't be more than 10 levels deep...but I'd rather not just do 10 outside joins...I was hoping there was something better than that, and less involved than cursors.
Thanks for any help.
This is basically a port of the accepted answer on my question that I linked to in the OP comments.
you can use common-table expressions
WITH Family As
(
SELECT e.id, e.supervisorid, 0 as Depth
FROM Employee e
WHERE id = #SupervisorID
UNION All
SELECT e2.ID, e2.supervisorid, Depth + 1
FROM Employee e2
JOIN Family
On Family.id = e2.supervisorid
)
SELECT*
FROM Family
For more:
Recursive Queries Using Common Table Expressions
You might be interested in the "Materialized Path" solution, which does slightly de-normalize the table but can be used on any type of SQL database and prevents you from having to do recursive queries. In fact, it can even be used on no-SQL databases.
You just need to add a column which holds the entire ancestry of the object. For example, the table below includes a column named tree_path:
+----+-----------+----------+----------+
| id | value | parent | tree_path|
+----+-----------+----------+----------+
| 1 | Some Text | 0 | |
| 2 | Some Text | 0 | |
| 3 | Some Text | 2 | -2-|
| 4 | Some Text | 2 | -2-|
| 5 | Some Text | 3 | -2-3-|
| 6 | Some Text | 3 | -2-3-|
| 7 | Some Text | 1 | -1-|
+----+-----------+----------+----------+
Selecting all the descendants of the record with id=2 looks like this:
SELECT * FROM comment_table WHERE tree_path LIKE '-2-%' ORDER BY tree_path ASC
To build a tree, you can sort by tree_path to get an array that's fairly easy to convert to a tree.
You can also index tree_path and the index can be used when the wildcard is not at the beginning.
For example, tree_path LIKE '-2-%' can use the index, but tree_path LIKE '%-2-' cannot.
Some recursive function which either return the supervisor (if any) or null. Could be a SP which invokes itself as well, and using UNION.
SQL is a language for performing set operations and recursion is not one of them. Further, many database systems have limitations on recursion using stored procedures as a safety measure to prevent rogue code from running away with precious server resources.
So, when working with SQL always think 'flat', not 'hierarchical'. So I would highly recommend the 'tree_path' method that has been suggested. I have used the same approach and it works wonderfully and crucially, very robustly.