SQL Join Problems - sql

I'm having a problem that I assume is related to the Join in my SQL statement.
select s.customer as 'Customer',
s.store as 'Store',
s.item as 'Item',
d.dlvry_dt as 'Delivery',
i.item_description as 'Description',
mj.major_class_description as 'Major Description',
s.last_physical_inventory_dt as 'Last Physical Date',
s.qty_physical as 'Physical Qty',
s.avg_unit_cost as 'Unit Cost',
[qty_physical]*[avg_unit_cost] as Value
from argus.DELIVERY d
join argus.STORE_INVENTORY s
ON (s.store = d.store)
join argus.ITEM_MASTER i
ON (s.item = i.item)
join argus.MINOR_ITEM_CLASS mi
ON (i.minor_item_class = mi.minor_item_class)
join argus.MAJOR_ITEM_CLASS mj
ON (mi.major_item_class = mj.major_item_class)
where s.last_physical_inventory_dt between '6/29/2011' and '7/2/2012'
and s.customer = '20001'
and s.last_physical_inventory_dt IS NOT NULL
It comes back with a seemingly infinite amount of copies of one record. Is there something wrong with the way I'm joining these tables?

join argus.MINOR_ITEM_CLASS mi
ON (i.minor_item_class = mi.minor_item_class)
join argus.MAJOR_ITEM_CLASS mj
ON (mi.major_item_class = mj.major_item_class)
My guess is that your error resides in one of these 2 joins. When you only use the word JOIN it assumes that you are trying to do an INNER JOIN which returns all records that have at least 1 to 1. I don't know what your data looks like but I am assuming that there is a many to many relationship between minor item class and major item class so when you run this query you are receiving duplicated records for almost every field, but the major item class differs.
I would look at the results. Most of the columns will have repeating data that doesn't change while one of the columns will have a different value for every row. That should tell you that the column with differing data for each row is the column that you should be joining differently.
Otherwise, I would say that your query is formatted correctly.

Related

SQL Server query stuck when adding order by clause

I'm working on writing some queries for a very large (and messy) database on behalf of a client. So far, I've just been grabbing the columns I need and making sure that they contain the data I'm looking for. There have been no issues up until I add anything to my "order by" clause. When I add an order by clause to my query, the SQL Server remains stuck on "executing query." Example of code below:
SELECT TOP 100
PEATR.[EffectiveDate] AS 'Effective Date',
CustomerRoot.[CustomerName] AS 'Name',
CustomerAddressRoot.[Street1] AS 'Address 1',
CustomerAddressRoot.[Street2] AS 'Address 2',
CustomerAddressRoot.[City],
CustomerAddressRoot.[State],
CustomerAddressRoot.[Zip],
CustomerAddressRoot.[Country],
CustomerAddressRoot.[AddressDesc] AS 'Description'
FROM PrEmployeeAccrueTierRoot AS PEATR, CustomerRoot, CustomerAddressRoot
ORDER BY CustomerRoot.[CustomerName]
I have also tried creating an inner join on CustomerRoot and CustomerAddressRoot, as the query is returning repeat data, especially in the "Address 1" column. When I ran the code below, I have received the following error message:
The objects "CustomerRoot" and "CustomerRoot" in the FROM clause have the same exposed names. Use correlation names to distinguish them.
Code:
SELECT TOP 100
PEATR.[EffectiveDate] AS 'Effective Date',
CustomerRoot.[CustomerName] AS 'Name',
CustomerAddressRoot.[Street1] AS 'Address 1',
CustomerAddressRoot.[Street2] AS 'Address 2',
CustomerAddressRoot.[City],
CustomerAddressRoot.[State],
CustomerAddressRoot.[Zip],
CustomerAddressRoot.[Country],
CustomerAddressRoot.[AddressDesc] AS 'Description'
FROM PrEmployeeAccrueTierRoot AS PEATR, CustomerRoot, CustomerAddressRoot
INNER JOIN CustomerRoot ON CustomerAddressRoot.CustomerId=CustomerRoot.CustomerId
I did assign aliases to all of the tables previously, though I was still returning the same error message. Any guidance or suggestions would be much appreciated.
You are performing a cartesian join.
FROM PrEmployeeAccrueTierRoot AS PEATR, CustomerRoot, CustomerAddressRoot
...is the same as...
FROM PrEmployeeAccrueTierRoot AS PEATR
inner join CustomerRoot on 1=1
inner join CustomerAddressRoot on 1=1
There is no join logic.
So, if PrEmployeeAccrueTierRoot has 1000 rows and CustomerRoot has 1000 rows and CustomerAddressRoot has 1000 rows, your result will have 1,000,000,000 rows.
Try two things:
Include join logic.
Your query should look something like this:
SELECT TOP 100
PEATR.[EffectiveDate] AS 'Effective Date',
CustomerRoot.[CustomerName] AS 'Name',
CustomerAddressRoot.[Street1] AS 'Address 1',
CustomerAddressRoot.[Street2] AS 'Address 2',
CustomerAddressRoot.[City],
CustomerAddressRoot.[State],
CustomerAddressRoot.[Zip],
CustomerAddressRoot.[Country],
CustomerAddressRoot.[AddressDesc] AS 'Description'
FROM PrEmployeeAccrueTierRoot AS PEATR
inner join CustomerRoot on CustomerRoot.customerrootid = peatr.customerrootid
inner join CustomerAddressRoot on CustomerAddressRoot.customerrootid = CustomerRoot.customerrootid
ORDER BY CustomerRoot.[CustomerName]
Of course, I don't know what columns you should actually join on. Know thy data.
Then you'll have one problem remaining: If your query (without the TOP 100) would return millions of rows, you're asking the database server to perform all of the logic and gather all of the rows, then sort them by CustomerName, then return the first 100 rows. That could still be slow. You'll want to...
Apply filters.
SELECT TOP 100
PEATR.[EffectiveDate] AS 'Effective Date',
CustomerRoot.[CustomerName] AS 'Name',
CustomerAddressRoot.[Street1] AS 'Address 1',
CustomerAddressRoot.[Street2] AS 'Address 2',
CustomerAddressRoot.[City],
CustomerAddressRoot.[State],
CustomerAddressRoot.[Zip],
CustomerAddressRoot.[Country],
CustomerAddressRoot.[AddressDesc] AS 'Description'
FROM PrEmployeeAccrueTierRoot AS PEATR
inner join CustomerRoot on CustomerRoot.customerrootid = peatr.customerrootid
inner join CustomerAddressRoot on CustomerAddressRoot.customerrootid = CustomerRoot.customerrootid
WHERE CustomerAddressRoot.State = 'Alaska'
ORDER BY CustomerRoot.[CustomerName]
..at least during testing, to speed things up.

Display multiple rows in a single row

I have been trying to achieve this:
Instead I am getting similar but NOT the result I'm hoping for:
In thos result that I got from my query, I have rows repeating the SAME thing. For example, if you look at the first 4 results that I highlighted, I want the 1st row to appear and the next 3 to disappear, just like in the first image attached, NO repetition.
I have tried my ways but have got nothing. As in the first image attached, that is the kind of result I am looking for. Kind of nested rows in the last column. Below is what I have tried. I am also attaching a link to my .sql file for ease if anyone can help me with this problem (link). I am using MS SQL.
SELECT cj.completed_job_id AS 'Job Card No.',
c.cus_name AS 'Customer',
c.cus_address AS 'Address',
jt.job_type AS 'Job Type',
cj.no_of_days AS 'No. of Days',
CONCAT(jm.mat_quantity, ' ', jm.mat_type) AS 'Materials Used'
FROM completed_jobs cj
JOIN customers c
ON cj.customer_id = c.customer_id
JOIN job_types jt
ON cj.job_type = jt.job_type
JOIN job_materials jm
ON cj.completed_job_id = jm.completed_job_id;
You can select all of columns without "material used" and make subquery that contains only "material used" for that row.
In that subquery you can convert result to xml to put results into one cell and in next step replace xml tags.
Not very elegant but working - part of SQL below:
SELECT cj.completed_job_id AS 'Job Card No.',
c.cus_name AS 'Customer',
c.cus_address AS 'Address',
jt.job_type AS 'Job Type',
cj.no_of_days AS 'No. of Days',
(SELECT CONCAT(jm.mat_quantity, ' ', jm.mat_type) as material FROM job_materials jm WHERE cj.completed_job_id = jm.completed_job_id FOR XML AUTO) AS 'Materials Used'
FROM completed_jobs cj
JOIN customers c
ON cj.customer_id = c.customer_id
JOIN job_types jt
ON cj.job_type = jt.job_type

SQL Query Stuck in Infintite Loop

I'm running a pretty simple SQL query on my database, and it seems to be returning the same record over and over, creating an infinite loop. Maybe I'm missing something obvious, but I don't see it. Here's the query:
select s.customer as 'Customer',
s.store as 'Store',
s.item as 'Item',
d.dlvry_dt as 'Delivery',
i.item_description as 'Description',
mj.major_class_description as 'Major Description',
s.last_physical_inventory_dt as 'Last Physical Date',
s.qty_physical as 'Physical Qty',
s.avg_unit_cost as 'Unit Cost',
[qty_physical] * [avg_unit_cost] as Value
from database.DELIVERY d,
database.STORE_INVENTORY s,
database.ITEM_MASTER i,
database.MINOR_ITEM_CLASS mi,
database.MAJOR_ITEM_CLASS mj,
database.STORE_INVENTORY_ADJUSTMENT sa
where sa.store = s.store
and s.last_physical_inventory_dt between '6/29/2011' and '7/2/2011'
and s.customer = '20001'
and s.last_physical_inventory_dt is not null
There is one record that falls on 7/1/2011 and it repeats it forever until I cancel the query.
Any help on preventing this?
You're joining all these tables: database.DELIVERY, database.ITEM_MASTER, database.MINOR_ITEM_CLASS, and database.MAJOR_ITEM_CLASS - without specifying how to join them. You need to specify how these tables are joined with the rest.
If each of these tables has ONLY 100 rows, it will give you 100 * 100 * 100 * 100 rows (100 million) minimum rows! (see Cartesian Join)
You haven't joined all your tables. For example, tables MINOR_ITEM_CLASS, database.MAJOR_ITEM_CLASS & database.ITEM_MASTER are missing joins. Missing joins causes the query engine to do a Cartesian join on the tables not explicitly joined. So you don't have an endless loop, you just many duplicate copies of the same record. Eventually your query will stop.
Add the appropriate joins for those tables & let us how it goes. You could also try adding the DISTINCT key word.

SQL query question

So...I'm new to sql and loving what I've learned, but here is a question I'm stumped on.
I've created a website form. It's a timecard type form. A user logs on and clicks a pull down menu (which is being selected from the DB) It's called jobsite, then he scrolls down and clicks Jobsite#2. (both fields are selecting from the same table..) When they submit, it submits into the DB just fine. Showing both fields as the ID number they picked.. Not the name..
SELECT jobsite.Jobsite as 'Customer',
FROM timeCard
LEFT JOIN jobsite ON timeCard.jobsite = jobsite.id
That query works great. Pulling timeCard.jobsite information. If I change it to
LEFT JOIN jobsite ON timeCard.jobsite2 = jobsite.id
That works great too pulling all the inputted data from Jobsite2...What query will work to get them to show as 'Customer 1, and Customer2' on the same query? Does my question make sense? I'm trying to get it to export into Excel so that it would show up with all the information on one line basically. Employee 1, worked at jobsite 1 and jobsite 4 (but instead show: Employee (me) worked at (this place) and (that place)
Where both jobsites come from the same table in the DB
Thanks for any help you guys/gals can give
You're going to need to join to the jobsite table twice, because one timecard record is associated with two different jobsite records (assuming some things about your schema and DBMS...):
SELECT t.EmployeeName, j1.jobsite AS 'Customer 1', j2.jobsite AS 'Customer 2'
FROM timecard t
LEFT JOIN jobsite j1 ON t.jobsite = j1.id
LEFT JOIN jobsite j2 ON t.jobsite2 = j2.id
SELECT js1.jobsite AS customer1, js2.jobsite AS customer2
FROM timecard tc
LEFT JOIN
jobsite js1
ON js1.id = tc.jobsite
LEFT JOIN
jobsite js2
ON js2.id = tc.jobsite2
Are you asking for something like this?
SELECT jobsite.Jobsite as 'Customer',
FROM timeCard, jobSite
WHERE timeCard.jobsite = jobsite.id
OR timeCard.jobsite2 = jobsite.id
(PS. Since you mentioned you are new to SQL, I'd like to take this opportunity to make sure you know about parameterized queries (articles on ASP.Net, PHP), which you should always be using instead of concatenating strings)

choosing latest string when aggregating results in mysql

I've been tasked to do generate some reports on our Request Tracker usage. Request Tracker is a ticketing system we use for several departments were I work. To do this I'm taking a nightly snapshot of details about tickets altered for the day into another database. This approach decouples my reporting from the the internal database schema that RT uses.
Amongst many other questions for the report, I need to report how many tickets were resolved in each month per Department. In RT the department is stored as a CustomField, and my modelling follows that trend, as you can see in my query below. However due to how I'm grabbing snapshots each night, I have multiple rows for a ticket, and the Department field can change over the month. I'm only interested in the most recent Department field. I don't know how to get that in a query.
I know I can use 'GROUP BY' to reduce my query results down to one per ticket, but when I do that, I don't know how to grab the last Department setting. As the Departments are all strings, a MAX() doesnt't get the last one. MySQL doesn't require you to use an aggregating function for fields you're selecting, but the results are indeterminate (from my testing it looks like it might grab the first one on my version of MySQL).
To illustrate, here is the results from a query that shows me two tickets, and all it's Department field settings:
"ticket_num","date","QueueName","CF","CFValue","closed"
35750,"2009-09-22","IT_help","Department","",""
35750,"2009-09-23","IT_help","Department","",""
35750,"2009-09-24","IT_help","Department","",""
35750,"2009-09-25","IT_help","Department","",""
35750,"2009-09-26","IT_help","Department","",""
35750,"2009-10-02","IT_help","Department","",""
35750,"2009-10-03","IT_help","Department","",""
35750,"2009-10-12","IT_help","Department","",""
35750,"2009-10-13","IT_help","Department","",""
35750,"2009-10-26","IT_help","Department","Conference/Visitors","2009-10-26 10:10:32"
35750,"2009-10-27","IT_help","Department","Conference/Visitors","2009-10-26 10:10:32"
36354,"2009-10-20","IT_help","Department","",""
36354,"2009-10-21","IT_help","Department","",""
36354,"2009-10-22","IT_help","Department","FS Students",""
36354,"2009-10-23","IT_help","Department","FS Students",""
36354,"2009-10-26","IT_help","Department","FS Students","2009-10-26 12:23:00"
36354,"2009-10-27","IT_help","Department","FS Students","2009-10-26 12:23:00"
As we can see, both tickets were closed on the 26th, and both tickets had an empty Department field for a few days when they first showed up. I've included my query below, you can see that I've artificially limited the number of columns returned in the second half of the where statement:
SELECT d.ticket_num, d.date, q.name as QueueName, cf.name as CF, cfv.value as CFValue, d.closed
FROM daysCF dcf
INNER JOIN daily_snapshots d on dcf.day_id = d.id
INNER JOIN Queues q on d.queue_id = q.id
INNER JOIN CustomFieldValues cfv on dcf.cfv_id = cfv.id
INNER JOIN CustomFields cf on cf.id = cfv.field_id
WHERE cf.name = 'Department' and (d.ticket_num = 35750 or d.ticket_num = 36354)
ORDER by d.ticket_num, d.date
How can I modify that query so I get a result set that tells me that in October there was one ticket closed for "FS Students" and one ticket closed for "Conference/Visitors"?
This is the "greatest-n-per-group" problem that comes up frequently on Stack Overflow.
Here's how I'd solve it in your case:
SELECT d1.ticket_num, d1.date, q.name as QueueName,
cf.name as CF, cfv.value as CFValue, d1.closed
FROM daysCF dcf
INNER JOIN daily_snapshots d1 ON (dcf.day_id = d1.id)
INNER JOIN Queues q ON (d1.queue_id = q.id)
INNER JOIN CustomFieldValues cfv ON (dcf.cfv_id = cfv.id)
INNER JOIN CustomFields cf ON (cf.id = cfv.field_id)
LEFT OUTER JOIN daily_snapshots d2 ON (d1.ticket_num = d2.ticket_num AND d1.date < d2.date)
WHERE d2.id IS NULL AND cf.name = 'Department'
ORDER by d1.ticket_num, d1.date;
Mysql doesn't have a LAST operator, so you really need to do this using a temporary table.
CREATE TEMPORARY TABLE last_dates SELECT ticket_num, MAX(date) AS date
FROM daily_snapshots GROUP BY ticket_num
that gets you a table with the last date for each ticket. Then in your main query, join against this table with both the ticket_num and date fields. This will filter out all rows for which the date isn't the latest for the corresponding ticket number.
You might need an index on that temporary table, I'll leave that to you.