problems while trying to optimize my SQL(inner join and group) - sql

Im having a problem in joining and grouping two table. Im using ms sql server 2005 express .
Thank you in advance!

You just need to add date_request to your JOIN criteria:
SELECT otd.userid,otd.task,otd.date_request,ot.approved_by
FROM otd
JOIN ot
ON otd.userid = ot.requested_by
AND otd.date_request = ot.date_request
WHERE otd.userid ='xxx'
AND CONVERT(varchar,otd.date_request,101) BETWEEN '09/10/2013' AND '09/11/2013'
AND ot.status ='A'
ORDER BY otd.date_request,ot.date_request ASC
Demo: SQL Fiddle
Note: Date is changed in Fiddle, but the extra JOIN criteria is the important part. Also, not sure what you're converting your date field for, but if it's a DATE you can just alter the format of your date strings and not cast (as it is in fiddle).

Based on your schema, there's no way to determine which task in otd the record in ot is referring to. Perhaps you meant to include a task column in ot? For example, take a look at your first record in otd. Task 1 by user xxx requested on 9/10/2013. Now look at all the records in ot. You're joining ot on otd.userid = ot.requested_by, and there are two records in ot requested by xxx. So that join matches those two records for task 1 by xxx, and the same two records for task 2 by xxx, and again for tasks 5 and 6.

Related

Having troubles with a conditional count in SQL

I'm working on an SQL project (involving a library database) and I'm having a hard time figuring out how to make a conditional count.
So, I have 4 tables: Imprumuturi, Cititori, Autori, Carti. I need to list the 'Cititori' that have more than one borrowed 'Carti' at the current time.
I tried to use
SELECT cititori.nume_cititor,COUNT(imprumuturi.pk_cititor)
AS numar_imprumuturi FROM cititori, imprumuturi
WHERE imprumuturi.data_return IS NULL GROUP BY cititori.nume_cititor
HAVING COUNT(imprumuturi.pk_cititor)>1
ORDER BY cititori.nume_cititor;
And while it lists all the 'Cititori", it doesn't count the number of active borrowed 'Carti' as it should.
Can I get a hint or some help on how to make it work?
These are the fields of my database
Seems you missed the relation between the tables:
SELECT cititori.nume_cititor,COUNT(imprumuturi.pk_cititor)
AS numar_imprumuturi
FROM cititori
INNER JOIN imprumuturi ON imprumuturi.pk_cititori = cititori.pk_cititori
WHERE imprumuturi.data_return IS NULL
GROUP BY cititori.nume_cititor
HAVING COUNT(imprumuturi.pk_cititor)>1
ORDER BY cititori.nume_cititor;
As suggested, you should not use the old implicit join syntax based on comma-separated table names and where condition, but use explicit join syntax.

How can I write a SQL query to display ID, First Name, Last Name of all players with more than 2000 career hits based on schema

I am new to SQL and DB management. I am working on writing queries based on a schema which you can find below. This is an exercise for me to get familiar reading, writing queries on SQL Server for my job. Could you please help me out defining query based on the schema and simply explain the logic?
Thanks a lot!
SQL Server is my DBMS and here are the question
Display ID, First Name, Last Name, and Hits to display all players with more than 2000 career hits.
This one you can get by typing this query in Microsoft SQL Server
SELECT
MLB_PLAYERS.FIRST_NAME,
MLB_PLAYERS.LAST_NAME,
MLB_PLAYERS.ID,
CAREER_STATS.HITS
FROM
MLB_PLAYERS LEFT JOIN KEY_GAMES_STATS on MLB_PLAYERS.ID=CAREER_STATS.ID
WHERE
CAREER_STATS.HITS>2000
So you have a simple structure to follow:
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
But you decide to get only 3 of them, which is select, from and where. By SELECT you decide which columns you wanna have as an output. Then in FROM you have to choose tables from which you wanna take your variables. But if you decide to use 2 different tables you need to join them. I used left join because I wanted to match hits to existing players. We can match them by similar key, in this case this is their ID. And eventually, you can use where to apply conditions to your queries
I guess you could do it with a join and a group
select p.MLB_PLAYERS.FIRST_NAME,
p.MLB_PLAYERS.LAST_NAME,
p.MLB_PLAYERS.ID,
count(g.KEY_GAMES_STATS.HITS) as hits
from MLB_PLAYERS p
left join KEY_GAMES_STATS on p.ID = g.ID -- not sure how to link there 2 tables
group by p.MLB_PLAYERS.FIRST_NAME,
p.MLB_PLAYERS.LAST_NAME,
p.MLB_PLAYERS.ID
having count(g.KEY_GAMES_STATS.HITS) > 2000

Question on getting number of day from single date column

Above is the screenshot of the tables for my practice. I want to extract the number of days between the earliest and latest sales made by staff 'Ali'. I do not have any SQL IDE to run the code and want to check any problem with my code.
SELECT DAYDIFF(day, MAX(st.Date), MIN(st.Date)) AS Duration
FROM SALES_TRANSACTION AS ST
LEFT JOIN SALES_MASTER AS sm
ON sm.Product_ID = st.Product_ID
GROUP BY sm.Staff_Name
HAVING sm.Staff_Name = 'Ali'
ORDER BY st.Date DESC
Here is the dataset
https://drive.google.com/file/d/13XCxQgbEONU22ZDYhQq-I1u-dh3A2fPc/view?usp=sharing
You want logic more like this:
SELECT DAYDIFF(day, MIN(st.Date), MAX(st.Date)) AS Duration
FROM SALES_TRANSACTION ST JOIN
STAFF_MASTER sm
ON sm.Staff_id = st.Staff_Id
WHERE sm.Staff_Name = 'Ali';
Note the changes:
The filtering is the in WHERE clause rather than the HAVING. In general, it is better to filter before aggregating if possible.
The LEFT JOIN is replaced by a JOIN. First, you need a match to get the name. Second, the foreign key reference should be valid so an outer join should not be necessary.
The correct table for the staff name is STAFF_MASTER.
If you are using SQL Server (which has the 3 argument DATEDIFF() syntax), then the smaller date is the second argument.
And finally, there are many tools on the web where you can test SQL, such as db<>fiddle, SQL Fiddle, and db-fiddle. You can also download free databases onto almost any platform.

How do I force SQL Server 2005 to run a join before the where?

I've got a SQL query that joins a pricing table to a table containing user-provided answers. My query is used to get the price based on the entered quantity. Below is my SQL statement:
SELECT JobQuestion.Value, Price.Min, Price.Max, Price.Amount FROM Price
INNER JOIN JobQuestion
ON Price.QuestionFK=JobQuestion.QuestionFK
AND JobQuestion.JobFK=1
WHERE Price.Min <= JobQuestion.Value
AND Price.Max >= JobQuestion.Value
The problem is SQL Server is running the where clause before the JOIN and it is throwing the error:
Conversion failed when converting the
varchar value 'TEST' to data type int.
because it is doing the min and max comparisons before the join ('TEST' is a valid user entered value in the JobQuestion table, but should not be returned when JobQuestion is joined to Price). I believe SQL Server is choosing to run the WHERE because for some reason the parser thinks that would be a more efficient query. If i Just run
SELECT JobQuestion.Value, Price.Min, Price.Max, Price.Amount FROM Price
INNER JOIN JobQuestion
ON Price.QuestionFK=JobQuestion.QuestionFK
AND JobQuestion.JobFK=1
I get these results back:
500 1 500 272.00
500 501 1000 442.00
500 1001 2000 782.00
So, adding the WHERE should filter out the last two and just return the first record. How do I force SQL to run the JOIN first or use another technique to filter out just the records I need?
Try "re-phrasing" the query as follows:
SELECT *
FROM (
SELECT JobQuestion.Value,
Price.Min,
Price.Max,
Price.Amount
FROM Price
INNER
JOIN JobQuestion
ON Price.QuestionFK = JobQuestion.QuestionFK
AND JobQuestion.JobFK = 1
) SQ
WHERE SQ.Min <= SQ.Value
AND SQ.Max >= SQ.Value
As per the answer from Christian Hayter, if you have the choice, change the table design =)
You shouldn't be comparing strings to ints. If you have any influence at all over your table design, then split the two different uses of the JobQuestion.Value column into two different columns.
First, this is very likely sign of poor design.
If you cannot change schema, then maybe you could force this behavior using hints. Quote:
Hints are options or strategies specified for enforcement by the SQL Server query processor on SELECT, INSERT, UPDATE, or DELETE statements. The hints override any execution plan the query optimizer might select for a query.
And some more:
Caution:
Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that < join_hint>, < query_hint>, and < table_hint> be used only as a last resort by experienced developers and database administrators.
In case you have no influence over your table design - Could you try to filter out those records with numeric values using ISNUMERIC()? I would guess adding this to your where clause could help.
You can likely remove the where... and just add those as predicates to your join. Since it's a inner join this should work
SELECT JobQuestion.Value, Price.Min, Price.Max, Price.Amount
FROM Price
INNER JOIN JobQuestion
ON Price.QuestionFK=JobQuestion.QuestionFK
AND JobQuestion.JobFK=1
AND Price.Min <= JobQuestion.Value
AND Price.Max >= JobQuestion.Value
You can use TRY_PARSE over that strings columns to convert to numeric, and if SQL cannot convert, it will get you NULL instead of error message.
P.S. This thing is first time introduced in SQL 2012, so might be helpful.

SQL join to find inconsistencies between two data sources

I have a SQL challenge that is wracking my brain. I am trying to reconcile two reports for licenses of an application.
The first report is an access database table. It has been created and maintained by hand by my predecessors. Whenever they installed or uninstalled the application they would manually update the table, more or less. It has a variety of columns of inconsistent data, including Name(displayName) Network ID(SAMAccountName) and Computer Name. Each record has a value for at least one of these fields. Most only have 1 or 2 of the values, though.
The second report is based on an SMS inventory. It has three columns: NetbiosName for the computer name, SAMAccountName, and displayName. Every record has a NetbiosName, but there are some nulls in SAMAccountName and displayName.
I have imported both of these as tables in an MS SQL Server 2005 database.
What I need to do is get a report of each record in the Access table that is not in the SMS table and vice versa. I think it can be done with a properly formed join and where clause, but I can't see how to do it.
Edit to add more detail:
If the records match for at least one of the three columns, it is a match. So I need the records form the Access table where the Name, NetworkID, and ComputerName are all missing from the SMS table. I can do it for anyone column, but I can't see how to combine all three columns.
Taking Kaboing's answer and the edited question, the solution seems to be:
SELECT *
FROM report_1 r1
FULL OUTER JOIN report_2 r2
ON r1.SAMAccountName = r2.SAMAccountName
OR r1.NetbiosName = r2.NetbiosName
OR r1.DisplayName = r2.DisplayName
WHERE r2.NetbiosName IS NULL OR r1.NetbiosName IS NULL
Not sure whether records will show up multiple times
You need to look at the EXCEPT clause. It's new to SQL SERVER 2005 and does the same thing that Oracle's MINUS does.
SQL1
EXCEPT
SQL2
will give you all the rows in SQL1 not found in SQL2
IF
SQL1 = A, B, C, D
SQL2 = B, C, E
the result is A, D
Building on Gabriel1836's answer, made simpler, but perhaps a bit harder to interpret:
SELECT *
FROM report_1 r1
FULL OUTER JOIN report_2 r2 ON r1.SAMAccountName = r2.SAMAccountName
WHERE r2.SAMAccountName IS NULL OR r1.SAMAccountName IS NULL
take a look at the tabeldiff.exe that comes with sql server.
Try the following:
SELECT displayName, 'report_1' as type
FROM report_1 r1
LEFT OUTER JOIN report_2 r2 ON r1.SAMAccountName = r2.SAMAccountName
WHERE r2.SAMAccountName IS NULL
UNION
SELECT displayName, 'report_2' as type
FROM report_1 r1
RIGHT OUTER JOIN report_2 r2 ON r1.SAMAccountName = r2.SAMAccountName
WHERE r1.SAMAccountName IS NULL