Using SELECT DISTINCT or alternative with a 3 table query

Using SELECT DISTINCT or alternative with a 3 table query - sql

Here I have an SQL statement which is retrieving all of the right stuff, but I need it to be DISTINCT.
So, for WEEK_NUMBER its returning week_number = 1,1,1,1,1,1 etc
I want it to condense into 1. It is a 3 table query and I'm not sure how I could include the SELECT DISTINCT feature or an alternative, any ideas??
SELECT WEEKLY_TIMECARD.*,DAILY_CALCULATIONS.*,EMPLOYEE_PROFILES.EMPLOYEE_NUMBER
FROM WEEKLY_TIMECARD, DAILY_CALCULATIONS, EMPLOYEE_PROFILES
WHERE EMPLOYEE_PROFILES.EMPLOYEE_NUMBER = WEEKLY_TIMECARD.EMPLOYEE_NUMBER
AND EMPLOYEE_PROFILES.EMPLOYEE_NUMBER = DAILY_CALCULATIONS.EMPLOYEE_NUMBER
AND WEEKLY_TIMECARD.WEEK_NUMBER = DAILY_CALCULATIONS.WEEK_NUMBER

Try this:
SELECT DISTINCT WEEKLY_TIMECARD.WEEK_NUMBER
FROM
WEEKLY_TIMECARD,
DAILY_CALCULATIONS,
EMPLOYEE_PROFILES
WHERE EMPLOYEE_PROFILES.EMPLOYEE_NUMBER = WEEKLY_TIMECARD.EMPLOYEE_NUMBER
AND EMPLOYEE_PROFILES.EMPLOYEE_NUMBER = DAILY_CALCULATIONS.EMPLOYEE_NUMBER
AND WEEKLY_TIMECARD.WEEK_NUMBER = DAILY_CALCULATIONS.WEEK_NUMBER

you should add GROUP BY WEEK_NUMBER

Since you are showing all the fields from tables WEEKLY_TIMECARD and DAYLY_CALCULATIONS, if you use SELECT DISTINCT... you may end up with exactly the same situation you are encountering now (many rows with the same value).
Besides the DISTINCT and GROUP BY usage, you need to consider the following:
Do yo really need all the fields? If you do, then maybe you do need the duplicate values. If you don't, just include the fields you need.
Do you need to aggregate data? Or you only need to deduplicate the values? If you need to aggregate data, you must use GROUP BY, and the appropriate aggregating functions. If you don't need to aggregate data, I would advise you not to use GROUP BY, because it can make your query to be executed very slowly (it may depend on which RDBMBS you are using).
Whichever solution you choose, be sure your tables are properly indexed.
Besides that, I would use INNER JOIN to explicitly define the relations between your data (rather than implicitly defining them using WHERE conditions)... but that's my personal preference.

Related

Trying to understand how WHERE IN in a subquery works in Teradata SQL?

I'm trying to build a sub-query with a list in the where clause, I have tried several variations and I think the problem is with the way I'm structuring the WHERE IN. Help is grealy appreciated!!
SELECT a.ACCT_SK,
a.BTN,
a.PRODUCT_SET,
MAX(b.ORD_CREATD_DT)
FROM MM.MEC_ACCT_ATTR a, CDI_CRM.ORD_MSTR b
WHERE a.ACCT_SK=b.ACCT_SK AND a.BTN=b.BTN
(SELECT b.ACCT_SK, b.ORD_CREATD_DT
FROM CDI_CRM.ORD_MSTR b
WHERE b.ACCT_SK IN ('44347714',
'44023302',
'43604964'));
SELECT Failed. 3706: (-3706)Syntax error: expected something between '(' and the 'SELECT' keyword
The desired output is a table with Product set for 50 ACCT_SKs with the most recent order date matched on ACCT_SK and BTN.

Sample data and desired results would really help. Your query doesn't make much sense, but I suspect you want:
SELECT a.ACCT_SK, a.BTN, a.PRODUCT_SET,
MAX(o.ORD_CREATD_DT)
FROM MM.MEC_ACCT_ATTR a JOIN
CDI_CRM.ORD_MSTR o
ON a.ACCT_SK = o.ACCT_SK AND a.BTN = o.BTN
WHERE a.ACCT_SK IN ('44347714', '44023302', '43604964')
GROUP BY a.ACCT_SK, a.BTN, a.PRODUCT_SET;
This returns the columns you want for the three specified accounts.
Notes:
Always use proper, explicit, standard JOIN syntax. Never use commas in the FROM clause.
Your subquery simply makes no sense. It is not connected to anything else in the query.
You are using an aggregation function (MAX()) so your query is an aggregation query and needs a GROUP BY.
Use meaningful table aliases. a makes sense for an accounts table, but b does not make sense for an orders table.

SQL cartesian product turns

I am trying to understand the cartesian product with the SELECT command
but when I try different combinations I get different results like when I type
select X.A,Y.A,Z.A
From X,Y,Z
i get XxYxZ
but if i try
select X.A,Y.A,Z.A
From X,Y,Z
where (conditions)
depending on how I put the conditions also I get more different combinations

Depending on the database you are using, you want to fiddle with cross join.
Also, the results vary depending on data, to achieve persistent order you want to use order by clause.

GROUP BY clause order omitting results in Oracle 11g query

I have a simple query that appears to give the desired result:
select op.opr, op.last, op.dept, count(*) as counter
from DWRVWR.BCA_M_OPRIDS1 op
where op.opr = '21B'
group by op.opr, op.last ,op.dept;
My original query returns no results. The only difference was the order of the group by clause:
select op.opr, op.last, op.dept, count(*) as counter
from DWRVWR.BCA_M_OPRIDS1 op
where op.opr = '21B'
group by op.opr, op.dept, op.last;
In actuality, this was part of a much larger, more complicated query, but I narrowed down the problem to this. All documentation I was able to find states that the order of the group by clause doesn't matter. I really want to understand why I am getting different results, as I would have to review all of my queries that use the group by clause, if there is a potential issue. I'm using SQL Developer, if it matters.
Also, if the order of the group by clause did not matter and every field not used in an aggregate function is required to be listed in the group by clause, wouldn't the group by clause simply be redundant and seemingly unnecessary?

All documentation I was able to find states that the order of the group by clause doesn't matter
That's not entirely true, it depends.
The grouping functionality is not impacted by the order of columns in the GROUP BY clause. It will produce the same group set regardless of the order. Perhaps that's what those documentation that you found were referring to. However the order does matter for other aspects.
Before Oracle 10g, the GROUP BY performed implicitly an ORDER BY, so the order of the columns in the GROUP BY clause did matter. The group sets are the same, but only ordered differently. Starting with Oracle10g, if you want the result set to be in any specific order, then you must add an ORDER BY clause. Other databases have similar history.
Another case where the order matters is if you have indexes on the table. Multi-column indexes are only used if the columns exactly match the columns specified in the GROUP BY or ORDER BY clauses. So if you change the order, your query will not use the index and will perform differently. The result is the same, but the performance is not.
Also the order of the columns in the GROUP BY clause becomes important if you use some features like ROLLUP. This time the results themselves will not be the same.
It is recommended to follow the best practice of listing the fields in the GROUP BY clause in the order of the hierarchy. This makes the query more readable and more easily maintainable.
Also, if the order of the group by clause did not matter and every field not used in an aggregate function is required to be listed in the group by clause, wouldn't the group by clause simply be redundant and seemingly unnecessary?
No, the GROUP BY clause is mandatory in the standard SQL and in Oracle. There is only one exception in which you can omit the GROUP BY clause, if you want the aggregate functions to apply to the entire result set. In this case, your SELECT list must consist only of aggregate expressions.

Creating view ,SQL Query performance

I am trying to create view, But select statement from this view is taking more than 15 secs.How can i make it faster. My query for the view is below.
create view Summary as
select distinct A.Process_date,A.SN,A.New,A.Processing,
COUNT(case when B.type='Sold' and A.status='Processing' then 1 end) as Sold,
COUNT(case when B.type='Repaired' and A.status='Processing' then 1 end) as Repaired,
COUNT(case when B.type='Returned' and A.status='Processing' then 1 end) as Returned
from
(select distinct M.Process_date,M.SN,max(P.enter_date) as enter_date,M.status,
COUNT(case when M.status='New' then 1 end) as New,
COUNT(case when M.status='Processing' and P.cn is null then 1 end) as Processing
from DB1.dbo.Item_details M
left outer join DB2.dbo.track_data P on M.SN=P.SN
group by M.Process_date,M.SN,M.status) A
left outer join DB2.dbo.track_data B on A.SN=B.SN
where A.enter_date=B.enter_date or A.enter_date is null
group by A.Process_date,A.New,A.Processing,A.SN
After this view..my select query is
select distinct process_date,sum(New),sum(Processing),sum(sold),sum(repaired),sum(returned) from Summary where month(process_date)=03 and year(process_date)=2011
Please suggest me on what changes to be made for the query to perform faster.
Thank you
ARB

It is hard to give advices without seeing the actual data and the structure of the tables. I would rewrite the query keeping in mind these principles:
Use inner join instead of outer join if possible.
Get rid of case operator inside COUNT function. Build a query so you use conditions in WHERE section not in COUNT.
Try to not use aggregated values in GROUP BY. Currently you use aggregated values New and Processing for grouping. Use GROUP BY by existing table values if possible.
If the query gets too complicated, break it into smaller queries and combine results in the final query. Writing a store procedure may help in this case.
I hope this helps.

For tuning a database query, I shall add few items additional to what #Davyd has already listed:
Look at the tables and indexing on those tables. Putting the right index and avoiding the wrong ones always speed up the query.
Is there anything in the where condition that is not part of any index? At times we put index on a column and in the query we use a cast or convert on the column. So the underlying index is not effective. You may consider setting the index on the cast/convert of the column.
Look at the normal form conformity or over normalisation. 3.
Good luck.

If your are using Postgresql, I suggest you use a tool like "http://explain.depesz.com/" in order to see more clearly what part of your query is slow. Depending on what you get, you could either optimize your indexes, or rewrite part of your query. If your are using another database, I'm sure a similar tool exists.
If none of these ideas help, the final solution would be to create a "materialized query". There are plenty of infos on the web regarding this.
Good luck.

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS

Here is my query. I got that error. Please help me. Thanks.
ASC
ALTER PROCEDURE [dbo].[sp_CostAllocation_Test]
#CompanyCode VARCHAR(3),
#EmpCode VARCHAR(600),
#PayCode VARCHAR(600)
AS
SELECT
CTPY33PAYRP.CTPAPECOD As EmployeeCode,
CTPY33PAYRP.CTPAPPCOD As paycode,
(select PY11RPTFPD.rpcol as columntotal from PY11RPTFPD where rppcod =CTPAPPCOD) ,
(SELECT COCODE,CTPAPECOD,CTPAPPCOD
FROM CTPY33PAYRP
WHERE CTPY33PAYRP.COCODE = #CompanyCode
AND CTPY33PAYRP.CTPAPECOD =#EmpCode
AND CTPY33PAYRP.COCODE = #CompanyCode
AND CTPY33PAYRP.CTPAPPCOD=#PayCode) As PayCode_Check,
PY11RPTFPD.RPPCOD As PayType,
(SELECT RPCOL,RPPCOD
FROM PY11RPTFPD,CTPY33PAYRP
WHERE CTPY33PAYRP.CTPAPPCOD=PY11RPTFPD.RPPCOD)
from CTPY33PAYRP,PY11RPTFPD
ORDER BY CTPAPECOD

I have to say your naming conventions aren't exactly transparent!
Without knowing the schemas for your tables it's a bit hard to say for sure, but I would guess that you are having trouble with this sub-query:
(SELECT COCODE,CTPAPECOD,CTPAPPCOD FROM CTPY33PAYRP
WHERE CTPY33PAYRP.COCODE = #CompanyCode AND CTPY33PAYRP.CTPAPECOD =#EmpCode
AND CTPY33PAYRP.COCODE = #CompanyCodeAND CTPY33PAYRP.CTPAPPCOD=#PayCode) As PayCode_Check,
and with this sub-query:
(SELECT RPCOL,RPPCOD
FROM PY11RPTFPD,CTPY33PAYRP
WHERE CTPY33PAYRP.CTPAPPCOD=PY11RPTFPD.RPPCOD)
You are selecting multiple columns from one table, in the first case, and from a join of two tables in the second case. There is nothing in either sub-query which restricts the results to a single row. If you are going to include a sub-query in your select list the sub-query has to return a single row per row in your main query. Also, I've never seen a sub-query with multiple columns.
Since I have no clue from your table and column names what it is the query is meant to do, I can't give you much definitive advice about how to fix the syntax errors. I would say keep your sub-selects to one column each. This is what the error message is telling you. Also you should either correlate the subqueries with the main query so that only one value is possible or use an aggregate function in the sub-queries to ensure that only a single value is possible for each record in the main query.
I will also say as an aside that you should learn ANSI join syntax. It seems tricky at first, but it is your friend once you get used to it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Using SELECT DISTINCT or alternative with a 3 table query - sql

you should add GROUP BY WEEK_NUMBER

Related

Trying to understand how WHERE IN in a subquery works in Teradata SQL?

SQL cartesian product turns

GROUP BY clause order omitting results in Oracle 11g query

Creating view ,SQL Query performance

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS

Categories

Resources