Selecting counts of a substring - sql

I know this has to be an easy select but I am having no luck figuring it out. I've got a table has a field of customer grouping codes and I'm trying to get a count of each distinct character 2 through 6 sets. In my past foxpro experience a simple
select distinct substr(custcode,2,5), count (*) from a group by 1
would work, but this doesn't appear to work in sql server queries. The error message indicated it didn't like using the number reference in the group by so I changed it to custcode but the count just returns 1 for each, as I assume the count is after the distinct occurs so there is only one. If I change the count to count(distinct substring(custcode,2,5)) and remove the first distinct substring I just get a count of how many different codes exist. Can someone point out what I'm doing wrong here? Thanks.

The DISTINCT and GROUP BY are redundant, you just want GROUP BY, and you want to GROUP BY the same thing you are selecting:
select substr(custcode,2,5), count (*)
from a
group by substr(custcode,2,5)
In SQL Server you can use column aliases/numbers in the ORDER BY clause, but not in GROUP BY.
Ie. ORDER BY 1 will order by the first selected column, but many consider it bad practice to use column indexes, using aliases/column names is clearer.

Related

What is the difference between these 2 select statements that use Count functions? SQL Oracle

Select Rating_Code as "Rating", Count(Rating_code) as "Total Movies"
From Movie
Group by Rating_code
Order by count(Rating_code) desc;
and
Select Rating_Code as "Rating", Count(*) as "Total Movies"
From Movie
Group by Rating_code
Order by count(*) desc;
They give the same results from what I can see
Also, when would use Count(rating_code) instead of Count(*)?
Using count with column name only counts non-null values.
In your case would be the differnce if you'd have unrated movies, see this fiddle
This starts really making sense if you group by other column(s), not (only) Rating_code...
For the preference of count(1) over count(*) see the comment of Koen Lostrie.
The documentation of COUNT says
If you specify expr, then COUNT returns the number of rows where expr
is not null. ... If you specify the asterisk (*), then this function
returns all rows, including duplicates and nulls.
Your case is a bit more special as you group by a column and count the same column, which is a bit unusual. Your second query is more widely used and well understood, I'd say. People reading your first query will likely need to pause, check the nullability of the column and try to understand what your goal was.
I believe that the column syntax came into being to answer questions like "How many movies are rated", and "how many different rating codes" are there:
SELECT COUNT(rating_code), COUNT(DISTINCT rating_code) FROM movie;

How to get a distinct result ordered by a different column? SQL Postgresql

I'm trying to figure the best way to perform this query in postgresql. I have a messages table and I want to grab the last message a user received from each distinct user. I need to select everything from the row.
I would think this is where I want to group by the senders id "msgfromid", but when I do this it complains that I haven't included everything from my select statement in my group by statement, but I only want to group by the one column, not all of them. So if I try to use Distinct on the one column it forces me to order by the 'distinct on' column first ("msgfromid") which prevents me from being able to get the correct row I need (ordered by the last message sent from the sender "msgsenttime").
My goal is to make this as efficient as possible on my server and database.
This is what I have right now, not working. (This is a sub-query of another query I use to join relevant information afterwards but I figure that is irrelevant)
SELECT DISTINCT ON ("msgfromid") "msgfromid", "msgid", "msgtoid", "msgsenttime", "msgreadtime", "msgcontent", "msgreportstatus", "senderun", "recipientun"
FROM "messages"
WHERE "msgtoid" = ?
ORDER BY "msgsenttime" desc, "msgfromid"
I thought maybe if I pre-ordered them in a sub-query it would work but it just seems to randomly pick one anyway, and this can't be very efficient, even if it were to work, since I'm pulling every message out to begin with, right?:
SELECT DISTINCT ON ("msgfromid") "msgfromid", "msgid", "msgtoid", "msgsenttime", "msgreadtime", "msgcontent", "msgreportstatus", "senderun", "recipientun"
FROM
(
SELECT * FROM "messages"
WHERE "msgtoid" = ?
ORDER BY "msgsenttime" desc
) as "mqo"
Thanks for any help.
Your order by keys are in the wrong order:
SELECT DISTINCT ON ("msgfromid") m.*
FROM "messages" m
WHERE "msgtoid" = ?
ORDER BY "msgfromid", "msgsenttime" desc;
For DISTINCT ON, the keys in parentheses need to be the first keys in the ORDER BY.
If you want the final result ordered in a different way, then you need to use a subquery, with a different ORDER BY on the outer query.

Oracle SQL Developer(4.0.0.12)

First time posting here, hopes it goes well.
I try to make a query with Oracle SQL Developer, where it returns a customer_ID from a table and the time of the payment from another. I'm pretty sure that the problems lies within my logicflow (It was a long time I used SQL, and it was back in school so I'm a bit rusty in it). I wanted to list the IDs as DISTINCT and ORDER BY the dates ASCENDING, so only the first date would show up.
However the returned table contains the same ID's twice or even more in some cases. I even found the same ID and same DATE a few times while I was scrolling through it.
If you would like to know more please ask!
SELECT DISTINCT
FIRM.customer.CUSTOMER_ID,
FIRM.account_recharge.X__INSDATE FELTOLTES
FROM
FIRM.customer
INNER JOIN FIRM.account
ON FIRM.customer.CUSTOMER_ID = FIRM.account.CUSTOMER
INNER JOIN FIRM.account_recharge
ON FIRM.account.ACCOUNT_ID = FIRM.account_recharge.ACCOUNT
WHERE
FIRM.account_recharge.X__INSDATE BETWEEN TO_DATE('14-01-01', 'YY-MM-DD') AND TO_DATE('14-12-31', 'YY-MM-DD')
ORDER
BY FELTOLTES
Your select works like this because a CUSTOMER_ID indeed has more than one X__INSDATE, therefore the records in the result will be distinct. If you need only the first date then don't use DISTINCT and ORDER BY but try to select for MIN(X__INSDATE) and use GROUP BY CUSTOMER_ID.
SELECT DISTINCT FIRM.customer.CUSTOMER_ID,
FIRM.account_recharge.X__INSDATE FELTOLTES
Distinct is applied to both the columns together, which means you will get a distinct ROW for the set of values from the two columns. So, basically the distinct refers to all the columns in the select list.
It is equivalent to a select without distinct but a group by clause.
It means,
select distinct a, b....
is equivalent to,
select a, b...group by a, b
If you want the desired output, then CONCATENATE the columns. The distict will then work on the single concatenated resultset.

How to use distinct all column of table in SQL Server

I want to get unique value form table. But all values should be unique.
So suggest how to get.
SELECT DISTINCT ProCode
, id,SubCat
,SmlImgPath
,RupPrice
,ActualPrice
,ProName
FROM product
WHERE ProCode='FZ10003-EBA';
(one day I may be able to post comments!)
SQLFiddle to show normal, distinct and returning a single row
SELECT DISTINCT works fine but it doesn't work the way you want it to work. From the data you posted in the comment under Klas' answer, it's clear you're expecting a single result when there are data differences somewhere in the columns. For example
/Products/CELEBRITY/KANGANA
is completely DISTINCT from
/Products/SALWAR
What you appear to be looking for cannot work with DISTINCT nor can it work with GROUP BY. Basically the only way two (or three, or ten, or 100) rows will become ONE row is if the data in ALL SEVEN COLUMNS in your SELECT are IDENTICAL.
Take a step back and think about what it is, exactly, you're trying to achieve here.
Are you saying that you want one record only? This is called aggregation. In case there are more records then one (three in your example), you would have to decide for each column, which value to show.
Which SubCat, which SmlImgPath, etc. do you want to see in your result line? The maximum value? The minimum? Or the string 'various'? An example:
SELECT
ProCode
, CASE WHEN MIN(id) <> MAX(id) THEN 'various' ELSE MIN(id) END
, MIN(SubCat)
, MAX(SmlImgPath)
, AVG(RupPrice)
, AVG(ActualPrice)
, MAX(ProName)
FROM product
WHERE ProCode='FZ10003-EBA'
GROUP BY ProCode;
DISTINCT refers to all selected columns, so the answer is that your SELECT already does that.
EDIT:
It seems your problem isn't related to DISTINCT. What you want is to get a single row when your search returns multiple rows.
If you don't care which row you get then you can use:
MS SQL Server syntax:
SELECT TOP 1 ProCode
, id,SubCat
,SmlImgPath
,RupPrice
,ActualPrice
,ProName
FROM product
WHERE ProCode='FZ10003-EBA';
MYSQL syntax:
SELECT ProCode
, id,SubCat
,SmlImgPath
,RupPrice
,ActualPrice
,ProName
FROM product
WHERE ProCode='FZ10003-EBA'
LIMIT 1;
Oracle syntax:
SELECT ProCode
, id,SubCat
,SmlImgPath
,RupPrice
,ActualPrice
,ProName
FROM product
WHERE ProCode='FZ10003-EBA'
AND rownum <= 1;

SQL Nested Where with Sums

I've run into a syntax issue with SQL. What I'm trying to do here is add together all of the amounts paid on each order (paid each) an then only select those that are greater than sum of of paid each for a specific order# (1008). I've been trying to move around lots of different things here and I'm not having any luck.
This is what I have right now, though I've had many different things. Trying to use this simply returns an SQL statement not ended properly error. Any help you guys could give would be greatly appreciated. Do I have to use DISTINCT anywhere here?
SELECT ORDER#,
TO_CHAR(SUM(PAIDEACH), '$999.99') AS "Amount > Order 1008"
FROM ORDERITEMS
GROUP BY ORDER#
WHERE TO_CHAR > (SUM (PAIDEACH))
WHERE ORDER# = 1008;
Some versions of SQL regard the hash character (#) as the beginning of a comment. Others use double hyphen (--) and some use both. So, my first thought is that your ORDER# field is named incorrectly (though I can't imagine the engine would let you create a field with that name).
You have two WHERE keywords, which isn't allowed. If you have multiple WHERE conditions, you must link them together using boolean logic, with AND and OR keywords.
You have your WHERE condition after GROUP BY which should be reversed. Specify WHERE conditions before GROUP BY.
One of your WHERE conditions makes no sense. TO_CHAR > (SUM(paideach)): TO_CHAR() is a function which as far as I know is an Oracle function that converts numeric values to strings according to a specified format. The equivalent in SQL Server is CAST or CONVERT.
I'm guessing that you are trying to write a query that finds orders with amounts exceeding a particular value, but it's not very clear because one of your WHERE conditions specifies that the order number should be 1008, which would presumably only return one record.
The query should probably look more like this:
SELECT order,
SUM(paideach) AS amount
FROM orderitems
GROUP BY order
HAVING amount > 999.99;
This would select records from the orderitems table where the sum of paideach exceeds 999.99.
I'm not sure how order 1008 fits into things, so you will have to elaborate on that.
Other have commented on some of the things wrong with your query. I'll try to give more explicit hints about what I think you need to do to get the result I think you're looking for.
The problem seems to break into distinct sections, first finding the total for each order which you're close to and I think probably started from:
SELECT ORDER#, SUM(PAIDEACH) AS AMOUNT
FROM ORDERITEMS
GROUP BY ORDER#;
... finding the total for a specific order:
SELECT SUM(PAIDEACH)
FROM ORDERITEMS
WHERE ORDER# = 1008;
... and combining them, which is where you're stuck. The simplest way, and hopefully something you've recently been taught, is to use the HAVING clause, which comes after the GROUP BY and acts as a kind of filter that can be applied to the aggregated columns (which you can't do in the WHERE clause). If you had a fixed amount you could do this:
SELECT ORDER#, SUM(PAIDEACH) AS AMOUNT
FROM ORDERITEMS
GROUP BY ORDER#
HAVING SUM(PAIDEACH) > 5;
(Note that as #Bridge indicated you can't use the column alias, AMOUNT, in the having clause, you have to repeat the aggregation function SUM). But you don't have a fixed value, you want to use the actual total for order 1008, so you need to replace that fixed value with another query. I'll let you take that last step...
I'm not familiar with Oracle, and since it's homework I won't give you the answers, just a few ideas of what I think is wrong.
select statement should only have one where statement - can have more than one condition of course, just separated by logical operators (anything that evaluates to true will be included). E.g. : WHERE (column1 > column2) AND (column3 = 100)
Group by statements should after WHERE clauses
You can't refer to columns you've aliased in the select in the where clause of the same statement by their aliased name. For example this won't work:
SELECT column1 as hello
FROM table1
WHERE hello = 1
If there's a group by, the columns you're selecting should be the same as in that statement (or aggregates of those). This page does a better explanation of this than I do.