Column value divided by row count in SQL Server - sql

What happens when each column value in a table is divided with the total table row count. What function is basically performed by sql server? Can any one help?
More specifically: what is the difference between sum(column value ) / row count and column value/ row count. for e.g,
select cast(officetotal as float) /count(officeid) as value,
sum(officetotal)/ count(officeid) as average from check1
where officeid ='50009' group by officeid,officetotal
What is the operation performed on both select?

In your example both will be allways the same value because count(officeid) is allways equal to 1 because officeid is contained in the WHERE clause and officetotal is also contained in GROUP BY clause. So the example will not work because no grouping will be applied.
When you remove officetotal from the GROUP BY, you will get following message:
Column 'officetotal' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
It means that you cannot use officetotal and SUM(officetotal) in one select - because SUM is meant to work for set of values and it is pointless to SUM only one value.
It is just not possible to write it this way in SQL using GROUP BY. If you look for something like first or last value from a group, you will have to use MIN(officetotal) or MAX(officetotal) or some other approach.

Related

Pivot multiple rows (2 columns) into a single row

I have a table where it has only 2 columns, the first columns is a name identifier and the second column is a value for this identifier (basically the table acts as default values), below is a screenshot of that table.
What I want is to convert the table from multiple rows into a single row and the values would be columns with the first column as column name. Example, the current values to be transformed into the below.
I read about the PIVOT operator, however it requires an aggregate function in the pivot clause but I don't think I can use an aggregate function in this case, its just setting row values as column values.
Is this possible with PIVOT or is there another construct I should use to achieve this?
There is already a correct technical answer, showing how to pivot in your case.
Let me explain why this "pivoting" is indeed an aggregation, at a logical level.
You have a group of four rows, and you want to generate a "summary row" for the group. (Imagine, in parallel, that you had several employees identified by employee id, in an additional column; each employee had up to four rows, for the same attributes. Then you are grouping by employee id, each group has up to four rows - fewer if there are missing attributes - and you want to get a "summary row" for each group.)
This is a form of aggregation. But for what aggregate function? You seem to only have one value for AGE, only one value for STATUS, etc.
In fact, you can think of AGE as existing in each of the four rows. When the CODE is 'AGE' then the value is 42, and when the CODE is something else then the value is NULL. You could use SUM(), AVG(), MIN(), MAX() over these four values (one is 42, the rest are NULL); they would all return the same answer, 42 - since all aggregate functions ignore NULL.
What if the values are strings, not numbers? Answer: same thing - except you can't use SUM() or AVG(). You still have MIN() and MAX(). In fact you could use other aggregate function too - they just have to be string aggregates. For example you could use LISTAGG(). Again, you are aggregating a single non-NULL string, the others are NULL, so the result will be just that one non-NULL string.
Before Oracle introduced the PIVOT operator in version 11.1 of the database, programmers were already able to pivot - using a conditional aggregation just like I explained. Something like
select max(case when code = 'AGE' then AGE end) as AGE,
...
from ...
group by EMPLOYEE_ID -- in the more general case
(in your simple case you don't need to group by anything.)
You can use pivot clause for that purpose, like below (Your table has only 2 columns and I assume you don't have any duplicate code)
select *
from Yourtable
pivot (
max(value) for code in (
'AGE' as AGE
, 'FIRST_NAME' as FIRST_NAME
, 'LAST_NAME' as LAST_NAME
, 'STATUS' as STATUS
)
)

Get latest data for all people in a table and then filter based on some criteria

I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.

What value is selected into parameter in SQL query without where clause

For example, I have this query
SELECT #param = column from table
What value is pulled into #param?
I tried this and can't figure out the value that is being pulled. It is not the old record or newer one.
The documentation states:
the variable is assigned the last value that is returned
But without a WHERE clause that uniquely identifies a row nor an ORDER BY clause that specifies a unique value for ordering, the row chosen for the variable assignment is undefined and not deterministic when the table has more than one row.
You could add ORDER BY to the query to return the last ordered row. A more efficient method to do that would to be use SELECT TOP(1)...ORDER BY...DESC. Conversely, SELECT TOP(1)...ORDER BY...ASC will return the first ordered row. Again, the order by column(s) need to be unique for a deterministic value.
This is the value in the column referenced. It seems like it should have a TOP 1 in it, with a WHERE Clause designed to fetch 1 row only.

SQL - Count(*) not behaving in expected manner [duplicate]

This question already has an answer here:
Access query producing results like ROW_NUMBER() in T-SQL
(1 answer)
Closed 7 years ago.
I have the following code
SELECT C_Record.BunchOfColumns, Count(*) AS Degrees
FROM C_Record
WHERE (((C_Record.[C#])=[Enter Value])) //Parameter Input from User
GROUP BY C_Record.BunchofColumns;
My Degrees column never increments, it shows 1 always no matter how many rows are returned from the query. I am suspecting that I have not implemented my GROUP BY method properly. If I understand it correctly, all columns that are selected and are not part of the aggregate function (COUNT in my case) should be put together in GROUP BY. Any help is much appreciated. Thanks in advance
Edit: What I am trying to achieve is to check how many rows have a particular value for a column, then select all other relevant columns and create a Index columns. For example if there are three rows that meet my requirement
Col1 Col2 Degrees
A X 1
B Y 2
C Z 3
and if only 2 rows meet my requirement then
Col1 Col2 Degrees
P X 1
Q Y 2
P.S - my C_Record.BunchofColumns consists of about 10 columns that I did not include for the sake of brevity.
P.P.S - If I try to skip out on any column it gives me the error You Tried to execute a query that does not include the specified expression <<column_name>> as part of an aggregate function
When you use Count() with a GROUP BY the count returned is the number of rows in each group. So to get a count greater than one you would have to have more than one row in your table that had exactly the same values. If you are selecting 10 different columns it seems likely that you have no two columns in the database that have exactly those 10 same values.
If you start with a selecting and grouping by a single column you will see count's of more than one.
That is not how GROUP BY works.
GROUP BY completely changes the meaning of your query. Each row of the result is an "aggregate grouping" of the original rows. Each aggregate grouping consists of all the rows with a particular combination of values for their GROUP BY columns. So if you GROUP BY ten columns, each grouping will consist of rows which are identical on all ten columns.
Once these groupings have been formed, you SELECT various aggregate values like count() or sum(), which provide you with information about the group as a whole. count(*) gives you the number of rows in the group, while count(column) gives you the number of rows in which column is non-NULL. You can also select any of the columns which appear in the GROUP BY clause, because those columns are identical across the whole group.
You are getting a count(*) of one because each of your groups only contains a single row. This is probably because you are grouping by ten columns, and there are no two rows which are identical for all ten columns.
If you just want a count of how many rows satisfy some query, and you don't want this aggregation at all, you write it like this:
SELECT count(*)
FROM something
WHERE something
-- no GROUP BY
;
That will form a single aggregate group of your whole query, and count the rows.
If you want something else, you will need to further explain what you're trying to do.

In SQL, why does group by make a difference when using having count()

I have a table that stores zone_id. Sometimes a zone id is twice in the database. I wrote a query to show only entries that have two or more entries of the same zone_id in the table.
The following query returns the correct result:
select *, count(zone_id)
from proxies.storage_used
group by zone_id desc
having count(zone_id) > 1;
However, if I group by last_updated or company_id, it returns random values. If I don't add a group by clause, it only displays one value as per the screenshot below. First output shows above query string, second output shows same query string without the 'group by' line and returns only one value:
correction: I'm a new member and thus can't post pictures directly, so I added it on minus: http://min.us/m3yrlkSMu#1o
While my query works, I don't understand why. Can somebody help me understand why group by is altering the actual output, instead of only the grouping of the output? I am using MySQL.
A group by divides the resulting rows into groups and performs the aggregate function on the records in each group. If you do a count(*) without a group by you will get a single count of all rows in a table. Since you didn't specify a group by there is only one group, all records in the table. If you do a count(*) with a group by of zone id, you will get a count of how many records there are for each zone id. If you do a count(*) of zone id and last updated date, you will get a count of how many rows were updated on each date in each zone.
Without a group by clause, everything is stored in the same group, so you get a single result. If there are more than one row in your table, then the having will succeed. So, you'll end up counting all the rows in your table...
source
From what I got, you could create a query with having and without group by only in two situations:
You have a where clause, and you want to test a condition on an aggregation of all rows that satisfy that clause.
Same as above, but for all rows in your table (in practice, it doesn't make sense, though).