I have a table where it has only 2 columns, the first columns is a name identifier and the second column is a value for this identifier (basically the table acts as default values), below is a screenshot of that table.
What I want is to convert the table from multiple rows into a single row and the values would be columns with the first column as column name. Example, the current values to be transformed into the below.
I read about the PIVOT operator, however it requires an aggregate function in the pivot clause but I don't think I can use an aggregate function in this case, its just setting row values as column values.
Is this possible with PIVOT or is there another construct I should use to achieve this?
There is already a correct technical answer, showing how to pivot in your case.
Let me explain why this "pivoting" is indeed an aggregation, at a logical level.
You have a group of four rows, and you want to generate a "summary row" for the group. (Imagine, in parallel, that you had several employees identified by employee id, in an additional column; each employee had up to four rows, for the same attributes. Then you are grouping by employee id, each group has up to four rows - fewer if there are missing attributes - and you want to get a "summary row" for each group.)
This is a form of aggregation. But for what aggregate function? You seem to only have one value for AGE, only one value for STATUS, etc.
In fact, you can think of AGE as existing in each of the four rows. When the CODE is 'AGE' then the value is 42, and when the CODE is something else then the value is NULL. You could use SUM(), AVG(), MIN(), MAX() over these four values (one is 42, the rest are NULL); they would all return the same answer, 42 - since all aggregate functions ignore NULL.
What if the values are strings, not numbers? Answer: same thing - except you can't use SUM() or AVG(). You still have MIN() and MAX(). In fact you could use other aggregate function too - they just have to be string aggregates. For example you could use LISTAGG(). Again, you are aggregating a single non-NULL string, the others are NULL, so the result will be just that one non-NULL string.
Before Oracle introduced the PIVOT operator in version 11.1 of the database, programmers were already able to pivot - using a conditional aggregation just like I explained. Something like
select max(case when code = 'AGE' then AGE end) as AGE,
...
from ...
group by EMPLOYEE_ID -- in the more general case
(in your simple case you don't need to group by anything.)
You can use pivot clause for that purpose, like below (Your table has only 2 columns and I assume you don't have any duplicate code)
select *
from Yourtable
pivot (
max(value) for code in (
'AGE' as AGE
, 'FIRST_NAME' as FIRST_NAME
, 'LAST_NAME' as LAST_NAME
, 'STATUS' as STATUS
)
)
Related
I am struggling to understand what the output of SELECT is meant to be in SQL (I am using MS ACCESS), and what sort of criteria this output needs to specify, if any. As a result, I don't understand why some queries work and others don't. So I know it retrieves data from a table, does calculations with it and displays it. But I don't understand the "inner" working of SELECT function. For instance, what is the name of data structure / entity it displays? Is it a "new" table?
And for example, suppose I have a table called "table_name", with 5 columns. One of the columns called "column_3", and there are 20 records.
SELECT column_3, COUNT(*) AS Count
FROM table_name;
Why does this query fail to run? By logic, I would expect it to display two columns: first column will be "column_3", containing 20 rows with relevant data, and second column will be "Count", containing just one non-empty row (displaying 20), and other 19 rows will be empty (or NULL maybe)?
Is it because SELECT is meant to produce equal number of rows for each column?
Your questions involve a basic understanding of SQL. SELECT statements do not create tables, but instead return virtual result sets. Nothing is persisted unless you change it to an INSERT.
In your example question, you will need to "tell" the SQL engine what you want a count "of". Because you added column_3, you need to write:
SELECT column_3, COUNT(*) AS Count
FROM table_name
GROUP BY column_3
If you wanted a count of all the rows, simply:
SELECT COUNT(*) FROM table_name
I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.
This question already has an answer here:
Access query producing results like ROW_NUMBER() in T-SQL
(1 answer)
Closed 7 years ago.
I have the following code
SELECT C_Record.BunchOfColumns, Count(*) AS Degrees
FROM C_Record
WHERE (((C_Record.[C#])=[Enter Value])) //Parameter Input from User
GROUP BY C_Record.BunchofColumns;
My Degrees column never increments, it shows 1 always no matter how many rows are returned from the query. I am suspecting that I have not implemented my GROUP BY method properly. If I understand it correctly, all columns that are selected and are not part of the aggregate function (COUNT in my case) should be put together in GROUP BY. Any help is much appreciated. Thanks in advance
Edit: What I am trying to achieve is to check how many rows have a particular value for a column, then select all other relevant columns and create a Index columns. For example if there are three rows that meet my requirement
Col1 Col2 Degrees
A X 1
B Y 2
C Z 3
and if only 2 rows meet my requirement then
Col1 Col2 Degrees
P X 1
Q Y 2
P.S - my C_Record.BunchofColumns consists of about 10 columns that I did not include for the sake of brevity.
P.P.S - If I try to skip out on any column it gives me the error You Tried to execute a query that does not include the specified expression <<column_name>> as part of an aggregate function
When you use Count() with a GROUP BY the count returned is the number of rows in each group. So to get a count greater than one you would have to have more than one row in your table that had exactly the same values. If you are selecting 10 different columns it seems likely that you have no two columns in the database that have exactly those 10 same values.
If you start with a selecting and grouping by a single column you will see count's of more than one.
That is not how GROUP BY works.
GROUP BY completely changes the meaning of your query. Each row of the result is an "aggregate grouping" of the original rows. Each aggregate grouping consists of all the rows with a particular combination of values for their GROUP BY columns. So if you GROUP BY ten columns, each grouping will consist of rows which are identical on all ten columns.
Once these groupings have been formed, you SELECT various aggregate values like count() or sum(), which provide you with information about the group as a whole. count(*) gives you the number of rows in the group, while count(column) gives you the number of rows in which column is non-NULL. You can also select any of the columns which appear in the GROUP BY clause, because those columns are identical across the whole group.
You are getting a count(*) of one because each of your groups only contains a single row. This is probably because you are grouping by ten columns, and there are no two rows which are identical for all ten columns.
If you just want a count of how many rows satisfy some query, and you don't want this aggregation at all, you write it like this:
SELECT count(*)
FROM something
WHERE something
-- no GROUP BY
;
That will form a single aggregate group of your whole query, and count the rows.
If you want something else, you will need to further explain what you're trying to do.
What happens when each column value in a table is divided with the total table row count. What function is basically performed by sql server? Can any one help?
More specifically: what is the difference between sum(column value ) / row count and column value/ row count. for e.g,
select cast(officetotal as float) /count(officeid) as value,
sum(officetotal)/ count(officeid) as average from check1
where officeid ='50009' group by officeid,officetotal
What is the operation performed on both select?
In your example both will be allways the same value because count(officeid) is allways equal to 1 because officeid is contained in the WHERE clause and officetotal is also contained in GROUP BY clause. So the example will not work because no grouping will be applied.
When you remove officetotal from the GROUP BY, you will get following message:
Column 'officetotal' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
It means that you cannot use officetotal and SUM(officetotal) in one select - because SUM is meant to work for set of values and it is pointless to SUM only one value.
It is just not possible to write it this way in SQL using GROUP BY. If you look for something like first or last value from a group, you will have to use MIN(officetotal) or MAX(officetotal) or some other approach.
How can i generate an ID value for every set of duplicate records as seen in the second table with ID column? In other words, how can I let the first table to look like the second table using SQL query?
Assume that first name and last name in the first table can appear in duplicates.
Each first name and last name can have one or many purchase yr and cost.
The given image is just a sample. Total records in table 1 can reach thousands.
I'm using Oracle SQL.
Note: I'm working with one table only that is the first one. The second table is what I want.
You can use the DENSE_RANK analytic function to assign ID's as below:
EDIT:
Simplified query to generate ID's.
SELECT
DENSE_RANK() OVER (ORDER BY First_Name, Last_Name) ID,
t.*
FROM Table1 t;
Reference:
DENSE_RANK on Oracle Database SQL Reference