I'm having trouble suming a column which has both numeric and ncharvar values where the numerics are summed (and grouped) but the strings are left as is.
I.e.:
from:
| ID | Value |
+----+-------+
| a | 4 |
| b | 3 |
| c | hello |
| a | 8 |
+----+-------+
to:
| ID | Value |
+----+-------+
| a | 12 |
| b | 3 |
| c | hello |
+----+-------+
So far I have:
SELECT
[ID],
[CASE]
WHEN ISNUMERIC([Value]) = 1 THEN SUM(CAST([Value] AS INT))
ELSE [Value]
END AS Value
FROM db
GROUP BY [ID]
But I get an error that "the column Value is inavlid in the select list because it is not contained in either an aggregate function or the GROUP BY clause".
Use try_convert()/try_cast() instead:
SELECT [ID], SUM(TRY_CAST([Value] as int))
FROM db
GROUP BY [ID];
Incidentally, the error that you are getting is because your cast() is after the sum(). You have a syntax error because value is not in t he group by. If you didn't you would still get a run-time error.
Related
I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )
You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.
Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.
You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col
I have a crosstab() query similar to the one in my previous question:
Unexpected effect of filtering on result from crosstab() query
The common case is to filter extra1 field with multiples values: extra1 IN(value1, value2...). For each value included on the extra1 filter, I have added an ordering expression like this (extra1 <> valueN), as appear on the above mentioned post. The resulting query is as follows:
SELECT *
FROM crosstab(
'SELECT row_name, extra1, extra2..., another_table.category, value
FROM table t
JOIN another_table ON t.field_id = another_table.field_id
WHERE t.field = certain_value AND t.extra1 IN (val1, val2, ...) --> more values
ORDER BY row_name ASC, (extra1 <> val1), (extra1 <> val2)', ... --> more ordering expressions
'SELECT category_name FROM category_name WHERE field = certain_value'
) AS ct(extra1, extra2...)
WHERE extra1 = val1; --> condition on the result
The first value of extra1 included on the ordering expression value1, get the correct resulting rows. However, the following ones value2, value3..., get wrong number of results, resulting on less rows on each one. Why is that?
UPDATE:
Giving this as our source table (table t):
+----------+--------+--------+------------------------+-------+
| row_name | Extra1 | Extra2 | another_table.category | value |
+----------+--------+--------+------------------------+-------+
| Name1 | 10 | A | 1 | 100 |
| Name2 | 11 | B | 2 | 200 |
| Name3 | 12 | C | 3 | 150 |
| Name2 | 11 | B | 3 | 150 |
| Name3 | 12 | C | 2 | 150 |
| Name1 | 10 | A | 2 | 100 |
| Name3 | 12 | C | 1 | 120 |
+----------+--------+--------+------------------------+-------+
And this as our category table:
+-------------+--------+
| category_id | value |
+-------------+--------+
| 1 | Cat1 |
| 2 | Cat2 |
| 3 | Cat3 |
+-------------+--------+
Using the CROSSTAB, the idea is to get a table like this:
+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1 | 10 | A | 100 | 100 | |
| Name2 | 11 | B | | 200 | 150 |
| Name3 | 12 | C | 120 | 150 | 150 |
+----------+--------+--------+------+------+------+
The idea is to be able to filter the resulting table so I get results with Extra1 column with values 10 or 11, as follow:
+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1 | 10 | A | 100 | 100 | |
| Name2 | 11 | B | | 200 | 150 |
+----------+--------+--------+------+------+------+
The problem is that on my query, I get different result size for Extra1 with 10 as value and Extra1 with 11 as value. With (Extra1 <> 10) I can get the correct result size on Extra1 for that value but not in the case of 11 as value.
Here is a fiddle demonstrating the problem in more detail:
https://dbfiddle.uk/?rdbms=postgres_11&fiddle=5c401f7512d52405923374c75cb7ff04
All "extra" columns are copied from the first row of the group (as pointed out in my previous answer)
While you filter with:
.... WHERE extra1 = 'val1';
...it makes no sense to add more ORDER BY expressions on the same column. Only rows that have at least one extra1 = 'val1' in their source group survive.
From your various comments, I guess you might want to see all distinct existing values of extra - within the set filtered in the WHERE clause - for the same unixdatetime. If so, aggregate before pivoting. Like:
SELECT *
FROM crosstab(
$$
SELECT unixdatetime, x.extras, c.name, s.value
FROM (
SELECT unixdatetime, array_agg(extra) AS extras
FROM (
SELECT DISTINCT unixdatetime, extra
FROM source_table s
WHERE extra IN (1, 2) -- condition moves here
ORDER BY unixdatetime, extra
) sub
GROUP BY 1
) x
JOIN source_table s USING (unixdatetime)
JOIN category_table c ON c.id = s.gausesummaryid
ORDER BY 1
$$
, $$SELECT unnest('{trace1,trace2,trace3,trace4}'::text[])$$
) AS final_result (unixdatetime int
, extras int[]
, trace1 numeric
, trace2 numeric
, trace3 numeric
, trace4 numeric);
Aside: advice given in the following related answer about the 2nd function parameter applies to your case as well:
PostgreSQL crosstab doesn't work as desired
I demonstrate a static 2nd parameter query above. While being at it, you don't need to join to category_table at all. The same, a bit shorter and faster, yet:
SELECT *
FROM crosstab(
$$
SELECT unixdatetime, x.extras, s.gausesummaryid, s.value
FROM (
SELECT unixdatetime, array_agg(extra) AS extras
FROM (
SELECT DISTINCT unixdatetime, extra
FROM source_table
WHERE extra IN (1, 2) -- condition moves here
ORDER BY unixdatetime, extra
) sub
GROUP BY 1
) x
JOIN source_table s USING (unixdatetime)
ORDER BY 1
$$
, $$SELECT unnest('{923,924,926,927}'::int[])$$
) AS final_result (unixdatetime int
, extras int[]
, trace1 numeric
, trace2 numeric
, trace3 numeric
, trace4 numeric);
db<>fiddle here - added my queries at the bottom of your fiddle.
I have an existing database that has column values abstracted out to a separate 'values' table (for localization reasons, but not necessarily important in the context of this question). In my case, say I have two tables:
***ITEM Table***
id cat
----|-----|
1 | A |
2 | A |
3 | B |
***VALUES Table***
seq id code value type
----|----|-------|-------|-------|
10 | 1 | NAME | Name1 | Type1 |
11 | 1 | DESC | Desc1 | Type1 |
12 | 1 | NAME | Name1 | Type2 |
13 | 1 | DESC | Desc1 | Type2 |
14 | 2 | NAME | Name2 | Type1 |
15 | 2 | DESC | Desc2 | Type1 |
Currently, I can retrieve names and descriptions like this:
***Current Result Set***
id code value
----|-------|-------|
1 | NAME | Name1 |
1 | DESC | Desc1 |
2 | NAME | Name2 |
2 | DESC | Desc2 |
However, I would like to retrieve names/descriptions like this:
***Target Result Set***
id NAME DESC
----|-------|-------|
1 | Name1 | Desc1 |
2 | Name2 | Desc2 |
I thought a CTE/Window function may be appropriate in this case, but I'm not sure how to tackle this.
In essence, how can I create column aliases and associated values, based on a value in a column (in this case, if the 'code' column contains 'NAME', a virtual 'NAME' column would be created, with the value from the associated 'value' column)?
I considered a CASE statement too, but couldn't use it to create a dynamic alias like this.
If this is impossible, as in (Dynamic column alias based on column value), would it be possible to do this if I knew of the "CODE" values ahead of time (i.e. I know that only "NAME" and "DESC" are valid codes.
Actually you need the data to be pivoted , you can use CASE for this with MAX aggregate function.
if you know the values in advance, it can be coded like this
if the values are dynamic then dynamic sql is preferred.
SELECT I.ID,
MAX ( case when code = 'NAME' THEN V.value end ) as 'NAME',
MAX ( case when code = 'DESC' THEN V.value end ) as 'DESC'
FROM ITEM I
INNER JOIN VALUE V
ON I.id = V.id
GROUP BY I.id
I have a table like the following:
+-------+--------------+
| Value | Date |
+-------+--------------+
| 14 | 10/11/2010 |
| 12 | 10/12/2010 |
| 12 | 10/13/2010 |
| 10 | 10/14/2010 |
| 8 | 10/15/2010 |
| 6 | 10/16/2010 |
| 4 | 10/17/2010 |
| 2 | 10/18/2010 |
+-------+--------------+
I would like to calculate the return (the quotient) between every row and the last row (which is with the latest date). e.g for the row with date "10/16/2010", the result should be 6/2=3
Hence, the resulting table should be
+-------+--------------+
| result| Date |
+-------+--------------+
| 7 | 10/11/2010 |
| 6 | 10/12/2010 |
| 6 | 10/13/2010 |
| 5 | 10/14/2010 |
| 4 | 10/15/2010 |
| 3 | 10/16/2010 |
| 2 | 10/17/2010 |
| 1 | 10/18/2010 |
+-------+--------------+
Is it possible to complete this? thanks you!
You can get the value you want to divide by. Since that's always going to be a single row, you can just use a cross join to join to that and perform your division. SQL Fiddle
with maxdate as
(select max([Date]) as maxdate from table1),
divby as
(select
value as divby
from
table1
inner join maxdate md
on md.maxdate = table1.[date])
select
value / divby
,[date]
from
table1
cross join divby
To break it down a bit, the first CTE (cleverly named maxdate) gets the maximum date for the whole thing. The second CTE (divby) get the value (that you will be dividing by) for that max date. As long as you only get one row back from that, you can safely use a cross join, resulting in each row in your table being divided by that one value.
Another possible solution JOIN the the table to itself.
SQL Fiddle Example
select (t1b.value / t1a.value) as result,
t1b.date from table1 t1a
join table1 t1b on t1a.date = (select max(date) from table1)
Thanks for the fiddle, Andrew! Can be accomplished like this as well if 2008 and above (fiddle: http://sqlfiddle.com/#!3/ecda1/11):
SELECT [Value] / MIN([Value]) OVER () AS result,
[Date]
FROM Table1
The procedure is to fill the "City" column in Table B based on the "Letter" column from Table A.
TABLE A
+----------+-------+
| Number | Letter|
+----------+-------+
| 1 | A |
| 1 | |
| 1 | |
| 2 | |
| 2 | |
| 3 | |
| 3 | B |
| 3 | |
| 3 | C |
+----------+-------+
TABLE B
+----------+-------+
| AC | City |
+----------+-------+
| 1 | A |
| 1 | A |
| 1 | A |
| 1 | A |
| 2 | |
| 2 | |
| 2 | |
| 2 | |
| 3 | B |
| 3 | B |
| 3 | B |
+----------+-------+
If AC=1, refer to Number=1, and loop through the "Letter" values from top to bottom to get the top-most value.
For Number=1, the topmost value is A, so for AC=1, fill in all "City" column as A.
For AC=2, Number=2, and there are no values in Table A, so fill in all "City" for each AC=2 as blank.
For AC=3, Number=3, and the top-most value is B, so fill in all "City" for each AC=3 as B.
How do you code this in standard SQL?
I am using the Caspio software and will be inserting the SQL into the "City" column itself, but that shouldn't interfere too much with the code.
This is what I have so far:
SELECT Letter
FROM TableA
WHERE TableA.Number = TableB.AC
AND TableA.Number != ""
LIMIT 1
But it doesn't seem to be working, and I think it's necessary to loop through Table A to find the City value for each AC=Number.
Thanks for any help.
EDIT:
I have figured out the solution:
SELECT TOP 1 Letter
FROM TableA
WHERE Letter !='' AND Number=AC
Thanks.
It doesnt work because you are not including tableB in your FROM clause, or joining it. You can try this one:
SELECT Letter FROM TableA WHERE Number IN
(SELECT AC FROM TableB WHERE City!='' AND City IS NOT NULL)
AND Letter!='' AND LETTER IS NOT NULL
First things first, don't think of "looping" in SQL, it means that you're thinking about the problem wrong. You can to use set-based thinking.
So think about what you want to do, not how you want to do it.
You want to update the TableB.City based on the value of TableA.Letter
UPDATE TableB
SET City = Letter
FROM
(
SELECT Number, Letter,ROW_NUMBER () OVER ( PARTITION BY Number order by number ) AS SortOrder
FROM TableA
WHERE Letter IS NOT NULL AND Letter != ''
) AS A
WHERE A.SortOrder = 1 AND TableB.AC = A.number
I have included the Row_Number sorting, this is to ensure you get the first letter. Please note that you should order by your PK, assuming you have one and assuming that it's an IDENTITY and an int
See the sqlFiddle
EDIT
Sure, you can just do a select.
SELECT TableB.AC, A.Letter
FROM
(
SELECT Number, Letter,ROW_NUMBER () OVER ( PARTITION BY Number order by number ) AS SortOrder
FROM TableA
WHERE Letter IS NOT NULL AND Letter != ''
) AS A
LEFT OUTER JOIN TableB.AC = A.number
WHERE A.SortOrder = 1