Multiplying values of same PKs - sql

I have the following table:
+-----+------+
| qwe | asdd |
+-----+------+
| a | 3 |
| a | 4 |
| b | 5 |
| b | 6 |
+-----+------+
The result should be something like this:
+-----+------+
| qwe | asdd |
+-----+------+
| a | 12 |
| b | 30 |
+-----+------+
I wrote a code that may be only applied to the actual table, but if we add a row or more, it is not working well:
select qwe, (SUM(asd) - MIN(asd)) * MIN(asd) a from t
group by qwe
How would you recommend me to modify this code to make it work properly with tables like this?
+-----+------+
| qwe | asdd |
+-----+------+
| a | 3 |
| b | 4 |
| b | 5 |
| a | 6 |
| a | 7 |
+-----+------+
And get table like this:
+-----+------+
| qwe | asdd |
+-----+------+
| a | 12 |
| b | 126 |
+-----+------+

There is no built in PRODUCT() function. Alas.
Assuming all your values are positive, you can do:
select qwe, exp(sum(log(asdd))) as aggregate_product
from t
group by qwe;
Note: This can be extended to handle 0 and negative values. That just adds a lot of extra stuff to the expression, which hides the fundamental logic.
To prevent problems with zero:
select qwe, coalesce(exp(sum(log(nullif(asdd, 0)))), 0) as aggregate_product
Negative numbers are a bit trickier.

Related

TSQL - Number groups based on distinct values in certain columns

Let's say I have a table like this:
| ID | ColA | ColB | ColC | ... |
|-----|------|------|------|-----|
| 1 | 111 | XXX | foo | |
| 1 | 111 | XXX | bar | |
| ... | ... | ... | ... | |
| 1 | 111 | YYY | foo | |
| 1 | 111 | YYY | bar | |
| ... | ... | ... | ... | |
| 1 | 999 | XXX | foo | |
| 1 | 999 | XXX | bar | |
| ... | ... | ... | ... | |
| 1 | 999 | YYY | foo | |
| 1 | 999 | YYY | bar | |
| ... | ... | ... | ... | |
| 2 | 111 | XXX | foo | |
| 2 | 111 | XXX | bar | |
| ... | ... | ... | ... | |
There are further columns to the right with all sorts of other values.
I want to partition this table in T-SQL into distinct groups only by columns "ID", "ColA" and "ColB", without regard to all other columns. Then I want to sequentially number those groups. My final result should look like this:
| ID | ColA | ColB | ColC | ... | GroupNumber |
|-----|------|------|------|-----|-------------|
| 1 | 111 | XXX | foo | | 1 |
| 1 | 111 | XXX | bar | | 1 |
| ... | ... | ... | ... | | ... |
| 1 | 111 | YYY | foo | | 2 |
| 1 | 111 | YYY | bar | | 2 |
| ... | ... | ... | ... | | ... |
| 1 | 999 | XXX | foo | | 3 |
| 1 | 999 | XXX | bar | | 3 |
| ... | ... | ... | ... | | ... |
| 1 | 999 | YYY | foo | | 4 |
| 1 | 999 | YYY | bar | | 4 |
| ... | ... | ... | ... | | ... |
| 2 | 111 | XXX | foo | | 5 |
| 2 | 111 | XXX | bar | | 5 |
| ... | ... | ... | ... | | ... |
It seems like this should be an easy problem but I struggle to get a handle on it. I have a certain suspicion that this should work somehow with DENSE_RANK and the partitioning clause in that function. My approach is:
SELECT
*,
DENSE_RANK() OVER(
PARTITION BY ID, ColA, ColB
ORDER BY ColC
) AS GroupNumber
FROM my_table
but this keeps increasing the GroupNumber within each one of these blocks as well.
If I'm understanding what you're looking for, you have the right idea, however you don't need to partition the data within the ranking function - you're looking for the rank of the combination of columns Id, ColA, and ColB within the entire dataset, not the rank of records within those combination of columns.
If that's the case, you simply would remove your partition clause in your dense_rank(), like this:
SELECT
*,
DENSE_RANK() OVER(ORDER BY ID, ColA, ColB) AS GroupNumber
FROM my_table
That assumes that you aren't trying to assign group #'s in any specific order other than the order of ID, ColA, and ColB, which I think is what you want, however you also used an "ORDER BY ColC" clause in your original example - I'm guessing you did that because you need to add an order by clause to a ranking function.
If you are however trying to order the groups a different way, would need to know that and would require something a little different.

How to concat rows based on ID,

Using standard SQL in bigquery:
Given a table such as: Where the values have been counted so only appear once
| id | key | value |
--------------------
| 1 | read | aa |
| 1 | read | bb |
| 1 | name | abc |
| 2 | read | bb |
| 2 | read | cc |
| 2 | name | def |
| 2 | value| some |
| 3 | read | aa |
How can I make it so each row is one user and their respective values? e.g. NEST
So the table would look like:
| id | key | value |
--------------------
| 1 | read | aa |
| | read | bb |
| | name | abc |
| 2 | read | bb |
| | read | cc |
| | name | def |
| | value| some |
| 3 | read | aa |
I've tried using ARRAY_AGG on the column, which ends up listing all the values of that column.
I just need to have each row as a single user with multiple values, as shown above.
Like BigQuery does here, this is what I want it to look like:
Below is for BigQuery Standard SQL
#standardSQL
SELECT id, ARRAY_AGG(STRUCT(key AS key, value AS value)) params
FROM `project.dataset.table`
GROUP BY id
if to apply to your sample data - result is

SQL Concat Id column with another column

Here is what I want to do:
I have this table
+----+-------------+
| id | data |
+----+-------------+
| 1 | max |
| 2 | linda |
| 3 | sam |
| 4 | henry |
+----+-------------+
and I want to Update the data with concatenating Id column with data, which will look like this:
+----+-------------+
| id | data |
+----+-------------+
| 1 | max1 |
| 2 | linda2 |
| 3 | sam3 |
| 4 | henry4 |
+----+-------------+
Sounds like this is basically what you want (T-SQL, Other platforms may have different methods for type conversion and concatenation):
update myTable
set data=data+convert(varchar(50),id)

Showing data from another table if it exists

I am having a hard time trying to get the correct data out of my DB.
I have a couple of tables:
events_template laser_events
| id | something | | id | extid | added |
================== ===========================
| 1 | something | | 1 | 7 | added |
| 2 | something | | 2 | 4 | added |
| 3 | something | | 3 | 2 | added |
| 4 | something | | 4 | 1 | added |
| 5 | something | | 5 | 9 | added |
| 6 | something | | 6 | 3 | added |
| 7 | something |
| 8 | something |
| 9 | something |
| 10 | something |
| 11 | something |
| 12 | something |
| 13 | something |
| 14 | something |
What I am trying to do is get some output that will show me the results of both tables together linked by id and extid, but still show the results from events_template even if there isn't a matching laser_events row.
I've tried something like
SELECT
id,
extid
FROM
events_template,
laser_events
WHERE
events_template.id = laser_events.ext_id;
But that doesn't show me the events_template rows if there isn't a matching laser_events row.
Any help would be appreciated!
You have to use LEFT JOIN:
SELECT e.id, l.ext_id
FROM events_template e
LEFT JOIN laser_events l ON e.id = l.ext_id;

Joining two tables and calculating divide-SUM from the resulting table in SQL Server

I have one table that looks like this:
+---------------+---------------+-----------+-------+------+
| id_instrument | id_data_label | Date | Value | Note |
+---------------+---------------+-----------+-------+------+
| 1 | 57 | 1.10.2010 | 200 | NULL |
| 1 | 57 | 2.10.2010 | 190 | NULL |
| 1 | 57 | 3.10.2010 | 202 | NULL |
| | | | | |
+---------------+---------------+-----------+-------+------+
And the other that looks like this:
+----------------+---------------+---------------+--------------+-------+-----------+------+
| id_fundamental | id_instrument | id_data_label | quarter_code | value | AnnDate | Note |
+----------------+---------------+---------------+--------------+-------+-----------+------+
| 1 | 1 | 20 | 20101 | 3 | 28.2.2010 | NULL |
| 2 | 1 | 20 | 20102 | 4 | 1.8.2010 | NULL |
| 3 | 1 | 20 | 20103 | 5 | 2.11.2010 | NULL |
| | | | | | | |
+----------------+---------------+---------------+--------------+-------+-----------+------+
What I would like to do is to merge/join these two tables in one in a way that I get something like this:
+------------+--------------+--------------+----------+--------------+
| Date | Table1.Value | Table2.Value | AnnDate | quarter_code |
+------------+--------------+--------------+----------+--------------+
| 1.10.2010. | 200 | 3 | 1.8.2010 | 20102 |
| 2.10.2010. | 190 | 3 | 1.8.2010 | 20102 |
| 3.10.2010. | 202 | 3 | 1.8.2010 | 20102 |
| | | | | |
+------------+--------------+--------------+----------+--------------+
So the idea is to order them by Date from Table1 and since Table2 Values only change on the change of AnnDate we populate the Resulting table with same values from Table2.
After that I would like to go through the resulting table and create another (Final table) with the following.
On Date 1.10.2010. take last 4 AnnDates (so it would be 1.8.2010. and f.e. 20.3.2010. 30.1.2010. 15.11.2009) and Table2 values on those AnnDate. Make SUM of those 4 values and then divide the Table1 Value with that SUM.
So we would get something like:
+-----------+---------------------------------------------------------------+
| Date | FinalValue |
+-----------+---------------------------------------------------------------+
| 1.10.2010 | 200/(Table2.Value on 1.8.2010+Table2.Value on 20.3.2010 +...) |
| | |
+-----------+---------------------------------------------------------------+
Is there any way this can be done?
EDIT:
Hmm yes now I see that I really didn't do a good job explaining it.
What I wanted to say is
I try INNER JOIN like this:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code
FROM TableOne
INNER JOIN TableTwo ON TableOne.id_intrument=TableTwo.id_instrument WHERE TableOne.id_data_label = somevalue AND TableTwo.id_data_label = somevalue AND date > xxx AND date < yyy
And this inner join returns 2620*40 rows which means for every AnnDate from table2 it returns all Date from table1.
What I want is to return 2620 values with Dates from Table1
Values from table1 on that date and Values from table2 that respond to that period of dates
f.e.
Table1:
+-------+-------+
| Date | Value |
+-------+-------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
+-------+-------+
Table2
+-------+---------+
| Value | AnnDate |
+-------+---------+
| x | 1 |
| y | 4 |
+-------+---------+
Resulting table:
+-------+---------+---------+
| Date | ValueT1 | ValueT2 |
+-------+---------+---------+
| 1 | a | x |
| 2 | b | x |
| 3 | c | x |
| 4 | d | y |
+-------+---------+---------+
You need a JOIN statement for your first query. Try:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code FROM TableOne
INNER JOIN TableTwo
ON TableOne.id_intrument=TableTwo.id_instrument;