Limit CONCAT to 32.767 chars [IMPALA] - sql

Basically, I have a table that looks something like this:
| id | string_col |
| -- | ---------- |
| 1 | aaaaaaa |
| 1 | bbbbbbb |
| 1 | ccccccc |
| 2 | aaaaaaa |
| 2 | bbbbbbb |
| 2 | ccccccc |
and a query that groups the rows by ids and concatenates the string_col values for that id:
SELECT
id,
CONCAT('[', GROUP_CONCAT(DISTINCT(string_col)), ']') AS strings_list
FROM my_table
GROUP BY id
the result looks like this:
| id | strings_list |
| -- | --------------------------- |
| 1 | [aaaaaaa, bbbbbbb, ccccccc] |
| 2 | [aaaaaaa, bbbbbbb, ccccccc] |
The query does what I want, but I need to export the results in excel. The cell max length in excel is 32.767 characters and in some cases the length for strings_list is way above that and messes up the entire file.
Is there any way I could limit the CONCAT to 32.767 chars?

Just use the LEFT function e.g.
LEFT(CONCAT('[', GROUP_CONCAT(DISTINCT(string_col)), ']'))

Related

Multiple unnest hits custom Dimensions BigQuery

I have this table in BigQuery
+---------------------------+---------------------------+--------------------------+|
| date |hits.customDimensions.index|hits.customDimensions.value|
+---------------------------+---------------------------+---------------------------|
| 24/09/2021 | 1 | ARG |
| | 2 | production |
| | 3 | id1 |
| | 4 | label1|label2 |
| 24/09/2021 | 1 | GER |
| | 2 | production |
| | 3 | id2 |
| | 4 | label1|label4 |
+---------------------------+---------------------------+---------------------------+
I would like to get a table like this:
+-------------++----------+----------+
| date | Country | labels |
+-------------+----------+----------+
| 24/09/2021 | ARG | label1 |
| 24/09/2021 | ARG | label2 |
| 24/09/2021 | GER | label1 |
| 24/09/2021 | GER | label4 |
+-------------+----------+----------+
I tried with UNNEST hits.customdimensions individually but I could't get join the information of country and labels in one table.
Consider below approach
select date,
(
select value from t.hits hit,
hit.customDimensions customDimension
where index = 1
) country,
label
from data t,
unnest((
select split(value, '|') from t.hits hit,
hit.customDimensions customDimension
where index = 4
)) label
if applied to sample data in your question - output is

How to concat rows based on ID,

Using standard SQL in bigquery:
Given a table such as: Where the values have been counted so only appear once
| id | key | value |
--------------------
| 1 | read | aa |
| 1 | read | bb |
| 1 | name | abc |
| 2 | read | bb |
| 2 | read | cc |
| 2 | name | def |
| 2 | value| some |
| 3 | read | aa |
How can I make it so each row is one user and their respective values? e.g. NEST
So the table would look like:
| id | key | value |
--------------------
| 1 | read | aa |
| | read | bb |
| | name | abc |
| 2 | read | bb |
| | read | cc |
| | name | def |
| | value| some |
| 3 | read | aa |
I've tried using ARRAY_AGG on the column, which ends up listing all the values of that column.
I just need to have each row as a single user with multiple values, as shown above.
Like BigQuery does here, this is what I want it to look like:
Below is for BigQuery Standard SQL
#standardSQL
SELECT id, ARRAY_AGG(STRUCT(key AS key, value AS value)) params
FROM `project.dataset.table`
GROUP BY id
if to apply to your sample data - result is

How to get value in one row in postgreSQL?

I have database table name emp_leave in postgreSQL9.3 like
|emp_name|department|ann_leave|med_leave|cas_leave|org_ann_lv|org_med_lv|org_cas_lv|
| Tame | IT | | | 3 | | | 25 |
| Tame | IT | 4 | | | 20 | | |
| Tame | IT | | 3 | | | 30 | |
I want the query result like
|emp_name|department|ann_leave|med_leave|cas_leave|org_ann_lv|org_med_lv|org_cas_lv|
| Tame | IT | 4 | 3 | 3 | 20 | 30 | 25 |
You want aggregation :
select el.emp_name, el.department,
max(el.ann_leave),
. . . ,
max(el.org_cas_lv)
from emp_leave el
group by el.emp_name, el.department;
This assumes blank space as null.
To eliminate null value and convert into single row
select
emp_name,
department,
min(ann_leave),
min(med_leave),
min(cas_leave),
min(org_ann_lv),
min(org_med_lv),
min(org_cas_lv)
from emp_leave
group by
emp_name,
department

SQL 'Sum' Text Fields, Delim with commas

I have a table like this:
+----+-------+-----------------+
| ID | Name | Email |
+----+-------+-----------------+
| 1 | Jane | Jane#doe.com |
| 2 | Will | Will#gmail.com |
| 3 | Will | wsj#example.com |
| 4 | Jerry | jj2#test.com |
+----+-------+-----------------+
Unfortunately I have records that are duplicates due to multiple emails. I would like to run a sql query to generate this:
+----+-------+---------------------------------+
| ID | Name | Email |
+----+-------+---------------------------------+
| 1 | Jane | Jane#doe.com |
| 2 | Will | Will#gmail.com, wsj#example.com |
| 4 | Jerry | jj2#test.com |
+----+-------+---------------------------------+
I know with numbers you'd do something like this, but I don't know how to 'sum' text fields:
SELECT *,
SUM(Number_Field) AS Number_Field,
FROM table
Thanks!
Edit: I am using MS Access

Joining two tables and calculating divide-SUM from the resulting table in SQL Server

I have one table that looks like this:
+---------------+---------------+-----------+-------+------+
| id_instrument | id_data_label | Date | Value | Note |
+---------------+---------------+-----------+-------+------+
| 1 | 57 | 1.10.2010 | 200 | NULL |
| 1 | 57 | 2.10.2010 | 190 | NULL |
| 1 | 57 | 3.10.2010 | 202 | NULL |
| | | | | |
+---------------+---------------+-----------+-------+------+
And the other that looks like this:
+----------------+---------------+---------------+--------------+-------+-----------+------+
| id_fundamental | id_instrument | id_data_label | quarter_code | value | AnnDate | Note |
+----------------+---------------+---------------+--------------+-------+-----------+------+
| 1 | 1 | 20 | 20101 | 3 | 28.2.2010 | NULL |
| 2 | 1 | 20 | 20102 | 4 | 1.8.2010 | NULL |
| 3 | 1 | 20 | 20103 | 5 | 2.11.2010 | NULL |
| | | | | | | |
+----------------+---------------+---------------+--------------+-------+-----------+------+
What I would like to do is to merge/join these two tables in one in a way that I get something like this:
+------------+--------------+--------------+----------+--------------+
| Date | Table1.Value | Table2.Value | AnnDate | quarter_code |
+------------+--------------+--------------+----------+--------------+
| 1.10.2010. | 200 | 3 | 1.8.2010 | 20102 |
| 2.10.2010. | 190 | 3 | 1.8.2010 | 20102 |
| 3.10.2010. | 202 | 3 | 1.8.2010 | 20102 |
| | | | | |
+------------+--------------+--------------+----------+--------------+
So the idea is to order them by Date from Table1 and since Table2 Values only change on the change of AnnDate we populate the Resulting table with same values from Table2.
After that I would like to go through the resulting table and create another (Final table) with the following.
On Date 1.10.2010. take last 4 AnnDates (so it would be 1.8.2010. and f.e. 20.3.2010. 30.1.2010. 15.11.2009) and Table2 values on those AnnDate. Make SUM of those 4 values and then divide the Table1 Value with that SUM.
So we would get something like:
+-----------+---------------------------------------------------------------+
| Date | FinalValue |
+-----------+---------------------------------------------------------------+
| 1.10.2010 | 200/(Table2.Value on 1.8.2010+Table2.Value on 20.3.2010 +...) |
| | |
+-----------+---------------------------------------------------------------+
Is there any way this can be done?
EDIT:
Hmm yes now I see that I really didn't do a good job explaining it.
What I wanted to say is
I try INNER JOIN like this:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code
FROM TableOne
INNER JOIN TableTwo ON TableOne.id_intrument=TableTwo.id_instrument WHERE TableOne.id_data_label = somevalue AND TableTwo.id_data_label = somevalue AND date > xxx AND date < yyy
And this inner join returns 2620*40 rows which means for every AnnDate from table2 it returns all Date from table1.
What I want is to return 2620 values with Dates from Table1
Values from table1 on that date and Values from table2 that respond to that period of dates
f.e.
Table1:
+-------+-------+
| Date | Value |
+-------+-------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
+-------+-------+
Table2
+-------+---------+
| Value | AnnDate |
+-------+---------+
| x | 1 |
| y | 4 |
+-------+---------+
Resulting table:
+-------+---------+---------+
| Date | ValueT1 | ValueT2 |
+-------+---------+---------+
| 1 | a | x |
| 2 | b | x |
| 3 | c | x |
| 4 | d | y |
+-------+---------+---------+
You need a JOIN statement for your first query. Try:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code FROM TableOne
INNER JOIN TableTwo
ON TableOne.id_intrument=TableTwo.id_instrument;