How to easily list columns in BigQuery - sql

I'm have a table with many columns in BigQuery.
I wanna list its columns in select query, but listing the all columns is hard.
I wanna do like this
SELECT
col1,
col2,
col3,
...
SOME_METHOD(col30),
...
col50
FROM
foo.bar;
Is there any ways to write such query easily?

Below is for BigQuery Standard SQL
SELECT * EXCEPT(col30), SOME_METHOD(col30)
FROM foo.bar
or
SELECT * REPLACE(SOME_METHOD(col30) as col30)
FROM foo.bar
for example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 col1, 2 col2, 3 col3, 4 col4, 5 col5
)
SELECT * EXCEPT(col3), 2 * col3 AS col3
FROM `project.dataset.table`
with result
Row col1 col2 col4 col5 col3
1 1 2 4 5 6
or
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 col1, 2 col2, 3 col3, 4 col4, 5 col5
)
SELECT * REPLACE(2 * col3 AS col3)
FROM `project.dataset.table`
with result
Row col1 col2 col3 col4 col5
1 1 2 6 4 5

This is untested in Big Query, but one trick which is available in other databases, such as SQL Server, is to do a SELECT *, but then also list other items you want to select. So you may try one of the following:
SELECT *, SOME_METHOD(col30) AS output
FROM yourTable;
Or
SELECT SOME_METHOD(col30), * AS output
FROM yourTable;
Note that depending on what the other things are you explicitly list, you could end up with the same column (and name) appearing more than once in the result set.

Related

Union all in Vertica SQL based on tables with different number of columns?

Hellow i have two tables in Vertica SQL:
table 1
col1 col2 col3
1 3 5
2 4 6
table 2
col1 col2
11 33
22 44
And I would like to UNION these two tables, so as as result I would like to have:
col1 col2 col3
1 3 5
2 4 6
11 33 NULL
22 44 NULL
How can I do it in vertica
In general, you should use UNION ALL and define the extra column with whatever default value you want:
select col1, col2, col3
from table1
union all
select col1, col2, NULL as col3
from table2;
UNION incurs overhead for removing duplicates. In general, you should use UNION ALL unless you intend to remove duplicates.
use null as follows:
select col1, col2, col3 from table1
union
select col1, col2, null from table2

Getting the value of no grouping column

I know the basics in SQL programming and I know how to apply some tricks in SQL Server in order to get the result set, but I don't know all tricks in Oracle.
I have these columns:
col1 col2 col3
And I wrote this query
SELECT
col1, MAX(col3) AS mx3
FROM
myTable
GROUP BY
col1
And I need to get the value of col2 in the same row where I found the max value of col3, do you know some trick to solve this problem?
The easiest way to do this, IMHO, is not to use max, but the window function rank:
SELECT col1 , col2, col3
FROM (SELECT col1, col2, col3,
RANK() OVER (PARTITION BY col1 ORDER BY col3 DESC) rk
FROM myTable) t
WHERE rk = 1
BTW, the same syntax should also work for MS SQL-Server and most other modern databases, with MySQL being the notable exception.
A couple of different ways to do this:
In both cases I'm treating your initial query as either a common table expression or as an inline view and joining it back to the base table to get your added column. The trick here is that the INNER JOIN eliminates all the records not in your max query.
SELECT A.*,
FROM myTable A
INNER JOIN (SELECT col1 , MAX( col3 ) AS mx3 FROM myTable GROUP BY col1) B
on A.Col1=B.Col1
and B.mx3 = A.Col3
or
with CTE AS (SELECT col1 , MAX( col3 ) AS mx3 FROM myTable GROUP BY col1)
SELECT A.*
FROM MyTable A
INNER JOIN CTE
on A.col1 = B.Col1
and A.col3= cte.mx3
Here's an alternative that's just a slight extension of your existing group by query (ie. doesn't require querying the same table more than once):
with mytable as (select 1 col1, 1 col2, 1 col3 from dual union all
select 1 col1, 2 col2, 2 col3 from dual union all
select 1 col1, 1 col2, 3 col3 from dual union all
select 1 col1, 3 col2, 3 col3 from dual union all
select 2 col1, 10 col2, 1 col3 from dual union all
select 2 col1, 23 col2, 2 col3 from dual union all
select 2 col1, 12 col2, 2 col3 from dual)
SELECT
col1,
MAX(col2) keep (dense_rank first order by col3 desc) mx2,
MAX(col3) AS mx3
FROM
myTable
GROUP BY
col1;
COL1 MX2 MX3
---------- ---------- ----------
1 3 3
2 23 2

Distinct records based on some columns in SQL

I have a table named myTable in SQL server database. Let`s say the name of columns is like:
col1, col2, col3, col4, col5
There are thousands of records in the table.
I want to select records with no repetition based on only 4 columns.
currently I use the following query:
SELECT DISTINCT col1, col2, col3, col4 FROM myTable
The query does return unique and distinct records, however I need to have col5 in the result too, even thought I do not want to col5 to be considered when I distinct records.
for example, there are three records in the table as follows:
col1 col2 col3 col4 col5
1 2 3 4 5
2 5 6 9 7
1 2 3 4 10
I want the result to be something like this:
col1 col2 col3 col4 col5
1 2 3 4 5
1 2 3 4 10
That will give you the records you like but only col1 to col4:
SELECT col1, col2, col3, col4
FROM myTable
group by col1, col2, col3, col4
having count(*) > 1
If you also need col5 then use
select t1.*
from myTable t1
join
(
SELECT col1, col2, col3, col4
FROM myTable
group by col1, col2, col3, col4
having count(*) > 1
) t2 on t1.col1 = t2.col1
and t1.col2 = t2.col2
and t1.col3 = t2.col3
and t1.col4 = t2.col4
Edit
After you edited your question, this is the answer:
SELECT col1, col2, col3, col4, min(col5) as col5
FROM myTable
group by col1, col2, col3, col4

T-SQL Eliminating duplicate rows while ignoring certain columns

I'm struggling to find the proper statements to select non-duplicate entries that are duplicates only for particular columns. As an example, in the following table I only care about rows that have unique values in col1, col2, and col3 and the values in col4 and col5 do not matter. This means I would consider row 1 and row 2 to be duplicates and row 4 and row 5 to be duplicates:
col1 col2 col3 col4 col5
A 2 p 0 2
A 2 p 1 8
A 3 r 4 12
B 0 f 3 1
B 0 f 6 5
And I would want to select only the following:
col1 col2 col3 col4 col5
A 2 p 0 2
A 3 r 4 12
B 0 f 3 1
Is there a way to combine multiple DISTINCT statements to achieve this or specify certain columns to ignore when comparing rows for duplicates?
You have to choose which lines you want to keep, you can use the ROW_NUMBER() function for this:
SELECT col1, col2, col3, col4, col5
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY col1, col2, col3 ORDER BY col4 DESC) 'RowRank'
FROM table
)sub
WHERE RowRank = 1
You can change the ORDER BY section to change which row you keep and which you toss. The ROW_NUMBER() function just assigns a number to each row, in this example, you want to preserve each combination of col1, col2, col3, so you PARTITION BY them, meaning that numbering will start at 1 for each combination of them. You can run just the inside query to get the idea.
Alternatively, you could use GROUP BY and aggregate functions, ie:
SELECT col1, col2, col3, MAX(col4), MAX(col5)
FROM table
GROUP BY col1, col2, col3
The downside here is that the MAX() of col4 and col5 might come from different rows, so you're not necessarily returning one single row from your original table, but if you don't care which row you return then it doesn't matter.

Element-wise quotient of two columns in SQL

How can I combine the columns returned by two SELECT statements to give their element-wise quotient?
Query 1:
SELECT COUNT(*) AS count
FROM table1
WHERE col2 = 1 AND col3 > 5
GROUP BY col4
ORDER BY col4
Query 2:
SELECT COUNT(*) AS count
FROM table1
WHERE col2 = 1
GROUP BY col4
ORDER BY col4
So if they return something like:
Query 1 Query 2
count count
-----------------------
1 5
2 4
I will get:
quotient
-------
0.2
0.5
With the 4-column version of the question, we can assume that the quotient is between groups with the same value in col4. So, the answer becomes:
SELECT col4, SUM(CASE WHEN col3 > 5 THEN 1 ELSE 0 END) / COUNT(*) AS quotient
FROM table1
WHERE col2 = 1
GROUP BY col4;
I've retained col4 in the output because I don't think the ratios (quotients) will be useful without something to identify which quotient is associated with which values, though theoretically, the answer doesn't want that column in the output.
In this case, you don't need two separate queries at all:
SELECT SUM(col3 > 5) / COUNT(*)
FROM table1
WHERE col2 = 1
GROUP BY col4
ORDER BY col4
In case your actual queries cannot be simplified as per the other answers, you can join the subqueries, like this:
select j1.count / j2.count as quotient
from (
SELECT col4, COUNT(*) AS count
FROM table1
WHERE col2 = 1 AND col3 > 5
GROUP BY col4
) j1
join (
SELECT col4, COUNT(*) AS count
FROM table1
WHERE col2 = 1
GROUP BY col4
) j2 on j1.col4=j2.col4