How can I separate column values in to column names? - sql

Im trying to make a table with dynamic columns.
I have this table. This is just simplified, in other instances these may have more values instead of 3.
name
----------------------
Fall
Medication
Wander
(3 rows)
I am trying to get this result. I need to separate the values into columns.
Fall | Medication | Wander
--------+------------+--------
(0 rows)

You need to PIVOT the table. Unfortunately MySQL (unlike Oracle http://www.oracle.com/technetwork/articles/sql/11g-pivot-097235.html) doesn't have a PIVOT operator, but there are workarounds it seems. MySQL pivot table

You might try one of the crosstab() functions in newer PostgreSQL. See https://www.postgresql.org/docs/current/static/tablefunc.html (F.35.1.2 and F.35.1.4). It allowes you to specify a query that has row-names (those create rows), categories (that create columns) and values (that populate the inside part of the table.
The variant with 2 queries can be especially useful since it allows you to specify the columns you want to use with a seperate query.

Related

How can I create multiple rows based on the value of one column in SQL?

I have a column of type string in my table, where multiple values are separated by pipe operator. For example, like this,
Value1|Value2|Value3
Now, what I want is to have a query, which will show three rows for this row. Basically something similar to the concept of explode in Dataframes.
Note that I am using Spark SQL. And I want to achieve this using SQL, not dataframes.
I got it working by using the following query.
select t.*, explode(split(values, "\\|")) as value
from table t
\\| here can also be replaced by [|]. Just specifying | doesn't work.

How to flatten tables correcty in Big Query?

I have the following tables:
In table 2 (yellow looking fields), the first field is part of the following:
name1 RECORD NULLABLE
name1. name2 RECORD REPEATED
name1.name2. date_inserted TIMESTAMP NULLABLE
As you can see the last (sub-row?) of the row 25 is greyed because it is part of the repeated record name1.name2
I am trying to join table 2, with table 1(orange looking fields) on another field. I have 0 experience with records or repeated records but using FLATTEN() I managed to join them.
The problem is, I noticed that some dates from the 2nd after the join return NULL although there aren't any NULLS before it. So since I can't figure out what the greyed cells are I guess I am doing something wrong.
All this sums up to: How can I totally flatten all tables that I want to use so that there won't be any records at all and so I can go through the data with simple SQL statements? Please provide an example as well. Looking for something generic.
How can I totally flatten all tables that I want to use so that there won't be any records at all and so I can go through the data with simple SQL statements?
It really depends on the schemas you are working with. You can preprocess them, flatten the arrays and rename the structs fields, then use that as your base table to work with simple SQL statements
For your scenario, you can start by flattening the table 2, name2 column like this
SELECT
name2.date_inserted -- Add additional fields you want on the result
FROM table2, table2.name1.name2
You can do CROSS JOIN and LEFT JOIN to further adjust your results.
Please provide an example as well. Looking for something generic.
I'm not sure about a generic approach, since each schema would probably have distinct requirements. The key concept is to know how to flatten arrays and how to query struct with arrays and arrays of structs
You can find plenty examples in that documentation

Equivalent of CONCAT_GROUP for multiple columns

Do you have any idea on how to display values obtained by concat_group in multiple columns instead of having a unique column containing all the values separated by commas.
Thanks in advance :-)
You cannot do this in SQL.
One of the fixed rules of SQL is that the columns in your select-list must be set at the time you prepare your query. The select-list does not expand dynamically to match the values it finds as it examines the data.
This comes from the origins of SQL in the relational model. A relation (not a relationship, lots of people get this wrong) is a data structure with a fixed set of columns, a header defining the names and data types of the columns, and then a set of rows, where every row has the same set of columns as the header.
The select-list of an SQL SELECT statement effectively defines the header of the relation returned as the result-set of that query. The number and names of the columns are defined by the query, not by the data in the result.
A commenter above asks if you want to do a pivot, but a pivot also requires that you name the columns in the select-list. There is no such thing as a SQL pivot query that grows its select-list according to the data in the result.

select only specific number of columns from Table - Hive

How to select only specific number of columns from a table in hive. For Example, If I have Table with 50 Columns, then how Can I just select first 25 columns ? Is there any easy way to do it rather than hard coading the column names.
I guess that you're asking about using the order in which you defined your columns in your CREATE TABLE statement. No, that's not possible in Hive for the moment.
You could do the trick by adding a new column COLUMN_NUMBER and use that in your WHERE statements, but in that case I would really think twice of the trade off between spending some more time typing your queries and messing your whole table design by adding unnecessary columns. Apart from the fact that if you need to change your table schema in the future (for instance, by adding a new column), adapting your previous code with different column numbers would be painful.

Selecting data from multiple tables using regexp in postgres or any SQL platform

Is there a way to select from multiple tables using regexp in postgres or any SQL platform?
I want to select data from all tables in the database provided the table names follow a pattern.
I attempted the following but without any success...
SELECT cpu_id,date,time,duration,state,speed
FROM like 'count%' ","where date = '2012-09-27'
The table names are count1, count2 and so on. I want to take all of them.
I am sorry, I do not know how to put a reproducible example/data in this case.
There is no way to do that. Tables are meant to hold different rows and data. Have you considered merging the tables (instead, place some sort of index instead of different table names)?