Pivot a Table in PostgreSQL with Many Keys [duplicate] - sql

This question already exists:
how to write ```crosstab``` in postgres with many keys (too many to write out) [closed]
Closed 2 years ago.
I am pivoting a table of the form id, key, value where there are many types of keys (130). There are too many keys to explicitly enumerate the types in the crosstab() call, or write out the crosstab_N function definition as recommended in the crosstab documentation:
row_name TEXT,
category_1 TEXT,
category_2 TEXT,
.
.
.
category_N TEXT
);
How do I pivot this table into wide format with columns id, category_1, category_2, ... category_130? I find it hard to believe you can't pivot such tables in SQL without explicitly enumerating the column types. For example in R, using the tidyverse package I would just call dataframe %>% spread(key=key, value=value)

I find it hard to believe you can't pivot such tables in SQL without explicitly enumerating the column types.
A SQL query returns a fixed set of columns. SQL is a descriptive language, and you need to tell the database which columns you want in the resultset so it can understand your requirement, and build the proper query plan.
Typical solutions involve dynamic SQL: that is, use a query to generate the actual query string, then execute it. That's an additional level of indirection, that is can be somehow challenging.
One half-way solution is JSON. If you can live with a result where all key/pair values are aggregated in a single column a JSON object, you can do:
select id, json_object_agg(key, value) obj
from mytable
group by id

Related

Selecting all columns from a table but casting certain columns

I've read through other similar questions but can't find an answer
I have a SQL query such as:
SELECT * FROM tblName
tblName has several columns e.g. id, name, date etc
I would like to cast id as bigint.
However, in my application, tblName is dynamic. So a user has a list of all the tables in the DB. Let's say 1000 tables. The application then gets all columns from that table. The only column each table has in common is the id column.
The application is using Flask and pyodbc so any larger numbers get converted to decimal/float which is a whole other headache. The workaround is to cast the int as bigint within SQL.
Is this possible? I'm unable to rewrite any part of the application so therefore asking if it can be done in SQL

How can I separate column values in to column names?

Im trying to make a table with dynamic columns.
I have this table. This is just simplified, in other instances these may have more values instead of 3.
name
----------------------
Fall
Medication
Wander
(3 rows)
I am trying to get this result. I need to separate the values into columns.
Fall | Medication | Wander
--------+------------+--------
(0 rows)
You need to PIVOT the table. Unfortunately MySQL (unlike Oracle http://www.oracle.com/technetwork/articles/sql/11g-pivot-097235.html) doesn't have a PIVOT operator, but there are workarounds it seems. MySQL pivot table
You might try one of the crosstab() functions in newer PostgreSQL. See https://www.postgresql.org/docs/current/static/tablefunc.html (F.35.1.2 and F.35.1.4). It allowes you to specify a query that has row-names (those create rows), categories (that create columns) and values (that populate the inside part of the table.
The variant with 2 queries can be especially useful since it allows you to specify the columns you want to use with a seperate query.

SQL or statement vs multiple select queries

I'm having a table with an id and a name.
I'm getting a list of id's and i need their names.
In my knowledge i have two options.
Create a forloop in my code which executes:
SELECT name from table where id=x
where x is always a number.
or I'm write a single query like this:
SELECT name from table where id=1 OR id=2 OR id=3
The list of id's and names is enormous so i think you wouldn't want that.
The problem of id's is the id is not always a number but a random generated id containting numbers and characters. So talking about ranges is not a solution.
I'm asking this in a performance point of view.
What's a nice solution for this problem?
SQLite has limits on the size of a query, so if there is no known upper limit on the number of IDs, you cannot use a single query.
When you are reading multiple rows (note: IN (1, 2, 3) is easier than many ORs), you don't know to which ID a name belongs unless you also SELECT that, or sort the results by the ID.
There should be no noticeable difference in performance; SQLite is an embedded database without client/server communication overhead, and the query does not need to be parsed again if you use a prepared statement.
A "nice" solution is using the INoperator:
SELECT name from table where id in (1,2,3)
Also, the IN operator is syntactic sugar built for exactly this purpose..
SELECT name from table where id IN (1,2,3,4,5,6.....)
Hoping that you are getting the list of ID's on which you have to perform a query for names as input temp table #InputIDTable,
SELECT name from table WHERE ID IN (SELECT id from #InputIDTable)

How to select all fields in SQL joins without getting duplicate columns names?

Suppose I have one table A, with 10 fields. And Table B, with 5 fields.
B links to A via a column named "key", that exists both in A, and in B, with the same name ("key").
I am generating a generic piece of SQL, that queries from a main table A, and receives a table name parameter to join to, and select all A fields + B.
In this case, I will get all the 15 fields I want, or more precisely - 16, because I get "key" twice, once from A and once from B.
What I want is to get only 15 fields (all fields from the main table + the ones existing in the generic table), without getting "key" twice.
Of course I can explicit the fields I want in the SELECT itself, but that thwarts my very objective of building a generic SQL.
It really depends on which RDBMS you're using it against, and how you're assembling your dynamic SQL. For instance, if you're using Oracle and it's a PL/SQL procedure putting together your SQL, you're presumably querying USER_TAB_COLS or something like that. In that case, you could get your final list of columns names like
SELECT DISTINCT(column_name)
FROM user_tab_cols
WHERE table_name IN ('tableA', 'tableB');
but basically, we're going to need to know a lot more about how you're building your dynamic SQL.
Re-thinking about what I asked makes me conclude that this is not plausible. Selecting columns in a SELECT statement picks the columns we are interested in from the list of tables provided. In cases where the same column name exists in more than one of the tables involved, which are the cases my question is addressing, it would, ideally, be nice if the DB engine could return a unique list of fields - BUT - for that it would have to decide itself which column (and from which table) to choose, from all the matches - which is something that the DB cannot do, because it is solely dependent in the user's choice.

Loop and Array in SQL

I have a table called SOURCE_TAG where I want to insert a data where all the insert statements will differ only in one of the columns (this column is a primary key id in a table called SOURCE_LU ). However, to get the id of the column I should also do some work.
The following list contains a list of stringKeys (a column in SOURCE_LU)
So first, I should do some think like the following pseudo code in Oracle SQL
stringKeys= {"foo","bar","foobar","barfoo",...,"etc"}
for(each s in StringKeys) {
SELECT SOURCE_LU where stringKeys=s and Store the id in a list (lets say idList)
}
after getting the list of id's insert each id in to SOURCE_TAG with other similar data for each row
for (each id in listId ){
INSERT INTO SOURCE_TAG values (x,y,id)
}
Sorry, I am a java guy with little SQL knowledge. So how should use Arrays, and loops in Oracle SQL? The simpler the solution the better. Thank you.
SQL itself doesn’t have loops, but Oracle has a procedural language called PL/SQL that you can use. It has loops, conditionals, variables, and other things you might be used to.
However, I think what you are trying to accomplish can be done in regular SQL. Mind you, I haven’t used an Oracle installation in years and don’t have access to one right now, but in PostgreSQL you can do something like:
INSERT INTO SOURCE_TAG
(YEAR_ID,SOURCE_TAG_LU_ID,PRIORITY_ORDER,STATUS_EN,SOURCE_LU_ID)
select 4 as year_id, 2 as source_tag, 1000 as priority_order, 'ACTIVE' as status_en, id
from source_lu
where stringkeys in ('foo', 'bar', ...)
group by year_id, source_tag, priority_order, status_en, id;
It’s possible that group by id is enough in the last line.