postgresql transpose with different attribute row - sql

This is the first time i used transpose
so i dont know if what i want is possible or not
this is my query
SELECT *
FROM crosstab(
'select p.name, a.attributekey, a.attributevalue
from productz p
join attribute a on a.itemid=p.id
order by p.name, a.attributekey')
AS final_result(name varchar, interface varchar, negativemargin varchar,parity varchar);
select p.name, a.attributekey, a.attributevalue
from productz p
join attribute a on a.itemid=p.id
order by p.name, a.attributekey;
here's the link
http://rextester.com/IQNSY51011
but the output is different than what i want, because productz 1 have two row and the productz 2 have 3 row
name interface negativemargin parity
dufan true true NULL
waterboom android true false
the output i want is this below without insert interface,Null to database
name interface negativemargin parity
dufan NULL true true
waterboom android true false
Note: please click "run it" after opening the link

Solution to your problem:
SELECT *
FROM crosstab(
'select p.name, a.attributekey, a.attributevalue
from productz p
join attribute a on a.itemid=p.id
order by p.name, a.attributekey',
'SELECT DISTINCT attributekey FROM attribute ORDER BY 1')
AS final_result(name varchar, interface varchar, negativemargin varchar,parity varchar);
LINK: http://rextester.com/OPNK82802
Use crosstab(text, text) with 2 input parameters.
The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end.
What is the problem with CROSSTAB(text) i.e. crosstab with 1 parameter?
The main limitation of the single-parameter form of crosstab is that it treats all values in a group alike, inserting each value into the first available column. If you want the value columns to correspond to specific categories of data, and some groups might not have data for some of the categories, that doesn't work well. The two-parameter form of crosstab handles this case by providing an explicit list of the categories corresponding to the output columns.
For more info on crosstab follow the below link:
https://www.postgresql.org/docs/9.2/static/tablefunc.html

Related

SQLite How to fetch every column after the 5th

Lets say i have the following sql table named urls:
url
redirect
revenue
realRevenue
clicksGermany
clicksUSA
clicksIndia
gaxktgq
google.com
0.321
69.51
15
28
33
oqjkgf1
example.cn
0.252
1424.3
1202
10
69
gaxktgq
corn.shop
1.242
42525.2
325525
1230
420
Now i want to fetch every column after realRevenue.
In this example you could just fetch by using the names of the columns that come after realRevenue but in my case there are way more fields that come after.
What query do i need?
If the fields you have correspond to the ones presented in your sample input, you should directly select those who are needed by you. If they're more than what we see here, check the next part.
As long as SQLite does not support dynamic queries, you can't create a query in an automatical way.
Although you can retrieve your table interesting fields by accessing the two tables "sqlite_master" and "pragma_table_info", that contains information regarding your table name and table fields respectively. By filtering on the table name and on the field id, you can have a list of all your fields.
SELECT p.name AS column_name
FROM sqlite_master AS m
JOIN pragma_table_info(m.name) AS p ON m.name = 'tab' AND p.cid >= 4
Output:
column_name
clicksGermany
clicksUSA
clicksIndia
But you can also have them prepared to be hardcoded into a SELECT statement, applying a GROUP_CONCAT on the concatenation of the table name and each table field.
SELECT GROUP_CONCAT(m.name || '.' || p.name, ', ') AS fields
FROM sqlite_master AS m
JOIN pragma_table_info(m.name) AS p ON m.name = 'tab' AND p.cid >= 4
fields
tab.clicksGermany, tab.clicksUSA, tab.clicksIndia
Check the demo here.
Note: This solution gets useful if your amount of fields is very big, such that writing all of them by hand becomes a time-consuming task.

How to delete records in BigQuery based on values in an array?

In Google BigQuery, I would like to delete a subset of records, based on the value of a specific column. It's a query that I need to run repeatedly and that I would like to run automatically.
The problem is that this specific column is of the form STRUCT<column_1 ARRAY (STRING), column_2 ARRAY (STRING), ... >, and I don't know how to use such a column in the where-clause when using the delete-command.
Here is basically what I am trying to do (this code does not work):
DELETE
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
The error that I'm getting is: Syntax error: Expected end of input but got keyword LEFT at [3:1]
If I replace the DELETE with SELECT *, it does work:
SELECT *
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
Does somebody know how to use such a column to delete a subset of records?
EDIT:
Here is some code to create a reproducible example with some silly data (fill in your own dataset and table name in all queries):
Suppose you want to delete all rows where category.type contains the value 'food'.
1 - create a table:
CREATE TABLE <DATASET>.<TABLE_NAME>
(
article STRING,
category STRUCT<
color STRING,
type ARRAY<STRING>
>
);
2 - Insert data into the new table:
INSERT <DATASET>.<TABLE_NAME>
SELECT "apple" AS article, STRUCT('red' AS color, ['fruit','food'] as type) AS category
UNION ALL
SELECT "cabbage" AS article, STRUCT('blue' AS color, ['vegetable', 'food'] as type) AS category
UNION ALL
SELECT "book" AS article, STRUCT('red' AS color, ['object'] as type) AS category
UNION ALL
SELECT "dog" AS article, STRUCT('green' AS color, ['animal', 'pet'] as type) AS category;
3 - Show that select works (return all rows where category.type contains the value 'food'; these are the rows I want to delete):
SELECT *
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Initial Result
4 - My attempt at deleting rows where category.type contains 'food' does not work:
DELETE
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Syntax error: Unexpected keyword LEFT at [3:1]
Desired Result
This is the code I used to delete the desired records (the records where category.type contains the value 'food'.)
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS(SELECT 1 FROM UNNEST(t1.category.type) t2 WHERE t2 = 'food')
The embarrasing thing is that I've seen these kind of answers on similar questions (for example on update-queries). But I come from Oracle-SQL and I think that there you are required to connect your subquery with your main query in the WHERE-statement of the subquery (ie. connect t1 with t2), so I didn't understand these answers. That's why I posted this question.
However, I learned that BigQuery automatically understands how to connect table t1 and 'table' t2; you don't have to explicitly connect them.
Now it is possible to still do this (perhaps even recommended?):
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS (SELECT 1 FROM <DATASET>.<TABLE_NAME> t2 LEFT JOIN UNNEST(t2.category.type) AS type WHERE type = 'food' AND t1.article=t2.article)
but a second difficulty for me was that my ID in my actual data is somehow hidden in an array>struct-construction, so I got stuck connecting t1 & t2. Fortunately this is not always an absolute necessity.
Since you did not provide any sample data I am going to explain using some dummy data. In case you add your sample data, I can update the answer.
Firstly,according to your description, you have only a STRUCT not an Array[Struct <col_1, col_2>].For this reason, you do not need to use UNNEST to access the values within the data. Below is an example how to access particular data within a STRUCT.
WITH data AS (
SELECT 1 AS id, STRUCT("Alex" AS name, 30 AS age, "NYC" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Leo" AS name, 18 AS age, "Sydney" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Robert" AS name, 25 AS age, "Paris" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Mary" AS name, 28 AS age, "London" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Ralph" AS name, 45 AS age, "London" AS city) AS info
)
SELECT * FROM data
WHERE info.city = "London"
Notice that the STRUCT is named info and the data we accessed is city and used it in the WHERE clause.
Now, in order to delete the rows that contains an specific value within the STRUCT , in your case I assume it would be your_struct.column_1, you can use DELETE or MERGE and DELETE. I have saved the above data in a table to execute the below examples, which have the same output,
First method: DELETE
DELETE FROM `project.dataset.table`
WHERE info.city = "Sydney"
Second method: MERGE and DELETE
MERGE `project.dataset.table` a
USING (SELECT * from `project.dataset.table` WHERE info.city ="London") b
ON a.info.city =b.info.city
WHEN matched and b.id=1 then
Delete
And the output for both queries,
Row id info.name info.age info.city
1 1 Alex 30 NYC
2 1 Robert 25 Paris
3 1 Ralph 45 London
4 1 Mary 28 London
As you can see the row where info.city = "Sydney" was deleted in both cases.
It is important to point out that your data is excluded from your source table. Therefore, you should be careful.
Note: Since you want to run this process everyday, you could use Schedule Query within BigQuery Console, appending or overwriting the results after each run. Also, it is a good practice not deleting data from your source table. Thus, consider creating a new table from your source table without the rows you do not desire.

Sqlite fetch from table

I have a table named Text_Field which consists of a column named ID,
I have another table named Content which consists of a table named value,
I want to fetch those values of ID from the Text_Field table which are present in the value column of the Content and satisfying a said condition.
I know I can construct a query like this
SELECT ID
FROM Text_Field
WHERE ID IN (
SELECT value
FROM CONTENT
WHERE USER='CURRENT_USER')
My only problem is that for some scenarios the value table might contain the ID inside a string
So the inner query might return something like
56789
12334
12348
Rtf(833405)
Now if my ID is 833405 it is present in the value column but the IN query would return false,
I tried
group_concat(value)
So that the inner query returns a single row which is a string,
56789,12334,12348,Rtf(833405)
I want to know that after group_concat can I use something as LIKE to satisfy my need
Or is there some other way I can do this?
Use exists instead, with like:
SELECT t.ID
FROM Text_Field t
WHERE EXISTS (SELECT 1
FROM CONTENT c
WHERE c.USER = 'CURRENT_USER' AND
(c.value = t.id OR
c.value LIKE '%(' || t.id || ')%'
)
);
Note:

Updating table where LIKE has several criteria

I have two tables in PostgreSQL (version 9.3). The first holds id, title and the second holds schdname. I'm trying to create a select statement that will retrieve id and title where the title contains the schdname from the other table. The id, title table can hold several thousand rows. I can do this fine if I use WHERE LIKE for an individual schdname example but there are 40 plus names so this is not practical.
My original query ran like this which I know doesn't work but would show what I'm trying to achieve.
SELECT
id,
title,
dname
FROM
mytable
WHERE
title LIKE (
SELECT
schdname
FROM
schedule
)
This produces an error of more than one row returned by the subquery used as an expresssion. So my question is can this be achieved another way?
Here is one way to do that:
SELECT id, title, dname FROM mutable
JOIN schedule ON mutable.title like '%' || schedule.schdname || '%'
Or a sligtly more readable way:
SELECT id, title, dname FROM mutable
JOIN schedule ON POSITION(schedule.schdname in mutable.title)<>0
Are you actually using a wildcard with like? You don't say so above. If not you can replace like with IN. If you do want to do a wildcard join I'd recommend taking a substring of the columns and comparing that e.g.
names
james
jack
janice
select substr(names,1,2) as names_abbr
from names_table where names_abbr = (select ...)

SQL Query For Grouping

Hi i need help with a query
I have a table Where Jobs and Employee are linked its called EmployeeToJobsApplied
Id EmployeeId JobsId Applied Viewed
1 1 1 True True
2 1 2 False True
3 1 1 True True
4 1 3 True True
If you noticed there are repeating values like in ID=3
I didn't create the database structure. I can't do much about the table structure as of this point since this is a post production project.
The thing i can change is the StoredProcedure that could retrieve information from this table.
So what i need is a single column sigle row value of the Total of Jobs Applied
So basically what i need based on this example is to get a value of
2 Jobs Applied for Employee ID = 1
i want to ignore the duplicates.
Thank You!
Please feel free to edit/retag
UPDATE
I do need the total of the result,
I need the total count (not the list) of Employees who applied for a specific job.
I tried using count and i'ts not working accordingly, Because it counts also those who are not distinct. Thank you for your kind help
If you need to aggregate on distinct values, then you can write:
select EmployeeId, count(distinct JobsId) as JobsApplied
from EmployeeToJobsApplied
where Applied = 1
group by EmployeeId
Use distinct.
select distinct * from tablename
if you dont need id in your result set then, it's simple to use distinct something like:
SELECT distinct employeeid, JobsId, Applied, Viewed
FROM EmployeeToJobsApplied
additionally you can use the where clause to remove results where Applied is false:
WHERE Applied = true
select distinct EmployeeId, JobsId from EmployeeToJobsApplied
since you don't seem to care about the applied and viewed columns
Instead of distinct we can group by our columns
select id,JobsId,Applied,Viewed from Emp
group by id,JobsId
As distinct hits performance.