Include header row while extracting the table as csv to s3 in data pipeline AWS service

Include header row while extracting the table as csv to s3 in data pipeline AWS service - header

while extracting the entire table from RDS to s3 in csv format.
Header column is not included.
what option should we choose to include the Header row in the csv file.

SELECT 'firstName', 'lastName', 'email'
UNION ALL
SELECT firstName, lastName, email
FROM users

SELECT '1st_name', 'last_name' UNION ALL SELECT u.firstName as 1st_name, u.lastName FROM users u;
You just need to select 'alias' name as column name when you have alias. It works.

Related

How to combine multiple rows as one in Bigquery?

I have a BigQuery table which has data as shown below image
I wish to create a table out of this data which is as shown below image
So here I wish to
remove the email column data
combine the emp_type column values as comma separated value
have just 1 row per id
I tried using STRING_AGG function of BigQuery but was unable to achieve what I specified above.
The table actually has more than 30 columns but for the sake of explaining the issue i reduced it to 7 columns.
How do I combine multiple rows as one in a query?

Consider below approach
select
any_value((select as struct * except(email, emp_type) from unnest([t]))).*,
string_agg(emp_type, ', ') emp_type
from data t
group by to_json_string((select as struct * except(email, emp_type) from unnest([t])))
if applied to sample data in your question - output is
As you can see here - it will work no matter how many columns you have 30+ or 100+ . you done even need to type them at all!

I see two possible options, if you want to have uniqe row per combination of all parameters except email and emp_type:
SELECT id, name, status, `count`, is_hybrid, STRING_AGG(emp_type, ', ')
FROM data
GROUP BY id, name, status, `count`, is_hybrid
If you want to have just one row per id, you can group by id and select arbitrary value(from rows with this id) for other columns:
SELECT id, ANY_VALUE(name), ANY_VALUE(status), ANY_VALUE(`count`), ANY_VALUE(is_hybrid), STRING_AGG(emp_type, ', ')
FROM data
GROUP BY id

PostgreSQL multi-table query with name, first_name, and last_name

I have two tables in my PostgreSQL database.
One table, call it 'Archive', has a column 'name' which are two-word first-name and last-name, e.g. 'John Smith' and an empty column which wants to be full of 'user_id'.
The second table, call it 'User Accounts', has three columns, one 'first_name' and one 'last_name' and one 'id'.
My goal is to implement a query that uses the column in Archive to select for the corresponding first_name and last_name in User Accounts and return to Archive the 'id' into 'user_id' column

You can use a join. For instance:
select a.*, ua.id as user_id
from archive a left join
user_accounts ua
on a.name = ua.first_name || ' ' || ua.last_name;
If you actually want to update the column:
update archive a
set user_id = ua.id
from user_accounts ua
where a.name = ua.first_name || ' ' || ua.last_name;
Note that name matching can be quite tricky. You may not get as many matches as you expect. If this turns out to be an issue, ask another question. Include sample data and desired results in the question.

How to give user specific access to BigQuery columns?

I am currently trying to limit the columns that my view may return but keeping the possibility to the user to filter it, for example:
Table
{ f_name: string, l_name: string, ssn: string }
View
{ f_name: string, l_name: string}
but allowing queries like this: SELECT * FROM view WHERE ssn = '1234567890'
I am pretty sure that there is better approach but I am too deep to see it :)

Below is high level idea for you
This is for BigQuery Standard SQL
#standardSQL
WITH `yourTable` AS (
SELECT 'a' f_name, 'x' l_name, '1234567890' ssn UNION ALL
SELECT 'b', 'y', '2234567890' UNION ALL
SELECT 'c', 'z', '3234567890' UNION ALL
SELECT 'd', 'v', '4234567890' UNION ALL
SELECT 'r', 'w', '5234567890'
),
`yourView`AS (
SELECT f_name, l_name, FARM_FINGERPRINT(ssn) ssn
FROM `yourTable`
)
SELECT *
FROM `yourView`
WHERE ssn = FARM_FINGERPRINT('3234567890')
Below is implementation outline:
1. create yourView View in separate for yourTable Dataset
2. authorize yourView View as a reader for yourTable's dataset
3. so now, any user who has access to your view will be able to run below
4. of course, make sure your users do not have access to yourTable's dataset
#standardSQL
SELECT *
FROM `yourView`
WHERE ssn = FARM_FINGERPRINT('3234567890')
and even if ssn is visible it is not real

There is no way to expose only a subset of columns but to allow access to all of them if a filter is present. What if the filter was ssn = ssn, for instance? That would always be true unless ssn was null. You can, however, set up different datasets with different permissions and create views that expose some subset of columns in them. There is a good tutorial on creating authorized views in the BigQuery documentation.
For example, you could have:
restricted_dataset: Only certain people in your team/organization can query or manage the tables and views in this dataset. Contains a table named all_info containing all data.
open_dataset: Anyone in your team/organization can query tables/views in this dataset. Contains a view named filtered_info defined e.g. as SELECT f_name, l_name FROM all_info;

How to use other table JSON data in select query

I am using postgres and have 2 tables, deviceTble has the following columns: deviceName, device_id, type, deviceOwnerPerson_id, deviceAccessPerson_id.
The other table is Person_kv and has 2 columns id,data (containing person info but in JSON format).
I want to a select query from deviceTble and want to use first_name and last_name of a person which are in Person_kv table by given of deviceOwnerPersonId and deviceAccessPersonId.
Here is what I have to get data from person_kv table to get data in tabular form:
select data :: json ->> 'id' as id
, data :: json ->> 'name' as first_name
, data :: json ->> 'surename' as last_name
from Person_kv
and expected deviceTble query:
select deviceName,device_id,type from deviceTble
I am confused either I use WITH clause on person_kv query and then join here or one by one on deviceOwnerPerson_id and deviceAccessPerson_id OR is there any other way as well by using inner query
Can someone tell me how I can get required result?

from you description you can just join em:
select deviceName,device_id,type, p.data:: json ->>'name' , p.data:: json ->>'surname'
from deviceTble d
join Person_kv p on p.data:: json ->>'id' = deviceOwnerPerson_id::text OR p.data:: json ->>'id' = deviceAccessPerson_id::text

merged SQL table names

I have a merged table from several unions and i want to know from which of those tables the results were taken, is that possible?
example...
select name from users where name like '%alex%'
union
select name from admins where name like '%alex%';
Would return lets say two rows, Alexander and Alexandra. Alexander is an admin and Alexandra is a user. How can i tell them apart?

SELECT
Name,
'Users' AS Type
FROM users
WHERE name LIKE '%alex%'
UNION
SELECT
Name,
'Admins' AS Type
FROM admins
WHERE name LIKE'%alex%'

Include a virtual column in your select that will allow you to identify the source table
select name, 'Name' as Source from users where name like '%alex%'
union select name, 'Admins' as Source from admins where name like '%alex%';

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Include header row while extracting the table as csv to s3 in data pipeline AWS service - header

while extracting the entire table from RDS to s3 in csv format. Header column is not included. what option should we choose to include the Header row in the csv file.

SELECT 'firstName', 'lastName', 'email' UNION ALL SELECT firstName, lastName, email FROM users

SELECT '1st_name', 'last_name' UNION ALL SELECT u.firstName as 1st_name, u.lastName FROM users u; You just need to select 'alias' name as column name when you have alias. It works.

Related

How to combine multiple rows as one in Bigquery?

PostgreSQL multi-table query with name, first_name, and last_name

How to give user specific access to BigQuery columns?

How to use other table JSON data in select query

merged SQL table names

Categories

Resources