How do I add a comma as my delimiter for the output of awk - awk

How do I add a comma as my delimiter for the output of awk
> cat file.csv
col1,col2,col3,col4
col1,col2,col3
col1,col2
col1
col1,col2,col3,col4,col5
This is my attempt:
> awk -F, 'BEGIN{FS=OFS=","} {print $1 $2}' file.csv
col1col2
col1col2
col1col2
col1
col1col2
>
What I want is this
col1,col2
col1,col2
col1,col2
col1,
col1,col2
Below is just for my ref:
> awk -F, '{print $0}' file.csv
col1,col2,col3,col4
col1,col2,col3
col1,col2
col1
col1,col2,col3,col4,col5
> awk -F, '{print $1}' file.csv
col1
col1
col1
col1
col1
> awk -F, '{print $1 $2}' file.csv
col1col2
col1col2
col1col2
col1
col1col2

The statement print $1 $2 prints one output field, the concatenation of $1 and $2. Hence there will be no OFS output.
What you're looking for is print $1, $2 which prints two distinct fields, and will therefore have the OFS inserted between them. You can see the difference in the following transcript:
pax#styx:~> echo "1 2" | awk 'BEGIN{OFS="-xyzzy-"}{print $1 $2}'
12
pax#styx:~> echo "1 2" | awk 'BEGIN{OFS="-xyzzy-"}{print $1, $2}'
1-xyzzy-2

if you actually want that trailing "," :
mawk NF=2 FS=, OFS=,
col1,col2
col1,col2
col1,col2
col1,
col1,col2
...without the trailing "," :
gawk 'NF < 2 || NF = 2' FS=, OFS=,
col1,col2
col1,col2
col1,col2
col1
col1,col2

Related

How to replace IN clause with JOIN in Postgres?

I have the following query.
select *
from table table0_
where (table0_.col1, table0_.col2, table0_.col3) in (($1, $2, $3) , ($4, $5, $6) , ($7, $8, $9) , ($10, $11, $12) , ($13, $14, $15))
How to replace IN clause with JOIN as shown in the below in the Postgres.
select *
from table0_
where table0_.col1=$1
and table0_.col2=$2
and table0_.col3=$3
EDIT: I read from somewhere that IN operator does not make use of indexes. Also, this query is taking more time if passing more parameters.
I don't know why you should do that because actually no difference between them. You can use the below query and use CTE to create a temp table and join together.
with data as (
select
*
from (
values ($1, $2, $3) , ($4, $5, $6) , ($7, $8, $9) , ($10, $11, $12) , ($13, $14, $15)
) t (col1, col2, col3)
)
select
table0_.*
from
table0_, data
where
table0_.col1 = data.col1
and table0_.col2 = data.col2
and table0_.col3 = $data.col3

Snowflake - Count distinct values in comma seperated list

I basically have a column that looks like below.
"[
""what"",
""how"",
]"
"[
""how"",
""what"",
]"
"[
""project_management"",
""do it"",
""personal""
]"
"[
""do it"",
""finance"",
""events"",
""save""
]"
"[
""do it"",
""sales"",
""events""
]"
"[
""finance"",
""sales"",
""events""
]"
"[
""events""
]"
I am simple trying to get a count of each unique instance/value within the column and output value counts for each value that is seperated by a string. The output should look like the following:
What: 2
how: 2
do it: 3
Finance: 4
etc.
I tried the following but the problem is it only counts lists that repeat itself and not the individual values within the list itself
select (i.OUTCOMES), count(i.OUTCOMES)
from table i
GROUP BY 1;
You'll need to flatten the values.
If the variant is an array as described:
with data as (
select parse_json('["a", "b"]') v
union select parse_json('["a", "a", "c"]')
)
select x.value::string val, count(*) c
from data, table(flatten(v)) x
group by 1
;
It seems like an array, so you need to use flatten two times:
with data as (
select ARRAY_CONSTRUCT( ARRAY_CONSTRUCT('what','how'),
ARRAY_CONSTRUCT('how','what'),
ARRAY_CONSTRUCT('project_management','do it', 'personal') ) OUTCOMES
)
select item.VALUE::string, count(*) from data,
lateral flatten( OUTCOMES ) v,
lateral flatten( v.VALUE ) item
group by item.VALUE;
+--------------------+----------+
| ITEM.VALUE::STRING | COUNT(*) |
+--------------------+----------+
| what | 2 |
| how | 2 |
| project_management | 1 |
| do it | 1 |
| personal | 1 |
+--------------------+----------+
Using SPLIT_TO_TABLE & REPLACE FUNCTIONS
SELECT COL_VAL,COUNT(COL_VAL) FROM
(
SELECT REPLACE(REPLACE(REPLACE(VALUE,'['),'"'),']') COL_VAL FROM TABLE( SPLIT_TO_TABLE('"[
""what"",
""how"",
]"
"[
""how"",
""what"",
]"
"[
""project_management"",
""do it"",
""personal""
]"
"[
""do it"",
""finance"",
""events"",
""save""
]"
"[
""do it"",
""sales"",
""events""
]"
"[
""finance"",
""sales"",
""events""
]"
"[
""events""
]"',','))) GROUP BY COL_VAL;

Column values to a row in Db2 like xargs in Linux

I want to do the following in only SQL:
db2 -x "select colname from syscat.columns where tabschema like 'SYSCAT%' and tabname = 'TABLES' order by colno" | xargs
How can I do that? Convert the list of values into a row, like the xargs in Linux.
I want something dynamic, not with CASE, because I need to change the Tablename and the result should be a row.
Original query:
col1
col2
col3
After xargs
col1 col2 col3
I know there is a function called ARRAY_AGG, but it only works inside a compound block, not a SQL query.
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0050494.html
Use the LISTAGG function instead.
db2 -x "select listagg(colname, ' ') within group (order by colno) from syscat.columns where tabschema like 'SYSCAT%' and tabname = 'TABLES'" | xargs
How about this
SELECT TABNAME
, LISTAGG(COLNAME,',') WITHIN GROUP (ORDER BY COLNO) AS COLNAMES
FROM SYSCAT.COLUMNS
WHERE TABSCHEMA LIKE 'SYSCAT%'
GROUP BY TABNAME
ORDER BY TABNAME
it returns output such as this
TABNAME COLNAMES
------------------------------ -----------------------------------
BUFFERPOOLDBPARTITIONS BUFFERPOOLID,DBPARTITIONNUM,NPAGES
BUFFERPOOLEXCEPTIONS BUFFERPOOLID,MEMBER,NPAGES
BUFFERPOOLNODES BUFFERPOOLID,NODENUM,NPAGES

Use different line separator in awk

I have a file as follows:
cat file
00:29:01|10.3.57.60|dbname1| SELECT
re.id,
re.event_type_cd,
re.event_ts,
re.source_type,
re.source_id,
re.properties
FROM
table1 re
WHERE
re.id > 621982999
AND re.id <= 884892348
ORDER BY
re.id
^
00:01:00|10.3.56.101|dbname2|BEGIN;declare "SQL_CUR00000000009CE140" cursor for SELECT id, cast(event_type_cd as character(4)) event_type_cd, CAST(event_ts AS DATE) event_ts, CAST(source_id AS character varying(100)) source_id, CAST(tx_id AS character varying(100)) tx_id, CAST(properties AS character varying(4000)) properties, CAST(source_type AS character(1)) source_type FROM table1 WHERE ID > 514725989 ORDER BY ID limit 500000;fetch 500000 in "SQL_CUR00000000009CE140"^
These are the output of sql results delimited by pipe (|). In order to identify new line I used ^ at the end of each row.
I want to get the output as:
1/00:29:01|10.3.57.60|parasol_ams| SELECT
re.id,
re.event_type_cd,
re.event_ts,
re.source_type,
re.source_id,
re.properties
FROM
table1 re
WHERE
re.id > 621982999
AND re.id <= 884892348
ORDER BY
re.id
2/00:01:00|10.3.56.101|parasol_sprint_tep|BEGIN;declare "SQL_CUR00000000009CE140" cursor for SELECT id, cast(event_type_cd as character(4)) event_type_cd, CAST(event_ts AS DATE) event_ts, CAST(source_id AS character varying(100)) source_id, CAST(tx_id AS character varying(100)) tx_id, CAST(properties AS character varying(4000)) properties, CAST(source_type AS character(1)) source_type FROM table1 WHERE ID > 514725989 ORDER BY ID limit 500000;fetch 500000 in "SQL_CUR00000000009CE140"
But when I am using:
cat file | awk -F '|' -v RS="^" '{ print FNR "/" $0 }'
I get:
1/00:29:01|10.3.57.60|parasol_ams| SELECT
re.id,
re.event_type_cd,
re.event_ts,
re.source_type,
re.source_id,
re.properties
FROM
table1 re
WHERE
re.id > 621982999
AND re.id <= 884892348
ORDER BY
re.id
2/
00:01:00|10.3.56.101|parasol_sprint_tep|BEGIN;declare "SQL_CUR00000000009CE140" cursor for SELECT id, cast(event_type_cd as character(4)) event_type_cd, CAST(event_ts AS DATE) event_ts, CAST(source_id AS character varying(100)) source_id, CAST(tx_id AS character varying(100)) tx_id, CAST(properties AS character varying(4000)) properties, CAST(source_type AS character(1)) source_type FROM table1 WHERE ID > 514725989 ORDER BY ID limit 500000;fetch 500000 in "SQL_CUR00000000009CE140"
3/
awk '/^\^/{next}/\|/{sub("^",++c"/")}1' file
awk -vRS='^' -F '|' '{sub("^\n","")}{printf "%s/TIME:%s HOST:%s DB:%s SQL:%s",FNR,$1,$2,$3,$4}' file

SQL - condensing repeative boolan 'WHEN' statements

please can anyone suggest a way of condensing this code, to reduce its repetitive nature. many thanks
select case
when c=1 and cs=1 and f=0 and fs=0 then 'FPL02'
when c=0 and cs=0 and f=1 and fs=1 then 'FPL03'
when c=1 and cs=0 and f=0 and fs=0 then 'FPL04'
when c=0 and cs=0 and f=1 and fs=0 then 'FPL05'
when c=1 and cs=1 and f=1 and fs=1 then 'FPL06'
when c=1 and cs=1 and f=1 and fs=0 then 'FPL07'
when c=1 and cs=0 and f=1 and fs=1 then 'FPL08'
when c=1 and cs=0 and f=1 and fs=0 then 'FPL09'
when Ab=1 then 'FPL10'
when cpc=1 and plo=0 then 'FPC01'
when cpc=0 and plo=1 then 'FPC02'
when cpc=1 and plo=1 then 'FPC03'
else 'FPL01' end
from (select ptmatter, BillLHAbsolute as Ab, BillLHChildren as C, BillLHChildrenSettle as CS, BillLHFinances as F, BillLHFinancesSettle as FS, BillLHCPC as CPC, BillLHPLO as PLO from MatterDataDef) as mmd
where ptmatter=$matter$
With that many different conditional statements on different columns, I sincerely doubt you can condense that code while having it still be maintainable by someone else.
For example, you would need this:
select case
when c IN (0, 1) AND cs IN (0, 1) AND f IN (0, 1) AND fs IN (0, 1) then
case
when c=1 and cs=1 and f=1 and fs=0 then 'FPL07'
when c=1 and cs=0 and f=1 and fs=0 then 'FPL09'
else 'FPL0' + cast(c * 5 + f * 6 - cs * 2 - fs * 2 - 1 as char(1))
end
when Ab = 1 then
'FPL10'
when cpc IN (0, 1) AND plo IN (0, 1) then
'FPC0' + cast(cpc * 1 + plo * 2 as char(1))
else
'FPL01'
end
It's condensed (sort of), but you're trading off fewer lines for less readability.
All in all, it's really not that many WHEN statements.