sed: change case of text enclosed within backticks

sed: change case of text enclosed within backticks - sql

I'm writing a bash script to automatically perform some changes to SQL files. My problem is: how can I use sed to convert CamelCase text to snake_case ONLY where it's enclosed within backticks?
e.g. a line like the following:
INSERT INTO TableName (ColumnOne, ColumnTwo) VALUES (120, "YouTube video", "Linux and macOS");
should become:
INSERT INTO table_name (column_one, column_two) VALUES (120, "YouTube video", "Linux and macOS");
The following expression
sed -i -r 's/([a-z0-9])([A-Z])/\1_\L\2/g' filename.sql
performs the first part of the desired job (text can eventually be easily turned to lowercase using sed -i 's/`[^`]*`/\L\0/g' filename.sql) but I need to limit its scope just to those parts enclosed within backticks (i.e. table and column names) leaving anything else untouched. How can I achieve the desired result?

With GNU sed:
sed -E 's/(`[^`]*)([A-Z])([^`]*`)/\L\1\L_\2\3/g' file

Don't seen the backticks ?
So, with this input:
INSERT INTO `TableName (ColumnOne, ColumnTwo)` VALUES (120, "YouTube video", "Linux and macOS");
You can try this sed
sed -E ':A;s/(`.+)([a-z])([A-Z])([^`]+`)/\1\2_\L\3\4/;tA'

If you don't mind using Perl – here is a general-use alternative – one which can handle multi-line files, multi-hump CamelCase, and will properly ignore strings ("|') and SQL statements:
perl -i -p -e 's/(?<!\x22|\x27)([A-Z][a-z0-9]++)(?!\s|,|\))/\L\1_/g; s/(_)+([A-Z])?/\L\1\2/g' file

Related

Substituting Variable in sed command

I have ./cpptest.sh to which I am passing a command line parameter
For e.g.
$./testcps.sh /srv/repository/Software/Wind_1.0.2/
The above command line parameter, is stored in variable $1
when I echo $1, the output is correct (the path)
Actual issue...
There is another file let's say abc.properties file. In this file there is a key-value field something like location.1=stg_area.
I want to replace the 'stg_area' with the value stored in $1 (the path) so that the substitution looks like location.1=/srv/repository/Software/Wind_1.0.2/
Now, to achieve this, I am tried all option below with sed and none worked
sed -i "s/stg_area/$1/" /srv/ppc/abc.properties //output is sed: -e expression #1, char 17: unknown option to `s'
sed -i 's/stg_area/'"$1'"/' /srv/ppc/abc.properties //output is sed: -e expression #1, char 18: unknown option to `s'
sed -i s/stg_area/$1/ /srv/ppc/abc.properties //output is sed: -e expression #1, char 17: unknown option to `s'
I think I have tried all possible ways... Any answer on this is appreciated. Thanks in advance.

You know that sed is using / as a special separator in the command s/pattern/replacement/, right? You've used it yourself in the question.
So obviously there's a problem when you have a replacement string containing /, like yours does.
As the documentation says:
The / characters may be uniformly replaced by any other single character within any given s command. The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character.
So the two available solutions are:
use a different separator for the s command, such as
s#stg_area#$1#
(although you still need to check there are no # characters in the replacement string)
sanitize the replacement string so it doesn't contain any special characters (either /, or sequences like \1, or anything else sed treats as special), for example by escaping them with \
sanitized=$(sed 's#/#\\/#g' <<< $1)
(and then used $sanitized instead of $1 your sed script)

How to delete the "0"-row for multiple fles in a folder?

Each file's name starts with "input". One example of the files look like:
0.0005
lii_bk_new
traj_new.xyz
0
73001
146300
I want to delete the lines which only includes '0' and the expected output is:
0.0005
lii_bk_new
traj_new.xyz
73001
146300
I have tried with
sed -i 's/^0\n//g' input_*
and
grep -RiIl '^0\n' input_* | xargs sed -i 's/^0\n//g'
but neither works.
Please give some suggestions.

Could you please try changing your attempted code to following, run it on a single Input_file once.
sed 's/^0$//' Input_file
OR as per OP's comment to delete null lines:
sed 's/^0$//;/^$/d' Input_file
I have intentionally not put -i option here first test this in a single file of output looks good then only run with -i option on multiple files.
Also problem in your attempt was, you are putting \n in regex of sed which is default separator of line, we need to put $ in it to tell sed delete those lines which starts and ends with 0.
In case you want to take backup of files(considering that you have enough space available in your file system) you could use -i.bak option of sed too which will take backup of each file before editing(this isn't necessary but for safer side you have this option too).

$ sed '/^0$/d' file
0.0005
lii_bk_new
traj_new.xyz
73001
146300
In your regexp you were confusing \n (the literal LineFeed character which will not be present in the string sed is analyzing since sed reads one \n-separated line at a time) with $ (the end-of-string regexp metacharacter which represents end-of-line when the string being parsed is a line as is done with sed by default).
The other mistake in your script was replacing 0 with null in the matching line instead of just deleting the matching line.

Please give some suggestions.
I would use GNU awk -i inplace for that following way:
awk -i inplace '!/^0$/' input_*
This simply will preserve all lines which do not match ^0$ i.e. (start of line)0(end of line). If you want to know more about -i inplace I suggest reading this tutorial.

Replace character except between pattern using grep -o or sed (or others)

In the following file I want to replace all the ; by , with the exception that, when there is a string (delimited with two "), it should not replace the ; inside it.
Example:
Input
A;B;C;D
5cc0714b9b69581f14f6427f;5cc0714b9b69581f14f6428e;1;"5cc0714b9b69581f14f6427f;16a4fba8d13";xpto;
5cc0723b9b69581f14f64285;5cc0723b9b69581f14f64294;2;"5cc0723b9b69581f14f64285;16a4fbe3855";xpto;
5cc072579b69581f14f6428a;5cc072579b69581f14f64299;3;"5cc072579b69581f14f6428a;16a4fbea632";xpto;
output
A,B,C,D
5cc0714b9b69581f14f6427f,5cc0714b9b69581f14f6428e,1,"5cc0714b9b69581f14f6427f;16a4fba8d13",xpto,
5cc0723b9b69581f14f64285,5cc0723b9b69581f14f64294,2,"5cc0723b9b69581f14f64285;16a4fbe3855",xpto,
5cc072579b69581f14f6428a,5cc072579b69581f14f64299,3,"5cc072579b69581f14f6428a;16a4fbea632",xpto,
For sed I have: sed 's/;/,/g' input.txt > output.txt but this would replace everything.
The regex for the " delimited string: \".*;.*\" .
(A regex for hexadecimal would be better -- something like: [0-9a-fA-F]+)
My problem is combining it all to make a grep -o / sed that replaces everything except for that pattern.
The file size is in the order of two digit Gb (max 99Gb), so performance is important. Relevant.
Any ideas are appreciated.

sed is for doing simple s/old/new on individual strings. grep is for doing g/re/p. You're not trying to do either of those tasks so you shouldn't be considering either of those tools. That leaves the other standard UNIX tool for manipulating text - awk.
You have a ;-separated CSV that you want to make ,-separated. That's simply:
$ awk -v FPAT='[^;]*|"[^"]+"' -v OFS=',' '{$1=$1}1' file
A,B,C,D
5cc0714b9b69581f14f6427f,5cc0714b9b69581f14f6428e,1,"5cc0714b9b69581f14f6427f;16a4fba8d13",xpto,
5cc0723b9b69581f14f64285,5cc0723b9b69581f14f64294,2,"5cc0723b9b69581f14f64285;16a4fbe3855",xpto,
5cc072579b69581f14f6428a,5cc072579b69581f14f64299,3,"5cc072579b69581f14f6428a;16a4fbea632",xpto,
The above uses GNU awk for FPAT. See What's the most robust way to efficiently parse CSV using awk? for more details on parsing CSVs with awk.

If I get correctly your requirements, one option would be to make a three pass thing.
From your comment about hex, I'll consider nothing like # will come in the input so you can do (using GNU sed) :
sed -E 's/("[^"]+);([^"]+")/\1#\2/g' original > transformed
sed -i 's/;/,/g' transformed
sed -i 's/#/;/g' transformed
The idea being to replace the ; when within quotes by something else and write it to a new file, then replace all ; by , and then set back the ; in place within the same file (-i flag of sed).
The three pass can be combined in a single command with
sed -E 's/("[^"]+);([^"]+")/\1#\2/g;s/;/,/g;s/#/;/g' original > transformed
That said, there's probably a bunch of csv parser witch already handle quoted fields that you can probably use in the final use case as I bet this is just an intermediary step for something else later in the chain.
From Ed Morton's comment: if you do it in one pass, you can use \n as replacement separator as there can't be a newline in the text considered line by line.

This might work for you (GNU sed):
sed -E ':a;s/^([^"]*("[^"]*"[^"]*)*"[^";]*);/\1\n/;ta;y/;/,/;y/\n/;/' file
Replace ;'s inside double quotes with newlines, transpose ;'s to ,'s and then transpose newlines to ;'s.

How can I remove table prefixes from a SQL dump?

I have a *.sql file which is dumped from PHPMyAdmin, and all of the tables have a prefix of ff_. How can I remove this? I tried using Notepad++, but it doesn't work because the insert data contains the word too.

Try something like "`ff_" to "`". In simple notepad, notepad++ or sed.
Sed here isn't something different.
For this simple replacement you should create your dump to be forced with "`" around table names.

GNU sed is here to help:
sed -i 's/`ff_/`/g' *.sql
On Mac look for gsed instead of sed. Note the backtick in patterns.
If you think that one of your files contains `ff_ in a string other than table name, you can check that with:
grep '`ff_' *.sql
If this is the case, consider the following:
sed -i 's/INSERT INTO `ff_/INSERT INTO `/g' *.sql

How can I convert SQL comments with -- to # using Perl?

UPDATE:
This is what works!
fgrep -ircl --include=*.sql -- -- *
I have various SQL files with '--' comments and we migrated to the latest version of MySQL and it hates these comments. I want to replace -- with #.
I am looking for a recursive, inplace replace one-liner.
This is what I have:
perl -p -i -e 's/--/# /g'` ``fgrep -- -- *
A sample .sql file:
use myDB;
--did you get an error
I get the following error:
Unrecognized switch: --did (-h will show valid options).
p.s : fgrep skipping 2 dashes was just discussed here if you are interested.
Any help is appreciated.

The command-line arguments after the -e 's/.../.../' argument should be filenames. Use fgrep -l to return names of files that contain a pattern:
perl -p -i -e 's/--/# /g' `fgrep -l -- -- * `

I'd use a combination of find and inplace sed
find . -name '*.sql' -exec sed -i -e "s/^--/#/" '{}' \;
Note that it will only replace lines beginning with --
The regex will become vastly more complex if you wan't to replace this for example:
INSERT INTO stuff VALUES (...) -- values used for xyz
because the -- might as well be in your data (I guess you don't want to replace those)
INSERT INTO stuff VALUES (42, "<!-- sboing -->") -- values used for xyz

The equivalent of that in script form is:
#!/usr/bin/perl -i
use warnings;
use strict;
while(<>) {
s/--/# /g;
print;
}
If I have several files with comments of the form of --comment and feed any number of names to this script, they are changed in place to # comment You could use find, ls, grep, etc to find the files...
There is nothing per se wrong with using a 1 liner.
Is that what you are looking for?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sed: change case of text enclosed within backticks - sql

With GNU sed: sed -E 's/(`[^`])([A-Z])([^`]`)/\L\1\L_\2\3/g' file

Don't seen the backticks ? So, with this input: INSERT INTO `TableName (ColumnOne, ColumnTwo)` VALUES (120, "YouTube video", "Linux and macOS"); You can try this sed sed -E ':A;s/(`.+)([a-z])([A-Z])([^`]+`)/\1\2_\L\3\4/;tA'

If you don't mind using Perl – here is a general-use alternative – one which can handle multi-line files, multi-hump CamelCase, and will properly ignore strings ("|') and SQL statements: perl -i -p -e 's/(?<!\x22|\x27)([A-Z][a-z0-9]++)(?!\s|,|\))/\L\1_/g; s/(_)+([A-Z])?/\L\1\2/g' file

Related

Substituting Variable in sed command

How to delete the "0"-row for multiple fles in a folder?

Replace character except between pattern using grep -o or sed (or others)

How can I remove table prefixes from a SQL dump?

How can I convert SQL comments with -- to # using Perl?

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sed: change case of text enclosed within backticks - sql

With GNU sed: sed -E 's/(`[^`]*)([A-Z])([^`]*`)/\L\1\L_\2\3/g' file

Don't seen the backticks ? So, with this input: INSERT INTO `TableName (ColumnOne, ColumnTwo)` VALUES (120, "YouTube video", "Linux and macOS"); You can try this sed sed -E ':A;s/(`.+)([a-z])([A-Z])([^`]+`)/\1\2_\L\3\4/;tA'

If you don't mind using Perl – here is a general-use alternative – one which can handle multi-line files, multi-hump CamelCase, and will properly ignore strings ("|') and SQL statements: perl -i -p -e 's/(?<!\x22|\x27)([A-Z][a-z0-9]++)(?!\s|,|\))/\L\1_/g; s/(_)+([A-Z])?/\L\1\2/g' file

Related

Substituting Variable in sed command

How to delete the "0"-row for multiple fles in a folder?

Replace character except between pattern using grep -o or sed (or others)

How can I remove table prefixes from a SQL dump?

How can I convert SQL comments with -- to # using Perl?

Categories

Resources

With GNU sed: sed -E 's/(`[^`])([A-Z])([^`]`)/\L\1\L_\2\3/g' file